Developers new to the world of Web services tend to think only in HTTP terms. There are many ways to transport data produced by your service to your client or clients, but before listing some examples, let's talk about the "OSI model of Communicating Systems." The way people think about communication between systems has been greatly influenced by the Open Systems Interconnection (OSI) model which provides a conceptual system of seven "layers" from the physical hardware (1) to the highest or application (7) layer. In this article we are concerned mostly with the transport (4) and higher layers. First, let's look at approaches that directly connect a server with a single client. For more information, check out my article on data transport formats.
Sockets are a programming language abstraction for representing connections between processes, typically across a computer network. The concept first appeared in the early 80s in the BSD Unix operating system using a C library, but has proven so useful that socket APIs are the underlying technology in most languages for all inter-computer communication. Network sockets use an IP (Internet Protocol) address and a "port" number to locate a computer and communicating process.
There are various conventions about the availabilty and uses of ports which I am not going to get into here. Actual low level interfacing of the socket concept to physical hardware is up to the operating system. The TCP and UDP Internet protocols are built on top of sockets, but you can also program at a more basic "raw socket" level. Raw socket communication is very fast but tricky to program and not very portable. Learn more about socket programming in Java.PROS:
- Fastest possible communication.
- Flexible data size.
- Considerable custom programming effort required.
- Limited portability between disparate systems.
User Datagram Protocol (UDP)
The simplest Internet protocol does not provide for automatic error checking or "handshake" verification between communicating processes and thus is sometimes called by programmers the Unreliable Datagram Protocol. UDP data packets may be addressed to a single or multiple recipients. You can only use UDP if loss of one or more data packets will not crash the client's application. Examples include streaming media and games.PROS:
- Single or Multiple recipient.
- Only useable where data loss can be tolerated.
Transmission Control Protocol (TCP)
Commonly referred to as TCP/IP, this protocol builds a system of error checking and correction on top of sockets and IP to provide reliable point to point communication of data packets. The higher level protocols which are built on TCP, such as HTTP and FTP, can assume that transmitted data is getting to the recipient correctly and that any failure in communication will result in a detectable error. This advantage comes at a cost of the extra time used to detect errors, retransmit data blocks and control network behavior.PROS:
- Well understood and widely supported in a variety of languages.
- Flexible adaptation to network conditions.
- Slower than UDP.
- Multiple recipients not possible.
Using an Intermediate Server
Building further on TCP/IP, we have data transmission systems which rely on an intermediate server to store and eventually forward data to authorized recipients. This approach may be indicated if you need to distribute results to multiple recipients, distribute to users who may not be connected when the service result is generated, or when generating the result takes a long time. An example would be a RESTful Web service POST which initiates creation of a large PDF formatted document. Get more information about TCP/IP protocol.
Originating in the need to send text messages on ARPANET, the precursor to the familiar Internet, email has been extended with standards for Multipurpose Internet Mail Extensions or MIME which permits sending any type of content as an attachment.PROS:
- Email clients are widely available.
- Toolkits to create custom client software are available in many computer languages.
- Distribution to multiple recipents is relatively easy.
- Some mail servers may limit the size of attachments.
- Security precautions may prevent transmission of some data types.
- Encoding of binary data adds bulk.
- Uncontrollable delay as mail is forwarded between servers.
Message-oriented Middleware, or MOM, is commonly used in corporate networks for reliable movement of data between dissimilar systems and applications. The Java version, called Java Message Service (JMS), is part of the Java Enterprise Edition and is thus widely available. Messages may contain any type of content. A message service supports two models of transmission, point-to-point and publish/subscribe. Point-to-point is similar to email in that the server maintains a queue of messages assigned to a particular clent. Publish/subscribe allows a user to subscribe to "topics" which receive messages from publishers. Any number of users can subscribe to a topic based on security settings.PROS:
- Flexibility of message delivery.
- Commonly available in both open-source and proprietary implementations.
- Specialized server and client software is required.
- Users must subscribe to a topic.
- Compatability between vendors and operating systems is incomplete.
The most extreme data transport problem would be the case where the size of the file and the number of intended recipients totally overwhelms available bandwidth on your server. As discussed in my earlier article, cloud services can come to your rescue very economically.PROS:
- Removes load from your server.
- Access can be public or tightly controlled.
- Requires a cloud service account.
- Specialized server programming required to move the data and provide links to clients.
There is more than HTTP
Newcomers to designing web services should be open to using alternate ways to get results to clients. There is a spectrum of technologies to choose from and you should not assume conventional HTTP without considering the alternatives.