7 Host-to-host (transport) layer protocols Objectives When you have completed this chapter you should be able to: • Explain the basic functions of the host-to-host layer • Explain the basic operation of TCP and UDP • Explain the fundamental differences between TCP and UDP • Decide which protocol (TCP or UDP) to use for a particular application • Explain the meaning of each field in the TCP and UDP headers The host-to-host communications layer (also referred to as the service layer, or as the transport layer in terms of the OSI model) is primarily responsible for ensuring end-to- end delivery of packets transmitted by the Internet protocol (IP). This additional reliability is needed to compensate for the lack of reliability in IP. There are only two relevant protocols residing in the host-to-host communications layer, namely TCP (transmission control protocol) and UDP (user datagram protocol). In addition to this, the host-to-host layer includes the APIs (application programming interfaces) used by programmers to gain access to these protocols from the process/ application layer. Host-to-host (transport) layer protocols 123 Figure 7.1 TCP and UDP within the ARPA model 7.1 TCP (transmission control protocol) 7.1.1 Basic functions TCP is a connection-oriented protocol and is therefore reliable, although this word is used in a data communications context and not in an everyday sense. TCP establishes a connection between two hosts before any data is transmitted. Because a connection is set up beforehand, it is possible to verify that all packets are received on the other end and to arrange re-transmission in the case of lost packets. Because of all these built-in functions, TCP involves significant additional overhead in terms of processing time and header size. TCP includes the following functions: • Fragmentation of large chunks of data into smaller segments that can be accommodated by IP. The word ‘segmentation’ is used here to differentiate it from the ‘fragmentation’ performed by IP • Data stream reconstruction from packets received • Receipt acknowledgment • Socket services for providing multiple connections to ports on remote hosts • Packet verification and error control • Flow control • Packet sequencing and reordering In order to achieve its intended goals, TCP makes use of ports and sockets, connection oriented communication, sliding windows, and sequence numbers/acknowledgments. 124 Practical TCP/IP and Ethernet Networking 7.1.2 Ports Whereas IP can route the message to a particular machine on the basis of its IP address, TCP has to know for which process (i.e. software program) on that particular machine it is destined. This is done by means of port numbers ranging from 1 to 65 535. Port numbers are controlled by IANA (the Internet Assigned Numbers Authority) and can be divided into three groups. Well known ports, ranging from 1 to 1023, have been assigned by IANA and are globally known to all TCP users. For example, HTTP uses port 80. Registered ports are registered by IANA in cases where the port number cannot be classified as well known, yet it is used by a significant number of users. Examples are port numbers registered for Microsoft Windows or for specific types of PLCs. These numbers range from 1024 to 49 151, the latter being 75% of 65 536. A third class of port numbers is known as ephemeral ports. These range from 49 151 to 65 535 and can be used by anyone on an ad-hoc basis. 7.1.3 Sockets In order to identify both the location and application to which a particular packet is to be sent, the IP address (location) and port number (process) is combined into a functional address called a socket. The IP address is contained in the IP header and the port number is contained in the TCP or UDP header. In order for any data to be transferred under TCP, a socket must exist both at the source and at the destination. TCP is also capable of creating multiple sockets to the same port. 7.1.4 Sequence numbers A fundamental notion in the TCP design is that every BYTE of data sent over the TCP connection has a unique 32-bit sequence number. Of course this number cannot be sent along with every byte, yet it is nevertheless implied. However, the sequence number of the FIRST byte in each segment is included in the accompanying TCP header, for each subsequent byte that number is simply incremented by the receiver in order to keep track of the bytes. Before any data transmission takes place, both sender and receiver (e.g. client and server) have to agree on the initial sequence numbers (ISNs) to be used. This process is described under ‘establishing a connection’. Since TCP supports full duplex operation, both client and server will decide on their initial sequence numbers for the connection, even though data may only flow in one direction for that specific connection. The sequence number, for obvious reasons, cannot start at 0 every time, as it will create serious problems in the case of short-lived multiple sequential connections between two machines. A packet with a sequence number from an earlier connection could easily arrive late, during a subsequent connection. The receiver will have difficulty in deciding whether the packet belongs to a former or to the current connection. It is easy to visualize a similar problem in real life. Imagine tracking a parcel carried by UPS if all UPS agents started issuing tracking numbers beginning with 0 every morning. The sequence number is generated by means of a 32-bit software counter that starts at 0 during boot-up and increments at a rate of about once every 4 microseconds (although this varies depending on the operating system being used). When TCP establishes a connection, the value of the counter is read and used as the initial sequence number. This creates an apparently random choice of the initial sequence number. Host-to-host (transport) layer protocols 125 At some point during a connection the counter could rollover from 65 535 and start counting from 0 again. The TCP software takes care of this. 7.1.5 Acknowledgment numbers TCP acknowledges data received on a PER SEGMENT basis, although several consecutive segments may be acknowledged at the same time. The acknowledgment number returned to the sender to indicate successful delivery equals the number of the last byte received +1, hence it points to the next expected sequence number. For example: 10 bytes are sent, with sequence number 33. This means that the first byte is numbered 33 and the last byte is numbered 42. If received successfully, an acknowledgment number (ACK) of 43 will be returned. The sender now knows that the data has been received properly, as it agrees with that number. TCP does not issue selective acknowledgments, so if a specific segment contains errors, the acknowledgement number returned to the sender will point to the first byte in the defective segment. This implies that the segment starting with that sequence number, and all subsequent segments (even though they may have been transmitted successfully) have to be retransmitted. From the previous paragraph it should be clear that a duplicate acknowledgement received by the sender means that there was an error in the transmission of one or more bytes following that particular sequence number. Please note that the sequence number and the acknowledgment number in one header are NOT related at all. The former relates to outgoing data, the latter refers to incoming data. During the connection establishment phase the sequence numbers for both hosts are setup independently, hence these two numbers will never bear any resemblance to each other. 7.1.6 Sliding windows Obviously there is a need to get some sort of acknowledgment back to ensure that there is a guaranteed delivery. This technique, called positive acknowledgment with retransmission, requires the receiver to send back an acknowledgment message within a given time. The transmitter starts a timer so that if no response is received from the destination node within a given time, another copy of the message will be transmitted. An example of this situation is given in Figure 7.2. 126 Practical TCP/IP and Ethernet Networking Figure 7.2 Positive acknowledgment philosophy The sliding window form of positive acknowledgment is used by TCP, as it is very time consuming waiting for each individual acknowledgment to be returned for each packet transmitted. Hence the idea is that a number of packets (with cumulative number of bytes not exceeding the window size) are transmitted before the source may receive an acknowledgment to the first message (due to time delays, etc). As long as acknowledgments are received, the window slides along and the next packet is transmitted. During the TCP connection phase each host will inform the other side of its permissible window size. For example, for Windows 95/98 this is typically 8K or around 8192 bytes. This means that, using Ethernet, 5 full data frames comprising 5 × 1460 = 7300 bytes can be sent without acknowledgment. At this stage the window size has shrunk to less than 1000 bytes, which means that unless an ACK is generated, the sender will have to pause its transmission. 7.1.7 Establishing a connection A three-way SYN/ SYN_ACK/ACK handshake (as indicated in Figure 7.3) is used to establish a TCP connection. As this is a full duplex protocol it is possible (and necessary) for a connection to be established in both directions at the same time. Host-to-host (transport) layer protocols 127 Figure 7.3 TCP connection establishment As mentioned before, TCP generates pseudo-random sequence numbers by means of a 32-bit software counter that resets at boot-up and then increments every 4 microseconds. The host establishing the connection reads a value ‘x’ from the counter where x can vary between 0 and 2 32 –1) and inserts it in the sequence number field. It then sets the SYN flag = 1 and transmits the header (no data yet) to the appropriate IP address and port number. Assuming that the chosen sequence number was 132, this action would then be abbreviated as SYN 132. The receiving host (e.g. the server) acknowledges this by incrementing the received sequence number by one, and sending it back to the originator as an acknowledgment number. It also sets the ACK flag = 1 to indicate that this is an acknowledgment. This results in an ACK 133. The first byte expected would therefore be numbered 133. At the same time the server obtains its own sequence number (y), inserts it in the header, and also sets the SYN flag in order to establish a connection in the opposite direction. The header is then sent off to the originator (the client), conveying the message e.g. SYN 567. The composite ‘message’ contained within the header would thus be ACK 133, SYN 567. The originator receives this, notes that its own request for a connection has been complied with, and also acknowledges the other node’s request with an ACK 568. Two- way communication is now established. 7.1.8 Closing a connection An existing connection can be terminated in several ways. Firstly, one of the hosts can request to close the connection by setting the FIN flag. The other host can acknowledge this with an ACK, but does not have to close immediately as 128 Practical TCP/IP and Ethernet Networking it may need to transmit more data. This is known as a half-close. When the second host is also ready to close, it will send a FIN that is acknowledged with an ACK. The resulting situation is known as a full close. Secondly, either of the nodes can terminate its connection with the issue of RST, resulting in the other node also relinquishing its connection and (although not necessarily) responding with an ACK. Both situations are depicted in Figure 7.4. Figure 7.4 Closing a connection 7.1.9 The push operation TCP normally breaks the data stream into what it regards are appropriately sized segments, based on some definition of efficiency. However, this may not be swift enough for an interactive keyboard application. Hence the push instruction (PSH bit in the code field) used by the application program forces delivery of bytes currently in the stream and the data will be immediately delivered to the process at the receiving end. 7.1.10 Maximum segment size Both the transmitting and receiving nodes need to agree on the maximum size segments they will transfer. This is specified in the options field. On the one hand TCP ‘prefers’ IP not to perform any fragmentation as this leads to a reduction in transmission speed due to the fragmentation process, and a higher probability of loss of a packet and the resultant retransmission of the entire packet. On the other hand, there is an improvement in overall efficiency if the data packets are not too small and a maximum segment size is selected that fills the physical packets that are transmitted across the network. The current specification recommends a maximum segment size of 536 (this is the 576 byte default size of an X.25 frame minus 20 bytes each for the IP and TCP headers). If the size is not correctly specified, for example too Host-to-host (transport) layer protocols 129 small, the framing bytes (headers etc) consume most of the packet size resulting in considerable overhead. Refer to RFC 879 for a detailed discussion on this issue. 7.1.11 The TCP frame The TCP Frame consists of a header plus data and is structured as follows: Figure 7.5 TCP frame format The various fields within the header are as follows: Source port: 16 bits The source port number. Destination port: 16 bits The destination port number. Sequence number: 32 bits The sequence number of the first data byte in the current segment, except when the SYN flag is set. If the SYN flag is set, a connection is still being established and the sequence number in the header is the initial sequence number (ISN). The first subsequent data byte is ISN+1. Refer to the discussion on sequence numbers. Acknowledgment number: 32 bits If the ACK flag is set, this field contains the value of the next sequence number the sender of this message is expecting to receive. Once a connection is established this is always sent. 130 Practical TCP/IP and Ethernet Networking Refer to the discussion on acknowledgment numbers. Data offset: 4 bits The number of 32 bit words in the TCP header. (Similar to IHL in the IP header.) This indicates where the data begins. The TCP header (even one including options) is always an integral number of 32 bits long. Reserved: 6 bits Reserved for future use. Must be zero. Control bits (flags): 6 bits (From left to right) URG: Urgent pointer field significant ACK: Acknowledgment field significant PSH: Push function RST: Reset the connection SYN: Synchronize sequence numbers FIN: No more data from sender Checksum: 16 bits The checksum field is the 16-bit one’s complement of the one’s complement sum of all 16-bit words in the header and text. If a segment contains an odd number of header and text octets to be check-summed, the last octet is padded on the right with zeros to form a 16-bit word for checksum purposes. The pad is not transmitted as part of the segment. While computing the checksum, the checksum field itself is replaced with zeros. This is known as the standard Internet checksum, and is the same as the one used for the IP header. The checksum also covers a 96-bit ‘pseudo header’ conceptually appended to the TCP header. This pseudo header contains the source IP address, the destination IP address, the protocol number (06), and TCP length. It must be emphasized that this pseudo header is only used for computation purposes and is NOT transmitted. This gives TCP protection against misrouted segments. Figure 7.6 Pseudo TCP header format Window: 16 bits The number of data octets beginning with the one indicated in the acknowledgement field, which the sender of this segment is willing or able to accept. Refer to the discussion on sliding windows. Urgent pointer: Urgent data is placed in the beginning of a frame, and the urgent pointer points at the last byte of urgent data (relative to the sequence number i.e. the number of the first byte in the frame). This field is only being interpreted in segments with the URG control bit set. Options: Options may occupy space at the end of the TCP header and are a multiple of 8 bits in length. All options are included in the checksum. Host-to-host (transport) layer protocols 131 7.2 UDP (user datagram protocol) 7.2.1 Basic functions The second protocol that occupies the host-to-host layer is UDP. As in the case of TCP, it makes use of the underlying IP protocol to deliver its datagrams. UDP is a ‘connectionless’ or non-connection-oriented protocol and does not require a connection to be established between two machines prior to data transmission. It is therefore said to be an ‘unreliable’ protocol – the word ‘unreliable’ used here as opposed to ‘reliable’ in the case of TCP. As in the case of TCP, packets are still delivered to sockets or ports. However, no connection is established beforehand and therefore UDP cannot guarantee that packets are retransmitted if faulty, received in the correct sequence, or even received at all. In view of this, one might doubt the desirability of such an unreliable protocol. There are, however, some good reasons for its existence. Sending a UDP datagram involves very little overhead in that there are no synchronization parameters, no priority options, no sequence numbers, no retransmit timers, no delayed acknowledgement timers, and no retransmission of packets. The header is small; the protocol is quick, and streamlined functionally. The only major drawback is that delivery is not guaranteed. UDP is therefore used for communications that involve broadcasts, for general network announcements, or for real-time data. A particularly good application is with streaming video and streaming audio where low transmission overheads are a prerequisite, and where retransmission of lost packets is not only unnecessary but also definitely undesirable. 7.2.2 The UDP frame The format of the UDP frame and the interpretation of its fields are described RFC 768. The frame consists of a header plus data and contains the following fields: Figure 7.7 UDP frame format Source port: 16 bits This is an optional field. When meaningful, it indicates the port of the sending process, and may be assumed to be the port to which a reply should be addressed in the absence of any other information. If not used, a value of zero is inserted. Destination port: 16 bits As for source port Message length: 16 bits This is the length in bytes of this datagram including the header and the data. (This means the minimum value of the length is eight.) . sockets, connection oriented communication, sliding windows, and sequence numbers/acknowledgments. 124 Practical TCP/IP and Ethernet Networking 7.1.2 Ports Whereas IP can route the message. message will be transmitted. An example of this situation is given in Figure 7.2. 126 Practical TCP/IP and Ethernet Networking Figure 7.2 Positive acknowledgment philosophy The sliding. vary between 0 and 2 32 –1) and inserts it in the sequence number field. It then sets the SYN flag = 1 and transmits the header (no data yet) to the appropriate IP address and port number.