Voice Over Ethernet 89 www.newnespress.com TCP needs to keep track of flow state, in order to provide the appearance of a stream. The TCP stream is a two-way channel, symmetric in the sense that no one side is favored over the other. Each side keeps sender state and receiver state. A part of that state is that every byte in the TCP stream, since the stream began, is given an increasing sequence number. As packets are pushed out to the network by the sender, the sender keeps copies of those packets. When the receiver gets a packet, it acknowledges it by sending a packet with the ACK flag set, and the Acknowledgment field to the sequence number of the highest byte it has received before a break in the sequence occurs. This acknowledgment can come back as a part of the next return-direction data packet, but if none are queued, then ACKs are generated in their own, otherwise empty, packets every 200ms. This process is called delayed acknowledgment. If the acknowledgment is never received by the sender, the sender has to assume that either the original data packet, or the acknowledgment itself, got lost. The sender will then retry the packet some time later. Once a packet has an acknowledgment received for it, the sender will finally free up the packet. On the receive side, the receiver cannot send information back to the application unless there is a contiguous run of bytes at the head of the reassembly buffer. If not, the buffer holds onto the bytes, and advertises the hole in the next acknowledgment. TCP uses sophisticated flow control techniques to prevent the sender from sending too much. The basic flow control technique is that the sender cannot have any more outstanding packets sent than the window size it hears on any given TCP packet in return. This prevents the receive buffer from being overrun. On top of that, however, TCP engages in congestion control. The sender specifically tries to measure the round-trip time of the network, and its loss rate. Because TCP is a handshaking protocol, and the sender cannot send when the window is full unless it receives an acknowledgment first, TCP will perform most optimally if it can stick enough packets in the wire to fill the round trip time. If, after a round trip time elapses, an acknowledgment does not come in for a packet, the sender can assume the packet did not arrive and retransmit it right away. However, what if the network is congested? In that case, switches and routers can start dropping packets. TCP reacts to that packet loss. Its first method is to avoid flooding the line to begin with. With TCP, past success begets future success. To make that work, TCP starts of slowly, sending one packet at a time. Every acknowledgement gives it more confidence, and a reason to send one more packet than before in the next round trip. This process, called slow start, continues until the network finally drops a packet. Once a packet is dropped, the sender will notice it, because subsequent packets that are sent to the receiver will cause duplicate acknowledgments, as the hole that the loss created prevents the receiver from acknowledging the later sequence numbers, and yet the receiver is required to send an acknowledgment. The back-to-back duplicates cause TCP to back off, by cutting its congestion window—the number of packets it thinks it can have outstanding every round trip—in half. The sender then tries to ease back in, by growing its congestion window once every round trip time. This process is 90 Chapter 4 www.newnespress.com finely tuned to ensure that the network does not become overly crowded by aggressive behavior. In the early days of the Internet, this did, in fact, happen and was the motivation behind introducing congestion control. Because TCP refuses to allow any loss, it is required to block the sender and receiver until it can resolve outstanding packet matters. This makes TCP generally inappropriate for voice mobility. Interestingly, TCP can be used for the signaling protocols, such as with Secure SIP (Chapter 2), as long as the applications that use it are prepared to handle cases on lossy networks where the application gets stuck. Also, TCP is being used increasingly for video, mostly because of applications such as consumer-oriented video sharing services, which make the assumption that simplicity is best. 4.2 Quality of Service on Wired Networks The benefit of packet networks is that they are incredibly flexible. The ironic thing about the transfer from circuit networks to packet networks, however, is that the best-effort nature of packet delivery requires that packet networks develop the quality-of-service sophistication that was not needed for circuits networks. On a circuit, there is always room for a high- quality call, if there is room at all. On a packet-based network, however, it is very difficult to tell whether that call can just be squeezed in, or whether problems will arise. There are two general methods for solving the quality-of-service problem with packet networks. Both methods are based on notions designed for IP, to ensure the simplest network that can deliver on the promises. 4.2.1 Integrated Services The concept of integrated services is that the quality-of-service mechanisms are integrated directly into the forwarding network. Integrated services are based on the notion of resource reservations, similar to with circuit networks. The difference between circuit-oriented resource reservations and integrated services reservations is that the latter can accommodate a nearly infinite variety of packet rates, sizes, types, and behaviors. Integrated services is based on the protocol called RSVP, the Resource Reservation Protocol in RFC 2205 (though named to sound like the phrase répondez, s’il vous plaît, as appropriate for reservations). The idea is rather simple. A receiver that needs to send a flow of a certain type requests the ability to get that flow with a specific quality of service. The sender uses a multicast group to send out specific messages, called PATH messages, to announce the availability of a stream. When a listener wants to join, it sends a response called a RESV (for reservation) directly back to the sender. Along the way, the intervening routers on the path have the responsibility of listening in on that protocol, and trying to provide just that quality of service. Voice Over Ethernet 91 www.newnespress.com The crux of the mechanism is the voluntary announcing of the flow qualities required using a format known as a traffic specification, or TSPEC. The TSPEC for RSVP is shown in Table 4.11. The Token Bucket Rate is a floating-point number that represents the amount of bytes per second that the flow is expected to take. The Token Bucket Size is a floating-point number specifying the amount of tokens, in bytes, that a flow can accumulate. This is something like a measure of the backlog. Token buckets work to regulate a flow. The idea is that the flow is given tokens at a fixed rate, one token per allowed byte. Thus, a flow that was admitted for one kilobyte a second will get 1000 tokens a second. The bucket has a maximum capacity, the size, to prevent a flow from getting an infinite hall pass to transmit if it has been idle for a while and not using its resources. Every byte that passes by from that flow needs a token to continue, and takes it from the bucket. Once the buckets are done, the network can either buffer the packet until it gets a token, or police the flow by dropping that packet. RSVP works usually under the latter condition. However, RSVP also endeavors to ensure that the flows that are within their limits get to use the resources before best effort traffic. The Minimum Policed Unit is set to the size of the smallest packet that is used in the flow: for voice or video, this is likely to be an RTP packet. The Maximum Policed Size field specifies the largest packet size that the sender may want to generate. Table 4.11: RSVP TSPEC Format Token Bucket Rate Token Bucket Size Peak Data Rate Minimum Policed Unit Maximum Policed Size 4 bytes 4 bytes 4 bytes 4 bytes 4 bytes RSVP is a form of admission control. When the resource requests are made, if any one of the routers cannot support the flow because it has already exceeded its admissible capacity, it will inject a reject message, to inform the listener that its flow’s quality of service will not be granted. RSVP’s greatest disadvantage is that it requires all of the routers to keep state—even soft state—on every flow, and to take action based on the behavior of the flow. Because of this, RSVP is not commonly used in voice mobility networks, and the concepts are not used for wireline. The same concepts behind RSVP, however, appear in the context of wireless networks, where the stakes are higher and the number of devices that must maintain state are dramatically reduced (to one base station and one client). 4.2.2 Differentiated Services So how do wireline networks get quality of service? They do so through the use of prioritization. Instead of asking for, and accounting for, resources and reservations and 92 Chapter 4 www.newnespress.com policing, the network becomes very simple. Traffic is divided up into classes. Some classes are better than others, and will get special treatment. Most likely, this treatment is just to cut to the head of the line. Each packet, not flow, is independently marked with the priority or class it belongs to. Every router and switch along the way that understands the tags will provide that differentiation, and the ones that do not simply ignore the tags and treat the packet as best effort. This is the concept of differentiated services. For IP networks, the TOS/DSCP field in IPv4 and Traffic Class field in IPv6 is expected to hold the specific class or priority that the packet belongs to. The sender self-marks the packet, and the network takes it from there. Here, the two conflicting concepts of the IPv4 Type of Service (TOS) come in contact with the Differentiated Services Code Point (DSCP) definition, for the same byte in the header. Each is a mechanism that was created to try to classify packets on a per-packet basis. TOS is the older mechanism, and is now considered to have fallen out of use. However, for the purposes of voice mobility, a lot is similar about TOS and DSCP. TOS defined, among other things, eight priority levels. The format of the now formally deprecated TOS field is shown in Table 4.12. Table 4.12: The TOS Field in IPv4 Precedence Delay Throughput Reliability Reserved Bit: 0–2 3 4 5 6–7 Table 4.13: The TOS Precedence Value Old Meaning 802.1p Meaning WMM Meaning 7 Network Control Network Management Voice 6 Internetwork Control Voice Voice 5 CRITIC/ECP Video Video 4 Flash Override Controlled Load Video 3 Flash Excellent Effort Best Effort 2 Immediate Undefined Background 1 Priority Background Background 0 Routine Best Effort Best Effort The precedence value is a prioritization that is used within the network to determine its handling. The values run from 0 to 7, with 0 being the lower end of the range. The definitions originally conceived for this value is given in Table 4.13. The table suggests a gradual rise in priority from 0 to 7. The problem with this definition is that different technologies use the 0–7 range for priorities. Most equipment endeavors to Voice Over Ethernet 93 www.newnespress.com maintain a consistent mapping for the number to a priority level, no matter how the priority got to the packet. The three different meanings are shown in the columns. The second column is from IEEE 802.1p, which is a per-frame prioritization extension to Ethernet, and uses a special header to advertise the priority. The third column contains the meaning of the same eight values in WMM, the Wi-Fi prioritization standard. In general, it is best to assume the meaning of the final two columns. Note that the priority for values 1 and 2 are actually less than best effort in that case. When in doubt, do not use those priorities. The remaining three flags in Table 4.12 represent extra information that may have been useful for the packet. Setting the delay bit meant to ask for low delays, whereas setting the throughput or reliability bit was meant to signal that throughput or reliability was a greater concern to the application. TOS is considered to be replaced, and yet many modern devices in the world of IP telephones use the TOS meanings, and not the later DSCP meanings, in order to support older network configurations that may still be in use. DSCP requires that the TOS meanings for the top three bits still be preserved, as long as the remaining bits are zero. However, DSCP looks at the one byte a different way. Table 4.14 shows the new meaning. Table 4.14: The DSCP Field in IPv4 (Same Byte as TOS; Different Meaning) Code Selector ECN Bit: 0–5 6–7 Table 4.15: Assured Forwarding DSCP Values Drop Probability Class 1 Class 2 Class 3 Class 4 Low AF11 = 10 AF21 = 18 AF31 = 26 AF41 = 34 Medium AF12 = 12 AF22 = 20 AF 32 = 28 AF42 = 36 High AF13 = 14 AF23 = 22 AF33 = 30 AF43 = 38 There are a couple of RFCs that define what the code selector maps to. The goal of the DSCP is to interpret the selector as a somewhat arbitrary code, mapping into a specific quality of service type. RFC 2597 defines the concept of Assured Forwarding (AF), the purpose of which is to allow a service provider to accept markings of packets and apply a certain amount of guaranteed bandwidth, as well as allowing more bandwidth to be given. Each class is named AFxx, where the first x is a number from one to four, representing the class of traffic, and the second x is a number from one to three, representing the drop probability from low to high (see Table 4.15). 94 Chapter 4 www.newnespress.com The network administrator is expected to assign meanings to the four classes, in terms of assured, set-aside bandwidth that these codes can eat into. The drop probabilities are meant to be sent by the traffic originator to make sure that, if resources are getting exhausted, some packets get more protection than others. A different concept is defined in RFC 2598. Expedited Forwarding (EF) sets up a specific codepoint, 46, to allow packets to be marked as belonging to a “virtual lease line,” a high-performing point-to-point measure of quality of service. (There is a wrinkle with this DSCP code as it applies to Wi-Fi: All EF tagged packets get transmitted in the class of service designated for video because of the way the EF tag is coded.) In total, there are 21 commonly seen DSCPs: the twelve AFs, the EF codepoint, and the eight original precedence values, now known default and CS1 to CS7. Nothing in DSCP or differentiated services defines just what the qualities of the differentiated services are to be. This is the advantage of differentiated services: the differentiation is up to the administrator, and can grow as the network grows. 4.2.3 Quality-of-Service Mechanisms and Provisioning There are a few common ways for quality of service to be provided in networks, using enterprise-grade wireline infrastructure. The concepts all stem around handling the packets differently when it comes to queuing. Why? Most wireline networks can handle a fairly large amount of traffic, because the wireline technologies, such as Gigabit Ethernet, have enough throughput to make congestion be less of an issue. However, certain protocols are designed to take up as much bandwidth as they can—to specifically expand into the space that you give them. It will always be important on voice mobility networks to keep the voice traffic protected from these applications, especially if they cause changes in delay. Moreover, network congestion can cause loss rates to become problematic. All of the problems happen to the packets not as they are on the wire, but as they back up in queues within the choke points of the network, the routers or switches that connect the links together. What happens in those queues makes the difference. Thankfully, using the packet classification capabilities from differentiated services, enterprise-grade wireline infrastructure can be used to both police flows that get out of hand and give the ones that are being squeezed out the help they need. These techniques go under the broad category of queuing disciplines, as they provide the discipline that is used to maintain order in the queues. The idea is to take what was once one monolithic queue for the chokepoint, and to create possibly different queues, each queue leading to the same eventual chokepoint. As traffic heads towards the bottoms of the queues, an element called a scheduler chooses from which queues to take packets, and then provides those packets for transmission. We’ll take queuing disciplines and scheduling together for this discussion. Voice Over Ethernet 95 www.newnespress.com 4.2.3.1 FIFO The simplest behavior is to do no particularly new behavior at all. First-in, first-out (FIFO) queuing refers to using the one queue that is there, and to putting packets in with the same order in which they arrived, and pulling them out the same way. This sort of queuing is precisely what causes congestion and variable delays. For the purposes of voice, the longer the queue gets, the longer the potential maximum delay the queue can cause the voice packet to suffer. The alternative is not much better: if the queue gets longer than it can handle, the packets will be dropped. 4.2.3.2 Classification The first step is to determine whether there is any structure in the packets that can be used to differentiate them. Enterprise-grade classification techniques can use a wide, rich array of properties about the individual packet, including the sender, receiver, size, DSCP value, ports, applications, and routes. These can all be applied in a stateless manner, meaning that the router or switch need look at each packet only in isolation. An additional option exists for some routers and switches with a lot of memory and processing ability. They can use flow state to create stateful classification, in which previous packets that are related to the current one dictate the behavior. This distinction is identical to that used in firewalling. Once packets are classified, they can be placed into queues by their classes. These queues can be administratively created, or they can be created on the fly based on the class divisions, ensuring that packets from each class stay in separate queues. Class-based queuing (CBQ) is an extension of this basic concept. Instead of having one level of discrimination, the concept can be extended to a hierarchy of queues, all set up by the administrator. This hierarchy can be powerful in preventing flows and users from stepping on each other, and for shaping the bursts and behavior of the traffic. Traffic shaping is a highly important function for variable bitrate, expansive applications, to prevent them from overwhelming other applications that may not deserve the highest prioritization, but still need to be metered. Once the packets are classified into sibling queues, the schedulers need to be selected, to determine how to get the packets out of the queues. 4.2.3.3 Round-Robin The simplest scheduler is the round-robin scheduler. As the name suggests, the round-robin scheduler takes packets in turn from each queue, wrapping around when it hits the last one. Queues with empty packets get skipped over, but otherwise, everyone gets a shot. Round robin is good for creating packet fairness, were every class gets an equal shot at sending a packet. However, if some of the classes should have a higher priority than the other, then round robin will not suffice. 96 Chapter 4 www.newnespress.com 4.2.3.4 Strict Prioritization Strict prioritization is a very simple scheduler. Classes are ordered, strictly, from highest to lowest. The scheduler always starts with the highest queue. If there are no packets in the highest-priority queue, it checks the one with the next highest priority. This continues until the scheduler finds a packet, which it then sends. By draining the highest-priority queue before moving onto the others, strict prioritization ensures that the traffic with the highest prioritization moves right to the head of the line. Even if the lower-priority queues are heavily backed up and congested, if the highest- priority queue is empty and a highest-priority packet comes in, it will move right past the long lines and be sent first. Strict prioritization is often good enough for voice, especially when the issue is preventing data from competing with voice. However, for elastic or variable applications where one should get more resources than the other, but not too much more, strict prioritization will not suffice either. 4.2.3.5 Weighted Fair Queuing To provide a sense of both prioritization, of which strict prioritization may provide too much, and fairness, of which round robin may provide too little, there is the notion of fair queuing. In fair queuing, the goal is to provide a fair bitrate to each of the classes. Round robin provides a fair packet rate, which is the same only if the packets are all the same size. On top of fair queuing, however, the bitrate should be adjustable so that higher-quality flows get more throughput, without exhausting all the throughput available. This is the concept of weighted fair queuing (WFQ). The idea behind WFQ is that each queue gets a relative weight. That relative weight is used to adjust the data rate that the queue gets. The amount of traffic that the queue gets is always based on how many other queues are active and for how long; the goal is not to tightly control throughputs or to ensure that no one queue gets ahead of the other, but that queues with equal amounts to send get their weight’s worth of relative throughput. The scheduler’s goal is to give the appearance that each queue with a byte in it has a byte taken out fairly (as if, say, by round robin, though order does not matter). This gives rise to thinking about packets flowing through the queues like fluids. The output requires a given data rate, or velocity, and each of the packets are extruded through their queues a little at a time, in equal amounts. The first packet out, then, would be the one whose last byte gets drawn out first—that is, the one that finishes first. The problem, of course, is that packets are packets, not bytes, and cannot be drawn out in this manner. What can be done is that the scheduler can do the math that simulates the bit-by-bit extraction, and make sure to dequeue packets, then, in that proportion. The scheduler Voice Over Ethernet 97 www.newnespress.com calculates the expected time the packet at the end of each queue would get drawn out, in units of virtual time, that don’t depend on real time but still flow forward. This gives the precise order of the packets that should come out. As a packet comes out, the new packet’s virtual end time is calculated, and so on. This technique ensures that packets flow out in the order they should. The weightings come into play by adjusting the velocity, in virtual time, that a queue extrudes its packets. Higher-weighted queues extrude packets more quickly, and thus those packets finish more quickly in virtual time, and hit the wire sooner. It is important to observe that WFQ is a work-conserving process. Work conservation means that the scheduler never delays sending traffic. If there is a packet to send, in any queue, then at least one packet will be sent. At no time will a work-conserving process refuse to send traffic, or delay sending traffic, in hopes of getting a more even throughput. Work conservation is important for not wasting network resources for the sake of “quality.” 4.2.3.6 Traffic Shaping Traffic shaping is more severe than fair queuing. Whereas fair queuing is concerned with fairness, traffic shaping is concerned with ensuring that a precise rate of traffic is met by a given class. Traffic shaping is usually performed through the use of some form of token bucket, first mentioned in the context of RSVP (Section 4.2.1). To recap, the idea of a token bucket is that virtual tokens, corresponding to permission to send bytes, are deposited into the virtual bucket corresponding to the queue at a fixed rate. This rate is the goal at which traffic should be sent. The token bucket then requires that a packet from the queue have enough tokens before it can be let past. This requirement ensures a constant bit rate to the flow. Token buckets are general ways of metering the flow of traffic. Using them to shape traffic, by holding up packets until there are enough tokens for them, is clearly not work- conserving, as the hold up will happen regardless of whether the line will go idle because of it. On the other hand, token buckets have a bucket depth for a reason. If traffic does happen to go idle for a while in the queue that owns the tokens, the queue is allowed to save up its backlog of tokens for when it might need it. Once the traffic resumes, it can use up all of its saved tokens without waiting. This allows for the average traffic rate to be more manageable, even if the incoming flow is not perfectly regular. Traffic shaping holds an important place in keeping variable flows in check, so that they do not exceed specific service-level agreements (SLAs), which often specify a minimum available bandwidth. The goal of an SLA is to give a fat pipe that is shared among users the appearance that it really is a dedicated thin pipe for that one user. This is reminiscent of the reason we embarked on this journey, to make packet networks seem more like dedicated 98 Chapter 4 www.newnespress.com circuits. For voice, a constant, inelastic traffic, traffic shaping does not hold much interest in itself for what we need. However, traffic shaping does highlight one advantage of packet- based networks. They are flexible enough to provide circuit-like throughput guarantees for some services when needed while providing expandable prioritization for other services, all on the same wire. 4.2.3.7 Policing Policing is the other side of the coin of scheduling and queuing discipline. Instead of deciding to hold onto the packet in a queue until it has met its criteria, classes are watched for the same criteria and their packets are dropped when they exceed it. The point of policing is that it does not require building up the long lines of delayed packets as queuing would. Instead, the policer can just observe and drop packets that go over the mark. Policing is a lot less forgiving than queuing, but it requires fewer resources in the network. Token buckets are often used for policing. With token bucket policing, when a packet comes by that does not have enough tokens, it is simply dropped. Packets never delay in this model. Policing is a tough tactic to get right, because it works necessarily by dropping packets that could have been queued or sent. For voice networks, where the goal is to prevent data from interfering with voice, policing is useful only for preventing runaway or hijacked voice streams, being high priority, from taking over the network. Prioritization is a better method to keep data from affecting voice quality. 4.2.3.8 Random Early Detection Along with policing comes the idea of how to drop a packet when the queue is filling. Congestion, for data, is a major issue, and as data backs up, it can cause major problems for any traffic that shares the link with it. The concept behind random early detection (RED) is that congestion can be signaled to TCP, or any other elastic and responsive traffic protocol, before the congestion gets so bad that it caused unfair loss. Congestion causes that unfair loss by affecting whichever random flow whose packet happens to be the one too many for the queue and gets dropped first. As such a flow loses packets, it slows down, and other flows expand to fit their place. To bring back less broken symmetry between the flows, random early detect uses a sliding scale of random drop probabilities to keep the backup at bay. When the queue is nearly empty, nothing is dropped. As the queue fills, however, RED kicks in by increasing its drop probability. This slow but steady increase starts backing the flows off before the queue gets . it belongs to. Every router and switch along the way that understands the tags will provide that differentiation, and the ones that do not simply ignore the tags and treat the packet as best. affecting whichever random flow whose packet happens to be the one too many for the queue and gets dropped first. As such a flow loses packets, it slows down, and other flows expand to fit their place state—even soft state—on every flow, and to take action based on the behavior of the flow. Because of this, RSVP is not commonly used in voice mobility networks, and the concepts are not used for