560 NETWORK SURVIVABILITY 10.4 Why Optical Layer Protection The optical layer provides lightpaths for use by its client layers, such as the SONET, IP, or ATM layers. (Recall that the layers that use the services provided by the op- tical layer are called client layers of the optical layer.) We have seen that extensive protection mechanisms are available in the SONET layer, and there is some degree of protection possible in the other client layers as well. These layers were all de- signed to work independently of each other and not rely on protection mechanisms available in other layers. We will see below that there is a strong need for protec- tion in the optical layer, despite the existence of protection mechanisms in the client layers. 9 SONET/SDH networks incorporate extensive protection functions. However, other networks such as IP, ATM, and ESCON networks do not provide the same level of protection. As we saw in Section 10.3, IP traffic for the most part is "best-effort" traffic. However, as carrier networks become more data centric, there is an increasing expectation from both carriers and their customers that these networks will need to provide the same level of availability as SONET and SDH networks. One way for realizing this capability is to develop additional protection mechanisms within the IP, ATM, or other client layers, as we saw in Section 10.3. Another way to protect data networks is to rely on optical layer protection, which can be quite cost-effective and efficient. 9 Significant cost savings can be realized by making use of optical layer protection instead of client layer protection. We illustrate this with two examples. Consider an example of a WDM ring network with lightpaths carrying higher-layer traffic. Figure 10.14 illustrates an example where there is no op- tical layer protection. Two SONET line terminals (LTEs) are connected to each other through lightpaths provided by the optical layer, as are two IP routers. For simplicity we look at a undirectional lightpath from LTE A to LTE B and another lightpath from router C to router D. These two lightpaths are protected by the SONET and IP layers, respectively, using 1 + 1 protection. The working connec- tion from LTE A to LTE B is established on wavelength X1 along the shortest path in the ring, and the other protection connection is established, say, on the same wavelength ~.1 around the ring. Likewise, the working connection from router C to router D may be established on )~1 on the shortest path. However, the protec- tion connection from router C to router D, which needs to be routed around the ring, must be allocated another wavelength, say, )~2. Thus two wavelengths are required to support this configuration. 10.4 Why Optical Layer Protection 561 Figure 10.14 A WDM ring built using optical add/drop multiplexers (OADMs), sup- porting two interconnected SONET line terminals (LTEs) and two interconnected IP routers using protection provided by the SONET and IP layers, respectively. The SONET and IP boxes do not share protection bandwidth. Figure 10.15 shows what can be gained by having the optical layer do the protection instead. Now we can eliminate the individual 1 + 1 protection for the SONET LTEs and the IP routers and make them share a common protection wavelength around the ring. Only a single wavelength is required to support this configuration. Note, however, that only a single link cut can be handled by this arrangement, whereas the earlier arrangement of Figure 10.14 can handle some combinations of multiple fiber cuts (see Problem 10.11). Likewise, the 562 NETWORK SURVIVABILITY Figure 10.15 Benefit of optical layer protection. The configuration is the same as that of Figure 10.14. However, the optical layer now uses a single wavelength around the ring to protect both the SONET and IP connections. arrangement of Figure 10.14 can support two simultaneous transmitter failures, whereas the arrangement of Figure 10.15 can support only a single such failure. Nevertheless, if we are primarily interested in handling one failure at any given time, the optical layer protection scheme of Figure 10.15 offers a clear savings in capacity. Consider what would happen if we had to support N such pairs (N being the number of links in the ring), with each of them being adjacent on the ring. Without optical layer protection, N protection wavelengths would be required. 10.4 Why Optical Layer Protection 563 With optical layer protection, only one wavelength would be needed. Optical layer protection is more efficient because it shares the protection resources across multiple pairs of client layer equipment. In contrast, client layer protection mech- anisms cannot share the protection resources between different or independent clients. Another example of an IP network operating over WDM links is shown in Figure 10.16. Consider two network configuration options. Figure 10.16(a) shows the IP routers interconnected by two diversely routed WDM links. In this case, no protection is provided by the optical layer, and the protection against fiber cuts as well as equipment failures (for example, router port failure) is handled completely by the IP layer. Note that the configuration shown requires three working ports and three protect ports on each router. Figure 10.16(b) shows a better way of realizing a network with the same capabilities, by making use of protection within the optical layer. In this case, fiber cuts are handled by the optical layer. A simple bridge-and-switch arrangement is used to connect two diversely routed fiber pairs in a single WDM system. In general, it is more efficient to have fiber cuts handled by the optical layer, since a single switch then takes care of restoring all the channels, instead of having each individual IP link take care of the restoration by itself. More importantly, this arrangement can result in a significant savings in equipment cost. In contrast with the previous configuration, this configuration requires each router to have only a single protect port instead of three. If one of the working ports in the router fails, the router directs the traffic onto the protect port. Note that this type of failure cannot be handled by the optical layer. This example also brings out another value of optical layer protection. Gen- erally the cost of a router port is significantly higher than the cost per port of optical layer equipment. Therefore it is cheaper to reserve protection bandwidth in the optical layer (effectively reserve ports on optical layer equipment), rather than have additional ports in IP routers for this purpose. The optical layer can handle some faults more efficiently than the client layers. A WDM network carries several wavelengths of traffic on a single fiber. Without optical layer protection, a fiber cut results in each traffic stream being restored independently by the client layer. In addition, the network management system is flooded with a large number of alarms for this single failure. Instead, if the optical layer were to restore this failure, fewer entities have to be rerouted (albeit larger entities), and hence the process is faster and simpler. 9 Optical layer protection can be used to provide an additional degree of resilience in the network, for instance, to protect against multiple failures. An example of 564 NETWORK SURVIVABILITY Figure 10.16 Example showing the benefit of optical layer protection compared to protecting at the IP layer. (a) All the protection is handled by the routers. Two diversely routed WDM links are used. Each IP router uses three working ports and three protect ports to protect against both fiber cuts and equipment failures. (b) A single WDM line system is deployed, with protection against fiber cuts handled by the optical layer. Equip- ment failures are handled by the IP layer. The IP routers now use three working ports and an additional protect port in case one of the working ports fails. this is shown in Figure 10.17. Consider a SONET BLSR operating over light- paths provided by the optical layer. Figure 10.17(a) shows normal operation of the network. Figure 10.17(b) shows what happens to a sample SONET connec- tion in the event of a link failure. The BLSR does a ring switch and reroutes the connection around the ring. At this point, until the failed link is repaired, the network cannot handle another failure. Repairing a failed link can take several hours to days~a fairly long period during which the network is vulnerable to 10.4 Why Optical Layer Protection 565 Figure 10.17 Optical layer protection used to enhance SONET protection. The thick lines indicate fiber links, the thin lines indicate lightpaths provided by the optical layer between SONET ADMs, and the dashed line indicates a SONET connection. (a) Normal operation before failure. A SONET ring is realized using lightpaths provided by the optical layer. (b) Due to a fiber failure, a lightpath connecting two adjacent SONET ADMs fails, causing the SONET ADMs to invoke ring switching to rapidly restore the SONET connection. (c) The optical crossconnects (OXCs) perform optical layer restoration and reroute the lightpath around the failure. To the SONET ring, it appears as if the failure has been restored and the ring reverts back to normal operation, ready to tackle another failure. 566 NETWORK SURVIVABILITY 10.4.1 additional failures. Optical layer protection can be used to remove this vulnera- bility. In Figure 10.17(c), the optical layer reroutes the lightpath on the failed link around the failure over another optical path. At this point, as far as the BLSR is concerned, it appears as if the failed link has been restored, and the ring reverts back to normal operation. This allows the BLSR to handle additional failures while the failed link is actually being repaired. 9 Finally, protection in SONET is currently based on rings (UPSR/BLSR). Ring-based schemes require that the capacity in the network reserved for protec- tion be equal to the capacity used for working traffic. Within the optical layer, a variety of mesh-based protection schemes are being developed. These offer the promise of requiring significantly less protection capacity than ring-based schemes. Admittedly, these schemes could also be applied in the SONET layer. However, optical layer protection does have its limitations: 9 Not all failures can be handled by the optical layer. If a laser in an attached client terminal fails, the optical layer cannot do anything about it. Thus, client equipment failures need to be dealt with by the client layer. 9 The optical layer may not be able to detect the appropriate conditions that would cause it to invoke protection switching. For instance, a transparent network can only monitor presence or absence of power (and in some cases, the optical signal-to-noise ratio). While it may also be able to measure power degradations, it may not know what the reasonable values for the power levels are because they vary widely depending on the type of signal being carried. Thus it can only trigger protection switching upon detecting loss of light. The bit error rate is a more precise indicator of signal quality, but a transparent network may not be able to measure bit error rate. 9 The optical layer protects traffic in units of lightpaths, and it cannot protect part of the traffic within a lightpath and not protect other parts. Such functions need to be performed by the client layers. 9 Protection routes in the optical layer may be longer than the primary routes, and the choice of alternate routes may be severely limited due to link budget considerations. 9 We need to pay careful attention to interworking of protection schemes between the different layers. We will discuss some of these issues in Section 10.6. Service Classes Based on Protection In Chapter 9, we alluded to the fact that multiple classes of service can be provided by the optical layer based on the type of protection provided. The main differences 10.5 Optical Layer Protection Schemes 567 in these classes lie in the level of connection availability provided and the restoration time for a connection. These different classes will likely be supported using different protection schemes. While no standards have been defined yet, we provide a likely set of services below: Platinum. This provides the highest level of availability and the fastest restoration times, comparable to SONET/SDH protection schemes, typically around 60 ms. For example, a dedicated 1 + 1 protection scheme could be used to provide this class of service. This class may be viewed as a premium service and is accordingly priced. Gold. This provides high availability and fast restoration times, typically in the range of hundreds of milliseconds. For example, a shared mesh protection scheme can provide this class of service. Silver. This class sits below gold in terms of availability and restoration time. For example, a protection scheme that provides "best-effort" restoration may fit into this category. Another example would be a scheme wherein a connection is reattempted from scratch in case of a failure. Bronze. Here, the optical layer provides unprotected lightpaths. In the event of a failure of the working path, the connection is lost. Lead. This class of service would have the lowest availability and the lowest priority among all the classes. For instance, we may support this class by using protection bandwidth reserved for other classes of service. If that bandwidth is needed to protect other higher-priority traffic, connections in this class are preempted. There is a great deal of debate about what types of applications will use these service classes and which of them will proliferate. For instance, today carriers us- ing SONET/SDH are providing primarily platinum-type services to their customers. However, we expect that the increasing dominance of data traffic will stimulate the need for lower-priced classes of service. For example, carriers interconnecting In- ternet routers from Internet service providers are providing in some cases platinum services and in other cases bronze (unprotected) services. In the latter case, the IP layer handles all the restoration functions. In the former situation, it is quite possible that some of that traffic could be carried over lightpaths with a lower quality of service. 10.5 Optical Layer Protection Schemes We next look at the different types of optical layer protection schemes. For the most part, conceptually, the schemes are similar to their SONET and SDH equiva- lents. However, their implementation is substantially different, for several reasons: 568 NETWORK SURVIVABILITY Table 10.3 A summary of optical protection schemes operating in the optical mul- tiplex section (OMS) layer. Both dedicated protection rings (DPRings) and shared protection rings (SPRings) are possible. Protection Scheme 1 + 1 1:1 OMS-DPRing OMS-SPRing Type Dedicated Shared Dedicated Shared Topology Point-point Point-point Ring Ring Table 10.4 A summary of optical protection schemes operating in the optical channel layer. Protection Scheme 1 + 1 OCh-SPRing OCh-Mesh Type Dedicated Shared Shared Topology Mesh Ring Mesh the equipment cost for WDM links grows with the number of wavelengths to be multiplexed and terminated, link budget constraints need to be taken into account when designing the protection scheme, and there may be wavelength conversion constraints to deal with. We saw in Chapter 9 that the optical layer consists of the optical channel (OCh) layer (or path layer), the optical multiplex section (OMS) layer (or line layer), and the optical transmission section (OTS) layer. Just as SONET protection schemes fit into either the line layer (for example, BLSR) or the path layer (for exam- ple, UPSR), optical protection schemes also belong to the OCh or OMS layers. An OCh layer scheme restores one lightpath at a time, whereas an OMS layer scheme restores the entire group of lightpaths on a link and cannot restore individ- ual lightpaths separately. Table 10.3 provides an overview of schemes operating in the optical multiplex section layer. Table 10.4 summarizes schemes operating in the optical channel layer. These schemes have not yet been standardized, and there are many variants. We have attempted to use a nomenclature that is consistent with SDH terminology. In SONET, there is not a significant cost associated with processing each connec- tion separately in the path layer instead of processing all the connections together in the line layer because the processing is done using application-specific integrated circuits, where the incremental cost of processing the path layer compared to the line 10.5 Optical Layer Protection Schemes 569 layer is not significant. In contrast, there can be a significant difference in cost associ- ated with OCh layer schemes relative to OMS layer schemes. An OCh layer scheme has to demultiplex all the wavelengths, whereas an OMS layer scheme operates on all the wavelengths and thus requires less equipment. As an example, consider the two protection schemes shown in Figure 10.18. Figure 10.18(a) shows 1 + 10MS protection, while Figure 10.18(b) shows 1 + 10Ch protection. The OMS scheme requires two WDM terminals and an additional splitter and switch. The OCh scheme, on the other hand, requires four WDM terminals and a splitter and switch per wavelength. Thus its equipment cost is higher than the cost of the OMS scheme. Indeed this is the case if all channels are to be protected. However, the cost of OCh protection can be reduced if not all channels need to be protected. Assuming multiplexers, splitters, and switches can be added on a wavelength-by-wavelength basis, the cost of OCh protection grows linearly with the number of channels that are to be protected. The cost of an OMS protection scheme, on the other hand, is independent of the number of channels to be protected. If only a small fraction of the channels are to be protected, then OCh protection is not significantly more expensive than OMS protection. The choice of protection schemes is dictated primarily by the service classes to be supported (as discussed below) and by the type of equipment deployed. In the SONET/SDH world, protection is performed primarily by the SONET/SDH line ter- minals (LTEs) and add/drop multiplexers (ADMs) and not by digital crossconnects. This is the case primarily because digital crossconnects were more inefficient at per- forming fast protection than the LTEs and ADMs, partly because they operated on lower-speed tributaries. However, we are likely to see protection functions handled somewhat differently in the optical layer. Multiplexing equipment, such as optical line terminals and add/drop multiplexers, can provide both OCh layer and OMS layer protection in linear or ring configurations. On the other hand, optical cross- connects can provide protection in linear, ring, and mesh configurations. Unlike their digital crossconnect counterparts in the SONET/SDH world, optical crossconnects are designed to provide efficient protection. Depending on the type of crossconnect (see Section 7.4), the protection could be done either at the optical channel layer (for crossconnects that groom at the wavelength level) or at the STS-1 level (for electrical core crossconnects grooming at STS-1). Therefore one possibility is to use simple unprotected WDM point-to-point systems and rely on the optical crossconnects to perform the protection functions. Backbone networks handling large numbers of wavelengths may opt for this choice, as may operators who have already deployed a large quantity of unprotected WDM equipment in their networks. The other possi- bility is to rely on the WDM line terminals and add/drop multiplexers to perform this function. Metropolitan networks using small numbers of channels and not requiring the use of crossconnects may opt for this choice. . limitations: 9 Not all failures can be handled by the optical layer. If a laser in an attached client terminal fails, the optical layer cannot do anything about it. Thus, client equipment failures need. the optical multiplex section layer. Table 10.4 summarizes schemes operating in the optical channel layer. These schemes have not yet been standardized, and there are many variants. We have attempted. can only monitor presence or absence of power (and in some cases, the optical signal-to-noise ratio). While it may also be able to measure power degradations, it may not know what the reasonable