the SRM (Send Routing Message) and the SSN (Send Sequence Numbers) flag and are defined in ISO 10589. If the SRM is set on a link, this means that the corresponding LSP has to be sent out on that link. If the SSN flag is set, the corresponding LSP should be included in the next PSNP PDU. Note that these two flags are kept strictly internal to the router. They do not show up in any PDU that the router generates. However, what the flags do is influence the link on which LSPs and PSNPs are being sent. Getting back to the header fields proper, the DIS extracts this header information from its link-state database and packages up to 15 LSP-IDs in a single TLV. Given an IS-IS MTU of 1497 bytes over Ethernet LANs, a DIS can package up to 6 times an LSP Entry TLV #9, resulting in up to 90 LSP-IDs in a single CSNP. So even in the largest networks in the world there are just a few CSNP packets going over the wire every 10 seconds. Next, each router on the LAN compares its own link-state database to the CSNP received from the DIS. If the DIS reports the same sequence number for an LSP-ID, then everything is fine. If not, then there are three basic mismatch conditions that can occur: • The CSNP reporting an older version of a LSP • The CSNP reporting a more recent version of a LSP • The CSNP reporting an unknown LSP If the CSNP received is an older version, then the action is simple. Because it appears that the DIS is not up to date, just tell the DIS about the new version of the LSP by re-flooding the most recent version of the LSP onto the LAN. Figure 8.7 illustrates the chain of events. Router B notices that Router A is still carrying an older version of the LSP RouterX.00-00 in its link-state database. So Router B floods the LSP RouterX.00-00 with the most recent sequence number (0x7a). Note, however, that no receiver acknowledges the re-flooded LSP. This principle is sometimes referred to as implicit acknowledgement. So how can the update be made more reliable? Just wait for a maximum of 10 seconds, which is the regular CSNP interval. If the LSP RouterX.00-00 is mentioned in the CSNP with the sequence number (0x7a) and the checksum is correct as well, then the update is successful. IS-IS is very unique in that respect, in that IS-IS tries to keep state-related information very low. The periodic transmission of CSNPs is fundamental to well-synchronized databases. If the CSNP reported is a more recent version of the LSP, the receiving router needs to tell the DIS that it is out of sync by internally setting the SRM flag for this LSP. Setting the SRM flag triggers the sending of a PSNP to the DIS. Figure 8.8 shows the basic struc- ture discussed earlier in this section. The only difference from the CSNP PDU is that the PSNP PDUs are using different code points than the CSNP. IS-IS PDU type 24 indicates a Level-1 PSNP and IS-IS PDU type 25 is used for a Level-2 PSNP. Once the PSNP PDU is formed, all the LSP-IDs that are more recent are packaged again in TLV #9s. Once the PSNP PDU is received at the DIS, the DIS re-floods the most recent version of the requested LSP back onto the LAN. Figure 8.9 illustrates the chain of events. Router A reports an LSP (RouterX-00.00) that is unknown to Router B. So Router B sends a PSNP mentioning RouterX-00.00, but with a Sequence Number 0, Checksum 0, and Lifetime 0. Sequence Number 0 is specially reserved for the case where a router wants to get its database synchron- ized. By setting Sequence Number, Lifetime, and Checksum to zero, a router on a LAN is indicating that it wants to get a copy of that LSP. Therefore Router A re-floods the latest copy of the LSP RouterX.00-00 onto the LAN. Once again, this re-flooding is done using 214 8. Synchronizing Databases Synchronizing Databases on Broadcast LAN Circuits 215 Router A (DIS) Router B t t CSNP RouterX.00-00 Sequence # 0x79 Lifetime 974 Checksum 0x64fa LSP RouterX.00-00 Sequence # 0x7a Lifetime 1146 Checksum 0x3cce CSNP RouterX.00-00 Sequence # 0x7a Lifetime 1136 Checksum 0x3cce 10s FIGURE 8.7. Router A reports an outdated LSP in its CSNP, which causes a re-flood of Router B and finally Router A reports that it is in sync again by reporting the latest sequence number Intra-domain Routing Protocol Discriminator Header Length Indicator Version/Protocol ID Extension 0x83 Bytes 1 1 1 1 1 1 1 1 1 ID length PDU Type R 0 R 0 R 0 PDU Version Reserved Maximum Area Addresses 6 (0) 1 3 (0) 0 TLV section 18–1475 26,27 17 Source ID PDU Length 2 ID Length (6) ϩ 1 FIGURE 8.8. The PSNP PDU reports just a subset of the LSP in the link-state database “implicit acknowledgments”. What might happen is that the LSP does not arrive at Router B, but this is not a problem: Router B would simply resend the PSNP after 5 seconds. If the DIS reports a new or unknown LSP-ID in its CSNP PDU, then the router that detects the mismatch sends a PSNP requesting the missing LSP-ID by setting the three fields of Sequence Number, Lifetime and Checksum to zero, indicating to the receiver that the sender does not know anything specific about this LSP. The DIS will again re-flood the missing LSP. This procedure is typically executed when a new router becomes online. Synchronization of LSPs on LAN segments is both simple and lean. Contrary to OSPF, which needs to keep a lot of state information for synchronizing link-state data- bases, IS-IS uses only two flags for each link: the SRM and the SSN flag. Next, IS-IS synchronization on point-to-point circuits is discussed. Point-to-point links make different use of PSNPs and CSNPs than broadcast links, such as LANs. 8.3 Synchronizing Databases on p2p Links All link-state routing protocols start their first synchronization after one common event: once an adjacency is declared up. Once an adjacency is up on a point-to-point link, the router 216 8. Synchronizing Databases Router A (DIS) Router B t t CSNP RouterX.00-00 Sequence # 0x79 Lifetime 974 Checksum 0x64fa PSNP RouterX.00-00 Sequence # 0x0 Lifetime 0 Checksum 0x0000 CSNP RouterX.00-00 Sequence # 0x7a Lifetime 1136 Checksum 0x3cce 10s LSP RouterX.00-00 Sequence # 0x7a Lifetime 1146 Checksum 0x3cce FIGURE 8.9. Router B requests a re-flood of new or unknown LSPs by sending a PSNP with Sequence Number and Checksum set to zero will jitter a 5-second timer by 25 per cent before sending a CSNP from its own database. Jittering by 25 per cent means that the router computes a random number between 75–100 per cent of the underlying timer; 75 per cent of 5 seconds equals 3.75 seconds. The result is a random timer between 3.75 and 5 seconds. The other router does the same thing. Jittering timers decouple any kind of synchronization effects causing traffic spikes between the two routers. See Figure 8.10 with a hub router and many spoke routers for an illustration of how immediate dispatch of PDUs might harm IS-IS peak load performance. If all spoke routers immediately generate a CSNP after the adjacency is up, then the hub router has to process a large number of CSNPs in a relatively short timeframe. This leads to a short-term peak-load on the hub router. Also, sending all this control traffic at once might harm other user traffic that runs on the physical link. Just imagine if the spoke links were not physical links but logical Frame Relay circuits (DLCIs) all on one physical link. This result might be short-term congestion or an abrupt increase in delay for user traffic. However, if routers jitter the timer before the CSNP is sent after the adjacency-up event, this reduces the short-term congestion and peak CPU utilization. After routers have sent the CSNP it will hang around for a few seconds until routers get the CSNP from the other Synchronizing Databases on p2p Links 217 Hub Router Spoke Router #1 Spoke Router #2 Spoke Router #3 Spoke Router #4 Spoke Router #N t CSNP 0 Adj-UP t CSNP 0 Adj-UP t CSNP 0 Adj-UP Spoke Router #1 Spoke Router #2 Spoke Router #N t CSNP 0 Adj-UP t CSNP 0 Adj-UP t CSNP 0 Adj-UP Spoke Router #1 Spoke Router #2 Spoke Router #N All CSNPs hit the Hub Router at the same time CSNPs are spread over a time window FIGURE 8.10. Jittering timers helps to spread the processing load over a broader time window neighbour. If the router does not wait for the other CSNP, then another CSNP is sched- uled after 5 seconds (minus jitter) and so on. However, if the sending router does receive the remote end’s CSNP, then the router can compute the differences between the two link-state databases. For any LSPs that are miss- ing with respect to the sender’s link-state database, no action is taken. Just sending the CSNP is enough because the other router will see the sender’s CSNP and realize that in the sender’s link-state database there are a couple of LSPs missing. What does the other router do once detecting a database mismatch? – It re-floods the missing LSPs, of course. On point-to-point links, the LSP updates are required to be reliable and therefore must be acknowledged. This is achieved by setting the SRM flag internally for the LSP being sent. Setting the SRM flag translates into a waiting for an acknowledgement state. As soon as an acknowledgement for the LSP arrives (by a listing in a PSNP or CSNP), the SRM flag is cleared, or it is removed from the retransmission list. If no acknowledgement arrives, then the IS-IS router will periodically check the SRM flags on all links and retransmit LSPs that have not yet been acknowledged. See Figure 8.11 for the detailed chain of events that happen once the LSP-IDs in the CSNPs do not match. Router A sends its CSNP. Router B sends its CSNP. Next, Router A re-floods LSP 0000.0000.0005.00-00. Router B re-floods LSP 0000.0000.0006.00-00 and LSP 0000.0000.0006.00-01. Then, Router B sends a PSNP containing LSP-ID 0000.0000.0005.00-00 formatted in the LSP Entry TLV #9 as an acknowledgement for the LSP. Finally, Router A sends a PSNP containing the LSP-IDs 0000.0000.0006.00-00 and 0000.0000.0006.00-01 packaged in the LSP Entry TLV #9 as an acknowledgement for the two LSP fragments. ISO 10589 does not mandate sending CSNPs except for the initial synchronization procedure on point-to-point links. However, sending CSNPs periodically after the startup event results in better synchronization of the link-state database. The following section explains how IS-IS link-state database synchronization is improved by sending periodic CSNPs. 8.4 Periodic Synchronization on p2p Circuits In the IS-IS world of ISO 10589, there is an assumption that each link that can carry IS-IS Hellos can also carry IS-IS LSPs. At first sight, the previous sentence might sound odd and you may think, “Sure, why should a link that can carry a certain IS-IS packet type, not carry arbitrary IS-IS packet types?” But as demonstrated in Chapter 6 “Generating, Flooding and Ageing LSPs”, there can be situations where the IS-IS flood- ing topology may get pruned. Mesh-groups are a good example of this situation. Certain redundant links are removed from the flooding topology in mesh-groups. As a result, there might be situations where parts of the network may get de-synchronized because the LSPs do not get through. In this environment especially, it might be a good idea to send some additional CSNPs to make sure that the neighbours are well synchronized. Of the two implementations of IS-IS from Cisco Systems and Juniper Networks that are the subject of this book, only Juniper Networks implements a more robust synchronization scheme. 218 8. Synchronizing Databases P2p circuit t t CSNP Adj-UP 5s delay 25% jitter 0000.0000.0006.00-00 Sequence 0x101 Lifetime 1022 Checksum 0x99fe CSNP LSP PSNP PSNP 5s delay 25% jitter 0000.0000.0005.00-00 Sequence 0x142 Lifetime 1099 Checksum 0xabd4 0000.0000.0005.00-00 Sequence 0x143 Lifetime 1188 Checksum 0x318b 0000.0000.0006.00-00 Sequence 0x102 Lifetime 1178 Checksum 0x8812 0000.0000.0005.00-00 Sequence 0x143 Lifetime 1188 Checksum 0x318b LSP 0000.0000.0006.00-00 Sequence 0x102 Lifetime 1178 Checksum 0x8812 Mismatch in CSNP ! Our version of 0000.0000.0006.00-00 is newer Ok – got it ! send a PSNP as ACK Ok – got it ! send a PSNP as ACK Mismatch in CSNP ! Our version of 0000.0000.0005.00-00 is newer Adj-UP F IGURE 8.11. After the 3-way handshake each router sends a CSNP. If the tw o routers’ LSDBs are de-synced both routers will LSP-re-flood the missing LSPs and send subsequent PSNPs for ackno wledging those 219 220 8. Synchronizing Databases JUNOS software periodically transmits CSNPs at both IS-IS levels on all (including p2p) circuits. The time base is not fixed, that is, the JUNOS software does not transmit its CSNP at a hard-coded rate. What JUNOS software does is take a base value of 5 sec- onds and multiplies that number by the number of interfaces that have adjacencies in the Up state. This technique is called smearing because one timer is smeared over several interfaces. For an illustration of this technique, look at Figure 8.12. The router has seven adjacencies, four interfaces (two broadcast links and two point-to-point circuits). Each interface carries one or more adjacencies in the Up state. Therefore, 5 seconds times 4 interfaces ϭ a 20-second timer started on each of the interfaces. On average, every 5 sec- onds, a CSNP is sent, resulting in good synchronization. Even if a link is pruned from the flooding topology, for example by use of mesh-groups, periodical CSNPs ensure good synchronization. Fortunately ISO 10589 neither mandates the sending of periodic CSNPs nor does it strictly discourage sending them. This hole in the specification is utilized by the smearing hack described previously. There are no known interoperability issues between Cisco IOS and JUNOS software. Except for the support calls that are generated by NOC teams that find the amount of CSNPs being generated suspiciously high, there are no relevant issues. A typical show isis statistics output resulting from hooking up a JUNOS to an IOS router looks like this on the JUNOS side. JUNOS command output juniper@London> show isis statistics IS-IS statistics for London: PDU type Received Processed Drops Sent Rexmit LSP 41034 41034 0 95 0 IIH 36 36 0 34 0 CSNP 1 1 0 420859 0 PSNP 87 87 0 5125 0 Unknown 0 0 0 0 0 Totals 41159 41159 0 426113 0 Total packets received: 41158 Sent: 426079 SNP queue length: 0 Drops: 0 LSP queue length: 0 Drops: 0 SPF runs: 10772 Fragments rebuilt: 104 LSP regenerations: 27 Purges initiated: 0 London only received a single CSNP and sent hundreds of thousands of CSNPs. By looking at this output you can easily see that there must be lots of Cisco routers on the other end, which typically just generate a single CSNP, once an adjacency comes up. Finally, it should be noted that generating CSNPs comes at almost no cost because they are not difficult to build. CSNPs can be constructed by traversing internal tables in a single pass, and require no complex multi-field operations like checksum calculations. In addition, CSNP 20s t t CSNP t t CSNP CSNP CSNP CSNP CSNP CSNP 02 0 05 25 01 0 30 01 5 3 5 20s 20s 20s F IGURE 8.12. JUNOS sends a periodical CSNP each 5 s per routing process and distrib utes the load across all active interfaces 221 the content of the CSNP does not change frequently. Therefore the CSNP frame can be pre-computed and when the CSNP timer fires, the pre-computed frames are just transmitted over the wire. The pre-computed PDU needs only to change if there is an update in the link-state database. To be completely honest, CSNP construction is not that simple: the LSP lifetimes need to be adjusted as well, because all the LSPs age every second – however, once again inserting of lifetimes values is just a simple copy operation. 8.5 Conclusion CSNPs and PSNPs are very simple and powerful mechanisms to synchronize IS-IS link- state databases. In contrast to OSPF, almost no state information needs to be kept for synchronizing link-state databases. Just two bits per LSP/per interface are required. The openness of the base IS-IS spec ensures that more robust synchronization mechanisms can be implemented and all surrounding routers can cooperate and interoperate. Because of the inherent simplicity of the IS-IS protocol there are practically no inter- operability issues for database synchronization. Ultimately, a robust synchronization scheme is the main prerequisite for loop-free forwarding paths through the network. 222 8. Synchronizing Databases 10 SPF and Route Calculation In order for the hop-by-hop routing paradigm to work, link-state routers need a common algorithm to determine a loop-free path to all destinations in a network. In this chapter you will gain insight as to how the IS-IS related route-calculation and route-resolution algorithms work. There will be a step-by-step explanation of the main three elements in the route calculation process that is SPF calculation, route resolution and prefix insertion. The SPF calculation process has been practically demonized in the past. There is no need to view this process negatively, in the authors’ opinion. This chapter includes a per- formance assessment of each of the three elements needed for SPF calculation to correct this unfortunate perception. Also, common router OS implementation knobs for mitigating the CPU overload side-effects of the SPF calculation and route resolution will be discussed. Finally there will be an implementation assessment of the most dominant perform- ance-related element of the process, which is prefix insertion. The two common schemes for prefix insertion are presented and finally the cost of inserting a prefix and the metrics of current router hardware will be highlighted. 10.1 Route Calculation From the time that a link-state PDU arrives to the time traffic is flowing through the changed path in a router, a lot of actions need to be taken. Figure 10.1 shows the three different steps that are applied for each route. First, the SPF calculation needs to be run. Depending on the location in the network topology and which information has changed (topology, prefix), there are three choices of SPF runs: • Full • Partial • Incremental 247 SPF calculation Route resolution Prefix insertion FIGURE 10.1. The three operations for calculating routes . for the other CSNP, then another CSNP is sched- uled after 5 seconds (minus jitter) and so on. However, if the sending router does receive the remote end’s CSNP, then the router can compute the. LSP If the CSNP received is an older version, then the action is simple. Because it appears that the DIS is not up to date, just tell the DIS about the new version of the LSP by re-flooding the most. If the SSN flag is set, the corresponding LSP should be included in the next PSNP PDU. Note that these two flags are kept strictly internal to the router. They do not show up in any PDU that the