The Complete IS-IS Routing Protocol- P29 ppt

metric). Based on that, the partial run is basically a search operation, which tries to find out the lowest metric for a given prefix. Figure 10.14 illustrates the simplicity of a partial SPF calculation. All the leaf information from the routers on the PATH list, plus the Pennsauken root router, extract their IPv4 prefixes and move them to a table. Next, the list is sorted and duplicate entries with a worse cost are eliminated. Finally, the prefixes are sorted by their cost in ascending order. This simple search operation is computationally much less complex than the topological section of the full SPF run. Both JUNOS and IOS support partial runs for IPv4 and IPv6. In IOS, you can also control the SPF delay for partial route calculations (PRCs). PRC is an IOS term and can be controlled using the prc-interval router isis configuration command. These timers can be more aggressive (shorter) than the spf-interval <a> <b> <c> timers. This is because the burden that a partial SPF run adds to a control plane is not as high as a full run, so the router does not need to self-protect so much. The following configuration example sets the router pre-SPF timer (initial wait) before doing a partial SPF calculation to 100 ms. For the second run, the router holds down for 250 ms. The PRC also employs an exponential back-off timer. That means after the second run, the hold- down value is now 500 ms. The first argument of the command controls the maximum hold-down value of one second. IOS configuration In IOS there are three timers to control partial SPF hold down. The three timers work sim- ilarly to the timers for the spf-interval configuration command. London# show running-config [… ] router isis prc-interval 1 100 250 [… ] JUNOS does not have a dedicated control knob to control the PRC behaviour. In JUNOS, there is just one hold-down logic path. For partial SPF runs, therefore, the same hold-down logic applies as for full SPF runs. it is recommended setting the three IOS parameters 5, 200 and 200 for compatiblity to the JUNOS default behaviour. 10.3.2.1 Performance and CPU Usage Partial SPF runs are pretty cheap from the calculation point of view. A router has to scan through all the routers in its link-state database, extract the prefix information, add the prefix cost of the distance to the originating router, and sort the prefixes to find out which is closest. This exhibits absolutely linear behaviour, meaning the CPU processing time is directly proportional to the number of routes in the network. Mathematically speaking, this would be O(R) with R being the number of prefixes of an address family. In practical implementations, the cost of the partial SPF run nears zero cost. Typically, the partial run is less than 10 ms execution time, even if R is unreasonably high (like 10,000) routes. So partial runs are even less of an issue than full SPF runs. 268 10. SPF and Route Calculation 269 172.16.33.16/30 172.16.33.0/30 172.16.33.12/30 172.16.33.4/30 172.16.33.2 8/30 172.16.33.24/30 172.1 6.33.8/30 SPF Result list via cost Extracted IPv4 Prefix list New York 26000 Washington 48000 Frankfurt 298000 London Paris 315000 385000 Destination origin cost Pennsauken - 0 192.168.0.19/32 172.16.33.0/30 172.16.33.16/30 192.168.0.8/32 172.16.33.8/30 172.16.33.12/30 172.16.33.20/30 172.16.33.24/30 192.168.0.12/32 172.16.33.4/30 172.16.33.12/30 Sorted IPv4 prefix list Destination cost 192.168.0.17/32 0 172.16.33.0/30 0 172.16.33.4/30 0 192.168.0.19/32 172.16.33.16/30 26000 172.16.33.8/30 0 192.168.0.21/32 48000 172.16.33.20/30 172.16.33.28/30 192.168.0.8/32 298000 172.16.33.24/30 298000 192.168.0.12/32 315000 172.16.33.12/30 315000 192.168.0.22/32 385000 172.16.33.24/30 385000 192.168.0.17/32 0 172.16.33.0/30 0 172.16.33.4/30 0 172.16.33.8/30 0 192.168.0.21/32 172.16.33.16/30 172.16.33.20/30 172.16.33.28/30 192.168.0.22/32 172.16.33.24/30 172.16.33.28/30 Pennsauken 192.168.0.17 New York 192.168.0.19 London 192.168.0.12 Washington 192.168.0.21 172.16.33.20/30 Frankfurt 192.168.0.8 Paris 192.168.0.22 Destination New York New York New York New York London Washington Washington Washington Washington New York New York New York Paris Paris Paris Pennsauken Pennsauken Pennsauken Pennsauken Frankfurt Frankfurt Frankfurt Frankfurt Frankfurt London London London 48000 48000 48000 48000 26000 26000 26000 385000 385000 385000 298000 298000 298000 298000 298000 315000 315000 315000 Partial SPF calculation origin Pennsauken Pennsauken Pennsauken Pennsauken New York New York Washington Washington Washington Frankfurt Frankfurt London London Paris Paris 26000 48000 48000 F IGURE 10.14. A partial route calculation (PRC) is basically a simple, computational cheap sort operation 10.3.3 Incremental SPF Run The incremental SPF (iSPF) run is an optimized version of the full SPF run. What it does is maintain additional data structures, so-called Neighbor and Parent lists, during previous full SPF calculations. The paths that have not been used so far are of special interest. Consider Figure 10.15, which shows the SPF tree from the SPF calculation example. Note that the link between London and Frankfurt is not on the shortest path tree from 270 10. SPF and Route Calculation Pennsauken Paris oc192/STM-64 26000 New York oc768/STM-256 22000 Washington oc12/STM-4 600000 oc192/STM-64 87000 250000 oc192/STM-64 Area 49.0001 Level 2-only Frankfurt oc768/STM-256 22000 London oc48/STM-16 315000 oc48/STM-16 315000 FIGURE 10.15. Incremental SPF does not need to re-compute a SPF calculation if a link is not on the shortest path tree Pennsauken’s perspective. If the Pennsauken router receives a new LSP reporting that this particular link is down, then Pennsauken does not need to schedule a full SPF run. The reason is that because the router doing the SPF calculation has not used the link before (when it was up), then it does not have to consider it when it is down. Keep in mind that such considerations, whether to do a full SPF or an incremental SPF run, is a purely local decision that applies only to the local router. For other routers in the network, for example Frankfurt, the link between London and Frankfurt may be mean- ingful, and therefore on Frankfurt’s shortest path tree. The iSPF advantage on the Pennsauken router is meaningless to the Frankfurt router. The incremental SPF run only spares the full SPF run on some of the routers in a given area but not to all of them. Which routers benefit from incremental SPF is heavily dependent on topology. Another optimization of the incremental SPF run is to track network dependencies. Consider Figure 10.16, which shows a new router (Munich) attached as a leaf to the sample SPF Calculation Diversity 271 87000600000 250000 22000 22000 oc48/STM-16 315000 315000 26000 43000 GE Pennsauken oc192/STM-64 oc48/STM-16 London New York oc768/STM-256 Area 49.0001 Level 2-only oc768/STM-256 Washington oc192/STM-64 oc12/STM-4 oc192/STM-64 Frankfurt Paris Area 49.0001 Munich FIGURE 10.16. Leaf routers also do not need to re-run SPF on all event that would trigger a full SPF run topology. The incremental SPF algorithm figures out that Munich is a leaf node and dependent on the Frankfurt router. That knowledge is used in the SPF calculation. Recall that once the immediate successors on the PATH list are explored, the algorithm knows that Munich is (because of its edge position) an uninteresting node for path searches and hence does not need to get explored. Two scenarios where the iSPF algorithm may be applicable have been highlighted. It is the authors’ opinion that in the first scenario (Figure 10.15) the performance improvement is next to nothing. This is due to the fact that, in a distributed environment, convergence is bound to the worst-case performing router. It has been shown that not all routers take equal advantage of the optimization, and some routers in the topology need a full SPF run anyway. The second example (Figure 10.16) is far more interesting as it dramatically reduces the number of nodes that need to get explored. Also the majority of the routers in the network take advantage of this and so there is a real SPF performance improvement. 10.3.3.1 Performance and CPU Usage There are little, but profound, things known about theoretical models of the incremental SPF calculation. This is because there are lots of caveats and “it depends” in the underlying algorithm. Incremental SPF only makes sense if the underlying topology is sparsely meshed and has many edge nodes. Identification and path tracking turned out to have one of the highest overheads in the full SPF run. Stefano Previdi, a Development Engineer at Cisco Systems who maintains their IS-IS routing protocol, claims that the average saving is 80 per cent from early field trials. The first practical examination was conducted by Cengiz Alaettinoglu and Stephen Casner of Packetdesign, who monitored the QWEST backbone in the US and analyzed full and incremental SPF runtimes. The results are illustrated in Figure 10.17. It will be shown shortly that this is the misguided reason that people are afraid of frequent SPF runs. It is the post-processing of route resolving and prefix insertion, and not the SPF calculation itself, which makes the control planes of the core routers in the Internet busy. 272 10. SPF and Route Calculation 10000 1000 100 10 1 0 102030405060708090100 Percentage of SPF runs avg = 13 usec Dijkstra SPF Incremental SPF avg = 1069 usec F IGURE 10.17. Incremental SPF performs by a factor of 80 better than the full (Dijkstra) SPF based on the QWEST topology The result of the SPF calculation is fed into the route resolution process. The route resolver checks to see if routes from other routing protocols have been affected by the result of the SPF calculation. 10.4 Route Resolution Pure reachability protocols like BGP rely on a working IGP like IS-IS to map the Reachability information, such as customer and Internet routes, to a topology in order to properly calculate the path cost. After every SPF recalculation, the route resolver needs to track dependent routes and update their forwarding next-hops accordingly. Finally, the changed prefixes are downloaded to the line cards and ASICs. In the past there has been little attention to the nature and performance implications of tracking the dependent routes. However, in an Internet environment with full routing tables, it turns out to be that finding out who is dependent and who is not is one of the most dominating factors in the total route-recalculation period. 10.4.1 BGP Recursion and Route Dependency Routing protocols like BGP are somehow agnostic to the underlying topology and need an IGP that provides two services: 1. Connectivity between the internal loopback IP addresses of all the routers in an AS so that the BGP speakers can bootstrap their iBGP mesh 2. Topology awareness to calculate the IGP distance to a BGP speaker Internal BGP neighbours are typically not directly connected, so a router cannot simply inherit the neighbour address from the routing update sender as other distance vector protocols (RIP and EIGRP) would do. Even if the neighbour is directly connected, the router still cannot inherit that information because it does not know if the neighbour is a BGP Route Reflector or not. The good news is that there is information contained in the BGP message that points to the IP address where the route originated. This field is called the next-hop and is a mandatory BGP attribute that points to the correct forwarding router. In the tcpdump output below, a BGP Update message containing a next-hop attribute is shown. Tcpdump Output The BGP Next-hop attribute carries an IP address that the IGP needs to resolve. 08:28:27.945234 IP 192.168.0.19.179 > 192.168.0.21.28161: BGP, length: 77 Update Message (2), length: 77 Origin (1), length: 1, Flags [T]: IGP AS Path (2), length: 14, Flags [T]: 3320 4711 12788 24896 Next-hop (3), length: 4, Flags [T]: 192.168.0.8 Local Preference (5), length: 4, Flags [T]: 100 Community (8), length: 12, Flags [OT]: 5511:500, 5511:516, 5511:999 Updated routes: 81.21.0.0/20 Route Resolution 273 After receiving the BGP update the router needs to look up 192.168.0.8 in the SPF result database and find the local forwarding next-hop. The BGP route 81.21/20 is now dependent on the IS-IS route pointing to 192.168.0.8. Whenever the IS-IS topology is recalculated, the router needs to check all dependent routes and find out if there is a better way to reach the BGP speaker. A given route may arrive at a BGP router via many diverse paths. Certain rules in the BGP route selection process depend on the IGP calculated route. 10.4.2 BGP Route Selection BGP performs tie-breaking to find the best path according to the following list: 1. Is the BGP next-hop reachable? 2. Prefer the highest Local Preference value 3. Prefer the shortest AS Path length 4. Prefer the lowest Origin value 5. Prefer the lowest MED value 6. Prefer routes learned via EBGP over routes learned via iBGP 7. Prefer routes with the lowest IGP metric 8. Prefer routes from the peer with the lowest RID 9. Prefer routes from the peer with the lowest peer ID At the very top of the tie-breaking list, BGP is heavily dependent on IS-IS. BGP needs to validate its BGP next-hop and check if it is reachable before further comparing the route. The BGP next-hop is a mandatory BGP attribute that points to the correct forwarding router. In Rule #7, the BGP route again is dependent on IS-IS. This time the lower IGP metric provides BGP with some insight on how close a BGP speaker is. Consider Figure 10.18 for an example. Router Pennsauken has learned the prefix 81.21/20 from London, New York and Paris. After applying the BGP tie-breaking process, it turns out that the route from New York is best, due to a lower (better) IGP metric. There are different ways of implementing route-recursion inside the router – the most common ones are to store backtracking pointers. Whenever a BGP route is resolved through an IS-IS route, the router stores a pointer from the IS-IS routes to the dependent BGP routes. If a change is needed to an IS-IS route, simply revisit the stored prefixes and look to see if the old IS-IS route is still the best route. The router does that by checking if the BGP next-hop is still on the shortest path. If it is – fine, then simply stop there (do not attempt to change forwarding state). If it is not, and there has been a path change (which could be a path becoming better or a path getting worse), then re-run the recursion for the prefixes stored in the backtrack-list. The router has to re-check to see if there are better paths pointing to the BGP next-hop. In a worst case, this means that 100 K prefixes need to re-run through the entire BGP tie-breaking process, which can be quite expensive in terms of computational cost (CPU load). 10.4.2.1 Performance and CPU Usage Both JUNOS and IOS do a proper BGP recursion check, but implemented differently. The difference is in the way the BGP code is written and its performance implications. 274 10. SPF and Route Calculation In IOS the BGP code is job-based. That means whenever there is a change to a BGP learned prefix only a flag in the data-structure of the prefix is set or cleared. Then there is a job that scans the BGP table for changed entries (called the BGP walker). Why is this information relevant for a book about IS-IS? It means that even if IS-IS has detected that a link has been broken, and must perform all the relevant actions (flooding, scheduling of an SPF full run etc.), it takes in the worst case the BGP walker duration in IOS (50 seconds) until the Cisco router starts to change prefixes, update forwarding states, and so on. So the implementation style of the BGP implementation dictates the convergence behaviour of the BGP routes. Perhaps this is not the best design choice. In all fairness, the first implementation of BGP in IOS was coded at a time when the Internet consisted of not even 1000 routes. So it is probably not bad design, but a legacy effect. In contrast, JUNOS routing software is event-driven. That means that whenever a sub- system in the router notices that something has gone wrong, or is up again, that change is propagated throughout the system immediately and without any delay. Immediately after the SPF run, JUNOS does BGP recursion. Both implementations result in a list of prefixes that need to change in the main routing table. After that, the router updates the forwarding state in the forwarding plane. Updating the forwarding plane is the most daunting task of all because it makes both the forwarding and control plane CPUs really busy. The reason this keeps both CPUs busy is the sheer amount of data and table sizes that has to be pumped through a router’s chassis. Currently Route Resolution 275 Area 49.0001 Level 2-only oc192/STM-64 87000 oc12/STM-4 600000 oc192/STM-64 250000 oc768/STM-256 22000 oc768/STM-256 22000 oc48/STM-16 315000 oc48/STM-16 315000 oc192/STM-64 26000 Origin: IGP AS Path: 5511 2874 12788 24896 Next-hop: 192.168.0.12 Local preference: 100 81.21.0.0/20 Origin: IGP AS Path: 701 702 12788 24896 Next-hop: 192.168.0.19 Local preference: 100 81.21.0.0/20 Origin: IGP AS Path: 3320 8847 12788 24896 Next-hop: 192.168.0.8 Local preference: 100 Community: 3320:4711 81.21.0.0/20 LondonNew york Pennsauken New York Paris Community: 5511:500, 5511:516, 5511:999 Washington Frankfurt London FIGURE 10.18. The transit route 81.21/20 via Pennsauken wins the BGP tie-breaking process a full routing table of all Internet routes consumes about 120–200 MB of memory. A full forwarding table consumes about 2 MB of memory on each line-card in the router. So crunching at least 100 MB of BGP tables and generating N*2 MB sized forwarding tables is the main reason the router is busy. The next section covers legacy and state-of-the-art methods of forwarding state change operations that can make the prefix insertion process scale better. 10.5 Prefix Insertion In the age when the Internet was a network of only 1000 prefixes, no one had to worry about efficiency in changing forwarding state. Figure 10.19 shows an old-style implementation of a forwarding table structure. 10.5.1 Flat Forwarding Table There are two tables in the figure. The first table holds all the prefixes of the main routing table. The second table holds all the forwarding next-hops of the router. A forwarding next-hop is a local interface plus Layer-2 data like encapsulation method, MAC addresses etc. As a result of the route calculation, the entries in the prefixes list are all pointing to the forwarding next-hops. To put the two tables into perspective: based on today’s Internet routing tables, 100,000s of prefixes point to only 10s of forwarding next-hops. It is exactly that many-to-few mapping that causes problems. Consider the sample topology shown in Figure 10.20 where each router is a public BGP speaker and injects BGP routes into the network. Each of the six routers carries a full BGP load, and after the BGP tie-breaking process the routers figure out which are the best routes. The figures in the box indicate how many active routes each router carries. For simplicity, look at the Frankfurt routing and forwarding table only. The forwarding table looks very simple: all 120.000 prefixes map to one of three possible next-hops, which are the SONET/SDH links to London, Paris or Pennsauken. Now, assume the link between Washington and Frankfurt breaks. Both Washington and Frankfurt will quickly detect that one of their SONET/SDH interfaces is down. Next, both routers will originate 276 10. SPF and Route Calculation Forwarding engine 81.21.0.0/20 so-7/3/0.0 100000s of Prefixes 10s of forwarding Next-hops FIGURE 10.19. In a flat forwarding table a prefix points directly to a forwarding next-hop a new LSP declaring the adjacency down. Because of the default values of the SPF hold- down timers in the network, the SPF run will be scheduled after 100 ms. As the number of nodes and links is low, in less than one millisecond the results will be available. Now the scary part begins: the recursion and change of forwarding state in the forwarding plane. The routing tables are traversed in 1–2 seconds and the control plane realizes that it has to change 40,000 prefixes. The route processor computes new forwarding tables and loads them down to the line-cards. Because of the fact that the router has to update Prefix Insertion 277 Area 49.0001 Level 2-only LondonNew York BGP 13K active routes Pennsauken Frankfurt London Washington New York Paris BGP 40K active routes BGP 18K active routes BGP 17K active routes BGP 22K active routes BGP 10K active routes FIGURE 10.20. Each router in the sample topology is a BGP router and carries several thousand active paths . Flat Forwarding Table There are two tables in the figure. The first table holds all the prefixes of the main routing table. The second table holds all the forwarding next-hops of the router. A forwarding next-hop. needed to an IS-IS route, simply revisit the stored prefixes and look to see if the old IS-IS route is still the best route. The router does that by checking if the BGP next-hop is still on the shortest. than the full (Dijkstra) SPF based on the QWEST topology The result of the SPF calculation is fed into the route resolution process. The route resolver checks to see if routes from other routing

Định dạng
Số trang	10
Dung lượng	236,15 KB