Campus High Availability JRES 2009 Jean‐Marc Barozet ConsulBng System Engineer jbarozet@cisco.com February 2009 JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Features, Scalability, Longevity Cisco Switching PorIolio Nexus 7000 on/Core DistribuD Catalyst 6500 Catalyst 4500 E‐Series s s ter Acce Datacen Catalyst 6500 Nexus 5000 Catalyst 4900 Series Blade Switches set Wiring Clo Catalyst 6500 Catalyst 3560 Catalyst 3560‐E Catalyst 29xx Lite Catalyst 4500 E‐Series Catalyst 29xx Medium-sized Small Number of Employees/Density JMB Catalyst 3750 Catalyst 3750‐E © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Large Next GeneraBon Campus Design Unified CommunicaBons EvoluBon • IP Telephony is now a mainstream technology • Ongoing evoluBon to the full spectrum of Video and CollaboraBon technologies • High DefiniBon ExecuBve CommunicaBon ApplicaBons require stringent Service‐Level Agreement (SLA) – Reliable Service – Highly Available Infrastructure – ApplicaDon Service Management ‐ QoS JMB Presentation_ID © 2007 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Next GeneraBon Campus Design Unified CommunicaBons EvoluBon Seconds of Data Loss • Availability Requirements for UC are more than just five 9’s • Also need to consider the subjecBve impact to real Bme communicaBons Minimal Impact to Video, none to Voice Minimal Impact to Voice User Hangs Up Phone Resets* * Phone to reset time depends on the signaling protocol, SCCP or SIP, and call state; active, ringing, … JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Cisco’s Campus Architecture Hierarchical, Modular and Resilient Building Blocks • Offers hierarchy for scalability Access • Modular building blocks—Easy to grow, understand, and troubleshoot Distribution • Predictable traffic pa`erns under normal and failure condiBons • Small fault domains to isolate Core problems • Promotes load balancing and fast failover Distribution • Can be applied to all campus designs; MulB‐Layer L2/L3 and Routed Access designs Redundant Supervisor Layer 2 or Layer 3 Redundant L3 Links Layer 3 Equal Cost Link’s Redundant Switches Access WAN JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Data Center Building Blocks Internet AGENDA • Systems Level Resiliency Data Center • Network Level Resiliency – RouBng Services Block • Campus Core and FoundaBon Services • Emerging Campus Design – Routed Access – Virtual Switch Campus Design DistribuDon Blocks JMB Presentation_ID 2008Cisco CiscoSystems, Systems,Inc Inc.All Allrights rightsreserved reserved ©©2006 Cisco Confidential Cisco Confidential System Level Resiliency Comprehensive Physical Redundancy • Nexus 7000, Catalyst 6500 and 4500 highly redundant Modular systems Redundant hot swappable Supervisors Redundant hot swappable Power Supplies N+1 redundant fans with hot swappable fan trays Hot swappable line cards Passive data backplane Redundant system clock modules • Catalyst 3750/3750E StackwisePlus* technology 1:N Master redundancy Hot swappable stack members Hot swappable Power Supplies* JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Graceful Restart Non‐Stop Forwarding/Stateful Switch‐Over No Route Flaps During Recovery • • NSF/SSO is a supervisor redundancy mechanism for intra‐chassis supervisor failover SSO synchronizes layer 2 protocol state, hardware L2/L3 tables (MAC, FIB, adjacency table), ACL and QoS tables – • – – • JMB NSF‐Aware SSO synchronizes state for: trunks, interfaces, EtherChannels, port security, SPAN/ RSPAN, STP, UDLD, VTP Non‐Stop Forwarding (NSF) provides the capability for the rouBng protocols to gracefully restart aner an SSO fail‐over – NSF‐Aware NSF‐Capable The newly acBve redundant supervisor conBnues forwarding traffic using the synchronized HW forwarding tables The NSF capable RouBng Protocol requests a graceful neighbor start RouBng neighbors reform with no loss of traffic Aggressive RP Bmers may not work in NSF/ SSO environment © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Nexus 7000 Service Restart • Stateful Restart with PSS – Checkpoints states to PSS – Recover states from PSS upon restart • Stateful Restart with GR – Fresh start without traces from former instanBaBon. – Graceful Restart (NSF) for L3 Protocols • Supervisor Switchover • Non‐disrupBve In Service Sonware Upgrade JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Nexus 7000 – Layer2 Services – Layer3 Services • Neighbors never see event occur JMB HA Manager etc LACP OSPF STP IPv6 TCP/UDP PIM HSRP • MulBple Service Instances • Independent memory‐ protected re‐startable processes • Services checkpoint their runBme state to the PSS for recovery in the event of a failure BGP Stateful Fault Recovery Using PSS Restart process! PSS Kernel N7K Data Plane Data plane streams The Traffic keeps being forwarded by the Linecard Forwarding Engine Ifafaultoccursinaprocess: SysmgrdeterminesbestrecoveryacBon (restartprocess,switchovertoredundantsupervisor) Processrestartswithnoimpactondataplane StatecheckpoinBng(PSS)allowsinstant,statefulprocessrecovery â 2008 Cisco Systems, Inc All rights reserved Cisco Confidential 10 Routed Access Layer 3 DistribuBon with Layer 3 Access EIGRP/OSPF EIGRP/OSPF Layer 3 Layer 3 Layer 2 EIGRP/OSPF 10.1.20.0 10.1.120.0 GLBP Model VLAN 20 Data VLAN 120 Voice 10.1.40.0 10.1.140.0 EIGRP/OSPF Layer 2 VLAN 40 Data VLAN 140 Voice • Move the Layer 2/3 demarcaBon to the network edge • Upstream convergence Bmes triggered by hardware detecBon of light lost from upstream neighbor • Beneficial for the right environment JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential 39 Routed Access Design ConsideraBons Design MoBvaBons • Simplified Control Plane – No STP feature placement (root bridge, loopguard, …) – No default gateway redundancy setup/tuning – No matching of STP/HSRP priority – No L2/L3 mulBcast topology inconsistencies • Ease of TroubleshooBng (leverage well know toolset) – – – – – Show ip route Traceroute Ping and extended pings Extensive protocol debugs ConsistenttroubleshooBng:access,dist,core Failuredierences Routedtopologiesfailclosedi.e.neighborloss Layer2topologiesfailopeni.e.broadcastandunknowns ooded JMB â 2008 Cisco Systems, Inc All rights reserved Cisco Confidential 40 Routed Access Simplified Network Recovery • Routed Access network recovery is dependent on L3 re‐route • Time to restore downstream flows is based on a full rouBng protocol re‐ route – – – – Time to detect link failure Time to determine new route Process the update of the SW RIB & FIB Update the HW FIB • Time to restore upstream traffic flows is based on ECMP re‐route – Time to detect link failure – Process the removal of the lost routes from the SW FIB – Update the HW FIB JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential Upstream: ECMP Recovery Downstream: RouDng Protocol Recovery 41 Routed Access Design ConsideraBons Design Requirements • VLANs are localized to a single wiring closet switch • IP addressing—do you have an address allocaBon plan to support a routed access design? • PlaIorm requirements; – Requires a Cisco Catalyst 3560 or above – Cisco Catalyst IOS Feature Set consideraBons • IP Base for EIGRP‐Stub and PIM* IP Services for OSPF and PIM JMB © 2008 Cisco Systems, Inc All rights reserved Cisco Confidential 42 Routed Access Design Advantages, Yes in the Right Environment • Ease of implementaBon, less to get right – No matching of STP/FHRP priority – No L2/L3 mulBcast topology inconsistencies – No STP configuraBon in Dist • Both L2 and L3 Can Provide Sub‐ Second Convergence Single control plane and well known tool set – traceroute, show ip route, show ip eigrp neighbor, etc. • • • • JMB Most Cisco Catalysts support L3 switching today EIGRP converges in