Minimizing the Risks with Enterprise Multi-Site Data Center L2 Connectivity BRKDCT-2840 BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Goals of this Session… Present alternatives for interconnecting multiple Data Center locations Present tested methods in production for minimizing the risks associated with meeting these connectivity requirements BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Session Agenda Data Center Interconnection – Common Scenarios and Terms Dark Fiber / DWDM Solutions Label Based Solutions IP Based Solutions Encryption Recommended Designs for Optimizing Traffic Flows EoMPLS and VPLS Stability Testing (Reference Material) Q&A BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Layer / Clusters Use Cases Risks Solution Types BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Layer / Clusters Intra-Cluster node communications Flow Types Traditionally Layer2 Communications on Private and/or Public interfaces IPv4 and/or IPv6 possible depending on clustering package used Ability to prioritize interfaces Client Access to Cluster DNS/Active Directory resolution by clients Shared Virtual IP for service discovery Caching issues can inhibit Layer3 clustering Client application can have logic to re-establish connections Quorum considerations to avoid split-brain Additional cluster nodes at alternate sites to achieve a majority node set (MNS) Possible extensions such as ping-groups (Linux-HA) to have a quorum mechanism without a member node Shoot The Other Node In The Head topologies to resolve conflicts (STONITH) Mechanisms to facilitate service restoration in another location VMware Site Recovery Manager (SRM) is one example Microsoft Server 2008 Layer Clustering is another Remapping of service to new IP/DNS entry BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Some Layer Use Cases Extending Operating System / File System clusters Extending Database clusters Virtual machine mobility Physical machine mobility Legacy devices/apps with embedded IP addressing Time to deployment and operational reasons Extend DC to solve power/heat/space limitations BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Layer Risks Flooding of packets between data center‟s Rapid Spanning Tree (RSTP) is not easily scalable and risk grows as diameter grows RSTP has no domain isolation – issue in single DC can propagate First hop resolution and inbound service selection can cause verbose inter-data center traffic In general Cisco recommends L3 routing for geographically diverse locations This session focuses on making limited L2 connectivity as stable as possible BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Layer Solution Types Light customer owned fiber to build a separate L2 network No STP isolation between sites Purchase multiple wavelengths from SP Cost rises, still nothing to offer STP isolation Redesign data center RSTP domain using Multiple Spanning Tree (MST) regions STP domain concept Fundamental change requiring large time investment Operational differences and MST database management Implement a L2 solution to virtualize transport over L3 Virtual Switching System L2TPv3 for point to point (possible STP isolation issues) EoMPLS for point to point (possible STP isolation issues) Multipoint bridging using Virtual Private LAN Services (VPLS) BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Session Agenda Data Center Interconnection – Common Scenarios and Terms Dark Fiber / DWDM Solutions Label Based Solutions IP Based Solutions Encryption Recommended Designs for Optimizing Traffic Flows EoMPLS and VPLS Stability Testing (Reference Material) Q&A BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Dark Fiber / DWDM Solutions BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 10 DCI N-PE (Primary & Backup) EoMPLS/VPLS EEM Active/Backup Psuedowire event manager applet VPLS_B_loopback-is-up event track 10 state up action 1.0 cli command "enable" action 2.0 cli command "conf t" action 4.0 cli command "int lo2" action 4.1 cli command "shutdown" action 9.0 syslog msg "Backup N-PE is Active, Force Primary in Standby" event manager applet VPLS_B_loopback-is-down event track 10 state down action 1.0 cli command "enable" action 2.0 cli command "conf t" action 4.0 cli command "int lo2" action 4.1 cli command "no shutdown" action 5.0 cli command "end" action 5.1 cli command "clear mac-address-table dynamic int g1/1" action 9.0 syslog msg "Backup N-PE has become Standby, Primary runs Active" ! ! track timer ip route track 10 ip route 80.0.1.2 255.255.255.255 reachability ! event track 10 state up ! event track 10 state down ! BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public IOS Features like EEM + Tracking are used to Shutdown a Psuedowire on a failover recovery of a DCI NPE IOS Features like EEM + Tracking are used to activate a Psuedowire on a Active DCI NPE Failure Tracking is used to check redundant DCI N-PE Availability 131 Achieving STP Isolation, Redundancy (HA), Multi-Path and Avoid Loops in the DCI for Catalyst 6500 EEM + VSS + MEC for EoMPLS/VPLS or EoMPLS/VPLS over GRE Faster Convergence VSS Capable hardware required EEM + RSTP for EoMPLS/VPLS or EoMPLS/VPLS over GRE Slower convergence then VSS No Special hardware requirement BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 132 EoMPLS/VPLS and EoMPLS/VPLS o GRE STP Isolation L3 Si Si DCI VPLS is used in the core to Isolate Spanning Tree Domains L2 QInQ Aggregation Si BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Si 133 EoMPLS/VPLS and EoMPLS/VPLS o GRE HA Active / Back-up VFI PW EEM and Route Tracking is used to Monitor DCI router availability L3 DCI Si Si L2 3001 for Odd VLANs QinQ EEM based Route Tracking is used Activate Back-up PW after DCI Router Failure is detected 3002 for Even VLANs QinQ Aggregation TCN Link used to Flush out MAC on any topology change as QinQ disables STP Si Si STP Cost For Odd VLANS 1000 Forwarding STP Cost For Even VLANS 1000 Forwarding STP Cost For Even VLANS 1500 Blocking STP Cost For Odd VLANS 1500 Blocking BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 134 EoMPLS/VPLS and EoMPLS/VPLS o GRE Multi-Path Active / Back-up VFI PW EEM and Route Tracking is used to Monitor DCI router availability L3 DCI Si Si L2 Local RSTP and Spanning tree Cost Manipulation is used manipulate link utilization and convergence EEM based Route Tracking is used Activate Back-up PW after DCI Router Failure is detected Aggregation Si Si STP Cost For Odd VLANS 1000 Forwarding STP Cost For Even VLANS 1000 Forwarding STP Cost For Even VLANS 1500 Blocking STP Cost For Odd VLANS 1500 Blocking BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 135 EoMPLS/VPLS and EoMPLS/VPLS o GRE Loop Avoidance Split Horizon is used in the Core to Avoid Loops for VPLS Active / Back-up VFI PW EEM and Route Tracking is used to Monitor DCI router availability L3 DCI Si Si L2 Local RSTP and Spanning tree Cost Manipulation is used manipulate forwarding and blocking ports per VLAN EEM based Route Tracking is used Activate Back-up PW after DCI Router Failure is detected Aggregation Si Si STP Cost For Odd VLANS 1000 Forwarding STP Cost For Even VLANS 1000 Forwarding STP Cost For Even VLANS 1500 Blocking STP Cost For Odd VLANS 1500 Blocking BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 136 Aggregation Switch (non-VSS) Link Toward DCI N-PE interface GigabitEthernet1/13 description “Link Connected towards DCI" switchport switchport trunk encapsulation dot1q Link connecting to N-PE Is standard Dot1Q port Participating in local STP switchport mode trunk switchport nonegotiate DTP negociation is useless, and wastes time spanning-tree vlan Odd cost 1000 spanning-tree vlan Even cost 1500 ! Spanning cost is used to manipulate sending Odd and Even VLAN Forwarding To get fast VPLS convergence, RSTP at edge is recommended BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 137 CPOC Tested Failover Numbers BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 138 EoMPLS and VPLS Stability Testing Testing of link outage scenarios Pulling fiber connections Administratively shutting down interfaces Pulling active cards and supervisors Testing of failure and fail-back timing Tests grouped by location in the network Metro Core failures Aggregation failures Layer Core failures BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 139 Metro Core Failover/Failback Tests Link Down Link Up Top Rail Pull 105mSec 1mSec Top Rail Admin 133mSec 1mSec Vertical Rail Pull 0 2x10GE Card Fail 1.2Sec 5.4Sec 2x10GE Card AS 718mSec 5.7Sec Node Power Off 379mSec 6.4Sec MST Link Pull 0 Primary Sup Pull 516mSec BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public x x xx 140 Embedded Event Manager Scripting based on events Script initiator is a tracking of node reachability Bring up interfaces in a known order Allow traffic flows based on a time delay BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 141 EEM Policy to Handle VPLS Down In case VPLS redundancy is not possible an EEM policy can be used to prevent black-hole when VPLS path goes down Since, the LAN modules come-up before the WAN Modules used EEM and EOT to control track interface GigabitEthernet3/0/0 line-protocol ! track interface GigabitEthernet3/0/1 line-protocol ! track 20 ip route 10.1.133.226 255.255.255.255 reachability ! track 21 ip route 10.1.133.222 255.255.255.255 reachability ! track 25 list boolean and object 20 object 21 delay up 90 ! track 40 list boolean or object object delay up 90 ! 10.1.133.226 and 222 are remote N-PEs Gig3/0/0 and 3/0/1 are VPLS uplinks TenGigE4/4 is the link to local Agg switch BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public event manager applet TRACK_ES20_DOWN event track 40 state down action 1.0 cli command "config t" action 2.0 cli command "interface TenGigabitEthernet4/4" action 3.0 cli command "shutdown" action 4.0 syslog msg "EEM has shutdown the SVI's" event manager applet TRACK_ES20_UP event track 40 state up action 1.0 cli command "config t" action 2.0 cli command "interface TenGigabitEthernet4/4" action 3.0 cli command "no shutdown" action 4.0 syslog msg "EEM has enabled the Ten4/4" event manager applet UP_TEN4/4 event track 25 state up action 1.0 cli command "config t" action 2.0 cli command "interface TenGigabitEthernet4/4" action 3.0 cli command "no shutdown" action 4.0 syslog msg "EEM has unshut Ten4/4" event manager applet test event syslog pattern "Module 4: Passed Online Diagnostics" action 1.0 cli command "config t" action 2.0 cli command "interface TenGigabitEthernet4/4" action 3.0 cli command "shutdown" action 4.0 syslog msg "EEM has shutdown Ten4/4" 142 Alternative to EEM Policy Allows for a pseudowire to be established between the co-located N-PEs, and this PW transports the MST BPDUs between them This method eliminates both the need for EEM and for the L2 links between the co-located NPEs.Since l2 vfi BPDU manual vpn id 10 forward permit l2protocol all ! forward the BPDUs to the other PE neighbor 10.1.1.1 encapsulation mpls ! Loopback of co-located N-PE interface Vlan10 no ip address xconnect vfi BPDU BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 143 Aggregation Failover/Failback Tests Link Down Link Up Left Agg Pull 344mSec 5.7Sec Left Agg Admin 668mSec 5.7Sec Access Link Pull 63mSec (L2), 123mSec (L3) 17mSec (L2), 101mSec (L3) Agg Port-Ch Pull 0 Agg Port-Ch Admin 0 BRKDCT-2840 x x x © 2009 Cisco Systems, Inc All rights reserved Cisco Public 144 Core Failover/Failback Tests x x Link Down Link Up Core-Core Shut 0 Core-MC Shut 0 Core-Core with Core-MC Shut 0 Core Sup Pull 734mSec Core Reload 0 BRKDCT-2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public x 145 ... BRKDCT- 2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Layer / Clusters Use Cases Risks Solution Types BRKDCT- 2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Layer. .. Material) Q&A BRKDCT- 2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public Dark Fiber / DWDM Solutions BRKDCT- 2840 © 2009 Cisco Systems, Inc All rights reserved Cisco Public 10 Layer Prerequisites... Provider L3 Local Fiber Data Center #2 Data Center #1 VSS/vPC vPC / VSS BRKDCT- 2840 © 2009 Cisco Systems, Inc All rights reserved vPC / VSS Cisco Public 17 vPC / VSS L2 View VSS Data Center #3 L2 LH