DATACENTER SWITCHES IN THE CAMPUS BACKBONE
CONTENTS
Using Fixed Configuration Data Center
Switches in the Campus Backbone
•
CC-NIE Cyberinfrastructure Challenge (Background)
•
Evaluating Backbone Upgrade Options
•
Evaluating Upgrade Requirements
•
Evaluating What is Available
•
Campus vs Data Center - Features
•
CWRU Implementation and Experiences
•
Deployment, Topology, Benefits
•
Performance Monitoring (SNMP vs Splunk)
•
Buffer monitoring and VXLAN
CC-NIE Cyberinfrastructure Challenge (Background)
•
CWRU Received a CC-NIE grant in October 2013
– Included Science DMZ Component (100GE to OARnet / Internet2)
– Included network upgrades to extend 10GE to research centric buildings and labs
•
Campus Backbone is Aged, Not ready for 10GE to buildings.
•
Current Network Infrastructure is from circa 2003
– Distribution routers are pairs of Cisco 6509-E/sup720Base (L3 HSRP)
– Core to Distribution is 10GE (WS-X6704-10GE)
– Distribution to Building (Access) is 1GE (WS-X6516A-GBIC)
– Buildings are dual connected to distribution pairs (Fairly typical campus design)
•
Multiple Pis within “Bingham” distribution collaborated on CC-NIE.
– Bingham Distribution contains 20 Buildings
CC-NIE Cyberinfrastructure Challenge (Background Cont.)
•
Solution 1: Status Quo in the Backbone (Install Line cards)
– Install a Cisco WS-6708 and X2-LR optics in each distribution 6509 (List price $149k)
– Provides 10GE ports enough for 8 buildings (16 ports @$9,312.50 list per port).
– No other changes required. One of the benefits of chassis gear.
•
Solution 2: Spend that money on something different.
Possibly even replace the old equipment altogether.
– Lets generate a list of requirements and “nice to haves” for this part of the network.
– Lets look at the market and see that what else may meet these requirements at that
price point. Seek better per port value.
– Lets look at feature sets that may provide more options in terms of high performance
networking and ease of operations.
•
We went with Option 2.
Step 1. Take Inventory (What do we really need?)
•
Interface Count and Type
– Need 24 optical 1/10GE ports minimum (12 10GE for fair comparison to option 1)
– Having 48 would be better for many reasons.
•
L3 Requirements (modest tables sizes, standard protocols)
– Needs to support OSPF, OSPFv3, IPV6, a FHRP of some sort.
– Needs to support policy based routing, Standard ACLs, QoS
– Other standard campus routing needs like IPv4/IPv6 DHCP relay, RPF, PIM
– Needs to support 1000 ipv4 routes (Current is less than 500 in these routers)
– Needs to support 1000 ipv6 routes (Current is less than 25 in these routers)
•
L2 Requirements
– Needs to support Spanning Tree Protocol
– Needs to support 10,000 entry CAM table (Currently ~5,600)
Evaluating Backbone Upgrade Options (What's Available?)
•
Traditional Campus Core and Dist Options (Overkill and Expensive)
(gleaned vendor web sites and reference designs)
– Cisco Catalyst / N7K, HP 5400/7500 , Brocade ICX / MLX, Juniper EX6200/8200
– Most are chassis based, most have huge L2 and L3 table capabilities (~256k+ ipv4).
– Most have power supplies ranging from 2500W to 6000W
– Cost per 10GE port ranges from $4,000 to $7,000+ list with vendor optics.
– Cost per 40GE / 100GE too much for this exercise or just plain not available.
•
Data Center ToR Switches (Sweet spot for Value and Functionality?)…
– Cisco Nexus 9372px, Arista 7050/7280, Dell S4048-ON, HP 5900AF-48XG, Etc
– A lot of 1U fixed 48 SFP+, 4 to 6 QSFP, albeit smaller L2 and L3 tables.
– Most have efficient power supplies ranging from between 500W and 1200W
– Cost per 10GE port between $2,000 and $5,000 list with vendor optics
– Cost per 40GE / 100GE still pricy but available with flexible options (break out cables)
Campus vs Data Center: Features (Differentiators)
•
Features are comparable, but not the quite the same (see below).
•
Data Center switches offer some neat stuff though.
•
Traditional Campus Core and Distribution Features
– Most offer a virtual chassis system. (No FHRP, fewer configs, multi-chassis LAG)
– Most offer full MPLS / VPLS implementations.
– Some Offer integrated Security / NAC features.
– Some offer services line cards (Firewalls / Load Balancers / Wireless Controllers)
•
Data Center Switch Features
– Most have some sort of fabric (if you are into that sort of thing). multi-chassis LAG.
– Most have VRF / VRF Lite
– Most offer “Network Telemetry” and very low latency forwarding.
– Most have API / OpenFlow integrations and automation tools (Puppet, Chef, XMPP).
Campus vs Data Center: Sanity Check
•
Are ToR switches suitable for both IMIX and research flows?
•
Data Center Pros: More ports, Less Power, Less Space, Cool Features
– We get ~96 10GE capable ports instead of 16 and an upgrade path for all 20 buildings
– We get at least a 2x40GE EtherChannel between the pair and Multi-Chassis LAG.
– We get a 40GE or 100GE upgrade path for core links.
– We get to features like advanced buffer monitoring, automation, VXLAN.
– We use way less power, generate less heat, and take up less space.
•
Data Center Cons: Longevity? More Risk.
– Shorter life span. No easy (change-less) upgrade path to dense 40GE/100GE.
– No operational experience with most of these devices and OSes.
– Higher risk overall by replacing all L2 and L3 services with new equipment.
– We won’t be able to scale this OSPF area to 256k IPV4 routes… bummer
Data Center Switch Options Abound
•
Many other Data Center ToR switches might be a good fit in campus
backbones. Some Include
…
– Dell S4048-ON, Cisco Nexus 9372px, Brocade ICX-7750, HP 5900AF-48XG, Juniper
QFX 5100-48S. Choose your favorite vendor, I bet they have something to look at.
•
Most are based on merchant silicone. Software and support are key.
•
Many campuses have already started using 1U switches like Cisco 4500-X,
Juniper EX4550, etc, as those are cross-marketed as campus and data center.
They lack some features of data center offerings.
•
Dense 100GE switches are now on the market or shipping soon.
– Dell Z9100, Z6100
– Arista 7060CX
Lets roll the dice!
•
We decided to take a shot.
•
If it fails, we can always use the switches in
…
well
…
a data center.
•
We settled on a really new switch at the time, the Arista 7280SE-64.
•
Choosing Arista helped minimize some of the operational ris
k.
– We had been using Arista in HPC for a while so Engineers were familiar with EOS.
– We also chose Arista 7500 for HPC / Science DMZ integration.
•
The Arista 7280SE-64 specs exceeded our needs (table sizes, port count)
– Based on the Broadcom Arad chipset.
– 48 1/10GE SFP+, 4 40GE QSFP (typical 4Watt per 10GE port)
– 64k IPv4 /12k IPv6 LPM routes, 128k MACs, 96k ARP / Host entries, PIM, VRRP
– Buffer Monitoring, VXLAN, Splunk App for Network Telemetry (we like Splunk), MLAG,
etc.
Data Center Switches in Campus Backbone: Outcomes
•
Arista 7280SE-64 in production today and working really well
.
– No VoIP /QoS / Multicast issues.
– No packet loss or high CPU or high latency… that we have seen.
•
Five Engineering buildings were upgraded to 10GE uplinks. Cost was
less that adding line cards and optics to Catalyst 6509-E.
•
We deployed pairs of Arista 7150S-24 as building aggregators to take care of
the other side of the links and provide 10GE ports within the buildings.
•
Energy Savings Add Up (Nearly $5k/year per pair)
– US Average (All Sectors) is 10.64 Cents /kWh
http://www.eia.gov/electricity/monthly/epm_table_grapher.cfm?t=epmt_5_6_a
– Old equipment costs $5,331.40/yr (4*1430W)/1000)*.1064*24*365
– New equipment costs $354.18/yr (4*95W)/1000)*.1064*24*365
Data Center Switches in Campus Backbone: Measurement
Actual PerfSonar Throughout Test Graph
Traditional SNMP Obscures Traffic Bursts
[ 14 ]
This shows only 750Mbps. Where are my spikes?
Splunk Network Telemetry App: Bandwidth Chart
Buffer Monitoring (Also with Splunk App)
•
Looking at buffer (queue) utilization of Bingham Eth 33 (uplink to core)
•
Can you guess when I stopped the 10GE PerfSonar Throughut Tests?
Buffer Monitoring (No Splunk Required)
•
You can see this via the CLI too
…
•
Might be useful for identifying “microburst” congestion events that could cause
Extending your Science DMZ using VXLAN
•
No real bandwidth advantage, but aids in applying consistent security controls and
inspection. Make sure the VTEPs have a firewall free path!
[ 18 ] KSL HPC Crawford CASC inetrouter0 WS-C6509E inetrouter1 WS-C6509
CWRU ScienceDMZ Deployment
Internet2 sciencerouter0 Juniper MX480 DTN1 1GE 10 GE 100 GE 40 GE hpc-rhk15-m1-e1 Arista 7508E Po10 Bingham
MLAG Case Backbone129.22.0.0/16
PerfSonar-bing CC-NIE Engineering Buildings Glennan White Rockefeller Olin Nord bingham-‐h0-‐e2 bingham-‐h0-‐e1 PBR Enabled FW Bypass CWRU HPC
40GE trunks w/ science DMZ and private HPC Nets
PerfSonar-dmz
Lab System
Science DMZ Vlan Trunked to Building
Summary
•
Data Center class ToR L3 switches can work in campus backbone
deployments. Thought must be given to current and mid-term requirements in
terms of advanced features.
•
The value proposition is compelling in comparison to traditional (or at least
marketed as traditional) campus core and distribution options.
•
Data Center network equipment is designed with power, heat, and space
efficiency in mind. Depending on the size of your backbone, this could make a
difference for you.
•
Data Center network equipment seems to adopt new networking technology
more rapidly than campus centric offerings. Some of which can be helpful to
Cyberinfrastructure engineers.
•
Data Center network equipment has a robust set of API and automation tools
that are not as mature in campus or enterprise offerings. (didn’t have time to
References
[ 20 ]
=== Brocade List Pricing ===
http://des.wa.gov/SiteCollectionDocuments/ContractingPurchasing/brocade/price_list_2014-03-28.pdf
=== Cisco List Pricing ===
http://ciscoprice.com/
=== Juniper List Pricing ===
http://www.juniper.net/us/en/partners/mississippi/juniper-pricelist-mississippi.pdf
=== HP List Pricing ===
http://z2z-hpcom-static2-prd-02.external.hp.com/us/en/networking/products/configurator/index.aspx#.VfwJXCBVhBc http://www.kernelsoftware.com/products/catalog/hewlett-packard.html
=== Dell Campus Networking Reference ===
http://partnerdirect.dell.com/sites/channel/Documents/Dell-Networking-Campus-Switching-and-Mobility-Reference-Architecture.pdf
=== HP Campus Network Design Reference ===
http://www.hp.com/hpinfo/newsroom/press_kits/2011/InteropNY2011/FCRA_Architecture_Guide.pdf
=== Cisco Campus Network Design Reference ===
http://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Campus/HA_campus_DG/hacampusdg.html http://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Campus/campover.html
http://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Campus/Borderless_Campus_Network_1-0/Borderless_Campus_1-0_Design_Guide.pdf
=== Juniper Campus Network Design Reference ===
http://www.juniper.net/us/en/local/pdf/design-guides/jnpr-horizontal-campus-validated-design.pdf
http://www.juniper.net/techpubs/en_US/release-independent/solutions/information-products/topic-collections/midsize-enterprise-campus-ref-arch.pdf
https://www-935.ibm.com/services/au/gts/pdf/905013.pdf
=== Brocade Campus Network Design Reference ===