Chapter 3. IBM Flex System data center network design basics
3.4 High availability
3.4.1 Highly available topologies
The Enterprise Chassis can be connected to the upstream infrastructure in a number of possible combinations. Some examples of potential L2 designs are included here.
One of the traditional designs for chassis server-based deployments is the looped and blocking design, as shown in Figure 3-3.
Figure 3-3 Topology 1: Typical looped and blocking topology
Topology 1 in Figure 3-3 features each I/O module in the Enterprise Chassis with two direct aggregations to a pair of two top-of-rack (ToR) switches. The specific number and speed of the external ports that are used for link aggregation in this and other designs shown in this section depend on the redundancy and bandwidth requirements of the client. This topology is a bit complicated and is considered dated with regard to modern network designs, but is a tied and true solution nonetheless.
Important: There are many design options available to the network architect, and this
section describes a small subset based on some useful L2 technologies. With the large feature set and high port densities, the I/O modules of the Enterprise Chassis can also be used to implement much more advanced designs, including L3 routing within the
enclosure. L3 within the chassis is beyond the scope of this document and is thus not covered here.
Chassis
Compute Node NIC 1 NIC 2 Upstream Network ToR Switch 2 ToR Switch 1 AggregationX
X
Spanning-tree blocked path
I/O Module 1
Chapter 3. IBM Flex System data center network design basics 63
Although offering complete network-attached redundancy out of the chassis, due to loops in this design, the potential exists to lose half of the available bandwidth to Spanning Tree blocking and is thus only recommended if this design is wanted by the customer.
Topology 2 in Figure 3-4 features each switch module in the Enterprise Chassis directly connected to its own ToR switch through aggregated links. This topology is a possible example for when compute nodes use some form of NIC teaming that is not
aggregation-related. To ensure that the nodes correctly detect uplink failures from the I/O modules, trunk failover (as described in 3.4.5, “Trunk failover” on page 69) must be enabled and configured on the I/O modules. With failover, if the uplinks go down, the ports to the nodes shut down. NIC teaming or bonding also is used to fail the traffic over to the other NIC in the team. The combination of this architecture, NIC teaming on the node, and trunk failover on the I/O modules, provides for a highly available environment with no loops and thus no wasted bandwidth to spanning-tree blocked links.
Figure 3-4 Topology 2: Non-looped HA design
Topology 3, as shown in Figure 3-5, starts to bring the best of topology 1 and 2 together in a robust design, which is suitable for use with nodes that run teamed or non-teamed NICs.
Figure 3-5 Topology 3: Non-looped design using multi-chassis aggregation
Important: Because of possible issues with looped designs in general, a good rule of L2
design is to build loop-free if you can still offer nodes high availability access to the upstream infrastructure.
Chassis
ComputeNodeNIC 1 NIC 2 Upstream Network ToR Switch 2 ToR Switch 1 I/O Module 1 I/O Module 2 Aggregation
Chassis
ComputeNodeNIC 1 NIC 2 Upstream Network ToR Switch 2 ToR Switch 1 Multi-chassis Aggregation I/O Module 1 I/O Module 2 Aggregation
Offering a potential improvement in high availability, this design requires that the ToR switches provide a form of multi-chassis aggregation (see “Virtual link aggregations” on page 67), that allows an aggregation to be split between two physical switches. The design requires the ToR switches to appear as a single logical switch to each I/O module in the Enterprise Chassis. At the time of this writing, this functionality is vendor-specific; however, the products of most major vendors, including IBM ToR products, support this type of function. The I/O modules do not need any special aggregation feature to make full use of this design. Instead, normal static or LACP aggregation support is needed because the I/O modules see this as a simple point-to-point aggregation to a single upstream device.
To further enhance the design shown in Figure 3-5, enable the uplink failover feature (see 3.4.5, “Trunk failover” on page 69) on the Enterprise Chassis I/O module, which ensures the most robust design possible.
One potential draw back to these first three designs is in the case where a node in the Enterprise Chassis is sending traffic into one I/O module. But, the receiving device in the same Enterprise Chassis happens to be hashing to the other I/O device (for example, two VMs, one on each Compute Node, but one VM is using the NIC toward I/O bay 1 and the other is using the NIC to I/O bay 2). With the first three designs, this communications must be carried to the ToR and back down, which uses extra bandwidth on the uplinks, increases latency, and sends traffic outside the Enterprise Chassis when there is no need.
As shown in Figure 3-6, Topology 4 takes the design to its natural conclusion of having multi-chassis aggregation on both sides in what is ultimately the most robust and scalable design recommended.
Figure 3-6 Topology 4: Non-looped design using multi-chassis aggregation on both sides
Topology 4 is considered the most optimal, but not all I/O module configuration options (for example, Virtual Fabric vNIC mode) support the topology 4 design, in which case topology 3 or 2 is the recommended design.
The designs that are reviewed in this section all assume that the L2/L3 boundary for the network is at or above the ToR switches in the diagrams. We touched only on a few of the many possible ways to interconnect the Enterprise Chassis to the network infrastructure. Ultimately, each environment must be analyzed to understand all the requirements to ensure that the best design is selected and deployed.
Chassis ComputeNode
NIC 1 NIC 2 Upstream Network ToR Switch 2 ToR Switch 1
Multi-chassis Aggregation (vLAG, vPC, mLAG, etc)
I/O Module 1
I/O Module 2
Chapter 3. IBM Flex System data center network design basics 65