Cloud computing ushers in an era in which most information technology users do not need to own the system hardware and software infrastructure on which their day-to-day IT applications run. They either pay for their IT infrastructure usage on demand or get it for free, e.g. through subsidies from advertisers. Although the concept of decoupling using from owning IT infrastructure only starts to gain traction in the enterprise space within the last 3 years, it has been quite common and popular in the consumer space. In this new ecology, it is the cloud service providers that build and own IT infrastructures on which third-party or their own cloud applications run and deliver services, and along the way get reimbursed for values provided to their respective users.
The name of the game behind most cloud computing business models is economy of scale. By consolidating IT infrastructures within an organization (private cloud) or across multiple organizations (public cloud), both the capital expense (software licensing cost, hardware acquisition cost, etc.) and opera- tional expense (human system administration and support cost, energy usage cost, etc.) could be significantly cut down. In addition, by exploiting statisti- cal multiplexing, a consolidated IT infrastructure could be made more capable, flexible and robust than the sum of the parts from which it is consolidated. Although IT infrastructure consolidation brought forth by cloud computing has many benefits, it also escalates the scalability issue of IT infrastructure to a new level. One such issue is scalability of a cloud data center’s network architecture. This paper describes the design, implementation and evaluation of a data center network calledPeregrine, which is specifically designed for a container computer
built at Industrial Technology Research Institute (ITRI) in Taiwan.
The ITRI container computer is designed to be a modular building block for constructing acloud data center computer, which in general is composed of multiple container computers that are connected by a data center network, is interfaced with the public Internet through one or multiple IP routers, and is designed as an integrated system whose hardware components such as servers and switches are stripped off unnecessary functionalities, whose resources are centrally configured, monitored and managed, and which encourages system- wide optimizations to reach more optimal global design tradeoffs. A key design decision of the ITRI container computer is using only commodity hardware, including compute servers, network switches, and storage servers, and leaving high availability and performance optimization to the systems software. Another key decision is to design a new data center network architecture from the grounds up to meet the unique requirements imposed by a cloud data center computer. Before embarking on the design of the network architecture for the ITRI container computer, we carefully reviewed related research literature and studied possible use cases and came up with the following requirements:
1. There is only one network, which supports communications among pro- grams, data storage accesses and interactions with the Internet.
2. The network must be buildable from mainstream commodity layer-2 switches for lower cost and better manageability.
3. The network must be able to support up to one million end points, each of which corresponds to a virtual or physical machine.
4. The fail-over latency for any single network link/device failure is lower than 100 msec.
5. The loads on the network’s physical links are balanced.
6. The network must support private IP address reuse, i.e., multiple instances of the same private IP address can co-exist simultaneously.
The first requirement dictates that the ITRI container computer should not use a separate SAN for storage data accesses, and its network must in- terface seamlessly with the container computer’s internet edge logic component. The second requirement mandates that only mainstream rather than high-end enterprise-grade Ethernet switches be used and the modifications required on these switches must be minimized. The last requirement is included specifically to support Amazon EC2-like IaaS (infrastructure as a service) cloud service, where multiple virtual data enters are multiplexed on a physical data center, and each virtual data center is given the full private IP address address10.X.X.X
so that a customer’s virtual data center could seamlessly inter-operate with its existing on-premise physical data centers without any network/system re- configuration, such as IP address re-assignment.
A natural choice for building an all-layer-2 data center network is the stan- dard Ethernet architecture. Unfortunately, because conventional Ethernet is based on a spanning tree architecture, it cannot satisfy the third and fourth re- quirements. Moreover, because the number of forwarding table entries in most
Physical Server VM0 VM1 VMn Layer-2-Only Data Center Network Load Balancing Traffic Shaping Intrusion Detection NAT/VPN Compute Server Rack Layer-3 Border Routers Storage Server
Figure 4.1: System architecture of the ITRI container computer and its various system components
mainstream Ethernet switches is between 16000 to 64000, they are un-equipped to meet the third requirement. Finally, IP address reuse is actually considered a run-time configuration error and is thus impossible to support in standard Ethernet networks.
Peregrinesatisfies all the requirements mentioned above. It uses a two-stage dual-mode packet forwarding mechanism to support up to 1M end points using only mainstream Ethernet switches. It incorporates load-aware routing to make the best of all the physical network links, and proactively provisions primary and backup routes to anticipate potential network failures. Peregrinesupports private IP address reuse through a protected address translation mechanism similar to virtual address translation. Finally, Peregrine only requires about 100 lines of code change on mainstream Ethernet switches.