A World of
Connected
Possibilities
Virtual Data Center Networks:
Shedding Light On The Telco Edge
Written by:
Ido Shargil, Chief Architect, European Research Center
2 | P a g e
Table of Contents
Table of Contents ... 2
Executive Summary ... 3
The Telco Challenge ... 4
The Enterprise Prism – What does it take to dive into the public cloud? ... 5
Introducing the vDCN Layer ... 7
Challenges with Virtualizing network devices, availability and management ... 8
vDCN Solution Components ... 9
Deploying an Overlay Network ... 10
Automation, Programmability and Simplification Use Cases ... 12
Summary ... 14
3 | P a g e
Executive Summary
One of the fastest growing segments in the ICT market is the public cloud service. As the Telcos are seeing their revenues of legacy services declining, providing cloud service is conceived as a new,
promising source of revenues. However, the Telcos are currently lagging behind the dominant players in this field, the OTT\IT vendor such as Amazon, Google and Microsoft. Parallel to that, NFV is becoming a critical foundation for the next generation network services Telcos will adapt in the future, this is a new domain Telco’s are doing their first steps.
Still, the Telcos are not to be underestimated. They have unique assets such as existing business relationship with the customers providing a variety of services from mobile connectivity to business VPNs, local presence connected with their physical network infrastructure, infrastructure and control of the network. Our target is to leverage the network control and promote an advanced approach to leverage that advantage.
The vDCN (virtual data center network) concept aims at providing a competitive edge to the Telco and its customers by enabling it better performance, control, flexibility and usability, while ensuring low CAPEX and OPEX. That is done by overlaying at two vectors:
1. Network function virtualization – deployment of virtual appliances under the NFV framework. These appliances allow the Telco and the enterprise customer to deploy advanced services over existing cloud infrastructures in a ‘click of a mouse’.
2. Network Resources Virtualization – Overlaying of a virtual network layer on top of the underlying physical infrastructure.
The vDCN concept is implemented on top of industry standard x86 commodity servers, is available to demonstrate, and is now being tested by major tier1 operators.
4 | P a g e
The Telco Challenge
Telcos are facing immense pressure due to their declining revenues from traditional services, and their inability to get revenues from Over-The-Top (OTT) services. They are even being threatened in their ‘home territories’, by the OTT players who invade into their ground. A recent example is Google winning over AT&T to provide connectivity services to 7,000 Starbucks coffee shops in the US1.
A new source of revenues for the Telcos is the cloud. The public cloud services market is considered a high growth market, which expects to reach total revenues of $206B by 2016, at CAGR of 17.7%2. Infrastructure as a service (IaaS) is the fastest growing segment, with CAGR of 41.7%.
Figure 1 The public cloud market forcast; Source: Gartner
Over the last few years, major Telcos have launched public clouds services. Some have done that through internal resources, and others have acquired companies that specialize in that field. By end-2012, total of 300 Telcos worldwide were supplying cloud services.
The Telcos are well positioned to be ‘Cloud Players’. They already operate their own Data Centers, they have an established business relationship with business and residential customers, they have substantial presence and real estate in the territories they operate in, and they own and operate the inter data centers network.
However, currently the Telcos face difficulties in penetrating this market, and are dragging behind the IT\OTT competitors, such as Amazon, Google, IBM and others. A recent study in the US revealed that only 11% of SMBs report that they prefer to buy cloud services from the Telco3.
1
http://www.forbes.com/sites/kellyclay/2013/07/31/starbucks-replaces-att-with-google-as-wifi-provider/ 2
Informa, “Navigating The Telecom Cloud”, Nov, 2012
3
5 | P a g e
The question ‘how can the Telco win in the Cloud?’ is the ‘one million dollars’ question (or in this case, better to count in billions). Several research firms have dealt with this issue, and one recommendation that is being evangelized by all is “make the network count!”4.
According to Arhur D. Little report, one of the main values the Telcos have is their control of the network. Creating ‘context aware networks’ is key to providing a superior service, and the Telco is in a much better position than the OTT to provide it.
Another research report from Overture5 outlines its view of the value the Telco has in ‘bringing cloud to the metro edge’:
• Provide agility and scalability • Simplify back office integration • Create application-aware network • Create network-aware applications • Provide flexible workload placement • Reduce OPEX and CAPEX
• Offer fast path to new features and services
As we are going to present in the next sections, the vDCN concept addresses many of the challenges above.
The Enterprise Prism – What does it take to dive into the public cloud?
For enterprise customers to get their service from the public cloud, they need to be assured that the level of service is not compromised vs. the private cloud option. Their main concerns are the following: Security and Privacy. Enterprises usually have their own security policies enforced in their on-premises infrastructure. For that they deploy variety of tools and appliances such as firewalls, load balancers, IDS/IPS and traffic monitoring solutions. As they move into the public or hybrid cloud environment, they seek to extend these policies into that realm as well. Another concern is the privacy, and as recent events of security breaches were publicly reported, many refrain from putting sensitive data into the cloud.
VM migration between data centers. There as several drivers for the need for migration of VMs between data centers – the need for elastic scaling of resources, optimization based on usage, minimization of the electricity cost and backup and disaster recovery. As it is needed to allow
4
Arthur D. Little, “Cloud from Telcos: Business distraction or a key to growth?”, 2013
5
6 | P a g e
this migration while preventing service interruption and avoiding applications awareness, it is required to maintain the L2 (MAC) to L3 (IP) association. This obligates maintaining the same IP subnets across the data centers, hence preserving L2 connectivity through the entire domain.
The Slow CT vs.fast IT gap. Provisioning of cloud resources, such as VMs or storage is done in seconds, while provisioning of networks could be dragged for weeks. One solution for that is to over-provision the network for future growth. However, such a policy has its associated costs, and planning ahead may not to be accurate enough. Keeping the network flexible and tuned to the fast pace of IT is critical for a successful solution.
Lack of Quality of Service. Naturally, the interconnection of data centers across the WAN cannot enjoy the same level of bandwidth pipes and quality of service as is common inside the data center. However, the customer expects to achieve similar levels of service, mainly with regard to supporting time-critical applications and quick recovery from network failures.
7 | P a g e
Introducing the vDCN Layer
The Virtual Data Center Network concept is constructed of two vectors of virtualization: 1. Network function virtualization – Based on ETSI-NFV framework, this concept refers to the
virtualization of physical appliances and moving them into the cloud. Such an approach enables deploying advanced services and capabilities without upgrading the current equipment of the underlying physical network.
2. Network Resources Virtualization – Overlaying of a virtual network layer on top of the underlying physical infrastructure. Each tenant gets access to its virtual network, which can be further hierarchically divided into sub-virtual networks. These virtual networks are isolated and are managed and operated by the tenant.
8 | P a g e
Challenges with Virtualizing network devices, availability and management
As the standardization in the field of NFV is still being formed in ETSI-ISG, implementing virtual appliances under the NFV framework bear its challenges:
Performance. The NFV vision of moving from the world of hardware, to pure software implementation, could be shuttered by reality if implemented in ‘brute-force’. There are functions which are good candidates for software implementation, mainly related to higher layer intelligence, but many are much cheaper to be performed by hardware.
An optimal solution would be to complement the software platform with a hardware
accelerator (HWA), thus enjoying the flexibility of the software with the superior performance of the hardware. The HWA mechanism is being defined as part of the ETSI NFV standardization effort, and supports an abstraction layer and open interfaces, thus allowing interoperability between different vendors and solutions. There are several implementation options for the physical devices of the HWA – such as FPGA, Network Processor, Computing Hardware and more. As part of the vDCN research program, an FPGA-based architecture was implemented. High Availability. Carrier grade networks require high level of availability, typically five nines, equivalent to an allowed down time of roughly 5 minutes per year. When transitioning to NFV this problem intensifies as the world of standard x86 servers is not accustomed to such level of reliability. Traditional protection scheme are not suitable for NFV in the Telco environment, and therefore a new method, vNFPool6, was created. This method is based on the RSerPool
mechanism, which is an IT framework for access to multiple, coordinated pool of servers. To accommodate this mechanism to the implementation of Telco functionality under the NFV the following improvements were added:
• Dynamic discovery of standby servers to receive synchronization state.
• Support for active/active multi-homing for L2/L3 switch/router implementation. • Risk sharing avoidance (installing the active and backup vNF on the same hardware). One of the main values in the High-Availability mechanism implemented is its support of N:1 operation mode (N vNFs active and single vNF as backup), thus achieving considerable resources reduction vs. the traditional 1:1 mode.
Service automation and configuration. In traditional networks, complex configuration and long service deployment cycles were key factors to slow the pace of the CT operation vs. the quick pace of IT.
6
9 | P a g e
In NFV environment, services are required to scale up or down in a rapid and automatic mode. In addition, service velocity is improved by provisioning remotely in software, and no truck roll is required to install new hardware. To allow it, several capabilities are supported:
• Automatic NFV appliance creation for tenants, with NFV appliance resource pool. • NFV appliance configuration with open API for tenants.
• Automatic vNF redeployment following VM migrating.
• Automatic cooperation with the forwarding network devices to redirect traffic to appropriate vNF node.
Openness and Unified Management & Orchestration. With the introduction of NFV technology, Telcos need to support multitude of network functions developed by different vendors.
The virtualization of the functions will also ease the fast introduction of value added services using IT cloud technologies.
In order to enable the above, an appropriate infrastructure should be built addressing the following:
• Management and orchestration of NFV-platform capabilities to integrate virtual functions easily across vendors and platforms.
• Service capability templates that are generic enough to support different vendors and network services.
The benefits for Telcos are two-fold. First, this virtualization will ease the whole life-cycle of Telco-based services. It will assist them to keep with the pace of the cloud providers while bringing value-added services (e.g. location based). Second, encouraging openness will result a wide variety of software application eco-system around the Telcos. It opens the virtual
appliance market to pure software entrants, small players and academia, encouraging more innovation to bring new services and new revenue streams quickly at much lower risk.
vDCN Solution Components
The vDCN solution is comprised of the following virtual network elements:
• vDC-GW. Virtual Data Center Gateway installed logically on top of the existing DC gateway that is connecting the data center to the WAN. The vDC-GW is implemented under the NFV platform, and typically implements network functions such as MAC gateway, Overlay Routing, Load balancing, Firewall, IPS and WAN Optimization Controller.
• vAR\vPE. Virtual Access Router\Virtual Provider Edge is connecting the enterprise to the WAN. The vPE is to be deployed when the enterprise prefers to install a physical router on premises, and a vAR when the enterprise installs a simple layer2 device, and prefers to get the advanced connectivity services (L3 routing, NAT, Firewall, IPS, etc.) from the operator’s virtual device.
10 | P a g e
• vDCN Center. That is the management entity which is used to provision, operate, manage and monitor the vDCN network and virtual appliances. The operator has a complete view of the network and its customers, the tenants. The tenants get a ‘thin’ version, which allows it full control of its overlay network and virtual appliances, but prohibits access to other tenants as well as any of the shared resources.
Figure 3 The vDCN components - virtual network elements and the management entity
Deploying an Overlay Network
The overlay network is built of virtual networks, each serving a tenant. The virtual network is a collection of virtual nodes connected by a set of virtual links to form a virtual topology. All these virtual elements are mapped onto the underlying physical infrastructure. Thus, a virtual node is hosted on a physical node and a virtual link spans over a path in the physical network, and may include several physical links and nodes.
Each virtual network is being operated and managed by a single tenant. The operator has a system-wide view, and can manage all the different virtual networks. One of the advantages of deploying an overlay network is the ability to utilize alternate routing methods. An example of such a method is the Flow Aggregated Service Routing algorithm, as described by the diagram of Figure 4.
Conventional routing mechanism performs the forwarding based on the least cost path to the
destination, as determined by the routing protocol. This may be inadequate to the high-level application requirements. For example, an application that is sensitive to delay may traverse a path with longer delay than other available paths in the network. An optional solution for that, which is flow-routing, is
11 | P a g e
very complex to configure and manage, and will require considerable resources from the underlying layer.
The FASR method is implemented on top of the overlay. As illustrated in Figure 4, the edge virtual nodes identify the application and tag it according to its type – delay, packet-loss or bandwidth sensitive. The packets are forwarded over the virtual network, based on the ‘Aggregation Tag’ which is a 2-tuple field composed of the destination IP (of the overlay network) and the Type field.
Figure 4 Flow Aggregated Service Routing: Implemented on top of the overlay network
Creating an overlay network has many gains and challenges7:
Flexibility. Each tenant can implement its own network topology, routing and forwarding methods and its own policy functions, independent of the underlying physical network. For example, the Flow Aggregated Service Routing mechanism may be considered superior by the tenant to serve its purpose. However, deploying it directly on the underlay is practically impossible without upgrading the existing devices. Implementing it on top of the overlay network is the only practical way to do it in a cost efficient and flexible method.
Manageability. Simplification of the operational process for the tenant is critical. The fast pace of IT is no match to the slow tedious processes of the ‘old-fashioned’ Telco. With the traditional operation
7
Elaborated research in this field: N. M. Mosharaf Kabir Chowdhury and Raouf Boutaba, University of Waterloo, “Network Virtualization: State of the Art and Research Challenges”, 2009
12 | P a g e
mode, a change in service attributes (such as bandwidth, policy, etc.), requires interaction with the service provider and manual provisioning of the service. The management at the overlay network allows the tenant to autonomously manage and control the operational aspects of its service within the boundaries agreed and defined by the service provider.
Scalability. Typical Tier1 operator may have thousands of SMB\enterprise customers. Thus, being able to scale the service as it evolves is critical to its success.
Isolation. Tenants expect the provider to ensure there is complete isolation between their networks, and those of other tenants. This is required to guarantee privacy and security, as well as to avoid misconfiguration or misbehavior in one virtual network to affect its neighbors.
Programmability. Ensuring programmability of the network elements allows the required manageability and flexibility as required by the tenant. This is the method by which the tenant is able to implement customized protocols and to deploy diverse services.
Heterogeneity. The underlying infrastructure may be composed of different physical networks and different technologies (e.g. optical, wireless, etc.). The tenant should be kept agnostic to that diversity and avoid the involved complexity.
Automation, Programmability and Simplification Use Cases
The following are some use cases to explain the system operational model:- Automatic Scale-out. In order to provide improved scalability, simplified maintenance and reduced OPEX, automatic Scale-out/in is a key capability for NFV system. For example:
1. vDCN Center gets a status report (e.g. traffic, load,…) from the vNF (e.g. vFW). 2. If excess load is detected, vDCN Center automatically launches a new vFW for that
tenant.
3. Then vDCN Center cooperates with NMS/SDN Controller to redistribute the traffic, and direct part of the flows to the new vFW.
- Installing a new vNF to implement a new service. Anyone (e.g. vendor, Telco, enterprise) can update and install its new vNF into Telco’s vDCN Layer. vDCN Layer can support any type of existing and future vNFs via Unified-Template to provide unified management and control. Also any developer can develop its innovative vNF via APIs provided by vDCN Layer.
13 | P a g e
- Automatic deployment and redeployment for vNF. Operatorscan deploy the vNFs in the right location to reduce the trombone flow.
• vNF automatic deployment near the VM
• Automatic Redeployment following the VM migrating (Figure 5)
14 | P a g e
Summary
The vDCN concept is a research project which is being developed by Huawei’s network product line research group. The components and technologies described in this white paper are available, and are now being deployed and tested in several Telcos labs and test networks.
As the market for public cloud services is substantial and growing fast, as well as the challenges the Telco is facing in gaining a substantial share in this market, Huawei is committed to push forward and innovate in this field. The vDCN platform serves as an ‘arrow head’ in the move from the world of Telco to the world of IT. This platform is utilized for innovation and introduction of new concepts and
technologies.
For further information, the reader may contact either authors of this paper (contact details in the last page).
15 | P a g e
16 | P a g e
About the Authors
Ido Shargil,
Chief Architect, Huawei European Research Center
Email: [email protected]
Ido is engaged in research in the field of Telco-Cloud, NFV and advanced networking technologies. Before joining Huawei he was in head of R&D and head of System Engineering of the Access Products Business Unit in ECI Telecom.
Michael Yin,
Senior Research Engineer, Huawei Network Innovation R&D Center
Email: [email protected]
Michael is a researcher in Telecom and Networking technologies, focusing on data center devices, Ethernet switches, routers & chips. Now he is responsible for promoting the vDCN project which aims the study how to use SDN and NFV technologies in DC networks, and how to implement the demonstration for vDCN.
Copyright © 2014 Huawei Technology Co., Ltd. All rights reserved
You may copy and use this document solely for your internal, reference purposes. No other license of any kind granted herein.
This document is provided "as-is" without warranty of any kind, express or implied. All warranties are expressly disclaimed. Without limitation, there is no warranty of non-infringement, no warranty of merchantability, and no warranty of fitness for a particular purpose. Huawei assumes no responsibility for the accuracy of the information presented. Any information provided in this document is subject to correction, revision and change without notice. Your use of, or reliance on, the information provided in this document is at your sole risk. All information provided in this document on third parties is provided from public sources or through their published reports and accounts.
, HUAWEI, and are trademarks or registered trademarks of Huawei Technologies Co., Ltd. All other company names, trademarks mentioned in this document are the property of their respective owners.