• No results found

Network Virtualization and Data Center Networks DC Virtualization Basics Part 3. Qin Yin Fall Semester 2013

N/A
N/A
Protected

Academic year: 2021

Share "Network Virtualization and Data Center Networks DC Virtualization Basics Part 3. Qin Yin Fall Semester 2013"

Copied!
62
0
0

Loading.... (view fulltext now)

Full text

(1)

Network Virtualization and

Data Center Networks

263-3825-00

DC Virtualization Basics – Part 3

Qin Yin

(2)

Outline

A Brief History of Distributed Data Centers

The Case for Layer 2 Extension

Layer 2 Extension

Over optical Connections

• Virtual PortChannels • Fabric Path

Over MPLS

• Ethernet over MPLS (EoMPLS) • Virtual Private LAN Service (VPLS)

Over IP

• MPLS over GRE

(3)

Virtual PortChannel Summary

Virtualization Characteristics Virtual PortChannel

Emulation Single Ethernet Switch

Type Pooling

Subtype Homogeneous

Scalability Two physical switches

Technology area Networking

Subarea Data and control plane virtualization

(4)

Distributed Data Centers

Data center interconnect (DCI)

Many physical sites, one logical data center

Business goals

Seamless workload mobility

Business continuity

Pool and maximize global resources

Distributed applications

Defining two metrics for each application environment

Recovery Point Objective

: The maximum tolerable amount

of time in which data can be lost from an IT service

(5)

From a networking perspective

Connected through Layer 3 routing

Decreases fate sharing in distributed data center

Isolates each site from remote network

instabilities

Extending Layer 2 domains

(6)

The Cold Age (Mid-1970s to 1980s)

Computer rooms housing mainframe systems

Applications based on batch processing

RPO and RTO could span days or even weeks

Recovery technologies: data backup and retrieval

Data: stored on tapes

Connectivity: physical transport of tapes to the

backup site

Cold-standby and warm-standby

(7)

The Hot Age (1990s to Mid-2000s)

Internet booms and the advent of electronic business

The need of real-time response

Recovery technologies focused on service availability

RPO and RTO confined to hours, or even minutes

Geographic clusters (geocluster)

Application servers installed on

at least two

geo-separated sites

Active node failure triggers

automatic switchover

to standby

node

Generally require data replication to the

hot-standby site

• Synchronous replication (tens of kilometers apart to avoid latency

issues)

• Asynchronous replication (data periodically copied from primary to

(8)

Geocluster

Different types of

geocluster communication

– Heartbeat communication – Application state information

(such as cached data for database servers)

– Client traffic (especially nodes

share the same virtual IP address)

(9)

The Active-Active Age (Mid-2000s -)

In hot-standby site, hardware and software resources

– Used in case of major failure at the main site – Activated for a small amount of time per year

– Some critical applications (RPO of 0, RTO of seconds)

Active-active design

to avoid resource waste

– No luxury of unused sites

– Deploy several active nodes dispersed over multiple data centers – Server and storage virtualization to provide automatic and quick

workload mobility between sites

Challenges of scalability and flexibility

(10)

Requirements of Layer 2 Extension

• Heartbeat and connection state communications are usually directed to

multiple destinations

– Broadcast, multicast, or unknown unicast Ethernet frames (flooding)

• Active and standby nodes usually share the same virtual IP and MAC

address

– To facilitate traffic handling in the case of failure

• Server migration

– Application do not support IP readdressing

– Generates painstakingly complex operations

• Data center expansion

– A data center has reached a physical limitation

– A company hires a colocation service from an outsourcing data center

As a result, standard Layer 2 connection are deployed to provide extended VLANs over multiple data centers.

(11)

Challenges of Layer 2 Extension

Flooding and broadcast

– Loops over the Layer 2 extensions can be easily formed

A spanning tree instance spanning multiple sites presents

formidable challenges

– Scalability: recommended STP diameter is 7

– Isolation: reconvergence will affect VLANs within one STP instance – Multihoming: multiple DCI links will not be used for data comm

(12)

Challenges of Layer 2 Extension (cont.)

Tromboning can be formed

between data centers

– Non-optimal internal routing

within extended VLANs

– Cause for DCI resource waste:

uncontrolled state of an

active-standby pair of devices

Data confidentiality

– Mandates strict forms of

encryption in data center interconnect to minimize the risk of data leakage

(13)

Traditional Layer 2 VPNs

Dark Fiber

VPLS

(14)

Ethernet Extensions over Optical

Connections

Optical connections

Distance: less than a few hundred kilometers

Dark fiber

Fiber-optic pair to connect networking devices

Wavelength-division multiplexing (WDM) to increase

transport capacity

Coarse WDM: multiplex eight optical carrier signals

Dense WDM: aggregate a higher number (128, for example)

Dark fiber and WDM

Communication solutions belonging to Layer 1 (physical)

Can transport any data-link protocol including Ethernet

(15)

Spanning Tree Protocol

STP does not allow

Ethernet traffic on all

the links between DCI

switches

STP instance

Is spread over both sites

Sharing any internal

topology change or

reconvergence

(16)

STP and Link Utilization

STP wastes inter-switch

(17)

Link Aggregation

STP only detects one logical

interface

Traffic destined to this

interface is load balanced

among the active physical

links that are part of the

channel

This virtual interface is

(18)

Virtual PortChannel

Eliminate STP blocked

ports

Uses all available uplink

bandwidth

Allow a single device to

use a port channel across

two upstream switches

Dual-homed server

operate in active-active

mode

Provide fast convergence

(19)

Virtual PortChannels on a Layer 2

Extension

Virtual ProtChannels

– Transforms multiple Ethernet

links into a single-switch STP connection

Benefits

– Multihoming is enabled – all

links are being used (and load balanced)

– Spanning tree topology is

simplified – only one

connection between sites

– If vPC peer switch feature is

deployed, a device failure will not result in reconvergence

(20)

Virtual PortChannels in Multipoint

Data Center Connections

Problem

– vPCs can form a logical

looped topology

Solution: hub-and-spoke

– Deploy disjoint STP instances

per site – STP isolation is enabled on all DCI switches

– Avoid loops in the Layer 2

(21)

Traditional Layer 2 VPNs

Dark Fiber

VPLS

(22)

MPLS Labels and Packets

Provides packet

forwarding based on

labels

Layer 2.5 technology

Head fields

Label value

Experimental (Exp)

• To define QoS classes in

MPLS networks

(23)

MPLS Basics

Protocol flexibility

Comes from the capability of stacking labels

MPLS services

Traffic engineering

• Configures and defines unidirectional tunnels using tunnel label • Override routing protocol decision

Layer 3 virtual private networks

• Connect different VPNS • Inner label: VPN

Any transport over MPLS (AToM)

• Transport of Layer 2 frames • Inner label: virtual circuit • Example: EoMPLS

(24)

MPLS Network

• Forwarding Equivalence Class

(FEC)

– MPLS packets sharing the same

label

• Two types of routers

– Label Edge Routers (LER) – Label Switch Router (LSR)

• MPLS router elements

– A loopback interface

• To improve reachability

– A routing protocol

• To advertise connected subnets

– LDP - Label distribution Protocol

• To enable device discovery and label

distribution

(25)

EoMPLS Configuration

In essence

– Within MPLS network

• Encapsulates Ethernet frames

within MPLS packets

– At the egress of MPLS

network

• Transported, de-capsulated

and delivered as they were

MPLS label stack

– Tunnel Label routing from

ingress to egress LER

– VC Label identifying virtual

circuit within tunnel

Pseudowire

(26)

Pseudo Wire Reference Model

A Pseudo Wire (PW) is a connection between two provider edge

(PE) devices connecting two attachment circuits (ACs)

Label Switched Path (LSP) – MPLS tunnel

Emulated Service MPLS (or IP) PE1 Attachment Circuit Customer Site Customer Site PSN Tunnel (LSP in MPLS) Pseudo Wire PE2 CE PW1 PW2 CE CE

(27)

VC Distribution Mechanism using LDP

• Unidirectional Tunnel LSP

– To transport PW PDU from PE to PE based on tunnel label(s)

– Both LSPs combined to form a single bi-directional Pseudo Wire

• Directed LDP session

– To exchange VC information, such as VC label and control information

VC Label identifies interface

Tunnel Label(s) gets to PE router

IP/MPLS PE1 LSP created using IGP+LDP or RSVP-TE Customer Site Customer Site Customer Site Customer Site Label Switch Path

Directed LDP Session between PE1 and PE2

PE2 CE

CE

CE CE

(28)

Ethernet PW Tunnel Encapsulation

Tunnel Encapsulation

One or more MPLS labels associated with the tunnel

Defines the LSP from ingress to egress PE router

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 EXP TTL (set to 2) VC Label (VC) 1 Tunnel Label (LDP,RSVP,BGP) Layer-2 PDU

0 0 0 0 Reserved Sequence Number

EXP 0 TTL

PW Demux Tunnel Encaps

(29)

Ethernet PW Demultiplexer

• Obtained from Directed LDP session

• To identify individual circuits within a tunnel • Used by receiving PE to determine

– Egress interface for L2PDU forwarding (Port based)

– Egress VLAN used on the CE facing interface (VLAN Based)

• EXP can be set to the values received in the L2 frame

0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 EXP TTL (set to 2) VC Label (VC) 1 Tunnel Label (LDP,RSVP,BGP) Layer-2 PDU

0 0 0 0 Reserved Sequence Number EXP 0 TTL

PW Demux Tunnel Encaps

(30)

PW Operation and Encapsulation

This process happens in both directions

P2 P1 IP/MPLS Customer Site Customer Site Directed LDP Session between PE1

and PE2 PE2 CE CE LSP “PW1” Lo0: Label 24 for Lo0: Label Pop for Lo0: Label 38 for Lo0: Label 72 for PW1 PE1 LDP Session LDP Session LDP Session 24 72 L2 PDU 38 72 72 L2 PDU L2 PDU

(31)

Virtual Private LAN Service

End-to-end architecture allowing MPLS networks offer Layer 2

multipoint Ethernet Services

Provides emulation of a single virtual Ethernet bridge network

Virtual Bridges linked with MPLS Pseudo Wires

PE PE

CE CE

VPLS is an Architecture

CE

(32)

Virtual Private LAN Service

It is “Virtual”

Multiple instances share the same physical

infrastructure

It is “Private”

Each instance is independent and isolated from

one another

It is “LAN Service”

It emulates Layer 2 multipoint connectivity

(33)

VPLS Components

33 N-PE MPLS Core CE router CE router CE switch CE router CE router CE switch CE switch CE router Attachment circuits:

Port or VLAN mode Mesh of LSP between N-PEs: Pseudo Wires within LSP

N-PE

N-PE

Virtual Switch Interface (VSI) terminates PW and provides

Ethernet bridge function

LDP between PEs used to exchange VC and tunnel labels

for Pseudo Wires Attachment CE:

(34)

Virtual Switch Interface

Flooding / Forwarding

MAC table instances per customer (port/vlan) for each PE

Associate ports to MAC, flood unknowns to all other ports

Address Learning / Aging

LDP enhanced with additional MAC list TLV (label

withdrawal)

MAC timers refreshed with incoming frames

Loop Prevention

Create a full-mesh of Pseudo Wires (VCs in EoMPLS)

Unidirectional LSP carries VCs between pair of N-PEs

(35)

VPLS Flooding and Forwarding

Flooding (Broadcast, Multicast, Unknown Unicast)

Dynamic learning of MAC addresses on PHY and VCs

Forwarding

Physical Port

Virtual Circuit

Data SA DA?

(36)

MAC Learning and Forwarding

• Broadcast, Multicast, and Unknown Unicast are learned via the received

label associations

• Two LSPs associated with a VC (Tx & Rx) • If inbound or outbound LSP is down

PE1 PE2

Send me frames using Label 170

Send me frames using Label 102

CE CE

E0/0 E0/1

MAC 2 E0/1 MAC Address Adj MAC 1 102 MAC 2 170

MAC Address Adj MAC 1 E0/0 Use VC Label 102 MAC1 Use VC Label 170 MAC2 PE2 170 MAC2 MAC1 Data

PE2 102 MAC1 MAC2 Data

(37)

MAC Address Withdraw

• Message speeds up convergence process

– Otherwise PE relies on MAC Address Aging Timer

• Upon failure, PE removes locally learned MAC addresses

• Send LDP Address Withdraw to remote PEs in VPLS (using the Directed LDP

session)

• New MAC List TLV is used to withdraw addresses

MPLS

X

(38)

VPLS Functional Components

N-PE provides VPLS termination/L3 services

U-PE provides customer UNI

CE U-PE N-PE MPLS Core N-PE U-PE CE Customer

MxUs SP PoPs

Customer MxUs

(39)

Directed Attachment (Flat)

Characteristics

Suitable for simple/small implementations

Full mesh of directed LDP sessions required

N*(N-1)/2 Pseudo Wires required

Scalability issue a number of PE routers grows

No hierarchical scalability

VLAN and Port level support

Potential signaling and packet replication overhead

Large amount of multicast replication over same physical

CPU overhead for replication

(40)

Direct Attachment VPLS (Flat

Architecture)

CE N-PE MPLS Core N-PE CE

Ethernet (VLAN/Port Ethernet (VLAN Port) Full Mesh PWs + LDP MAC2 MAC1

(41)

Hierarchical VPLS (H-VPLS)

Best for larger scale deployment

Reduction in packet replication and signaling

overhead

Consists of two levels in a Hub and Spoke topology

Hub consists of full mesh VPLS Pseudo Wires in MPLS core

Spokes consist of L2/L3 tunnels connecting to VPLS (Hub)

PEs

(42)

Why H-VPLS?

• Potential signaling overhead

• Full PW mesh from the Edge

• Packet replication done at the Edge

Node Discovery and Provisioning

• Minimizes signaling overhead

• Full PW mesh among Core devices

• Packet replication done the Core

Partitions Node Discovery process

VPLS H-VPLS CE CE CE CE CE CE PE PE PE PE PE PE PE PE CE CE MTU-s CE CE PE-rs PE-rs PE-rs PE-rs PE-rs PE-rs PE-r CE CE

(43)

MPLS Edge H-VPLS

MPLS Core CE N-PE PE-rs MPLS Core N-PE PE-rs CE MPLS

Pseudo Wire Full Mesh PWs + LDP

U-PE PE-rs U-PE PE-rs 802.1q Access 802.1q Access MPLS Pseudo Wire MAC2 MAC1 Data Vlan CE P E VC MAC2 MAC1 Data Vlan CE 802.1q Customer MPLS PW SP Edge Pseudo Wire SP Core PE VC MAC2 MAC1 Data Vlan CE

Same VCID used in Edge and core (Labels

may differ) MPLS Acces s MPLS Acces s 1 2 3 1 2 3

(44)

Layer 2 VPNs

Dark Fiber

VPLS

(45)

Flooding Behavior

Traditional Layer 2 VPN technologies rely on

flooding to propagate MAC reachability

The flooding behavior causes failures to

propagate to every site in the Layer 2 VPN

Goal

Providing layer 2 connectivity, yet restrict the reach of

the unknown unicast flooding domain in order to

contain failures and preserve the resiliency

(46)

Pseudo Wires Maintenance

Before any learning can happen a full mesh of

pseudo-wires/ tunnels must be in place

For N sites, there will be N*(N-1)/2 pseudo-wires.

Complex to add and remove sites

Head-end replication for multicast and broadcast.

Sub-optimal BW utilization

Goal

providing point-to-cloud provisioning and optimal

(47)

Multi-homing

Requires additional protocols (BGP, ICC, EEM)

STP often extended

Malfunctions impact all sites

Goal

Natively providing automatic detection of

multi-homing without the need of extending the STP

domains, together with a more efficient

load-balancing

(48)

OTV Changes the Game

Circuits + Data Plane Flooding

– Full mesh of circuits

– MAC learning based on

flooding

– Tunnels and Pseudo Wires – Operationally challenging

• Loop prevention • Multi-homing

Packet + Control Protocol

Learning

– Packet switched connectivity – MAC learning by control

protocol

– Dynamic encapsulation – Operational simplification

• Automatic loop prevention and

(49)

Overlay Transport Virtualization

OTV delivers a virtual L2 transport over any L3

Infrastructure

Overlay

Independent of the Infrastructure technology

and

services, flexible over various inter-connect facilities

Transport

Transport services for

Layer 2 and Layer 3

Ethernet

and IP traffic

Virtualization

Provides

virtual stateless multi-access

connections.

(50)

OTV Control Plane MAC Learning

a. Server with MAC address X

sends frames that are flooded or broadcasted within site

b. OTV1 learns MAC X and

populates its MAC address table. c. OTV1 advertises MAC X with an

IS-IS update.

d. OTV2 and OTV3 become aware that MAC X can be reached through OTV1 and populate their MAC address tables using the virtual Layer 2 interface called Overlay

(51)

OTV Frame Forwarding

a. Server2 sends a unicast frame

destined to MAC X that is flooded to OTV2.

b. OTV2 checks its MAC address table

and realizes that the MAC X entry points to an Overlay interface.

c. Internally in OTV2, this Overlay

interface provides a mapping to OTV1’s IP address. As a result, the unicast frame is encapsulated into an IP packet directed to OTV1.

d. OTV1 receives the IP packet and

decapsulates it, recovering the original Ethernet frame.

e. OTV1 uses its local MAC address

table to forward the frame to Server1.

(52)

OTV Encapsulation

Outer IP header

Outer OTV shim header

VLAN

Overlay number

(53)

OTV elements

Edge device: network equipment that is actually deploying OTV - Internal interface: connected to a Layer 2 network

- To process Ethernet frames

- Join interface: connected to the Layer 3 network - To send or receive OTV packets

(54)

OTV elements

Overlay interface:

- A virtual Layer 2 interface that represents an OTV Layer 2 extension to other edge devices

- Used on their MAC address tables as the interface associated to remote MAC addresses

(55)

OTV elements

Site VLAN:

- A dedicated VLAN used for discovery and adjacency maintenance between edge devices on the same site - Should not be extended to other sites.

(56)

Spanning Tree and OTV

OTV is site transparent: no changes to the STP

topology

Each site keeps its own STP domain

An Edge Device will send and receive BPDUs

(57)

OTV Loop Avoidance

Blocking unknown unicast traffic between edge devices

Authoritative edge device (AED)

The only edge device on a site handling multicast and

(58)

OTV and Multi-homing

OTV built-in multi-homing

Allows Layer 2 traffic to be load balanced through different

IP WAN links

OTV multi-homing options

Automatic distribution of VLANs among the available AED

candidates (a hashing function to deploy this distribution).

For unicast egress traffic, OTV can be load balanced among

all the equal-cost Layer 3 paths to remote edge devices.

Multidestination egress and ingress traffic can only use the

join interface.

(59)

References

Jeff Apcar. An introduction to VPLS.

http://stor.balios.net/Divers/VPLS_Introduction.ppt

Peter Lam, Patrick Warichet. Simplifying Data

Center Interconnect with Overlay Transport

Virtualization (OTV).

(60)

Ethernet over MPLS Summary

Virtualization Characteristics Ethernet over MPLS

Emulation Ethernet connection

Type Abstraction

Subtype Structural

Scalability Hardware and software dependent

Technology area Networking

Subarea Data plane virtualization

(61)

Virtual Private LAN Summary

Virtualization Characteristics Virtual Private LAN

Emulation Ethernet bridge

Type Abstraction

Subtype Structural

Scalability Hardware and software dependent

Technology area Networking

Subarea Data plane virtualization

Advantages Layer 2 extension, multipoint connections,

(62)

Overlay Transport Virtualization

Virtualization Characteristics Overlay Transport Virtualization

Emulation Overlay Ethernet network

Type Abstraction

Subtype Structural

Scalability Hardware and software dependent

Technology area Networking

Subarea Data and control planes virtualization

Advantages Layer 2 extension, multipoint connections, transport

References

Related documents

DOVE (Distributed Overlay Virtual Ethernet) is an overlay based network virtualization service that uses VXLAN for the data plane, and OVSDB for interfaces to the physical

Remember, if you have applied for your licence under a different name to the one you have given to any of these bodies, you can either send your last inspection report or send us

A veteran of the marketing technology industry with more than two decades of experience, he has previously served in senior executive positions at InQuira (acquired by Oracle),

Fat Sat., fat saturation; FCL, fibular collateral ligament; FFL, fabellofibular ligament; Fig., figure; LCL, lateral collateral ligament; MCL, medial collateral ligament; PACS,

Inordertounderstandpsychologicalresponsetoinjury,twomodels,acognitiveappraisalmodel

At Hook Bay, historical drilling located 3 gold zones seen over a 100m wide fault zone, a possible continuation of the Lun-Echo & McChip zones, of Fe carbonate-sericite

Makale kapsamında, şirketleri birleşmeye iten nedenler ve birleşen firmalarda performans artışının gerçekleştiği, stratejik plan- lamanın şirket birleşmeleri ile olan

Organizations typically use an incoherent strategy towards BI deployment, characterized by different departments or business units using different BI tools.. The decision is