CSE561
Spring 2001 Venkat Padmanabhan 1
Internet Routing
Venkat Padmanabhan
Microsoft Research
9 April 2001
CSE561
Spring 2001 Venkat Padmanabhan 2
Outline
• Routing algorithms
– distance-vector (DV) – link-state (LS)• Internet Routing
– border gateway protocol (BGP)
• BGP convergence paper
CSE561
Spring 2001 Venkat Padmanabhan 3
DV Algorithm
• Each router maintains a vector of costs to all destinations as well as routing table
– Initialize neighbors with known cost, others with infinity
• Periodically send copy of distance vector to neighbors
– On reception of a vector, determine if path via the neighbor better and if so update routing table
• If no changes, will converge to shortest paths
– but changes can create loops (count-to-infinity)
CSE561
Spring 2001 Venkat Padmanabhan 4
Alleviating the Problem
• Split horizon
– Router never advertises the cost of a destination back to its next hop – that’s where it learned it from! – Solves trivial count-to-infinity problem
• Poison reverse
– go even further – advertise back infinity – why is this useful?
• Triggered updates
– count to infinity faster!
• However, DV protocols are still subject to the same problem with more complicated topologies
CSE561
Spring 2001 Venkat Padmanabhan 5
Routing Information Protocol
(RIP)
• DV protocol with hop count as metric
– Infinity defined to be 16 hops! Limits network size – Includes split horizon with poison reverse
• Routers send vectors every 30 seconds
– With triggered updates for link failures – Time-out in 180 seconds to detect failures
• RIPv1 (RFC1058), RIPv2 (RFC1388)
– v2 includes subnet mask, authentication
• Main advantage: simplicity
CSE561
Spring 2001 Venkat Padmanabhan 6
Link State Routing
• Same assumptions/goals, but different idea:
– Tell all routers the topology and have each compute best paths
– Two phases:
• Topology dissemination (flooding)
• Shortest-path calculation (Dijkstra’s algorithm)
• Why?
– In DV, routers hide their computation, making it difficult to make good decisions upon change
– With LS, faster convergence and hopefully better stability
CSE561
Spring 2001 Venkat Padmanabhan 7
• Each router maintains link state database
and periodically sends link state packets
(LSPs) to its neighbors
– Contain [router, neighbors, costs]
• Each router forwards LSPs not already in
its database on all ports except where
received
– Each LSP will travel over the same link at most once in each direction
Flooding
CSE561
Spring 2001 Venkat Padmanabhan 8
Example
• LSP generated by X at T=0
• Nodes become yellow as they receive it
X A C B D X A C B D X A C B D X A C B D T=0 T=1 T=2 T=3 CSE561
Spring 2001 Venkat Padmanabhan 9
Link-State Routing Issues
• Distinguishing between old and new LSPs
– LSP carry sequence numbers– Why is this not an issue for DV?
• Scalability
– overhead of flooding, SPF computation – use hierarchy (OSPF areas, IGP/EGP split)
• Metrics
– LSP can contain multiple metrics
CSE561
Spring 2001 Venkat Padmanabhan 10
Open Shortest Path First
(OSPF)
• Most widely-used Link State protocol
today
• Basic link state algorithms plus many
features:
– Authentication of routing messages – Extra hierarchy: partition into routing areas – Load balancing: multiple equal cost routes
Routing Metrics
• Protocols such as OSPF don’t specify this • ARPANET history:
– Original metric: instantaneous queue length – D-SPF (late 70s): delay metric
• okay under light load (delay dominated by static quantities)
• oscillations under heavy load
– HN-SPF (late 80s): normalized “hops” metric
• delay used to estimate link utilization
• link utilization is normalized using a linear transform • cost of heavily-loaded link ≤3*cost of idle link
Internet Routing
• Main concern: scalability
– size of routing tables – volume of routing tables – amount of routing computation• Tools for scaling
CSE561
Spring 2001 Venkat Padmanabhan 13
Address Allocation and
Aggregation
• IP address indicates topological location
– unlike “flat” Ethernet addresses
• Hosts in a network share a common prefix
– prefix obtained from IANA or ISP – e.g., 128.32.X.Y for Berkeley
• Address aggregation
– only advertise routes to aggregates – subnetting
– supernetting (CIDR)
CSE561
Spring 2001 Venkat Padmanabhan 14
Network Host 7 24 0 Network Host 14 16 1 0 Network Host 21 8 1 1 0
IPv4 Address Formats
Class A
Class B
Class C
CSE561
Spring 2001 Venkat Padmanabhan 15
Network number Host number Class B address
Subnet mask (255.255.255.0)
Subnetted address 111111111111111111111111 00000000
Network number Subnet ID Host ID
Subnetting
• Split up one network number into multiple physical networks • Internal structure isn’t propagated • Helps allocation efficiency CSE561Spring 2001 Venkat Padmanabhan 16
Subnet mask: 255.255.255.128 Subnet number: 128.96.34.0 128.96.34.15 128.96.34.1 H1 R1 128.96.34.130 Subnet mask: 255.255.255.128 Subnet number: 128.96.34.128 128.96.34.129 128.96.34.139 R2 H2 128.96.33.1 128.96.33.14 Subnet mask: 255.255.255.0 Subnet number: 128.96.33.0 H3
Subnet Example
CSE561Spring 2001 Venkat Padmanabhan 17
CIDR (Supernetting)
• CIDR: Classless Inter-Domain Routing • Aggregate advertised network routes
– e.g., ISP has class C addresses 192.4.16 through 192.4.31
– Really like one larger 20 bit address class … – Advertise as such (network number, prefix length) – Reduces size of routing tables
• But IP forwarding is more involved
– Based on Longest Matching Prefix operation
CSE561
Spring 2001 Venkat Padmanabhan 18
Border gateway (advertises path to 128.32.2/23) Regional network Corporation X (128.32.2/24) Corporation Y (128.32.3/24)
CIDR Example
CSE561
Spring 2001 Venkat Padmanabhan 19
Hierarchical Routing
• Several levels of hierarchy
• Intra-domain versus inter-domain
routing
– break problem down into more manageable pieces
– IGP: RIP, OSPF – EGP: EGP, IDRP, BGP
• Are RIP and OSPF suitable for
inter-domain routing?
CSE561
Spring 2001 Venkat Padmanabhan 20
Backbone service provider Peering point Peering point Large corporation Large corporation Small corporation “Consumer ” ISP “Consumer”ISP “ Consumer” ISP You at home You at work
Structure of the Internet
CSE561
Spring 2001 Venkat Padmanabhan 21
Inter-Domain Routing
• Network comprised of many Autonomous Systems (ASs)
– each AS is assigned a number
• Kinds of ASs
– stub AS – multi-homed AS – transit AS
• Does the AS number have to be unique? 12 44 7 321 23 1123 CSE561
Spring 2001 Venkat Padmanabhan 22
Inter-Domain Routing
• Border routers summarize and advertise internal routes to external neighbors and vice-versa • Border routers apply policy • Internal routers can use
notion of default routes • Core is “default-free” R1 Autonomous system 1 R2 R3 Autonomous system 2 R4 R5 R6 AS1 AS2 Border router Border router NSFNET backbone Stanford BARRNET regional Berkeley PARC NCAR UNM Westnet regional UNL KU ISU MidNet regional …
Exterior Gateway Protocol
(EGP)
• First major interdomain routing protocol
• Constrained Internet to tree structure
Border Gateway Protocol (BGP-4)
• EGP used in the Internet backbone
today
• Features:
– path vector routing
– incremental updates (except initially) – application of policy
CSE561
Spring 2001 Venkat Padmanabhan 25
Path Vectors
• Similar to distance vector, except send entire paths
– e.g. 321 hears [7,12,44] – stronger avoidance of loops – multiple BGP speakers per AS
• Shorter paths preferred (modulo policy)
• Reachability only
– announcements & withdrawals – explicit/implicit withdrawals – hard to ensure “optimal” routing
12 44 7 321 23 1123 CSE561
Spring 2001 Venkat Padmanabhan 26
BGP Policies
• Impact of policies
– which routes to accept and preference – which routes to advertise
• Policies are generally local to an AS
– business considerations– cost – robustness
CSE561
Spring 2001 Venkat Padmanabhan 27
BGP Policies: Example
C2 C3 ISP1 ISP2 ISP3 C1– ISP2may not provide transit service for ISP1and
ISP3
– ISP2may not blindly announce any route it hears
from C2
CSE561
Spring 2001 Venkat Padmanabhan 28
Impact of Policies – Example #1
• Early Exit / Hot Potato
– “if it’s not for you, bail”
• Combination of best local policies not globally best • Side-effect: asymmetry • Inter-domain connectivity
cannot be modeled as a simple directed graph!
B A
CSE561
Spring 2001 Venkat Padmanabhan 29
Impact of Policies: Example #2
• Persistent oscillations
• Example:
– (Varadhan et al. 1996) – AS1 prefers R2 – AS2 prefers R3 – AS3 prefers R1• Solution?
AS1 AS3 AS2 R1 R2 R3 CSE561Spring 2001 Venkat Padmanabhan 30
Operation over TCP
• Most routing protocols operate over UDP/IP • BGP uses TCP
– TCP handles error control; reacts to congestion – Allows for incremental updates
• Issue: Data vs. Control plane
– Should routing messages receive a higher priority than data?
CSE561
Spring 2001 Venkat Padmanabhan 31
When should we use BGP?
• Main benefit of BGP is greater control
– makes sense for multi-homed site, transit network
• How about a stub network?
– default/static route will suffice
– several costs to running BGP and advertising a separate prefix
• need BGP router
• additional routing entry in every BGP router • instability due to transient faults
CSE561
Spring 2001 Venkat Padmanabhan 32
BGP Convergence
• Paper by Labovitz, Ahuja, Bose,
Jahanian
• Fast fail-over of Internet routes is a
myth
– can take several minutes
• BGP maintains an alternate path per
neighbor
– protocol doesn’t indicate cause of failure – blindly explores all paths upon failure
CSE561
Spring 2001 Venkat Padmanabhan 33
Experimental Observations
• Tup & Tshort converge faster than
Tdown & Tlong
• No correlation between convergence
latency and geographic distance
– topology is the key (# of alternate paths)
• No correlation between convergence
latency and congestion
– previous study on routing instability had demonstrated correlation
CSE561
Spring 2001 Venkat Padmanabhan 34
BGP Convergence Model
• Complete graph: O((n-1)!) time
• Reason:
– monotonically increasing rather than strictly increasing path lengths
• Basic problem:
– nodes advertise new paths as soon as they receive updates
Doing better
• Synchronizing updates
– at most one announce per destination during a MinRouteAdver interval
– ensures that each round only considers paths longer than that in previous rounds – O(max length path)
• Loop detection
– receiver-side as well as sender-side
Doing still better
• BGP-CT
– “cause tag” indicates the reason that a route was withdrawn
– can tell if an alternate route is also affected by a failure