Content Delivery Networks
Hamid R. Rabiee
Mostafa Salehi, Fatemeh Dabiran, Hoda Ayatollahi
What are Content Distribution Networks
² A Content Distribution Network or Content Delivery Network (CDN) is a system of computers containing copies of data placed at various nodes of a network.
² When properly designed and implemented, a CDN can improve access to the data it caches by increasing access bandwidth and redundancy and reducing access latency.
² Data content types often cached in CDNs include web objects,
downloadable objects (media files, software, documents), applications, multimedia streams, and database queries.
² The content (information) providers are the CDN customers.
Content replication (mirror)
² CDN company installs hundreds of CDN servers throughout Internet.
² For example: in lower-tier ISPs, close to users.
² CDN replicates its customers’ content in CDN servers. When provider updates content, CDN
Content Distribution Network: idea
origin server in North America
CDN distribution node
CDN Categories
²
Network Infrastructure:
²
Single ISP
²
Overlay networks
²
Enterprise premise
²
Content types:
²
Static images and texts
²
Multimedia content: audio and video streams
²
Dynamic HTML and XML pages
²
Customers:
²
Content providers
CDN: Architectural Layout
.1
Origin server informs RR of Content Availability.
.2
Content Pushed to Distribution System. Request Request Routing(RR) Routing(RR) Distribution Distribution Node Node Origin Origin Server Server 1 2a Client Client Mirror Mirror Server Server 2b 3 5 6 4
Content Routing Principle
(a.k.a. Content Distribution Network)
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites routing requests ² CDN creates a “map”, indicating distances from leaf ISPs and CDN
nodes
² when query arrives at authoritative DNS server:
² server determines ISP from which query originates ² uses “map” to
determine best CDN server
Content Routing Principle
(a.k.a. Content Distribution Network)
S ISP Backbone ISP IX IX S S Site ISP ISP Backbone ISP Backbone ISP Hosting Center Hosting Center CS CS CS CS CS
Content Origin (CO) here at Origin Server Content Servers (CSs) distributed throughout the Internet OS
Content Routing Principle
(a.k.a. Content Distribution Network)
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS
Content is served
from content
servers nearer to
the client
C C OSTwo basic types of CDN: cached and pushed
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS OSCached CDN
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS 1. Client requests content. C C OSCached CDN
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS 1. Client requests content. 2. CS checks cache, if miss gets content from origin server.Cached CDN
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS 1. Client requests content. 2. CS checks cache, if miss gets content from origin server. 3. CS cachescontent, delivers to client.
C C
Cached CDN
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS 1. Client requests content. 2. CS checks cache, if miss gets content from origin server. 3. CS cachescontent, delivers to client.
4. Delivers content out of cache on
subsequent requests.
Pushed CDN
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS 1. Origin Serverpushes content out to all CSs.
C
OS
Pushed CDN
S ISP Backbone ISP IX IX S S Site S ISP S S S ISP S S Backbone ISP Backbone ISP Hosting Center Hosting Center Sites CS CS CS CS CS 1. Origin Serverpushes content out to all CSs.
2. Request served from CSs.
CDN benefits
²
Solve network bandwidth bottleneck
²
Content served closer to client
² Less latency, better performance
²
Load spread over multiple distributed CSs
² More robust (to ISP failure as well as other failures) ² Handle flashes better (load spread over ISPs)
Content distribution services (CDS): business model
² Benefits for Internet Service Providers (ISPs):
² customer benefits: provision of high quality content
² ISP benefit: well-defined relationship with only a few CDSs ² CDS’s are huge bandwidth consumers themselves!
² Benefits for content-owners
² owner benefits: focus only on the production of content ² CDS takes care of access-rights, format-conversions, etc
² Necessary (to be developed) standards
² resource negotiation/mgt protocol ² content-adaptation control protocol
CDN costs and limitations
²
Cached CDNs can’t deal with dynamic/personalized
content
² More and more content is dynamic ² “Classic” CDNs limited to images
²
Managing content distribution is non-trivial
² Tension between content lifetimes and cache performance ² Dynamic cache invalidation
Content delivery service providers
² We have commercial and non-commercial CDNs.
² The commercial CDNs are centralized and are client/server based. For example:
² Advection.NET/ Akamai Technologies/ Amazon CloudFront/ AT&T/ Azion/ Bitgravity/ CDNetworks/ Cotendo
² The non-commercial CDNs that are created mostly based on P2P architecture to reduce the cost. For example:
² Globule/ Coral Content Distribution Network/ coBlitz/ FreeCast/ MediaBlog/ PeerCast/ PPLive/ PPStream/ Xunlei
Layered architecture of a CDN: Basic fabric layer
² Basic fabric layer: this layer consists of basic resources for creating CDNs such as cluster servers that are connected with high bandwidth networks. The distributed application is run on these cluster servers for indexing and managing content in a distribution manner.
Layered architecture of a CDN:
Communication and connectivity layer
² Communication and connectivity layer: this layer consists of basic internet protocols like TCP/UDP and etc as well as CDN specified protocols such as internet cache protocols (ICP), hypertext caching protocol (HTCP) and cluster array routing protocols (CARP). Besides, there are some security protocols for authentication and authorization between cluster servers like the Public Key Infrastructure (PKI) and SSL. There are also some kinds of applications that exist for indexing data to enhance searching contents.
Layered architecture of a CDN: CDN layer
² CDN layer: CDN layer consists of sub-layers like CDN services, types of CDN and a kind of content that could be said that these are kinds of CDN servers. In CDN layer redirecting users to the CDN server, load balancing and managing user communications for sharing resources could be found. In CDNs multimedia content (i.e. text, audio and video) could be shared for users.
Layered architecture of a CDN: End-user layer
² End-user layer: in last layer, we have users that send their requested data from web browsers or multimedia programs to received content from CDN servers.
Main Components of building a CDN
²
Content distribution
² Placing the content to the devices
²
Request routing
² Steer users to a delivery node that is close
²
Content delivery
² Protocol processing, access control, QoS mechanisms
Content Distribution
² Goal: position content objects into delivery devices
² Different content types use different techniques
² Static images and texts: pulled & cached, or pushed ² Multimedia contents: usually pre-positioned
Distribution Mechanisms
² HTTP request for pulling
² Example: standard HTTP reverse proxy
² FTP of tar files
² Some equipment vendors use this technique
² Rate limited tree-form replication
Distribution Mechanisms using Multicast
² Application-level reliable multicast
² Example: Inktomi’s Fast-Forward
² Unreliable IP multicast with file-level error correction
² Example: Digital Fountain, multicast-ftp
² Unreliable IP multicast
Content Consistency Mechanisms
² Expiration times or TTL
² Renaming in the HTML file
² Web Cache Invalidation Protocol (WCIP)
² Nodes receive invalidations when objects change ² Objects are organized into channels
Request Routing
² Goal: steer the client such that it fetches the content from a close node
² Methods
² DNS selection ² HTTP redirection
DNS selection
² Basic idea: xyz.com’s Name Server (NS) returns node close to client
² How to become xyz.com’s NS?
² Rewrite URLs (aka Akamizer)
² Take a subdomain cdn.xyz.com and put all content there
² Accuracy limited to client’s name server
² Only suitable for ISP or overlay networks
HTTP Redirection
² Basic idea: web server tells client to go somewhere else
² Returns “302 redirect … 1.2.4.5/index.html…”
² Mostly used for multimedia objects
² These objects are usually put together in an index file (.sml or .asx) and clients fetch the index file via HTTP before streaming
² Accuracy is at individual client level
Transparent Interception
² Router and switch along the request path can send the request elsewhere
² Mostly used for distributed data centers front-ended with L7 switches
Algorithms for Request Routing
² Map-based
² Create a map of the Internet based on AS domains, pick the node with the shortest hop count to client
² Or, set up coverage zones mapping a node to a collection of subnets
² Racing-based
² Let the delivery nodes all race to the client with A-records ² Winner is selected by client automatically
Interaction between Content Distribution and Request
Routing
² Don’t route request to a node that doesn’t have the content!
² Particularly important for large streaming contents
² Such content are usually pre-positioned to ensure high-bandwidth playbacks
Content Delivery
² Goal: serve content to each client at desired quality of service
² Supported protocols
² HTTP
² Microsoft MMS
² Open standard RTP/RTSP ² RealNetworks RTP/RTSP
QoS of Content Delivery
² Server QoS
² Server needs to make sure it has enough CPU and disk to service the stream at specified bit rate
² Network QoS
² Interoperate with routers via DiffServ bits
² Coordination with request router
Resource Accounting
² Mining the log files
² Log file aggregation: all device sending log files to a central location ² Local mining: analyzing the log file at each delivery device
² Real-time statistics
² Real-time statistics on throughput/latency based on domain, content type or any HTTP header
Multicasting
² IP multicasting is efficient but Not deployed
² Application Level Multicasting (ALM)
² End systems:
² Intermediate node
² Leaf node
² ALM implementation approaches:
CDN for media streaming
² Works by pushing media to some servers. Each server serves a client that is placed in its specified domain.
² Each server contains lots of media files, the bandwidth for each file should be limited.
² Since the cost of maintaining CDN servers is too high, the user should pay a fee to access media files.
² One method to control the cost is to place video with different
rates/qualities in the network. Therefore, the users stream the video based on their available bandwidth.
CDN for media streaming (cont.)
² Some papers propose other methods like stream video with different rates by using techniques like Multiple Description Coding (MDC) or scalable video coding (SVC) to provide such a service to users with different
capabilities.
² In MDC the video is encoded into several descriptions each of which has the same importance. When video is encoded into M descriptions the combination of any different description is decodable. The more
A CDN-P2P for Multimedia Streaming
² P2P: Challenge delivering with constant quality: ² Churn nature of peers
² Limited resource
² CDN: Scalability problem ² Costly
² Maintenance problems
CDN-P2P: Could be an efficient approach for delivering video
CDN-P2P Protocol:
OMNI: Overlay Multicast Network Infrastructure [1]
² Two tire architecture:
² MSNs (Multicast service nodes) ² end-hosts (or clients)
² Using tree:
² Between MSNs ² Under MSNs ² Resilience problem
CDN-P2P Protocol:
A Cost-effective Hybrid P2P-CDN Architecture [2]
Different stages of a media data distribution process ² Consider One CDN server, but it is
applicable to all CDN servers
² Create virtual peer
² When the seed distribute completely in the network, it is released
² Using mesh under CDNs
² CDN servers are NOT participate in other meshes
CDN-P2P Protocol: ChinaCahe [3]
²
A commercial approach
²
Servers is deployed in
Multiple tree
²
Using mesh under CDNs
²
Support Live video
streaming
²
There is no connection
between meshes
Comparison between protocols’ properties
Name Scalability Load on server Startup Delay interconnection between P2P networks Locality awareness General CDN streaming system Costly High Minimum No peer No
General P2P streaming system Yes Low Long One P2P network No OMNI [1] Yes Trade off between
load and delay Shorter No No A cost-effective hybrid architecture [2] Yes Trade off between
load and delay Shorter No No ChinaCache [3] Yes Trade off between
Reference
² [1] S. Banerjee, C. Kommareddy, K. Kar, B. Bhattacharjee, and S. Khuller, ―OMNI: an efficient overlay multicast infrastructure for real-time applications, Computer Networks, vol. 50, 2006, pp. 826–841.
² [2] D. Xu, S.S. Kulkarni, C. Rosenberg, and H.K. Chai, ―Analysis of a CDN–P2P hybrid architecture for cost-effective streaming media distribution, Multimedia Systems, vol. 11, 2006, pp. 383–399.
² [3] X. Liu, H. Yin, and C. Lin, ―A Novel and High-Quality Measurement Study of Commercial CDN-P2P Live Streaming, WRI International Conference on Communications and Mobile Computing, 2009.
CMC'09., 2009, pp. 325–329.
² [4] A.M. Pathan and R. Buyya, ―A taxonomy and survey of content delivery networks, Grid Computing and Distributed Systems (GRIDS) Laboratory, University of Melbourne, Parkville, Australia, vol. 148, 2006.
² [5] Yasser Aeyyedi, MS Thesis, Sharif University of Technology-Kish Branch, 2010.