Redirection Techniques - Infrastructures for Video Delivery

2.3 Video as an Emerging Application

2.3.2 Infrastructures for Video Delivery

2.3.2.3 Redirection Techniques

As CDNs became popular, the redirection techniques utilised in historic web caches no longer became scalable across many thousands of users. For example, using the proxy gateway approach would be infeasible given the traffic load. It would also be difficult to manage connections to multiple CDNs simultaneously. Clearly, the redirection techniques need to be scalable and support the simulta- neous use of a number of different CDNs. In the modern Internet, a number of methods are used to achieve this redirection; these are described in the following section.

HTTP

In the context of a request made over HTTP, the HTTP 1.0 specification [63] defines a number of response codes that can be used to redirect a client to an

alternative location. For example the301 code is defined asMoved Permanently.

This indicates that content is no longer available at the current location, and all

future requests should be directed to a given URI. To supplement this, the 302

code was originally described as Moved Temporarily. Despite this specification,

most browser implementations would typically modify the subsequent request method to a GET, regardless of the previous method used.

In order to rectify the ambiguity in the use of the 301 and 302, three further

codes were added in the HTTP/1.1 specification [92]. Specifically, the 303: See

Other and307: Temporary Redirect codes were added to explicitly define this be-

haviour, with303 mandating a GET request, and 307 requiring the preservation

of the original request type. Similarly, 308 was added to signify a Permanent

There is a clear limitation for using these HTTP codes to modify requests for content: it requires the participation of the end servers, and mandates that they specifically have to have knowledge of where the content is now located. In a distributed system, where the content can be located in multiple locations, dynamically rewriting the target of a redirect on a per-client basis is not a par- ticularly scalable solution. Furthermore, if the content is located in multiple alternative locations, as can be the case with a video stream, successive redirects will have to be issued to the client. This requires inter-server collaboration, and would likely require large amounts of messaging overhead and coordination. It would also be infeasible if the servers belong to different organisations or delivery networks.

DNS

DNS resolution is a core part of any connection establishment process initiated over the Internet. Before a client can request content from a remote server, it will seek to resolve the given URL to an IP address using a lookup to a DNS server. In the case of redirection, this process can also be used to direct requests to a topographically closer cache [53]. To achieve this, the DNS server will inspect the source of a request, and associate this with a topographical region. A resolver will then return a response to the client, the contents of which will instruct the

client to request content from anearby edge cache.

This method of redirection is by far the most commonly used technique in use in the Internet today. However, when used in conjunction with a traditional web cache, it can result in inefficiencies. As these caches rely on the URL as the unique identifier for a piece of stored content, each object will be treated individually. However, as the DNS resolution may not be the same between clients, the same piece of content can be delivered to different clients, and thus stored under different identifiers within a cache. This leads to cache duplication and wasted disk utilisation.

Under normal circumstances, a DNS resolver should always resolve clients located within the same network to the same surrogate server. However, many of these DNS resolvers also act as a type of load balancer, ensuring requests are distributed between a number of surrogate servers, each of which evidently has its own address. DNS redirection can also be problematic in cases where a user

utilises a third party DNS resolver, rather than one provided by the ISP; this may result in ignorance of content located within an ISPs network [53], leading to the inefficient delivery of content.

The usefulness of DNS-based redirection is also diminished when the client itself caches the DNS response [152]. This caching can result in a slower response to failures and changes in demand, with the client not aware of changes in the availability or location of content. To address this, content providers use low time to live (TTL) values in their DNS entries. This in turn results in frequent DNS cache misses, adding additional latency to the request process.

Transport Layer

Another approach to request routing is to do so at the transport layer. Typically, this requires the introduction of a request routing middlebox or appliance, to be used as an initial gateway. This intermediary will then select a surrogate from a connected group, and facilitate a connection between it and the client. Once this connection is made, the surrogate will typically deliver the content to the user without traversing the middlebox again. This enables the maximum possible throughput to be used, without the performance penalty of traversing the middleware on the throughput intensive return flow of traffic.

This approach is often used in conjunction with another approach to ensure that requests arrive at the appliance. It can therefore be seen as a complimentary technique: it offers fine-grained control and redirection, but only once a requested is routed appropriately to it. Using an intermediary appliance also requires the need to purchase, maintain and house said equipment. Changing the behaviour of a device can also be a time-consuming process, especially as there is no standard technique of interacting with devices across vendors.

Anycast

This method utilises the behaviour of IP packet routing to select the nearest possible surrogate server. This is typically achieved by using routing protocols (such as the Border Gateway Protocol) to announce the same IP address from many different places within the Internet. When a request for content is sent from a client, the nearest router will automatically forward the packet to the nearest surrogate server, which should theoretically provide the best service to the client.

This technique requires a deterministic approach; identical requests can be handled in different ways if the routing table differs in any way. For a connection- orientated protocol, such as HTTP, this approach can lead to clients attempting to connect to a different surrogate during a long-lived connection, such as that found in video playback. As these surrogates do not share the same connection state, reconnects will occur, which can disrupt availability and thus playback.

The same deterministic nature of Anycast also has implications for the content catalogues stored on a surrogate server, as not all surrogates will replicate the same set of content. This can result in inefficient behaviour, and lead to increased and/or variable content delivery times.

In document OpenCache:a content delivery platform for the modern internet (Page 65-68)