Extensions - On the design of efficient caching systems

4.3.1 Asymmetric and multicast content routing

In the base hash-routing scheme described above, request and content paths are symmetric, i.e., in case of a cache miss, the content (on its way back from the origin) is forwarded first to the cache and then to the requester, following the reverse path of the request. While this routing scheme is simple to implement and manage, the path stretch resulting from request and content routing through off-path caching nodes may lead to increased latency and link load.

base one proposed above, which we refer to as Symmetric (see Fig. 4.1a). These schemes, that we named, respectively, Asymmetric and Multicast, differ from the symmetric scheme only with respect to the delivery of contents entering the network as a result of a cache miss. The asymmetric routing scheme (Fig. 4.1b) always routes contents through the shortest path. The content is cached in the responsible cache only if it happens to be on the shortest path between the origin and the requester, otherwise it is not cached at all. In the multicast routing scheme (see Fig. 4.1c), when the egress PRF receives a content, it multicast it to both the authoritative cache and the requester. If the authoritative cache is on the shortest path between content origin and requester, then symmetric, asymmetric and multicast schemes operate identically.

All three content routing options present both advantages and disadvantages. The choice of content routing however gives network operators a knob to tune performance.

Both multicast and asymmetric routing reduce user-perceived latency because contents fetched from the origin are delivered to the requesting user without any detour. However, they may not be applicable to future Internet architectures where only symmetric request/response paths are allowed, e.g., CCN and NDN. This is however likely to change in the near future, since PARC (currently leading the CCN architecture development) is considering supporting asymmetric paths.

In addition, asymmetric routing reduces link load since contents (which are much larger than requests) are always delivered though the shortest path. However, CF instances are populated with contents entering the domain only if they happen to be on the shortest path from origin to requester. This would not be a problem if content popularity varies slowly. On the contrary, it would actually increase cache hit ratio since it would reduce the impact of one-timers. If, differently, request patterns exhibit strong temporal locality, a content item may need to be requested several times before it is actually cached (especially if the responsible cache is located on an underutilised path), hence affecting caching performance.

Multicast routing achieves the same latency of asymmetric routing and only requires one content request for a content to be cached, as in symmetric routing. However, since at each content request, a content item must be forwarded to the authoritative cache, it leads to greater link loads than the one achieved by asymmetric routing, especially when cache and requester paths have a limited overlap.

4.3.2 Hybrid caching

Another proposed extension to the base design consists in allocating part of the overall caching space to operate autonomously, i.e., cache any content indiscriminately, regardless of the content-cache mapping. This could be implemented in two different ways, depending whether CF and PRF entities are co-located or not. If PRF and CF are co-located, this extension can be implemented by allocating a static fraction of the caching space at CF nodes to cache any content for which the CF is not responsible. Differently, in case PRF and CF are not co-located, a small caching space can be deployed within the PRF and CF entities can be used to only cache content items for which they are responsible.

This can provide two main advantages. First, it allows a small number of very popular contents to be replicated in multiple nodes, instead of just one, hence possibly reducing overall latency and link load. Second, as extensively discussed in Sec. 3.3.5, a small frontend cache achieves better load balancing

across backend caching nodes. In fact, while base hash-routing evens out localised traffic hotspots by spreading traffic originated from each requester across caches, any peak in demand for a specific content item will always be served by the same cache. By caching a small fraction of very popular contents in multiple caching nodes, the system is made robust to variations in content popularity. This also makes the system more robust to adversarial workloads targeted at overloading a specific caching node, as shown by Fan et al. [57]. Additionally, in case asymmetric content routing is adopted, this extension ensures that a requested content is cached the first time it is requested even if the responsible cache is not on the shortest path between the origin and the requester.

Performance can be tuned by selecting what fraction of caching space to dedicate to uncoordinated caching. Increasing the uncoordinated caching space leads to greater robustness towards flash-crowd events and lower latency in accessing the most popular contents. However, it reduces the number of distinct items cacheable in the network, hence affecting the overall cache hit ratio (possibly leading to a greater overall latency) and robustness against localised spikes in traffic. It should be noted that utilising uncoordinated caching space makes the system behave as a hybrid between edge caching and base hash-routing. At one extreme, where all caching space is operated in a coordinated manner, the system is a base hash-routing one. At the other extreme, the system behaves as a pure edge caching system.

4.3.3 Multiple content replication

AUTHORITATIVE CF

(a) Single replica

AUTHORITATIVE CF

(b) Multiple replicas

Figure 4.2: Single and multiple content replication

Base hash-routing design allows a content item to be cached at most once in a network domain, hence maximising the efficiency of caching space. However, in case of very large topologies, this may result in high path stretch, possibly leading to high latency (see, for example, Fig. 4.2a).

To address this issue, we propose an extension where the content-to-cache hash function maps to k distinct nodes instead of just one. As a result, multiple nodes (ideally well distributed over the network) are responsible for each content (see Fig. 4.2b). This ensures that the worst case path stretch to reach a responsible cache is reduced.

When a PRF processes a content request and computes the content hash, it resolves the content to k distinct caching nodes. The PRF forwards the request to the closest of the k CFs responsible for the

content. In case of cache miss, two approaches can be adopted: 1. The CF forwards the request directly to the content origin.

2. The CF forwards the request towards the original source through (all or a subset of) the other CFs responsible for the content, if this can be done with limited path stretch.

More sophisticated approaches are also possible. For example, it could be possible for PRF instances to dynamically select which authoritative CF to redirect the request to, based for example on the current load. We reserve this investigation for future work.

Similarly, when a content enters the network after an origin fetch, several options can be adopted. The content can be forwarded symmetrically, asymmetrically or with multicast. Also, in case of symmetric content routing, if several caches are looked up, contents can be placed using various on-path meta algorithms such as LCE [114], LCD [113] or ProbCache [139].

The degree of replication k enables the fine-tuning of performance by trading off latency with cache hit ratio. Hash-routing operating with multiple content replicas is a hybrid between hash-routing and on-path caching meta algorithms. At one extreme (k = 1) the system behaves as pure hash-routing scheme. At the other extreme (k equal to the number of caching nodes) the system behaves exactly as the on-path meta algorithms used for content routing, or as edge caching in case requests are forwarded to the content origin in case of a first miss.

In document On the design of efficient caching systems (Page 61-64)