Future work - On the design of efficient caching systems

Future research directions can be summarised as follows.

Extension of sharding performance models. The theoretical characterisation of sharded caching systems presented in Chapter 3 has been carried out with the main objective of providing theoretical foundations motivating decisions made in the design contributions of Chapters 4 and 5. However, as already mentioned, sharding techniques are ubiquitously utilised in a variety of applications and the modelling contributions provided can be easily applied to the design of many operational systems. While the work presented in this thesis already covers important aspects, such models can be extended to apply to more general cases and to take into account further performance metrics. One possible direction for extension consists in addressing the case of heterogeneous shards. Current models assume that all shards have identical cache size and processing capabilities. However, this may not be the case in reality, because as a result of incremental system upgrades, sharded systems may comprise entities with different hardware configuration. Generalising cache hit ratio and load balancing results for the case of heterogeneous processing and cache size capabilities would make the model suitable to address this case as well.

Another possible direction concerns the generalisation of the analysis of load balancing performance in the presence of frontend caches presented in Sec. 3.3.5. While the work presented in this thesis assumes perfect caching, there is practical interest in generalising these results to the case of recency-based replacement policies such as LRU, FIFO and derivatives. Early results from numerical evaluations (not presented in this thesis) suggest that the findings presented here hold also for a larger set of replacement policies. However, further investigation is required to validate this.

Finally, and probably most importantly, there would be great benefit in extending the current work to understand how load balancing and caching performance effectively impact more concrete metrics, such as latency and throughput. This could be achieved by making specific assumptions about the operation of each shard and applying concepts from queueing theory.

Investigate optimal configuration of multiple replicas caching. One extension to the base caching framework presented in Chapter 4 is the support for multiple content replicas, as explained in Sec.

4.3.3. By supporting multiple content replicas, a hash-routed caching system operates as a hybrid between pure hash-routing and on-path caching, depending on the number of content replicas in the system. At one extreme, where each content is replicated exactly once, the system operates like a pure hash-routing scheme. At the other extreme, where each content is replicated as many times as the number of caching nodes, the system operate as a pure on-path caching scheme.

A multiple-replica hash-routing scheme is in practice a generalised caching framework encom- passing on-path and hash-routing caching. As a result, it could give operators great flexibility in allowing them to fine-tune performance by trading off cache hit ratio, latency, load balancing and scalability through careful selection of the degree of replication, the assignment of content replicas and the routing strategy. Unfortunately, the optimal parameter configuration in the presence of multiple replicas is difficult. However, the results presented in Sec. 4.6.5 show that even with a simple replica placement based on network clustering it is possible to achieve good results, which is very encouraging.

A promising future research direction then consists in investigating how to optimise the configuration of the distributed caching framework of Chapter 4 in the presence of multiple replicas.

Generalisation of caching node design. The caching node implementation presented in Chapter 5 has been primarily designed to operate as a Content Store for an ICN router and this objective has dictated a number of design decisions. However, most of the design principles adopted are more widely applicable and could be exploited to improve the design of other applications. Therefore, one interesting direction for future work concerns the generalisation of H2C implementation to support a larger number of applications. Examples of applications that could be targeted include HTTP proxies and caching key-value store applications, such as memcached [130].

Also, another possible extension concerns the decoupling of H2C implementation from the other stages of packet processing of an ICN router (i.e., FIB and PIT lookups). This would improve the portability of H2C and would make it a suitable drop-in replacement able to operate with other ICN router implementations. A possible approach to achieve this objective would be to adapt the H2C implementation to export a well-defined interface to the rest of the system, such as a filesystem API, using tools like the Filesystem in Userspace (FUSE). However, this decoupling may degrade performance as a result of the introduced overhead.

Simplifying the configuration of controlled

network experiments

A.1 Introduction

The setup of a realistic scenario for a controlled network experiment, whether in a simulated or emulated environment, is usually a lengthy and delicate process comprising various tasks.

The first task consists in selecting a suitable network topology. Such a topology can either be parsed from datasets of inferred topologies, such as RocketFuel [157] or the CAIDA AS relationships dataset [27] or synthetically generated according to various models, such as [4], [18], [26], [54] or [179]. Alternatively, it is also possible to use canonical topologies such as stars, rings or dumbbells.

Second, after selecting the topology, it is necessary to configure it with all required parameters to be used in the target simulator or emulator. These include link delays, capacities, weights, buffer sizes and configuration properties of all protocol stacks.

Third, it is necessary to assign a traffic matrix to the topology or decide how the traffic will be modelled, such as deciding on the number of concurrent flows, their origin and destination and their characteristics.

Finally, all this configuration has to be implemented in the target environment before the experiment can be run.

The execution of all these tasks is cumbersome and error-prone, since there are no publicly available tools automating the entire process. In fact, although there are tools taking care of some of the tasks, such as the parsing or the generation of topologies, they do not support the entire setup chain and are generally bound to a specific simulator or emulator. As a result, a user is required to integrate heterogeneous software components or implement a complete experiment scenario from scratch.

Apart from possibly requiring a considerable amount of time, this process can also lead to an increased amount of mistakes affecting the reliability of results. In fact, the lack of a framework for automating experiment setup may lead users to configure network and traffic characteristics using unrealistic models. In addition, even if appropriate models are selected, defects may be introduced in their actual implementation. For all these reasons, it is highly desirable having a tool supporting the entire setup chain.

To address these issues, this chapter presents the design and implementation of a toolchain for easily executing all the tasks listed above. It allows users to parse topologies from various datasets or from other generators, as well as generating them according to the most common models. These topologies can then be configured with all required parameters and matched with appropriate traffic matrices or traffic source configurations. A fully configured experiment scenario can be exported to a set of XML files which can then be imported by the desired experiment environment. The toolchain provides adapters for ns-2 [132], ns-3 [77], Omnet++ [174], Mininet [111], AutoNetKit [105], jFed [95] as well as generic Java, C++ and Python APIs to enable an easy integration with other simulators or emulators. In particular, by providing generic APIs for the most common programming languages, we hope to contribute to increase the reliability and reduce the setup complexity of experiments run with custom-built simulators, which are very common.

The methods provided by the toolchain for generating and configuring network topologies are commonly used in literature, with the exception of those used for link capacity assignment. In fact, for this task, apart from providing commonly adopted models, we devised and implemented novel algorithms which provide a more realistic link capacity assignment than state-of-the-art methods. This chapter presents these new models and demonstrate their effectiveness by evaluating their performance on a number of real network topologies.

This toolchain was originally named Fast Network Simulation Setup (FNSS) and is publicly available as open-source software [61]. However, since it was first made publicly available evolved to support also emulated environments in addition to simulators.

The remainder of this chapter is organised as follows. Sec. A.2 describes the FNSS toolchain by explaining its architecture and design and illustrating its features. Sec. A.3 introduces novel link capacity assignment algorithms and evaluates their performance. Sec. A.4 presents a complete example of how FNSS can be used. Sec. A.5 provides an overview of the related work. Finally, Sec. A.6 summarises the contribution of this chapter and presents the conclusions.

In document On the design of efficient caching systems (Page 104-107)