Conclusions - Video Popularity Metrics and Bubble Cache Eviction Algorithm Analysis

As to conclude the main Chapters, this Thesis sought to gain a greater understanding of video request data and a method by which a cache enabled network could optimise its delivery in avoiding replicated video data on its network. Chapter 3 contained analysis of two VoD popularity distribution data-sets provided by BT. The wider research community partially claimed that VoD popularity distributions follow a Zipf Distribu- tion [7–18]. Other research findings have contradicted this claim or expressed caution as to assuming the distribution follows Zipf [12, 16, 17], thus providing a need to set- tle this dispute. A number of methods were used in an effort to deduce which model, Zipf or Zipf-Mandelbrot, is most appropriate in an effort to replicate VoD popularity distributions. The means of matching distributions included analytical methods leveraging Kullback-Leibler divergence, Pearson Chi-Sqaured, Pearson Correlation and an ICN network simulation environment used to assess the behavioural similarities of the request behaviours of the most prominent Zipf and Zipf-Mandelbrot distribution found to match the real popularity distributions most closely. The results of all tests performed concluded that Zipf-Mandelbrot is the most appropriate model to use when replicating VoD popularity distributions with emphasis on the performance differential experienced in the Icarus ICN environment when comparing Zipf and Zipf-Mandelbrot against the real popularity distributions. Zipf-Mandelbrot resembled the real data to a much greater degree as can be quantified and seen in Table 6.1 where a mean average cache-hit ratio for all cache sizes is summarised to demonstrate the close resemblance to the real data-set Zipf-Mandelbrot shows in relation to the real data as opposed to the

Empirical and Model — Mean Absolute Deviation

VoD TV Catch-up

Cache Size Zipf Zipf-Mandelbrot Zipf Zipf-Mandelbrot

Total Mean 0.0304 0.0037 0.0385 0.0056

Table 6.1: Difference between Cache-Hit Ratios relative to Empirical Results in test where Zipf and Zipf-Mandelbrot models were selected based on a goodness-of-fit

produced in KL — This table is a reduced replicate of Table 3.6

Zipf Distribution. In determining these results the methods by which it was possible to compare models and data-sets was the largest contribution. The Pearson Chi-Squared method used, though unconventional, is a novel method of comparing two popularity distributions that would provide an appropriate additional method of comparing two models beside the more conventional method of using the Kullback-Leibler divergence. Chapter 4 set out to create an ordered list of requests for video objects made to a CDN that reflects the changing probabilities one may expect to find in a VoD system. The key characteristics that were set out to include for each unique video object were; Probability of Request, Decay of Popularity and Life-Time. The process suggested to create a list of requests and requires the user to introduce the properties of items in the system at a single point of observation. Once the properties of each item are identified, the requests can be generated by a two step process. The steps are documented as be- ing the “Storyboard Generator” (Chapter 4.3.1) and the “Request Generator” (Chapter 4.4).

The list of requests generated by the request generator here may be of use to a number of ranging application requiring a pseudo-realistic request order for items. One such example may be cache eviction policy effectiveness in a VoD setting such as for the algorithms proposed in Chapter 5. As to conclude, a successful pseudo-realistic request generator for a VoD system was designed and implemented for the purposes specified.

In Chapter 5, the Bubble cache eviction algorithm is introduced. Additionally to the introduction of Bubble, two variation of Bubble were also introduced as Bubble- LRU and Bubble-Insert. The Bubble algorithm takes inspiration from the Bubble sort

algorithm which iteratively compares items and swaps them based on the predefined rule. Bubble also swaps items however, it interprets items based on a recency of requests experienced, thus providing the requested item with a greater value swapping it with the item preceding it. This mechanism is introduced in this Thesis as a simple cache eviction algorithm with potential to rival existing, better known algorithms such as LRU, FIFO, RAND and even LFU. Bubble-LRU and Bubble-Insert are variations in which the insertion of new items, which are at the lowest index in Bubble, are changed to be at an alternative index with a change to the proceedings below the alternative index to follow the LRU method in the Bubble-LRU algorithm.

Three separate methods were used to simulate the approximate cache-hit ratios one may be expected to find in a video on demand environment with caching. Each method used had a variety of results, however, from the Markov Chain and Icarus Simulations it can be concluded that Bubble is an effective eviction algorithm. Bubble, when results were obtainable, received a greater cache-hit ratio in most scenarios with the exception of LFU which closely resembled Bubble in most cases. This means that Bubble is the most effective eviction algorithm out of all algorithms used with the same operation cost of O(n). The third method of testing Bubble and it’s variants saw them submitted to the Complex Single Cache Request Generator created in Chapter 4. This method did not see a static pool of items available for request but instead saw a changing pool of available items as well as the introduction of popularity decay. This new dimension saw Bubble perform better than all other cache eviction algorithms tested when the cache size was very small, however not produce positive results when the cache size was medium or large. In the scenarios where Bubble did not perform, Bubble-LRU appeared to apply the performance Bubble applied in the small caches together with the effectiveness of LRU and performed better than all other algorithms observed.

To conclude, Bubble is the most appropriate cache eviction algorithm when cache sizes remain small based on the results observed in Chapter 5. When cache sizes are

large, a cache eviction algorithm such as Bubble-LRU would be worthy of investigation as it can achieve a great cache-hit ratio in a range of situations.

In document Video Popularity Metrics and Bubble Cache Eviction Algorithm Analysis (Page 178-181)