To track hint set statistics for the dynamic benefit/cost model, DP-CLIC maintains the same hint table as CLIC (Section 3.2.2). The dynamic benefit/cost model requires the read and write distributions based on re-reference distance for each hint set. Thus, besides the statistics N (H), the hint table entry for H maintains two histograms: a read-reference histogram and a write-reference histogram. When the server receives a request for page p, with sequence number s, it checks both the cache and the outqueue for information about
38 114 190 266 342 418 494 570 646 722 Re-reference distance (X5000) 0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 H in t p rio rit y
hint set A hint set B hint set C
DB2_C300_400 (a) 75 225 375 525 675 825 975 1125 1275 1425 Re-reference distance (X5000) 0 0.005 0.01 0.015 0.02 0.025 0.03 H in t p rio rit y
hint set A hint set B hint set C
DB2_C540_400
(b)
Figure 4.4: Hint Set Priority Produced by Dynamic Benefit/Cost Model of DB2 TPC-C Workload Traces
the most recent previous request, if any, for p. If it finds seq(p) and H(p) from a previous request, then it knows that the current request is a re-reference of p. If the request is a read request, DP-CLIC increments the read count in bucket s − seq(p) (the re-reference distance) of H(p). If the request is a write request, DP-CLIC increments the write count in bucket s − seq(p) of H(p).
The range of the histograms is determined by the number of buckets (Nbucket) per
histogram and the width of the buckets (Wbucket), which are parameters of DP-CLIC.
These two parameters control a trade off between the space consumption and the accuracy of the histograms. As shown in Figure 4.2, the most important portion of the histogram is the hill portion. Thus, the overall reference distance (Nbucket× Wbucket) of the histogram
should be able to cover the hill portions of all higher-priority hint sets. If Wbucket is small,
the histograms are more accurate but a large Nbucket is needed to cover the hill portions. If
Wbucket is large, the histograms have a coarse granularity but a small Nbucket is enough to
cover the hill portions, and thus, space overhead can be reduced. However, the benefit/cost model may not be accurate because the histograms have a coarse granularity.
4.4
Cache Management
Section 3.2.4describes how CLIC uses the hint set priorities to manage the contents of its cache. The cache management of DP-CLIC, described in Figure 4.5 is similar to that of CLIC. They both are priority-based replacement policies. However, the page priority in CLIC only depends on which hint set the page was requested with while the page priority in DP-CLIC also depends on how long the page has stayed in the cache. For each hint set, DP-CLIC maintains a dynamic caching priority in terms of re-reference distance. To
determine the caching priority for page p, Pr(H, d), DP-CLIC needs the hint set H that p has been most recently requested with, and d, which indicates how long p has been cached. The difference between CLIC and DP-CLIC is on the setting of page caching priority and on how they identify the page with the minimum priority in the cache. We explain the cache management of DP-CLIC using the algorithm in Figure 4.5. When the server receives a request (p, H), with sequence number s, the caching priority that DP-CLIC assigns to page p is set to the initial priority Pr(H, 0) (in line 31). For a page v in the cache, DP-CLIC calculates its d (how long it has stayed in the cache) as s − seq(v), and then sets the priority of page v to Pr(H(v), s − seq(v)). Theoretically, to find the page having the minimum priority, DP-CLIC needs to calculate and compare the priority of all pages in the cache.
In our implementation of the DP-CLIC cache management, we take advantage of the fact that the priority curves of hint sets are hill-shaped (from line12 to line30). When a page has just arrived in the cache, it has a lower priority, then its priority gradually grows until it reaches its peak point. After then, the priority decreases. DP-CLIC maintains a peak point for each hint set H(peak) to record the re-reference distance at which the priority reaches the maximum. Like CLIC, DP-CLIC maintains a queue of the hint sets. For each hint set H in the queue, all pages with H(p) = H are recorded in a doubly-linked list that is sorted by seq(p). Based on the hill-shaped priority curves, the page with the minimum priority may be the MRU page or the LRU page in the list. DP-CLIC calculates the distance for both MRU and LRU pages, and there are three cases:
• Both MRU page and LRU page have not passed the peak point H(peak). In this case, the MRU page has the minimum priority.
• Both MRU page and LRU page have passed H(peak). In this case, the LRU page has the minimum priority.
• MRU page has not passed the peak point but LRU page has passed H(peak). In this case, DP-CLIC compares the priority of the two pages and identifies the one with the minimum priority.
After identify the page with the minimum priority of each hint set, DP-CLIC identifies the minimum priority page in the cache among these pages. The run time is O(n), in which n is the number of the hint sets.