3.4 Compression in Existing Techniques
3.4.1 Lossy Compression in PLRU Cache Approximation
As introduced in Chapter 2, two main forms of analysis for PLRU caches exist; the LRU-approximation given by Heckmann et al. [56] and the Po- tential Leading Zero (PLZ) approach given by Grund and Reineke [54]. In addition, the practice of considering logical cache states rather than physical cache states, the Collecting Semantics [37] of a PLRU cache, also discards information. However, as the only information discarded by the Collect- ing Semantics is the precise concrete states encountered, by instead keeping track of all behaviours encountered, one can conclude that the information discarded by Collecting Semantics has no value. In contrast, as the LRU and PLZ approaches do not produce useful results for PLRU caches with associativity of at least 8, it can be inferred that these approaches discard information of significant value.
In the approached used by Heckmann et al. [56], the loss of information is by necessity: Heckmann et al. do not propose rules for tracking when the information could be evicted, and hence the only possible method is to discard that information and make no decisions using that information. The
a b c d 0 1 2 2 Subtree distance to a:
Figure 3.5: Subtree distances to element a in a 4-way cache
loss of such a large amount of information is that the analysis is incomplete, and hence has high pessimism: even for small caches a large number of definite hits will be classified as “May Hit”.
Grund and Reineke’s Potential Leading Zeros [54] approach to PLRU Must analysis is provided as a form of abstract interpretation, and hence as previously stated discards information by using its abstraction function. Specifically, an approximation of the number of pointers that point to each element of the cache is made (the “potential leading zeros” that the analysis is named for), as previously shown in Figure 2.12. In order to merge states, similar states are identified by the subtree distances of the elements within the state (Figure 3.5, which provides a partial representation of the tree structure).
Grund and Reineke’s algorithm approximates the subtree distances be- tween two cache lines as being either 0, non-maximal, maximal or unknown. In the case of Figure 3.5, a has subtree distance 0 to a itself, a non-maximal subtree distance to b, and maximal subtree distance to c and d. Effectively, this partitions the cache into two subtrees, which allows the analysis to de- duce that an access to a cache line within the left subtree of the root node will not impact the right subtree, and vice versa. Any cache states which have the same subtree distance approximations for their elements can be merged, as seen in Figure 3.6, with the number of potential leading zeros being upper bounded. When the number of potential leading zeros for a cache line reaches the height of the subtree it resides in, then the analysis considers that the cache line could be evicted at the next eviction.
The main problem with Grund and Reinkes approach is that it is in- complete: it can not analyse all elements of caches with at least 8 ways. Hence even though the approach succeeds in reducing the number of states to be considered, it does not provide a full cache analysis. This can be
Figure 3.6: Merging states in Grund and Reinekes Potential Leading Zeros analysis
viewed as discarding too much information; Grund and Reineke’s primary motivation for targeting the number of potential leading zeros for approx- imation appears to be soundness. This is because the behaviour of Grund and Reineke’s approach is similar to the technique of Heckmann et al. of using an LRU cache to provide a Must analysis [56]. This is observed as in Grund and Reineke’s algorithm, each subtree can be equivalently modelled as an LRU cache of size log2(N/2), for an N -way cache, which is the same as
performing Heckmann et al.’s analysis on each subtree. Hence the property of soundness is easily proved.
Whilst soundness is a desirable property, the consequence of discarding too much information is the appearance of compression artefacts: the dis- crepancies between the compressed representation and the actual system. In the case of Grund and Reineke’s approach [54], compression artifacts lead to considering additional states that could never be encountered. The major concern is that Grund and Reineke’s approach can never determine that any element has been evicted for a cache with N ≥ 8 ways. This immediately causes state space explosion: if no element can be provably evicted, then trivially after accessing k unique memory blocks, all possible combinations of those cache elements must be considered. Hence for k memory blocks, this quantity is expressed as kCN, and is equal to N !(k−N )!k! . However, as
each cache is represented as two subtrees, the number of states considered will likely be greater2. Regardless of the exact quantity, this leaves a lower
2
bound on the number of states considered that rises at least as fast as Ω(kN). Given that the number of unique memory locations considered is expected to be large, this is clearly an undesirable property.
As too much information is discarded by Grund and Reineke’s approach, and this leads to state explosion, it could be argued that considering the problem from an explicitly information theoretic point of view could help. By determining the effects of removing each type of information contained in the cache state on the size of the state space considered, it should be possible to devise an analysis which is both sound and doesn’t lead to state explo- sion. This could be obtained by retaining enough information to perform an effective May analysis, thus allowing elements to be provably evicted and hence reducing the number of cache states to be considered.
In contrast to the approaches of Heckmann et al. [56] and Grund and Reineke [54] approaches, using the Collecting Semantics [37] of a cache does not discard valuable information. It does however discard informa- tion: specifically, when cache states have different physical representations but all exhibit the same logical behaviour, only one of these cache states need be evaluated and the rest discarded. This information is not relevant to the analysis as the analysis is not concerned with which physical states may be reached, but the logical behaviours that may be observed. Hence, in PLRU analysis, there does exist a form of lossy compression which only discards information that has no value to the analysis. Unfortunately, the technique of Collecting Semantics does not discard enough information to be tractable. Hence any form of tractable analysis must discard information which has value: the main question is in determining which information is of least value and how it may be discarded without impacting the analysis.