4.5 Multitask Timing Analysis
4.5.3 Bounding Cache-related Preemption Delay
In the following we will be concerned with standard techniques to bound CRPD. We unify different techniques in one formal framework and discuss specific inaccuracies in
Chapter 4. Cache Analysis 52
their original definitions. We do not address practical analysis yet but constrain ourselves to concepts.
Useful Cache Blocks
To estimate CRPD, we are ultimately interested in the number of additional cache misses due to preemption. Useful cache blocks (UCB) [88] bound this number by computing sets of memory blocks potentially being accessed and reused during uninterrupted execution of a task. Intuitively, a preemption can not inflict more misses than cached blocks potentially being re-accessed.
Definition 4.2 (Useful Cache Block) A memory block m ∈ M is is a useful cache block in a point ui ∈ V if and only if it is cached in ui and reused in a point uj such that ui uj and m is not evicted before uj is reached.
A safe but pessimistic bound on CRPD is given by the assumption that preemptions evict all UCB.
useful not useful useful not useful useful m m
Figure 4.4: Useful Cache Blocks in a control flow graph
A set of UCB contains all memory blocks which may be cached at some point and may be reused at a later point in time. Figure 4.4 illustrates this notion. The white circles represent accesses to m along a control flow path. Block m is only classified “useful” at those program points from where an access to m is reachable in forward and backward direction without m being evicted along the path. We refer to the number of memory accesses distinct from m as distance between accesses to m.
We shall now formalize the notion of UCB. For an initial state c∅ ∈ Cs, a memory block m ∈ M is cached in a program point ui ∈ V if for a path π = (u0, . . . , ui, . . . , un), it holds that: agem,J(u0, . . . , ui)K(tf lru)(c ∅) ≤ K − 1 (4.21)
where age denotes the age of a block in a cache set state (Equation 4.2) and JπK(tflru) denotes the cache set semantics (Equation 4.4) on π. We define the set of all cached
Chapter 4. Cache Analysis 53
blocks, given a path π as:
cbπ: Π 7→ ℘(M) cbπ(π) = n m ∀m ∈ M : age(m,JπK(tf lru)(c∅)) ≤ K − 1o (4.22)
We define UCB via path-based collecting semantics. Recall that path-based semantics col→/←π (Definition 2.10 on page 11) denote sets of paths from CFG source to a point u or from CFG sink t u, respectively. Let UCB ⊆ V 7→ ℘(M) denote the function that map from points to sets of UCB, then we define:
ucb : UCB
ucb(u) =m∃π ∈ col→π (u) ∧ ∃π0∈ col←π (u) : m ∈ (cbπ(π) ∩ cbπ(π0)) (4.23) For reference, let set : m 7→ [1, N ] denote the cache set a memory block is mapped to, then we define the set of ucb in cache set s as:
ucbs: V 7→ ℘(M)
ucbs(u) = {m | m ∈ ucb(u), set(m) = s} (4.24)
Figure 4.5: UCB overestimation due to a single access
Note that Equation 4.23 overestimates UCB as illustrated in Figure 4.5: Let memory block m be accessed in program point u, then m will be classified useful regardless of an actual reuse in another program point.
CRPD due to a preemption in a point u is then denoted by: crpducbu : V × UCB 7→ N0
crpducbu (u, ucb) = |ucb(u)| × BRT (4.25)
In general, it is statically not possible to precisely determine the preemption point. Hence, a safe bound is denoted by [88]:
crpducb: UCB 7→ N0 crpducb(ucb) = max
u∈V crpd
ucb
u (u, ucb) (4.26)
Let # denote a bound on the number of preemptions, then a bound for the preempted execution of a task is given by:
Chapter 4. Cache Analysis 54
Figure 4.6: Relation of access classification (cf. Section 4.4.2) for CRPD and WCET Recall from Section 4.4.2 that for safe WCET computation all memory accesses classified “always miss” and “not classified” (not cache on all paths) must be accounted for as cache misses in uninterrupted execution. Similarly, function cbπ denotes all blocks classified “always hit” or “not classified” (cached on at least one path). Consequently, function WCETest and function crpd account for all accesses “not classified” twice, as illustrated in Figure 4.6. This gives rise to the notion of definitely cached useful cache blocks (DCUCB) [89] to mitigate redundancy in analysis, and which is defined as the constrained set of “useful” memory blocks classified as:
dcucb : V 7→ ℘(M)
dcucb(u) = {m ∈ ucb(u) ∧ acl(m, u) = ah} (4.28)
Note that Equation 4.27 is only safe if WCET and CRPD bounds are derived from the same cache analysis, which might not be possible in practice due to inaccessibility of proprietary WCET analyses frameworks.
Evicting Cache Blocks
Bound crpducb denotes an approximation on the number additional cache misses due to preemption by any task. It does not take into account the accesses that are actually performed by individual preempters, leading to unnecessary pessimism. In [78, 90], the consideration of evicting cache blocks (ECB) is proposed.
Definition 4.3 (Evicting Cache Block) A memory block m ∈ M is an evicting cache block if it is ever accessed on execution path π.
Consequently, let ECB ⊆ ℘(M) denote sets of memory blocks. Then the set of ECB is simply defined as the same of cached blocks along a path:
ecb : ECB
Chapter 4. Cache Analysis 55
For reference, we define the set of ECB per cache set s as: ecbs: ℘(M)
ecbs = {m | m ∈ ecb, set(m) = s} (4.30)
Symmetrically to function ucb, ecb is an upper bound on the number of additional misses for any preempted task. It is important to recognize that under LRU, a single eviction can cause up to associativity K additional cache misses in a preemptee [91]. This needs to be taken into account when deriving CRPD bounds.
(a) Non-preeemptive execution (b) Preemptive execution Figure 4.7: Example of miss behavior under LRU and preemption
Example Figure 4.7a illustrates the memory access sequence ι∗(π) of a task τ2 and the contents of a cache set for K = 4 under LRU into which the memory blocks are mapped. All initial accesses necessarily lead to compulsory cache misses. After that, all accesses are hits. As opposed to this, Figure 4.7b illustrates the effect of a single access by a preempting task τ1. All hits become misses since the accessed block has just been evicted before its access. Hence, the number of ECB is not a safe upper bound for additional cache misses.
Consequently, a safe bound on CRPD is given by the number of invalidated cache sets times associativity K:
crpdecb: ECB 7→ N0
crpdecb(ecb) = |{s | s ∈ N, ecbs6= ∅}| × K × BRT (4.31)
Originally [90], Equation 4.31 has been proposed for direct mapped caches only. Its extension [92] to non-direct mapped caches, however, has been shown to be unsound [91]. The bound given above has not been proposed before to the best of the author’s knowledge. Cache Block Resilience
Equation 4.31 is based on a safe but not very precise bound on the number of additional misses as it does not take UCB into account. From the example illustrated in Figure 4.7 we conclude that a possible improvement by taking UCB and ECB into account can
Chapter 4. Cache Analysis 56
be achieved by excluding cache sets with empty intersection of UCB and ECB [91]. Recall that otherwise we have to assume complete set invalidation to be safe. However, in [93], the authors recognize that by maintaining age information of UCBs, cache block resilience (CBR) with respect to preemption eviction can be computed to reduce over-approximation.
Figure 4.8: Example of block aging and distance metric of accesses
For an intuition, consider Figure 4.8 which illustrates two memory accesses to a block m on a path π. We ignore cba→/← for the moment. The distance between the two accesses equals 3. If we assume an associativity K = 4, then the resilience of m between the two accesses is (K − 1) − 3 = 0 in all potential preemption points. Consequently, any additional access due to preemption causes a cache miss upon the next access to m. For any shorter distance between two accesses, a block’s resilience increases: we can guarantee m is still cached after an access to some other block that maps into the same cache set if the distance is less than 3.
Definition 4.4 (Cache Block Resilience [93]) The maximal amount of additional accesses distinct from m ∈ M by preempting tasks without causing a cache miss upon the next access to m is referred to as cache block resilience (CBR).
Let us formalize this notion. To compute CBR, information on block ages need to be available in all program points. UCB (cf. Equation 4.23) do not include this information anymore. Let CBR ⊆ V 7→ Age denote the set of functions from program points to block ages. Then ages of memory blocks for a given execution path π are denoted by:
cbaπ: Π 7→ CBR
cbaπ(π) = λm . age(m,JπK(tflru)(c∅)) (4.32)
Recall that uncached memory blocks yield an age of ∞ and that a memory block m may only be cached in point u if m ∈ ucb(u). The maximal age (distance in terms of memory accesses) of a useful block m in point u for all forward paths leading to that point is then defined as:
cba→: V 7→ CBR
cba→(u) = λm .
max {cbaπ(π)(m) | π ∈ col→π (u)} if m ∈ ucb(u)
∞ otherwise
Chapter 4. Cache Analysis 57
Symmetrically, the maximal aging from point u to its next access is defined as: cba←: V 7→ CBR
cba←(u) = λm .
max {cbaπ(π)(m) | π · (u) ∈ col←π (u)} if m ∈ ucb(u)
∞ otherwise
(4.34)
We deviate from the original definition [93] in that we are collecting aging up to but not including respective program points in cba← — as opposed to cba→ — to avoid overestimation in the original proposal. Figure 4.8 illustrates the respective valuation of cba→/←.
CBR is the difference between the associativity (its maximal resilience) and the maximal distance between two accesses. We thus define a bound CBRmu for memory block m and program point u as:
cbr : V 7→ CBR
cbr(u) = λm . (K − 1) − min(K − 1, cba→(u)(m) + cba←(u)(m)) (4.35)
The sum of block ages denotes the distance, which is bounded by K − 1 such that distances equal or greater than K − 1 yield a CBR of 0.
CRPD in a program point u for a single preemption is then bounded by those whose UCB whose CBR is greater than the number of ECB. Formally, we define:
crpducb,cbr,ecbu : V × UCB × ECB × CBR 7→ N0
crpducb,cbr,ecbu (u, ucb, cbr, ecb) =X
s
|ucbs(u) \ {m | cbr(u)(m) ≥ ecbs(u)|}| (4.36)
A globally safe bound for a preemption is then denoted by the maximal interference over all program points:
crpducb,cbr,ecb: U CB × ECB × CBR 7→ N0
crpducb,cbr,ecb(ucb, cbr, ecb) = max
u∈V crpd
cbr
u (u, ucb, cbr, ecb) (4.37)
Summary
In this section we have provided an overview of the basic building blocks to bound CRPD and an initial notion of how these can be used to compute CRPD estimates. Note however, that the bounds so far only account preemptions by a single preempting task and merely serve the purpose of clarifying their conceptual use. For multiple preempters, in particular for bounds based on explicit interference of UCB and ECB, care has to be taken not to underestimate CRPD. We will address this subject in detail below. For a more detailed and general overview of various proposals, see [94]. In the following we leave the conceptual level and propose in detail a specific analysis framework for very precise CRPD estimation, addressing various shortcomings of existing approaches.
Chapter 4. Cache Analysis 58