On Control Flow Representation - Path Problems in Timing Analysis

5.2 Path Problems in Timing Analysis

5.2.2 On Control Flow Representation

We will now discuss important basics of control flow representation, characterize information loss in CFGs and introduce path expressions as a convenient formal representation of path related problems. In this context, we also put flow facts into perspective. The section forms the basis of the subsequent discussion of practical path analyses.

Control Flow Abstraction

In Section 2.3 we introduced control flow graphs as an abstraction of execution paths. To obtain a precise notion control flow reconstruction from some representation and the role of flow facts in path analysis, we now formally define CFG abstraction. Let bDΠ denote a set of CFGs such that G = (V, E, s, t) ∈ bDΠis a (sound) abstraction for a set of (concrete) paths Π ∈ DΠ. Without loss of generality, we assume ∀(us, . . . , ut), (vs, . . . , vt) ∈ Π : us= vs∧ ut= vt.

We first define abstraction. Let P denote a set of paths, then let fv denote the set of all elements in P such that:

fv := λP . {u | u ∈ π, π ∈ P } (5.18)

and let fe denote the set of consecutive pairs in all sequences such that:

fe:= λS . {(ui, ui+1) | (ui, ui+1) ∈ P } (5.19)

Then abstraction αΠ is defined as:

αΠ: DΠ7→ bDΠ

Chapter 5. Path Analysis 94

We now define concretization which computes from a CFG a set of paths. To this end, we define a helper function which recursively extends all paths in a given set according control flow relation R for a source node u and a terminal node w as:

p : V × V × V27→ ℘(V∗) p(u, w, R) =    {π · (w) | π ∈ p(u, v, R), (v, w) ∈ R} if u 6= w {(u)} otherwise (5.21)

Then concretization γΠ is defined as:

γΠ: bDΠ7→ DΠ

γΠ(G) = p(s, t, E) (5.22)

We also define the corresponding transformers for both domains. For the concrete domain Dπ, transformer tfΠ is just path-based forward collecting semantics, defined as:

tfΠ: V 7→ DΠ7→ DΠ

tfΠ(u) = λP . {π · u | π ∈ P } (5.23)

Let succΠ(u) = λP . {v | (u, v) ∈ P } denote successors, then the abstract transformer ctfΠ is defined as:

tfΠ: V 7→ bDΠ7→ bDΠ

tfΠ(u) = λG . ((V ∪ {u}, E ∪ {(v, u)}) | v ∈ V ∧ u ∈ succΠ(v)) (5.24) Correctness of the abstraction is easy to see. To be sound, the abstract domain must be a poset, the abstraction must be sound and the transformers must be locally consistent.

1. (bDΠ, vG) is a poset :

The order relation vG is defined as set inclusion: G v G0 ⇔ V (G) ⊆ V (G0) ∧ E(G) ⊆ E(G0). bDΠ is a complete lattice with > = (V, V2) and ⊥ = (V, ∅) with GF G0_{= (V (G) ∪ V (G}0_{), E(G) ∪ E(G}0_)).

2. Abstraction αΠ and concretization γΠ form a sound abstraction:

The abstract transformer is monotone by definition (set union) such that ∀G, G0 ∈ b

DΠ: G v G0 ⇔ ctfΠ(u)(G) v ctfΠ(u)(G0). By construction, a CFG overapproximates sets of paths ∀P ∈ DΠ: P ⊆ γΠ(αΠ(P )).

3. The transfer functions tfΠ and ctfΠ are locally consistent: ∀G ∈ bDΠ: ∀u ∈ V (G) : (tfΠ(u) ◦ γΠ)(G) v (γΠ◦ ctfΠ(u))(G).

Note also that abstraction and concretization form the Galois Insertion (DΠ, α, γ, bDΠ). The computation of all possible paths is infeasible in general and a CFG must be constructed by means of sound heuristics. The problem of control flow reconstruction is to find a suitable approximationsucc_dΠ: V 7→ ℘(V ) such that ∀u ∈ V : succΠ(u) ⊆succdΠ(u), which enables the construction of a sound CFG by Equation 5.24.

Chapter 5. Path Analysis 95

(a) (b)

Figure 5.8: Two example graphs to demonstrate unboundedness and infeasibility

Unbounded Concretization, Imprecision and Flow Facts

Let us characterize the loss of information due to abstraction of paths by means of CFGs. First, it is easy to see that without further constraints, cycles yield unbounded sets of paths

Lemma 5.23 For a cyclic CFG G, its concretization γΠ(G) is unbounded.

Example Consider the CFG illustrated in Figure 5.8a. According to Equation 5.22, concretization yields an unbounded set of unbounded paths:

γΠ(G) = p(s, t, E) = {p(s, u, E) · (t)} = {p(s, s, E) · (u, t), p(s, u, E) · (u, t)} = {(s, u, t), p(s, u, E) · (u, t), p(s, u, E) · (u, u, t), . . . } = {(s, u, t), p(s, s, E) · (u, u, t), p(s, s, E) · (u, u, u, t), . . . } = . . . = {(s, u, t), (s, u, u, t), (s, u, u, u, t), . . . }

Second, another source of imprecision is infeasible (mutually exclusive) paths. Since a CFG does not encode execution history per se, its concretization contains all structurally possible but — under execution — potentially infeasible paths.

Lemma 5.24 Even for a set of acyclic paths Πs,t = {(s, . . . , t)}, it holds that Πs,t ⊆ γΠ(αΠ(Πs,t)).

Example Consider Figure 5.8b, which depicts the CFG αΠ(P ) corresponding to the set of paths P = {(s, a, b, c, t), (s, a, x, b, c, t), (s, a, b, y, c, t)}. Apparently, nodes x and y are mutually exclusive in concrete semantics but concretization yields the set γΠ(G) = Π ∪ {(s, a, x, b, y, c, t)}, which includes all structurally possible paths.

Additional information needed to obtain sound and tight concretization is referred to as flow facts [4, 136–138]. These include — but are not necessarily restricted to — constraints on the repetition of nodes on paths to obtain bounded path sets or possibly denote mutual exclusion. We cumulatively refer to this subset of flow facts as flow constraints.

Chapter 5. Path Analysis 96

Path Expressions

We shall formalize the notion of flow constraints to establish a well-defined connection between program structure, flow constraints and path problems, which subsequently directly leads us to matters of practical path analysis.

Without loss of generality, we assume reducibility (see [9] for irreducible graphs). For practical reasons, we use original notation in the following.

Every path in a CFG G = (V, E, s, t) can be interpreted as a string over its edges3 E. For nodes u, v ∈ V , a path expression [139] is a regular expression [17] P of type (u, v) (written as P(u,v)) such that every string π in the language L(P ) ⊆ V∗ is a path from u

to v. Note that L(P ) equals concretization γΠ(G) (Equation 5.22).

Let P (u, v) be a path expression of type (u, v). Then subexpressions P1 and P2 of P are also path expressions whose type is recursively defined by the productions:

P (u, v) := P1(u, v) ∪ P2(u, v) (5.25)

P (u, w) := P1(u, v) · P2(v, w) (5.26)

P (u, u) := P∗(u, u) (5.27)

These rules define alternative paths (5.25), concatenation (5.26) and repetition (5.27), respectively. Complete path expressions describe all structurally possible (but yet unconstrained) paths of a CFG.

The underlying algebraic structure of path expressions is a Kleene algebra [140] (idempotent semi-ring with additional “Kleene closure” operator) (E, ∪, ∅, ·, ,∗), where ∪ is addition with neutral element ∅, · is multiplication with neutral element (the empty string) and the additional operator ∗, which denotes repetition (Kleene closure). The order of operator precedence is∗ > · > ∪. For convenience, we omit · for multiplication and parenthesis if possible.

The construction of path expressions corresponds to structural analysis of loops since we are not just recovering paths as in concretization γΠ(G) (Equation 5.22) but also represent repetitions more efficiently.

For the CFG with edges classified by DFS, let EF = E \ B = T ∪ F ∪ C refer to the set of non-back edges and let H ⊆ V denote loop heads. Then path expression P for a reducible4 CFG from source s to sink t is recursively defined as:

P (s, t) =          S (u,t)∈EFP (s, u)(u, t) if t /∈ H ∧ s 6= t (S (u,t)∈EF P (s, u)(u, t))P (t, t) if t ∈ H ∧ s 6= t if s = t (5.28) P (h, h) = ( [ (b,h)∈E\EF P (h, b)(b, h))∗ (5.29) 3

Path expression are defined over edges but reduction to nodes is straight forward [105].

Chapter 5. Path Analysis 97

A path expression P (s, t) in the acyclic case (Equation 5.28) is the union of all paths leading to the predecessors u of node t and the edges leading to t. In the cyclic case (Equation 5.29), if some node t is a loop head, then P (s, t) is a prefix of all paths in the loop body (P (h, h)) from the head to its bottoms, back to its head. The expression P (h, b) denotes the kernels of the loop and every exit path is represented by the expression P (h, t) for a head h and some exit node t.

Figure 5.9: Example graph for flow bounds

Example Consider the graph illustrated in Figure 5.9. The minimal path expression for this graph is P = (s, h)((h, a)(a, c) ∪ (h, b)(b, c))∗(c, t). Note that Equation 5.28 yields a much larger but equivalent expression. In particular, common prefixes are not factored out and the final loop iteration is represented explicitly although, in this graph, it equals the kernel expression.

The path language L(P ) is unbounded for cyclic graphs and flow constraints must be introduced to obtain feasible solutions. Let the indicator function 1e be defined as 1e(u) = 1 if (u, ) ∈ E, then the multiplicity (or frequency) mπ of a node u ∈ V on a path π is defined as:

mπ(u) = X

(u, )∈π

1(u, )(u) (5.30)

Given a set of flow constraints C, the subset of structurally possible paths L(P, C) ⊆ L(P ) that satisfy the constraints then is denoted by:

L(P, C) = {π ∈ L(P ) | ∀C ∈ C : C(mπ)} (5.31)

A constraint set C0 is an approximation if L(P, C) ⊆ L(P, C0). A typical approximation of flow bounds constraining frequencies of individual nodes are loop bounds, which only constrain frequencies of loop heads, consequently lifting the specification of constraints to entire loops.

Example Reconsider Figure 5.9. Given constraints C = {0 ≤ mπ(a) ≤ 2, 1 ≤ mπ(b) ≤ 5}. A sound approximation is CL = {min(0, 1) ≤ mπ(h) ≤ max(2, 5)} such that L(P, mathcalC) ⊆ L(P, CL).

Node infeasibility for a node u is obviously expressed as constraint {mΠ(u) = 0}. Path infeasibility can be expressed as mutual exclusion of nodes.

Chapter 5. Path Analysis 98

Example For Figure 5.9, let CX = {¬(mπ(a) > 0 ∧ mπ(b) > 0)}, then it holds that L(P, C ∪ CX) ⊆ L(P, C) ⊆ L(P, CL).

Depending on the constraint model, different degrees of tightness can be achieved. In [137], path expressions are used to define a formal framework for parametric WCET formulae. Besides flow bounds and exclusion constraints, in [141] a model with conditional constraints is proposed that enables the modeling of flows depending on loop iteration numbers, which allows the partitioning of loop iterations. Value constraints are proposed in [138]. This further increases the level of accuracy for value-dependent (dynamic) control flow. In [142], a flow bound model is extended by predicate logic to

express the effect software configurations on path feasibility.

In document Tight integration of cache, path and task-interference modeling for the analysis of hard real time systems (Page 105-110)