Abstract Interpretation of Basic Program Units

An instance of the data-flow model to a particular program is built from a set of nodes N and a set of edges which correspond to the set of program labels L ∈ (N 7→ ℘(N )). We assume that the resulting model instance is a connected directed graph hN , Li. In [32], the following elementary program units were defined by Cousot: a single entry node, exit nodes, assignment nodes, test nodes, simple junction nodes and loop junction nodes. As required by Kleene’s first recursion theorem, the evaluation of expressions that can change the environment and test nodes have no side effects.

The abstract interpretation of basic program units is given by an interpretation I]_{. Given}

any assignment or test node n ∈ N and an input state vector σ ∈ Σ, I] _{returns an output}

context I]_{(n, σ}0_{) or two different output contexts when n is a test node. Since any state}

vector contains all the labelled abstract environments, each one identified by some node n, the consistency of I] _{with respect to the collecting interpretation I, which is given by}

I ⊆ hα, γiI]_{, imposes local consistency of the interpretations I and I}] _{at the level of basic}

language constructs. Let Na be the set of assignment nodes. Then, for all n ∈ Na, an

assignment is of the form:

v

= f (v1, . . . , v

where (v, v1, . . . , v2) ∈ Vm+1 and f(v1, . . . , vm) is an expression of the language depending

on the variables v, v1, . . . , v2. Let ρ be the abstract environment found in the state vector

σ ∈ Σ at the program node n. Let ~n be the outgoing node of the assignment node n. We use the square bracket notation map[k 7→ e] to denote the update of the contents of a map at the key k with the new element e. For a program states map (σ) and for the an abstract environment map (ρ), it follows that:

∀σ ∈ Σ, ∀i ∈ V, i 6= v =⇒ I]_{(n, σ)(i) = σ and} _(3.13)

∀σ ∈ Σ, I]_{(n, σ)(v) = σ [ ~n 7→ ρ [ (v 7→ α ({f(v}

1, . . . , vm) | (v1, . . . , vm) ∈

γ (ρ(v1)) × · · · × γ(ρ(vm))}) ] ] (3.14)

The absence of side effects in the abstract interpretation of the expression f(v1, . . . , v2) is

expressed by Def. (3.13). The local consistency of I]_{is expressed by Def. (}_3.14_{): the abstract}

value p]_{of v in the output environment is the abstraction of the set of values of the expression}

f(v1, . . . , v2) when the values (v1, . . . , v2) are chosen from the input abstract context ρ such

that ρ0 _{= ρ[v 7→ p}]_{]; finally, the output state vector σ}0 _{is updated with the output abstract}

environment at the program label ~n, such that σ0_{= σ[~n 7→ ρ}0_].

Let Nt be the set of test nodes. Then, for all n ∈ Ntand given an input state vector σ ∈ Σ,

the abstract interpretation I]_{(n, σ) results of two output state vectors σ}

T and σF associated

with the “true” and “false” edge respectively:

Q(v1, . . . , v

σF σT

where Q(v1, . . . , vm) is a boolean expression without side-effects depending on the variables

v1, . . . , vm. Let nT and nF be the program labels of the outgoing nodes of the test node

n ∈ Nt. Also let ρ be the abstract environment found in the input state vector σ at the label

n. Then, we define I]_{(n, σ) = (σ}

T, σF) such that for all v ∈ V we have:

σT(v) = σ [ nT 7→ρ [ v 7→ α({t | t ∈ γ(ρ(v)) ∧ (∃(v1, . . . , vm) ∈

γ(ρ(v1)) × · · · × γ(ρ(vm)) | Q(v1, . . . , v2))}) ] ] (3.15)

σF(v) = σ [ nF 7→ρ [ v 7→ α({t | t ∈ γ(ρ(v)) ∧ (∃(v1, . . . , vm) ∈

γ(ρ(v1)) × · · · × γ(ρ(vm)) | ¬Q(v1, . . . , v2))}) ] ] (3.16)

In this way we achieve a path-sensitive data-flow analysis where on the “true” edge the abstract value of a variable v is the abstraction of the set of values t chosen in the input context ρ, for which the evaluation of the predicate Q may yield the boolean value True. In the converse path, the state vector contains the abstract value of the variable v when the predicate Q yields to False.

As already mentioned, an abstract interpretation I] _{is the least fixpoint solution of a system}

of recursive data-flow equations. In practice, these equations are iteratively evaluated according to an order of information propagation, in terms of program labels N , across the basic program units. To this end, the transitive closure of the state transition function is used. Therefore, an abstract interpretation amounts to the computation of the limits of Kleene’s sequences along all paths allowed on the program.

There are two possible solutions to the system of recursive equations [98]. The merge over all path (MOP) computes all possible functional compositions of the state transition function along all program paths. For each node n ∈ N inside a path, the transition function is evaluated for every input state and the resulting outputs are combined with the least upper bound operator F. In general there is an infinite number of program paths, which makes the MOP solution not computable. The solution to this problem is to compute an approximation of the MOP optimal solution called minimum fixed point (MFP). In this case, the least upper bound of the incoming states is computed before applying the next state transition function. To demonstrate the computation of the MFP solution, we next describe the abstract interpretation of an execution path that eventually follows from a test node. From the test node, two execution paths are created, each one to be evaluated in pseudo-parallel until reaching an exit node, in which case the execution of the path ends, or a junction node, in which case pseudo-parallel execution paths are synchronized. As describe before, in order to compute the output state vector of a junction node, we must first compute the least upper bound of the input state vectors of the incoming edges that may be reached by an execution path. For a simple junction node n, we combine all input state vectors in point-wise form for the program labels {1, 2, . . . , m}:

. . . σ = ˙Fi∈[1,m]σi n σ2 σm σ1

The limit of a Kleene’s execution sequence in the abstract domain is computed by a transitive closure that traverses the direct graph (N , L) and applies the state transformation function next to each of the different types of nodes. Such transformation function specifies the state(s) for the outgoing edges of the node, in terms of the state(s) associated with the incoming edges to the node. The algorithm recursively apply functional applications of the state transformation function until all abstract environments ρ inside a state vector σ stabilizes with respect ˙F (the point-wise version of the least upper bound F). The proof of the termination of this algorithm comes from the fact that, in one hand, sequences of state vectors form a strictly ascending chain, possibly after using widening/narrowing operators. Moreover, we assume that every loop contains a junction node.

The specification of an order of information propagation may lead to the instantiation of an optimal transitive closure algorithm. In [20], the notion of weak topological order (w.t.o.) is defined in terms of a total dominance order that exists between the basic program units of a particular program, according to the information contained in the set of program labels N . Let i → j denote an edge where is possible to jump from point i to point j by executing a single program step. An edge u → v such that v u is called a feedback edge. For example, a program loop is a well-parenthesized component in the w.t.o. by having the program point v as its head and containing the program point u.

The main advantage of dominance order is that any sequential algorithm `a la Gauss-Seidel [36] can be used to compute the limits of Kleene sequences. Such algorithms are called chaotic iteration algorithms and differ in the particular choice of the order in which the data-flow equations are applied. In [20] are enumerated two iteration strategies: (1) the iterative strategy simply applies the equations in sequence and stabilizes outermost components; (2) the recursive strategy recursively stabilizes the subcomponents of every component every time the component is stabilized. Of major relevance in our work is the fact that with the recursive strategy, the stabilization of a component of a w.t.o. can be detected by stabilization of its head.

In document Semantics-based Program Verification: an Abstract Interpretation Approach (Page 30-33)