6.6 Value Analysis
6.6.3 Abstract Domain
The abstract interpreter used for value analysis is obtained by induction as an abstraction of the instruction semantics using intervals of 32-bit values as the abstract domain. Intervals are abstract values ν ∈ V(W), in such a way that V can be parametrized to contain the
32-bit values defined in the domain W. Hence, the value analysis is defined in terms of the lattice V(W), with the least upper bound operator (t]
ν) and the greatest lower bound
operator (u] ν):
V(W) = {⊥]} ∪ {[l, u] | l ∈ W ∪ {−∞} ∧ u ∈ W ∪ {+∞} ∧ l 6 u}
[l1, u1] t]ν[l2, u2] = [min(l1, l2), max(u1, u2)] (6.8)
[l1, u1] u]ν[l2, u2] = [max(l1, l2), min(u1, u2)] (6.9)
The Haskell definition of abstract domain V(W) is given by the constructor AbstValusing as argument an Interval stored inside an analysis general-purpose register (R0-R10). The limit values −∞ and +∞ correspond to the minimum and maximum values of a signed 32-bit word. The registers used by the static analyzer to store control information (R11- R15) use the constructor of concrete Word32 values ConcVal. For the reasons previously mentioned, an explicit constructor of the abstract value of the register CPSR is not yet present in the definition of RegVal, but it will be included in the future re-definition of RegVal, given in Section 6.6.4.2, when the process of backward abstract interpretation of conditional instructions is described. For now, we assume that the CPSRregister stores a concrete 32-bit value.
data RegVal = AbstValInterval| ConcVal Word32 | Bottom typeInterval= (Word32, Word32)
Although the inductive Haskell type constructorRegVal is able to “algebraically” compose the constructors AbstVal and ConcVal, two elements of these two different types are not comparable by design. Therefore, the inductive constructor RegVal does not denote a lattice, in the sense that is not a complete partial order. However, using the coalesced domain definition given in [13], the same constructor can indeed denote the coalesced domain
Interval + Word32, lifted with a common undefined element Bottom. Nonetheless, the
instance of the type class (Latticea) must be defined for the co-product datatypeRegVal. The next step is to define the lattice of interval values, ν ∈ V(W). Despite the fact that the elements of W are unsigned 32-bit values, we are still interested in using the interval arithmetics for integer values [45,73]. To this end, we have defined the functiontoInt32 that
converts a 32-bit unsigned word into an Integer. This conversion is straightforward: the maximum number of an 32-bit unsigned word is 4294967296; an unsigned word is converted into a negative or positive integer by dividing the maximum number by 2 and subtracting 1, which gives 2147483647, and taking the difference to the maximum number. The Haskell functionfromIntegral is used to convert the type Word32into the typeInteger.
toInt32(w :: Word32) = let w0=f romIntegralw :: Integer in if w0 >2147483647
then w0 −4294967296 else w0
The instantiation of the type class Lattice for the typeInterval uses the functiontoInt32 so
that the join and meet operators defined in (6.8) and (6.9) can be directly implemented. For this purpose, the functions meet was added to the definition of (Latticea). Afterwards, the results are converted back to the Word32type by means of the functionfromIntegral. Note
that the following instance of Lattice does not implement the function bottom because the
undefined element is only defined for the coalesced domainAbstVal+ ConcVal, and not for each one of the composed domains.
instanceLattice Intervalwhere
join(a, b) (c, d )
= let (a0, b0) = (toInt32a,toInt32b) (c0, d0) = (toInt32c,toInt32d )
in (f romIntegral(mina0c0) :: Word32,f romIntegral(maxb0d0) :: Word32)
meet(a, b) (c, d )
= let (a0, b0) = (toInt32a,toInt32b) (c0, d0) = (toInt32c,toInt32d )
in (f romIntegral(maxa0c0) :: Word32,f romIntegral(minb0d0) :: Word32)
On the one hand, assuming that the chaotic fixpoint strategy allows the static analysis to mimic the program execution, the partial order vδ
W on elements of the concrete domain W is
induced by the coeficient δ used in definition (5.15), which indicates the number of fixpoint iterations already performed, so that when we write that a vδ
Wb implies that a t δ
Wb = b, it
means that a is a value computed during iteration δ and b is a value computed during iteration δ + 1. The same applies to the greatest lower bound operator uδ
W. On the other hand, the
abstract domain V(W) has the least upper bounds of Def. (6.8) and the greatest lower bounds of Def. (6.9). Finally, the instance ofLatticefor the coalesced inductive constructor RegVal is (for sake of simplicity the definitions of join and meet involving the Bottom constructor are omitted here):
instanceLatticeRegVal where
bottom= Bottom
join(ConcVal a) (ConcVal b) = ConcVal b
join(AbstVal a) (AbstVal b) = AbstVal (joina b)
meet(ConcVal a) (ConcVal b) = ConcVal b
meet(AbstVal a) (AbstVal b) = ifdisjointa b then Bottom
else AbstVal (meeta b)
The definition of the functionmeet requires an auxiliary function designated by disjointthat
returns True if two intervals given as inputs do not intersect. In such cases, the greatest lower bound on the two intervals is, by definition, Bottom.
disjoint::Interval→Interval→Bool
disjoint(a, b) (c, d )
= let (a0, b0) = (toInt32a,toInt32b) (c0, d0) = (toInt32c,toInt32d ) inmaxa0c0> minb0d0
As expected, the Galois connection used for value analysis only considers the subset of registers that store abstract values, ν ∈ V(W). Therefore, from the previously defined set of register names N, we now define a subset NVof register names to each the interval abstraction
applies. Next, we formulate the correctness of the interval abstraction as a Galois connection [73]. The approximation of sets of concrete values, defined as elements of the powerset lattice ℘(W), into intervals inside V(W), is defined by the Galois connection hα, γi:
h℘(W), ⊆i −−−→←−−−
α γ
hV(W), v]νi (6.10)
The definitions of α and γ for the interval abstraction are: α(S) =
⊥]ν, if S = ∅
[a, b], if min(S) = a and max(S) = b (6.11) γ(i) = ∅, if i = ⊥]ν {w ∈ W | a 6 w 6 b}, if i = [a, b] (6.12)
The previous formal definitions of α and γ are in direct correspondence to their Haskell definitions, abst and conc, respectively.
abst:: [Word32] → RegVal
abst[ ] = Bottom
absts = AbstVal (minimums,maximums)
conc:: RegVal → [Word32]
concBottom = [ ]
conc(AbstVal (l , u)) = [l . . u ]
The abstract definition R]of the concrete register set R is obtained by the composition of two
abstractions: the first abstraction is called non-relational because all possible relationships between the register values are lost in the abstraction [30]; the second abstraction is called codomain abstraction as it based on the Galois connection (6.10), defined for content of each particular register.
The non-relational abstraction is defined by the Galois connection hαr, γriand approximate
properties of register sets by ignoring relationships between the possible values associated to register names:
h℘(NV7→ W), ⊆i −−−→←−−−
αr
γr
hNV7→℘(W), ˙⊆i. (6.13)
This abstraction approximates sets of concrete maps R\, ℘(N
V 7→ W) to a non-relational
collecting concrete semantics R\
r, NV7→℘(W). Given the register set ρ ∈ R, the definitions
of the Galois connection is the following:
αr(R\) = λn ∈ NV•{ρ(n) | ρ ∈ R \}
γr(R\r) = {ρ | ∀n ∈ NV: ρ(n) ∈ R \ r(n)}
where the pointwise ordering ˙⊆ is defined by: R\ r ˙⊆ R 0\ r , ∀n ∈ N : R\r(n) ⊆ R 0\ r(n).
The codomain abstraction is defined by the Galois connection hαc, γci and approximate the
codomain of R\
r, designated as Rν], using the Galois connection hα, γi of Def. (6.10):
hNV7→℘(W), ˙⊆i −−−→←−−−α c γc hNV7→ V(W), ˙v ] νi (6.14) where αc(R\r) , α ◦ Rr\, γc(R]ν) , γ ◦ R]ν, R] ν ˙v ] ν R 0] ν , ∀n ∈ NV: R ] ν(n) v]ν R 0] ν(n).
Therefore, the abstract register set R]
ν , NV 7→ V(W) is defined to be a complete lattice
for the pointwise ordering ˙v]ν. Finally, the composition of the non-relational and codomain
abstractions is given by the Galois connection h ˙α, ˙γi: h℘(NV7→ W), ⊆i −−−→←−−− ˙ α ˙ γ hNV7→ V(W), ˙v ] νi (6.15) where ˙α(R\) , α c◦αr(R\) = λn ∈ NV•α({ρ(n) | ρ ∈ R\}), ˙γ(R] ν) , γr◦γc(R]ν) = {ρ | ∀n ∈ NV: ρ(n) ∈ γ(R ] ν(n))}.
Finally, we have to define the abstract register domain for the entire set of register names N. Let NW= N \ NV be the set of registers storing “concrete” control information, obtained as
the set difference between N and NV. Now let Rw , NW 7→ W. The coalesced (coproduct)
type constructor AbstVal+ ConcVal is then formally defined the disjoint union of the two previously defined maps, R]
ν and Rw: R] , Rν] U Rw. Hence, the Haskell definition of R]
given in the Section6.3.1 must be re-defined to include in the domain all the possible sorts of abstract/concrete values:
typeR]=ArrayRegisterName RegVal