So we have a way of representing programs or values with missing parts: we simply replace the parts we want to delete with appropriately typed holes. To provide the raw components of the differential slices just described, what we now need is a way of determining how much of the output we can compute given only some prefix of the program (forward slicing), and how little of the program is needed if we need only some prefix of the output (backward slicing). It turns out that these problems are so closely related that in fact each determines the other.
5.2.1
Forward dynamic slicing
The intuition we proposed for forward slicing in Chapter2was that if a step in the computation consumes some program part which is unavailable, the output of that step must also be unavailable. In other words, unavailability propagates forward through the computation. This is straightforward to capture by extending the reference semantics ⇓ref with the additional rules for propagating holes given in Figure5.1. Hole itself evaluates to , and moreover for every type constructor, there are variants of the elimination rule which produce a hole whenever the sub-computation in the elimination position produces a hole. The new rules do not affect type preservation (Lemma1) or determinism (Lemma2), and so from now on by ⇓refwe shall mean the extended version of the rules, with these lemmas taken to apply to the new definition.
There are two things to note about the hole-propagation rules. First, no special treatment is required when the argument to a function evaluates to a hole; the behaviour we want in this case is precisely that the unavailability of the argument should only matter if that argument is actually consumed by the execution of the function. Second, the rules for primitive operations take them to be strict in both operands. Even when this is true of an actual implementation, it may not accurately reflect the dependency of the operation on its arguments: for example×need not consult the second argument if the first argument is0. We discuss a
more realistic treatment of primitives in Future Work, §7.2.6.
First we note that, since evaluation with hole-propagation can produce partial values, environments must also be partial, in other words map variables to partial values. This gives rise to a partial order on envi- ronments; specifically, we overload ⊑ to mean the relation that has ρ ⊑ ρ′ iff dom(ρ) = dom(ρ′) and ∀x ∈ dom(ρ).ρ(x) ⊑ ρ′(x). For any Γ, we will write Γ for the smallest partial environment for Γ, namely the ρ such that ρ(x) = for every x ∈ dom(Γ). Again, the set Prefix(ρ) forms a lattice where the join ρ′ ⊔ ρ′′ is the partial environment {x 7→ ρ′(x) ⊔ ρ′′(x) | x ∈ dom(ρ)}, and similarly for meet. Since environments are defined inductively, environment extension with respect to a variable x is a lattice isomorphism in the following sense. Suppose Γ ⊢ ρ and ⊢ v : τ. Then for any x, the bijection −[x 7→ −] from Prefix(ρ) × Prefix(v) to Prefix(ρ[x 7→ v]) satisfies:
(ρ′⊔ ρ′′)[x 7→ u ⊔ u′] = ρ′[x 7→ u′] ⊔ ρ′′[x 7→ u′′] (5.1) (ρ′⊓ ρ′′)[x 7→ u ⊓ u′] = ρ′[x 7→ u′] ⊓ ρ′′[x 7→ u′′] (5.2) Extending evaluation with hole-propagation rules has some important consequences which are sum- marised in Theorem2below. First we define the following family of partial functions indexed by terminating
programs.
Definition 2 (evalρ,e). Suppose ρ, e ⇓ref v. Define evalρ,eto be ⇓refdomain-restricted to Prefix(ρ, e). For readability, we will drop the ρ, e subscript from evalρ,e whenever it is applied to a prefix of (ρ, e) and the (ρ, e) is clear from the context. Now we make three observations. Collectively, they assert that evalρ,e is meet-semilattice homomorphism from Prefix(ρ, e) to Prefix(v). First, evalρ,eis a total function. Because unavailability propagates, introducing a hole into a terminating program cannot yield a non-terminating program but only one which produces less output. Second, least and greatest elements are preserved, which is just immediate from the definitions. Finally, evalρ,e preserves meets, or intersection of slices. This third property implies the monotonicity already alluded to.
Theorem 2 (evalρ,eis a meet-semilattice homomorphism). Suppose ρ, e ⇓ref v. Then: 1. If(ρ′, e′) ⊑ (ρ, e) then eval(ρ′, e′) is defined: there exists u such that ρ′, e′⇓
ref u. 2. eval() = and eval(ρ, e) = v.
3. If(ρ′, e′) ⊑ (ρ, e) and (ρ′′, e′′) ⊑ (ρ, e) then eval(ρ′⊓ ρ′′, e′⊓ e′′) = eval(ρ′, e′) ⊓ eval(ρ′′, e′′). Proof. Part (2) is immediate from the definition of ⇓ref and evalρ,e. For parts (1) and (3), we proceed by induction on the derivation of ρ, e ⇓ref v, using the hole propagation rules from Figure5.1whenever the evaluation would otherwise get stuck, and Equation5.2for the binder cases.
let p:Pair( , ) let sum:+ fst p snd p 18 Pair / * fst p6 100 600 sum 33 / * snd p 12 100 1200 sum 66 Pair( , ) let p:Pair( , ) let sum:+ fst p snd p 18 Pair / * fst p 6 100 600 sum 33 / * snd p12 100 1200 sum 66 Pair( , ) let p:Pair( , ) let sum:+ fst p snd p 18 Pair / * fst p 6 100 600 sum 33 / * snd p 12 100 1200 sum 66 Pair( , )
(a) Without first input (b) Without second input (c) With both inputs Figure 5.2 evalρ,edoes not preserve joins
Although evalρ,epreserves meets, it does not preserve joins, i.e. union of slices, as illustrated in Figure5.2. The program normalises two integers6and12by computing the proportion that each is of their sum and
returning the result as a pair of percentages. In (a), we “damage” the program by (hypothetically) deleting the first input 6. If the sum of the two numbers cannot be calculated, neither can either of the outputs.
The situation is reversed in (b), where we hypothetically delete the second input12. If we combine the two
program slices – by taking their join, in (c) – we repair the ability of the program to compute the sum. But in so doing we also repair the ability of the program to calculate both components of the output. In general,
because removing part of a program can dramatically impair its ability to function, combining program parts can dramatically restore that capability.
If evalρ,e is to provide canonical answers to forward-slicing questions then it must compute as much output as possible for the prefix of(ρ, e) it is given. This is indeed the case, but it will be easier to make sense of this once we have introduced backward slicing.
5.2.2
Backward dynamic slicing
We will now see how forward slicing as just construed uniquely defines the problem of dynamic backward slicing informally sketched in Chapter2. Our intuition there was that backward slicing tells us how much of the program is still needed if we only need some prefix of the output. To state this formally, we need to make precise the notion of there being “enough” program to compute that part of the output; we can do so by appealing to our monotonic forward-slicing function evalρ,e. Suppose ρ, e ⇓refv and some partial output u ⊑ v specifying how much of the output is needed. If (ρ′, e′) is capable of computing at least u, we say that (ρ′, e′) is a slice of (ρ, e) for u.
Definition 3 (Slice of(ρ, e) for u). Suppose ρ, e ⇓ref v and u ⊑ v. Then any (ρ′, e′) ⊑ (ρ, e) is a slice of (ρ, e) for u if eval(ρ′, e′) ⊒ u.
Operationally, the intuition is that it is fine to consume a hole during evaluation as long as we are computing a part of the output that is not needed.
Now, a canonical answer to a backward-slicing question is the smallest program slice for the prefix of v in question. At it happens, the fact that evalρ,epreserves meets guarantees the existence of such a slice. This stems from a basic property of meet-semilattice homomorphisms. If A and B are lattices, then every meet- preserving function f∗ : A → B is the upper adjoint of a unique Galois connection. The lower adjoint of f∗, written f∗: B → A, which preserves joins, inverts f∗in the following minimising way: for any output b of f∗, the lower adjoint yields the smallest input a such that f∗(a) ⊒ b. In fact each adjoint determines the other:
f∗(a) ⊒ b ⇐⇒ a ⊒ f∗(b)
This is easier to understand if we plug in evalρ,e and its lower adjoint, which we shall call unevalρ,e because it maps values to programs. Analogously with evalρ,e we drop the ρ, e subscript from unevalρ,e when the argument is a prefix of v and the (ρ, e) is clear from the context.
Corollary 1 (Existence of least program slices). Supposeρ, e ⇓ref v. Then there exists a unique function unevalρ,efrom Prefix(v) to Prefix(ρ, e) such that for any (ρ′, e′) ⊑ (ρ, e) and any u ⊑ v we have:
eval(ρ′, e′) ⊒ u ⇐⇒ (ρ′, e′) ⊒ uneval(u) Proof. Immediate from Theorem2.
For every prefix of the program there is a largest output slice which consumes at most that much of the program; and for every prefix of the output there is a smallest program slice which produces at least that much output.
Figure 5.3 Closure under meet of slices of(ρ, e) for u
It is instructive to consider why the meet-preservation of evalρ,e ensures that unevalρ,eexists. Let S be the set of all slices of (ρ, e) for some fixed u and let (ρ′, e′) be their meet. Because evaluation preserves meets, (ρ′, e′) evaluates to the meet u′of the values that the elements of S evaluate to. But all these values are larger than u, and therefore so is u′. Thus (ρ′, e′) is itself an element of S, namely the smallest one, so we can set this to be the value of uneval(u). This is depicted informally in Figure5.3. The larger diamond on the left is the lattice Prefix(ρ, e); the smaller diamond on the right is the lattice Prefix(v). For this particular u, there are exactly three elements of S, indicated by the three points in the left-hand lattice.
Thus unevalρ,esatisfies the following:
unevalρ,e(u) =l{(ρ′, e′) ∈ Prefix(ρ, e) | evalρ,e(ρ′, e′) ⊒ u} (5.3) and principle uneval(u) could be calculated by enumerating all the program slices for u and taking their meet. In the next section we will see a better approach.
Before we move on we contrast the behaviour of backward slicing with forward slicing. Whereas evalρ,e preserves meets and not joins, unevalρ,epreserves joins and not meets. Figure5.4revisits the normalisation example from Figure5.2to give an example of how meets are not preserved. Since we are backward-slicing, we manipulate the demand on the output of the computation. In (a), we relinquish demand on the first output33. Although we then do not need the entire(fst p) * 6 / sumcalculation associated with it, we
still need to calculate the sum because we need it for the second output, and this in turn means we still need both inputs. The situation in reversed in (b), where we relinquish demand on the second output66. But
when we combine these demand absences – by taking the meet of the output slices, as in (c) – we no longer need the sum, nor indeed either input. Thus backward slicing exhibits a kind of conservativity: part of the program can be relinquished only if it is not needed anywhere.
let p:Pair( , ) let sum:+ fst p snd p 18 Pair / * fst p 6 100 600 sum 33 / * snd p 12 100 1200 sum 66 Pair( , ) let p:Pair( , ) let sum:+ fst p snd p 18 Pair / * fst p 6 100 600 sum 33 / * snd p12 100 1200 sum 66 Pair( , ) let p:Pair( , ) let sum:+ fst p snd p 18 Pair / * fst p 6 100 600 sum 33 / * snd p 12 100 1200 sum 66 Pair( , )
(a) Without first output (b) Without second output (c) With neither output Figure 5.4 unevalρ,edoes not preserve meets
5.2.3
Differential slices
In Chapter2, §§2.5and2.6, we showed several examples where the user selected some part of the program or output in order to initiate a forward or backward dynamic slice. The red highlighting on the selected node was explained as a deletion delta: a comparison between the present state, and a hypothetical future state in which that node was deleted. The red parts pick out the complement of a prefix of the expression, a kind of negated slice capturing the difference between two expressions related by ⊑. The complement of a partial expression is not itself a partial expression, since the underlying set of paths is not prefix-closed.
In our implementation, these deltas that arise during slicing are just a special case of the more general form of delta illustrated in §2.2, which can describe complex structural reorganisations. We will be looking at these in detail in Chapter 6. But it is possible to explain differential slices without recourse to the more involved techniques presented there.
A deletion delta can be expressed in the formalism presented so far as a pair of partial expressions (e, e′) where e ⊑ e′. We call such a pair a differential slice. More precisely, for an expression e, we define Diff(e) to be the following sub-lattice of Prefix(e) × Prefix(e):
Definition 4 (Differential slice). Define Diff(e) to be the lattice with carrier set {(e′, e′′) | e′ ⊑ e′′ ⊑ e} and meet and join defined component-wise.
Moreover, the Galois connection for slices of a terminating computation lifts to differential slices in the natural way, yielding differential slices that are minimal because their components are, and because pairing preserves ⊓ and ⊔.
Definition 5 (Differential slicing). Supposeρ, e ⇓ref v. Then define the Galois connection heval, unevaliDiff(ρ,e) from Diff(ρ, e) to Diff(v), where
eval= evalρ,edef × evalρ,edomain-restricted to Diff(ρ, e) uneval= unevalρ,edef × unevalρ,edomain-restricted to Diff(v)
The larger component e′of a differential slice (e, e′) is the present state; the smaller component e is the future state relative to which e′ is being compared. Because e and e′ are related by ⊑, calculating the difference
between them is easy. We simply traverse e and e′simultaneously, identifying sub-expressions present in e′ but absent in e. Nodes unique to the present are scheduled for deletion in the future and highlighted in red in the user interface.