3.8 Minimisation and enumeration
3.8.4 Minimisation
There are some cases where when an error is discovered during constraint solving it may not be minimal, that is, that some of the labels attributed to the error are extraneous. The minimisation algorithm can be seen in figure 3.19.
The → test relation determines whether a given label can be removed from the set
of labels associated with an error without causing the error to no longer exist in the user program. Let →∗test be the reflexive and transitive closure of →test.
The process of minimisation is separated into two phases (labelled phase1 and phase2 in figure 3.19 respectively). In phase one binders are turned into dummy binders which can potentially remove large sections of code, then in phase two labels are removed one at a time until the minimal amount of labels are found that are attributed to the error.
A minimisation step is represented as he, l1, {l } ] l2i −−−test→ he, l3, l4i, where l3 and l4
depend upon the solvability of filt(e, l1∪ l2, {l }), referred to now as e0. The full set
of labels for the error the minimiser is working on is the set l1∪ l2∪ {l}, and {l } ] l2
is the label set where discard attempts must still be made. The new environment e0 is obtained from e by filtering out the constraints not labelled by l1∪ l2∪ {l }, by
3.9. SLICING
filtering out out accessors and equality constraints annotated with the label l , and by creating dummy binders and environment variables for binders and environment variables which were annotated with this label. If the new environment is solvable, then label l must be in the error’s label set for the error to occur, and so l3 = l1∪{l }
and l4 = l2, otherwise this label is extraneous and can be removed.
3.8.5
Enumeration
An enumeration step is denoted with the relation →e, with →∗e being its reflexive
(w.r.t. EnumState) and transitive closure. Enumeration states are defined below:
EnumState ::= enum(e) | enum(e, er , l ) | errors(er )
The enumeration process always starts in the state enum(e) and ends in the state errors(er ). The enumeration algorithm creates f ilters, which form the search space built when searching for errors, and starts with the empty filter (which causes all constraints to be considered).
After an error has been found and the minimisation process has been completed, the labels of the error that has been located are used to build new filters (see l0 in
rule (ENUM4)). When all filters are exhausted the enumeration algorithm stops. After the enumeration algorithm has stopped, the errors that have been found are all the minimal type errors in the analyzed piece of code.
3.9
Slicing
3.9.1
Dot Terms
After an error has been located in the user program, a type error slice is made from the labels and the error kind ek . This is done by the slicing function sl defined in figure 3.25. Any program nodes which are annotated with labels not occurring in the set of labels as part of the error are replaced by “dot” terms, which are used to show that some program nodes have been thrown away as they do not contribute to the error. As an example if a node is removed annotated with label l2 in d1l1()l2el3,
then d1l1dot-e(∅)el3 results. This is displayed as 1 h..i.
Any syntactic form that can be produced using the grammar rules defined in the combination of figures 3.8 and 3.21 is referred to as a slice, and a type error slice
3.9. SLICING
Figure 3.21 Extension of the syntax and constraint generator to “dot” terms
extension of the constraint syntax LabTyCon ::= · · · | dot-e(−−→term) LabDatCon ::= · · · | dot-e(−−→term) Ty ::= · · · | dot-e(−−→term) ConBind ::= · · · | dot-e(−−→term)
DatName ::= · · · | dot-e(−−→term) Dec ::= · · · | dot-d(−−→term) AtExp ::= · · · | dot-e(−−→term) Exp ::= · · · | dot-e(−−→term) AtPat ::= · · · | dot-p(−→pat )
Pat ::= · · · | dot-p(−→pat ) StrDec ::= · · · | dot-d(−−→term) StrExp ::= · · · | dot-s(−−→term)
extension of the constraint generator
(G24)Jdot-d(hterm1, . . . , termni)K = [Jterm1K; · · · ;JtermnK] (G25)Jdot-p(hpat1, . . . , patni), αK = Jpat1K; · · · ;JpatnK
(G26)Jdot-s(hterm1, . . . , termni), evK = [Jterm1K; · · · ;JtermnK] (G27)Jdot-e(hterm1, . . . , termni), αK = [Jterm1K; · · · ;JtermnK]
Figure 3.22 Labelled abstract syntax trees
class ∈ Class ::= lTc | lDcon | ty | conbind | datname | dec | atexp
| exp | atpat | pat | strdec | strexp
prod ∈ Prod ::= tyArr | tyCon | conbindOf | datnameCon | decRec | decDat
| decOpn | atexpLet | expFn | strdecDec | strdecStr | strexpSt | id | app | seq
dot ∈ Dot ::= dotE | dotP | dotD | dotS
node ∈ Node ::= hclass, prod i
tree ∈ Tree ::= hnode, l ,−−→treei | hdot ,−−→treei | id
any slice for which the constraint generation algorithm (which has been extended to dot terms) only generates unsolvable constraints.
An alternative definition of the external labelled syntax presented in 3.8 is given here. In figure 3.22 the labelled abstract syntax trees are defined, where a node in a tree tree can either be a labelled node of the form hnode, l ,−−→treei, an unlabelled ”dot” node of the form hdot,−treei, or a leaf of the form id. toTree is defined in figure−→
3.23 which associates a tree with every term (toTree is also defined on a sequence of terms).
Figure 3.24 also defines the function getDot which generates terms in Dot from nodes. This function is used in the slicing algorithm to generate dot nodes from labelled nodes.
3.9.2
Tidying
Defined here is flat, which flattens a series of terms. For example, flattening h..1..h..()..i..i becomes h..1..()..i. Note that nested dot terms are not always flat-
3.9. SLICING
Figure 3.23 From terms to trees
toTree(tcl) = hhlTc, idi, l , htcii
toTree(dconl) = hhlDcon, idi, l , hdconii
toTree(tvl) = hhty, idi, l , htv ii
toTree(ty1 l
→ ty2) = hhty, tyArri, l , htoTree(ty1), toTree(ty2)ii
toTree(dty ltcel) = hhty, tyConi, l , htoTree(ty), toTree(ltc)ii
toTree(dconlc) = hhconbind, idi, l , hdconii
toTree(dcon oflty) = hhconbind, conbindOfi, l , hdcon, toTree(ty)ii
toTree(dtv tcel) = hhdatname, datnameConi, l , htv , tcii
toTree(val rec pat = exp)l = hhdec, decReci, l , htoTree(pat), toTree(exp)ii toTree(datatype dn= cb)l = hhdec, decDati, l , htoTree(dn), toTree(cb)ii
toTree(openl strid ) = hhdec, decOpni, l , hstrid ii
toTree(vidle) = hhatexp, idi, l , hvid ii
toTree(letldec in exp end) = hhatexp, atexpLeti, l , htoTree(dec), toTree(exp)ii
toTree(fn pat ⇒ exp)l = hhexp, expFni, l , htoTree(pat), toTree(exp)ii
toTree(dexp atexpel) = hhexp, appi, l , htoTree(exp), toTree(atexp)ii
toTree(vidlp) = hhatpat, idi, l , hvid ii
toTree(dldcon atpat el) = hhpat, appi, l , htoTree(ldcon), toTree(atpat)ii toTree(structure
= hhstrdec, strdecStri, l , hstrid , toTree(strexp)ii strid = strexp)l
toTree(stridl) = hhstrexp, idi, l , hstrid ii
toTree(hterm1, . . . , termni) = htoTree(term1), . . . , toTree(termn)i
toTree(dot-e(−−→term)) = hdotE, toTree(−−→term)i toTree(dot-d(−−→term)) = hdotD, toTree(−−→term)i
toTree(dot-p(−→pat )) = hdotP, toTree(−→pat )i
toTree(dot-s(−−→term)) = hdotS, toTree(−−→term)i toTree(structl
= hhstrexp, strexpSti,
strdec1· · · strdecnend) l , toTree(hstrdec1, . . . , strdecni)i
tened as different semantics can sometimes be produced, for example
h..val x = false..h..val x = 1..i..x + 1..i is not flattened to become
h..val x = false..val x = 1..x + 1..i
as the semantics have changed- the first is a typable slice and the second is not. The predicates below check the classes of trees:
3.9. SLICING
Figure 3.24 Definition of getDot
getDot(hlTc, prod i) = dotE
getDot(hlDcon, prod i) = dotE
getDot(hty, prod i) = dotE
getDot(hconbind, prod i) = dotE
getDot(hdatname, prod i) = dotE
getDot(hdec, prod i) = dotD
getDot(hatexp, prod i) = dotE
getDot(hexp, prod i) = dotE
getDot(hatpat, prod i) = dotP
getDot(hpat, prod i) = dotP
getDot(hstrdec, prod i) = dotD
getDot(hstrexp, prod i) = dotS
Figure 3.25 Slicing algorithm
(SL1) sl(hnode, l ,−−→treei, l ) = hnode, l , sl1( −−→
tree, l )i, if l ∈ l and getDot(node) 6= dotS
hnode, l , tidy(sl1(
−−→
tree, l ))i, if l ∈ l and getDot(node) = dotS
hgetDot(node), flat(sl2(−−→tree, l ))i, otherwise
(SL2) sl1(hdot , htree1, . . . , treenii, l ) = hdot , flat(hsl2(tree1, l ), . . . , sl2(treen, l )i)i
(SL3) sl2(hdot , htree1, . . . , treenii, l ) = hdot , flat(hsl2(tree1, l ), . . . , sl2(treen, l )i)i
(SL4) sl1(hnode, l ,
−−→
treei, l ) = sl(hnode, l ,−−→treei, l ) (SL5) sl2(hnode, l ,
−−→
treei, l ) = sl(hnode, l ,−−→treei, l )
(SL6) sl1(htree1, . . . , treeni, l ) = hsl1(tree1, l ), . . . , sl1(treen, l )i
(SL7) sl2(htree1, . . . , treeni, l ) = hsl2(tree1, l ), . . . , sl2(treen, l )i
(SL8) sl1(id , l ) = id
(SL9) sl2(id , l ) = hdotE, hii
isClass(tree, {class} ∪ class) ⇐⇒ tree = hhclass, prod i, l ,−−→treei
declares(tree) ⇐⇒ isClass(tree, {dec, strdec, datname, conbind})
pattern(tree) ⇐⇒ isClass(tree, {atpat, pat})
These can be used to check whether a given tree has any binders (declares), is a pattern (pattern). With these in place, a formal definition of flat is defined below:
flat(hi) = hi flat(htreei@−−→tree) =
htree1, . . . , treeni@flat(
−−→ tree), if tree = hdot , htree1, . . . , treenii
and (∀i ∈ {1, . . . , n}.¬declares(treei) or −−→tree = hi) htreei@flat(−−→tree), otherwise
A function tidy is also defined which merges dot terms containing declarations in structures. It is defined below.
tidy(hi) = hi
tidy(hhdotD,−−→tree1i, hdotD,
−−→ tree2ii@ −−→ tree) = tidy(hhdotD,−−→tree1@ −−→ tree2ii@ −−→
tree), if ∀tree ∈ ran(−−→tree1).¬declares(tree)
tidy(hhdotD, ∅ii@−−→tree)
= tidy(−−→tree), if none of the above applies tidy(htreei@−−→tree)
3.9. SLICING
3.9.3
Algorithm
Figure 3.25 formally defines the slicing algorithm. In this figure let sl(strdec, l ) be an abbreviation for sl(toTree(strdec), l ).