Minimisation - Minimisation and enumeration

3.8 Minimisation and enumeration

3.8.4 Minimisation

There are some cases where when an error is discovered during constraint solving it may not be minimal, that is, that some of the labels attributed to the error are extraneous. The minimisation algorithm can be seen in figure 3.19.

The → test relation determines whether a given label can be removed from the set

of labels associated with an error without causing the error to no longer exist in the user program. Let →∗_test be the reflexive and transitive closure of →test.

The process of minimisation is separated into two phases (labelled phase1 and phase2 in figure 3.19 respectively). In phase one binders are turned into dummy binders which can potentially remove large sections of code, then in phase two labels are removed one at a time until the minimal amount of labels are found that are attributed to the error.

A minimisation step is represented as he, l1, {l } ] l2i −−−test→ he, l3, l4i, where l3 and l4

depend upon the solvability of filt(e, l1∪ l2, {l }), referred to now as e0. The full set

of labels for the error the minimiser is working on is the set l1∪ l2∪ {l}, and {l } ] l2

is the label set where discard attempts must still be made. The new environment e0 is obtained from e by filtering out the constraints not labelled by l1∪ l2∪ {l }, by

3.9. SLICING

filtering out out accessors and equality constraints annotated with the label l , and by creating dummy binders and environment variables for binders and environment variables which were annotated with this label. If the new environment is solvable, then label l must be in the error’s label set for the error to occur, and so l3 = l1∪{l }

and l4 = l2, otherwise this label is extraneous and can be removed.

3.8.5 Enumeration

An enumeration step is denoted with the relation →e, with →∗e being its reflexive

(w.r.t. EnumState) and transitive closure. Enumeration states are defined below:

EnumState ::= enum(e) | enum(e, er , l ) | errors(er )

The enumeration process always starts in the state enum(e) and ends in the state errors(er ). The enumeration algorithm creates f ilters, which form the search space built when searching for errors, and starts with the empty filter (which causes all constraints to be considered).

After an error has been found and the minimisation process has been completed, the labels of the error that has been located are used to build new filters (see l0 _in

rule (ENUM4)). When all filters are exhausted the enumeration algorithm stops. After the enumeration algorithm has stopped, the errors that have been found are all the minimal type errors in the analyzed piece of code.

3.9 Slicing

3.9.1 Dot Terms

After an error has been located in the user program, a type error slice is made from the labels and the error kind ek . This is done by the slicing function sl defined in figure 3.25. Any program nodes which are annotated with labels not occurring in the set of labels as part of the error are replaced by “dot” terms, which are used to show that some program nodes have been thrown away as they do not contribute to the error. As an example if a node is removed annotated with label l2 in d1l1()l2el3,

then d1l1_dot-e(∅)el3 _{results. This is displayed as 1 h..i.}

Any syntactic form that can be produced using the grammar rules defined in the combination of figures 3.8 and 3.21 is referred to as a slice, and a type error slice

3.9. SLICING

Figure 3.21 Extension of the syntax and constraint generator to “dot” terms

extension of the constraint syntax LabTyCon ::= · · · | dot-e(−−→term) LabDatCon ::= · · · | dot-e(−−→term) Ty ::= · · · | dot-e(−−→term) ConBind ::= · · · | dot-e(−−→term)

Pat ::= · · · | dot-p(−→pat ) StrDec ::= · · · | dot-d(−−→term) StrExp ::= · · · | dot-s(−−→term)

extension of the constraint generator

(G24)Jdot-d(hterm1, . . . , termni)K = [Jterm1K; · · · ;JtermnK] (G25)_Jdot-p(hpat1, . . . , patni), αK = Jpat1K; · · · ;JpatnK

(G26)_Jdot-s(hterm1, . . . , termni), evK = [Jterm1K; · · · ;JtermnK] (G27)_Jdot-e(hterm1, . . . , termni), αK = [Jterm1K; · · · ;JtermnK]

Figure 3.22 Labelled abstract syntax trees

dot ∈ Dot ::= dotE | dotP | dotD | dotS

node ∈ Node ::= hclass, prod i

tree ∈ Tree ::= hnode, l ,−−→treei | hdot ,−−→treei | id

any slice for which the constraint generation algorithm (which has been extended to dot terms) only generates unsolvable constraints.

An alternative definition of the external labelled syntax presented in 3.8 is given here. In figure 3.22 the labelled abstract syntax trees are defined, where a node in a tree tree can either be a labelled node of the form hnode, l ,−−→treei, an unlabelled ”dot” node of the form hdot,−treei, or a leaf of the form id. toTree is defined in figure−→

3.23 which associates a tree with every term (toTree is also defined on a sequence of terms).

Figure 3.24 also defines the function getDot which generates terms in Dot from nodes. This function is used in the slicing algorithm to generate dot nodes from labelled nodes.

3.9.2 Tidying

Defined here is flat, which flattens a series of terms. For example, flattening h..1..h..()..i..i becomes h..1..()..i. Note that nested dot terms are not always flat-

3.9. SLICING

Figure 3.23 From terms to trees

toTree(tcl) = hhlTc, idi, l , htcii

toTree(dconl) = hhlDcon, idi, l , hdconii

toTree(tvl) = hhty, idi, l , htv ii

toTree(ty1 l

→ ty₂) = hhty, tyArri, l , htoTree(ty1), toTree(ty2)ii

toTree(dty ltcel₎ ₌ _{hhty, tyConi, l , htoTree(ty), toTree(ltc)ii}

toTree(dconl_c) = hhconbind, idi, l , hdconii

toTree(dcon oflty) = hhconbind, conbindOfi, l , hdcon, toTree(ty)ii

toTree(dtv tcel) = hhdatname, datnameConi, l , htv , tcii

toTree(val rec pat = exp)l = hhdec, decReci, l , htoTree(pat), toTree(exp)ii toTree(datatype dn= cb)l = hhdec, decDati, l , htoTree(dn), toTree(cb)ii

toTree(openl strid ) = hhdec, decOpni, l , hstrid ii

toTree(vidl_e) = hhatexp, idi, l , hvid ii

toTree(letldec in exp end) = hhatexp, atexpLeti, l , htoTree(dec), toTree(exp)ii

toTree(fn pat ⇒ exp)l = hhexp, expFni, l , htoTree(pat), toTree(exp)ii

toTree(dexp atexpel) = hhexp, appi, l , htoTree(exp), toTree(atexp)ii

toTree(vidl_p) = hhatpat, idi, l , hvid ii

toTree(dldcon atpat el) = hhpat, appi, l , htoTree(ldcon), toTree(atpat)ii toTree(structure

= hhstrdec, strdecStri, l , hstrid , toTree(strexp)ii strid = strexp)l

toTree(stridl) = hhstrexp, idi, l , hstrid ii

toTree(hterm1, . . . , termni) = htoTree(term1), . . . , toTree(termn)i

toTree(dot-e(−−→term)) = hdotE, toTree(−−→term)i toTree(dot-d(−−→term)) = hdotD, toTree(−−→term)i

toTree(dot-p(−→pat )) = hdotP, toTree(−→pat )i

toTree(dot-s(−−→term)) = hdotS, toTree(−−→term)i toTree(structl

= hhstrexp, strexpSti,

strdec1· · · strdecnend) l , toTree(hstrdec1, . . . , strdecni)i

tened as different semantics can sometimes be produced, for example

h..val x = false..h..val x = 1..i..x + 1..i is not flattened to become

h..val x = false..val x = 1..x + 1..i

as the semantics have changed- the first is a typable slice and the second is not. The predicates below check the classes of trees:

3.9. SLICING

Figure 3.24 Definition of getDot

getDot(hlTc, prod i) = dotE

getDot(hlDcon, prod i) = dotE

getDot(hty, prod i) = dotE

getDot(hconbind, prod i) = dotE

getDot(hdatname, prod i) = dotE

getDot(hdec, prod i) = dotD

getDot(hatexp, prod i) = dotE

getDot(hexp, prod i) = dotE

getDot(hatpat, prod i) = dotP

getDot(hpat, prod i) = dotP

getDot(hstrdec, prod i) = dotD

getDot(hstrexp, prod i) = dotS

Figure 3.25 Slicing algorithm

(SL1) sl(hnode, l ,−−→treei, l ) =      hnode, l , sl1( −−→

tree, l )i, if l ∈ l and getDot(node) 6= dotS

hnode, l , tidy(sl1(

−−→

tree, l ))i, if l ∈ l and getDot(node) = dotS

hgetDot(node), flat(sl₂(−−→tree, l ))i, otherwise

(SL2) sl1(hdot , htree1, . . . , treenii, l ) = hdot , flat(hsl2(tree1, l ), . . . , sl2(treen, l )i)i

(SL3) sl2(hdot , htree1, . . . , treenii, l ) = hdot , flat(hsl2(tree1, l ), . . . , sl2(treen, l )i)i

(SL4) sl1(hnode, l ,

−−→

treei, l ) = sl(hnode, l ,−−→treei, l ) (SL5) sl2(hnode, l ,

−−→

treei, l ) = sl(hnode, l ,−−→treei, l )

(SL6) sl1(htree1, . . . , treeni, l ) = hsl1(tree1, l ), . . . , sl1(treen, l )i

(SL7) sl2(htree1, . . . , treeni, l ) = hsl2(tree1, l ), . . . , sl2(treen, l )i

(SL8) sl1(id , l ) = id

(SL9) sl2(id , l ) = hdotE, hii

isClass(tree, {class} ∪ class) ⇐⇒ tree = hhclass, prod i, l ,−−→treei

declares(tree) ⇐⇒ isClass(tree, {dec, strdec, datname, conbind})

pattern(tree) ⇐⇒ isClass(tree, {atpat, pat})

These can be used to check whether a given tree has any binders (declares), is a pattern (pattern). With these in place, a formal definition of flat is defined below:

flat(hi) = hi flat(htreei@−−→tree) =           

htree1, . . . , treeni@flat(

−−→ tree), if tree = hdot , htree₁, . . . , treenii

and (∀i ∈ {1, . . . , n}.¬declares(tree_i) or −−→tree = hi) htreei@flat(−−→tree), otherwise

A function tidy is also defined which merges dot terms containing declarations in structures. It is defined below.

tidy(hi) = hi

tidy(hhdotD,−−→tree1i, hdotD,

−−→ tree2ii@ −−→ tree) = tidy(hhdotD,−−→tree1@ −−→ tree2ii@ −−→

tree), if ∀tree ∈ ran(−−→tree1).¬declares(tree)

tidy(hhdotD, ∅ii@−−→tree)

= tidy(−−→tree), if none of the above applies tidy(htreei@−−→tree)

3.9. SLICING

3.9.3 Algorithm

Figure 3.25 formally defines the slicing algorithm. In this figure let sl(strdec, l ) be an abbreviation for sl(toTree(strdec), l ).

In document New developments to Skalpel : a type error slicing method for explaining errors in type and effect systems (Page 73-78)