Typed rigid E-unification - Challenges for complete type inference

6.3 Challenges for complete type inference

6.3.1 Typed rigid E-unification

First, most treatments of unification deal with untyped terms and attempt to find most general unifiers (which typically contain un-assigned unification variables). When using unification in a dependent language however, leaving unassigned unification variables is not acceptable, since they can correspond to unproved lemmas. So in

order to accept a program as well-typed, the typechecker needs to find a ground sub- stitution for every unification variable. This is a source of undecidability, since it is undecidable whether a given type has inhabitants or not.

Even if we do not require the algorithm to pick ground unifiers, in a typed setting we still need to me make sure to only produced well-typed unifiers. A priori, having type information available could either help or hurt. If we can rule out candidate substitutions because they don’t have the right type, then types are helpful. On the other hand, if we must consider all the same candidate solutions as for the untyped problem, and then additionally check that they are well-typed, then the types slow us down.

Gallier and Isakowitz [55] studied rigid E-unification for a simply-typed system, and found that type information could be used to prune the search early. In that setting, when considering candidates for a unification variable XB, we need only consider expressions of type B.

Unfortunately, with dependent types, we are not so lucky. Consider the context

f : [T:Type] → T → Type A : Type, B : Type, y : A, X : B, h1 : A = f [A] y , h2 : f [B] X = B

It is not the case that Γ ` y : B. However, {y/X}Γ ` y : B. In other words, it is not sufficient to consider only expressions which have type B, because the type of the expression may change after we carry out the instantiation. This context is not inconsistent, one can inhabit it e.g. by setting Aand B toNatand f to the constant

Nat function. So this style of example seems hard to avoid.

Performance The Zombie implementation is very cavalier about creating well- typed substitutions. While the elaborator tags each unification variable with an expected type, the unification code currently ignores those annotations, and it is possible to construct examples where the equational constraint get satisfied by an ill-typed term. (Ill-typed unifiers will be detected by the typechecker for the core language.) One annoying example which sometimes happens in practice is due to erased type casts: the terms a.v and a are propositionally equal but have different types, and the unifier can get confused about which one to pick for a given unification variable.

This limitation is partly due to performance concerns. In general, in order to cor- rectly check thata has the right type one should take congruence classes into account.

(Requiring that types of a and the ascribed type of the unification variable are syn- tactically identical can rule out correct programs, e.g. if instantiating a unification variable would make the types equal.) However, since any subexpression in the context might be part of a unifier, checking the types up to congruence closure requires calculating the CC equivalence class of every subexpression, which can be very ex- pensive.

The cost of computing the congruence closure of types is also relevant when im- plementing typeclasses/implicits/instance arguments (Section 6.1.1). This involves instantiating a unification variable with some expression of the right type, so if we want to work up to congruence, we need to compute the congruence closure of the types of all candidate expressions. But the situation here is better, because arbitrary subexpressions of the typing context are not candidates for instance arguments; only the variables in the context (Agda) or some special subset of those variables (Haskell and Scala). Similarly, to implement the assumption-up-to-CC rule (Section 5.2) we need to compute the congruence closure of types in the context, but not subexpressions of those types.

The following table shows the time it took to run the Zombie test suite with the typechecker instrumented to process different subsets of the context. While the ab- solute numbers are arbitrary (they depend on what example programs happened to be written in November 2013), the relative magnitudes give a rough indication of the cost of the different options.

Equations only 35.8s

Datatypes in context 36.3s

All types in context 83s

Every subexpression in context 180s

With just the “classic” assumption rule (Section 5.2), the congruence closure algorithm only needs to process equations in the context. To implement a Scala- or Haskell-style typeclass system up to congruence, we also need to process some other subset of the assumptions—in this example we processed every variable in the context that inhabits a datatype. With the full assumption-up-to-CC rule, and to implement an Agda-style system, we need to process the type of every variable in the context, since we do not know a priori which types will turn out to be equal to an equation. Finally, in order to track welltypedness of unifiers, we also want to process all subexpressions of types in the context, and the types of the types of the subexpressions, and so on.

It can be seen that the assumption-up-to-CC rule is quite costly, increasing the time by more than factor of 2. Tracking every type is worse still, and is not done by the current implementation.

In document A Dependently Typed Language with Nontermination (Page 132-135)