Surface language concerns - Limitations and future work

7.7 Limitations and future work

7.7.3 Surface language concerns

We have presented two different styles of type systems, one inspired by modal logic, and one formulated as a type-and-effect system. Considered as core languages, there is no big difference between them: they can express the same programs (Section 7.5), although perhaps the modal one is more promising for future extensions (reasoning about terms of non-logical type, Section 7.7.2).

On the other hand, when designing a surface language elaborating into the current core language, we noticed several points where the system with @-types does not work smoothly, and these trouble-spots could be avoided by using an effect-style system instead.

@-types everywhere Because domains of function types and arguments to data constructors are required to be mobile, almost every type in the context is mobile. (The main exception in the current Zombie implementation is top-level definitions.) Some types (e.g. datatypes) are automatically mobile, but because of the handling of type variables, argument to polymorphic functions and parameterized types need to be tagged with an @θ.

In practice, this means that variable references more often than not need to use the rule TUnboxVal to eliminate the @-qualifier. Similarly, arguments to polymorphic

functions need to be checked using TBox*. So these rules can create a lot of syn-

tactic clutter. If they were only used occasionally we could use explicit (erasable) annotations, e.g. unbox x, but to make programs readable it is important to be able to infer these.

Failure of local completeness Given a bidirectional type system like the one in Chapter 5, the most obvious way to infer uses of TBox* is to make them checking

rules. For example, the surface-language rule corresponding to _TBoxP would be: Γ`θ _a_⇐_A

Γ`L_A_⇐_Type

Γ`P _a_⇐_A_@_θ CBoxP

Since this is a checking (⇐) rule, we do not need an explicit box-annotation on a. However, after implementing the above rule and experimenting with writing programs using it becomes clear that it does not always work. The problem is that our

Box/Unbox rules do not satisfylocal completeness.

A deduction system is locally complete [108] if the elimination-rules are “strong enough” in the following sense: whenever there exists a derivation of a formula A, there exists a derivation ending in an introduction rule for that formula’s connective.

The rules for @-types satisfy this as long as the subject is a value, but it fails in general. Suppose Γ`P _a _:_A_@_L_{. The only @-introduction rule that could prove that}

formula is TBoxP, and to apply that we need to prove the premise Γ`L a :A. But

if a is a non-value, then _TUnboxValdoes not apply, and there is no way to derive that.

The failure of local completeness complicates typechecking, because one can not always eagerly apply the box rule. For example, given Γ `P _f _: _Nat _→ ₍_A_@_L_{) and}

Γ`P _g _{: (}_A_@_L₎_→ _Nat_{, the expression} _g ₍_f _{0) should be typeable. However, naively}

using the above checking rule gets stuck:

Γ`P _g _⇒₍_A_@_L₎_→_Nat

??? Γ`L ₍_f ₀₎_⇐_A

Γ`P ₍_f ₀₎_⇐_A_@_L CBoxP

Γ`P _g ₍_f ₀₎_⇒_Nat IApp

Things are still more complicated when working up-to-congruence. For example, suppose that in the above example f instead had the type Nat →B. Then it is not clear whether the implementation should search for a proof that ΓB = (A@L), or if it should first apply TBoxP and then search for a proof that ΓB =A.

The current Zombie implementation uses a rather poorly motivated hack. When checking an expression a against an @-type A@θ, we look at the syntactic form of

a; if it is a variable or an application, we do not apply the box rule, but instead synthesize a type for aand try to prove that the type is CC-equivalent to either A@θ

or A. The intuition is that applications and variables do not have checking rules, so there is no value in pushing the type A in and we instead synthesize straight away. This heuristic can fail in at least two ways: there may be other expression forms that would also benefit from being synthesized rather than checked; and when checking against a nested typeA@θ1@θ2even applications or variables really should be checked

by peeling off oneθ at a time.

This syntactic condition also interacts with the other typechecking rules. For example, when checking a case-expression against an @-type, we should first apply the case- rule and check each branch against the @-type, as opposed to eagerly applying the box-rule, because the branches could consist of applications/variables. On the other hand, the rules _CBox should fire before _ECrec, because we do not want to try to prove an @-type is equal to a function type.

To summarize, it is hard to make the box/unbox rules completely invisible in a bidirectional type system, and harder still when adding congruence closure. One appealing thing about the effect-style system is that these rules are not necessary there.

Mobile types and higher-order functions Another issue with @-types which becomes apparent when writing programs is an interaction between mobile types and higher-order functions. Consider a polymorphic higher-order function such as map. In order make its type well-formed, type variables must specify some θ, e.g. L:

map : [a b : Type] ⇒ (f : (x:a@log) → b @log) →

(xs : List a) → List b

On the other hand, functions which operate on mobile types do not have to tag their arguments with @-types, which would in any case not matter. For example, the Nat type is mobile, so natural number addition can be given simply the type

plus : Nat→ Nat → Nat

While this makes the type of plus less cluttered, it has an unwelcome consequence: plus can no longer be used as an argument to map. That is, in the application

map (plus 1) lst

the expression (plus 1) has type Nat → Nat, while map expects (A@log) → B for some A and B. The types do not match.

We saw an example of this problem in the DPLL-solver in Section 2.4. There, the functions interp_lit and interp_clause were used as arguments to the higher- order functions any and all, and there types had a spurious @log qualifying the mobile type List.

Semantically, this is not a big problem. As we did above, we can include subtyping rules (SubMobile1 and SubMobile2 in Figure 7.7) which express that Nat and Nat@θ are equivalent types. One could also contemplate adding a rule stating that these two types are propositionally equal, rather than just equivalent.

However, writing a typechecker for a surface language including such rules is a harder problem. In general, including subtyping in a language tends to make type checking complicated. And even the equational version is not straightforward when combined with unification-based inference. In the above example the implicit type arguments of mapgenerates two unification variablesXandY, and we try to matchNat→Natwith (X@L)→Y. Just syntactic unification cannot solve this goal, because it will match

NatagainstX@Land the two expressions have different top-level constructors. On the other hand, the rule for @-qualified mobile types can only fireafter the variableX has been instantiated. So the typechecker would have to interleave ordinary unification with operations which depend on the semantics of the language.

Subtyping constraints versus equality constraints In the future, one may consider moving from the current bidirectional type system (i.e. local typechecking) to a general constraint-based system (Section 6.4). Usually such systems are easier

to design if they can be phrased in terms of equality constraints, because those constraints can be solved by unification. On the other hand, asymmetric rules that state that one type should subsume another require more ingenuity. In particular, naively adapting the unification technique to inequality constraints creates semi-unification problems [67], which are undecidable [73].

When creating a constraint-based type system, all the issues about inferring uses of

TBox/TUnbox/SubMobilethat we mentioned above would become relevant, be-

cause these features all involve “asymmetric” constraints (e.g. if the surface language includes implicit unboxing, thenA@Lis a “better” type thanA). Since the effect-style system does not include these rules, it may be an easier target for elaboration. Of course, somewhat tempering this optimism is the fact that the effect-style calculus includes a subtyping relation, which also creates asymmetrical constraints. However, these may be more tractable, for two reasons. First, it may be that subtyping is not necessary in practice. The current Zombie implementation does not implement subtyping, and it can still check our example programs. In the translation of a possible-world program that does not use subtyping, the only use of subtyping is when the subsumption rule TSub is applied to a function type, e.g.

Γ`L _f _:_Nat_→_Nat_→_Nat

Γ`P _f _:_Nat_→_Nat_→_Nat TSub

In the corresponding effect-style derivation we use subtyping to say that a function of type Nat →L Nat →L Nat can also be used at type Nat →P Nat →P Nat. But in the typical case this is more than is needed; even if the function has the original L type, the application rule T App still lets it be applied in a P context. Similarly, if f was

used as an argument to a higher-order function, the programmer could work around the need for subtyping by instead η-expanding f.

Second, even if we want subtyping in the langauge, the effect-style subtyping relation

T <: T0 is easier to handle than the possible-world style relation A <: A0. The difference is that while the possible-world relation includes A@L <: A, the effect- style relation never changes the top-level constructor of a type expression, only the

θ-annotations on arrows. So it should still be possible to use unification to determine the shape of type arguments, leaving only residual constraints about the values of the

θs.

In document A Dependently Typed Language with Nontermination (Page 192-195)