Comparing the Alternatives - Certified Programming with Dependent Types

We have seen four different approaches to encoding general recursive definitions in Coq. Among them there is no clear champion that dominates the others in every important way. Instead, we close the chapter by comparing the techniques along a number of dimensions. Every technique allows recursive definitions with termination arguments that go beyond Coq’s built-in termination checking, so we must turn to subtler points to highlight differences. One useful property is automatic integration with normal Coq programming. That is, we would like the type of a function to be the same, whether or not that function is defined using an interesting recursion pattern. Only the first of the four techniques, well-founded recursion, meets this criterion. It is also the only one of the four to meet the related criterion that evaluation of function calls can take place entirely inside Coq’s built-in computation machinery. The monad inspired by domain theory occupies some middle ground in this dimension, since generally standard computation is enough to evaluate a term once a high enough approximation level is provided.

Another useful property is that a function and its termination argument may be developed separately. We may even want to define functions that fail to terminate on some or all inputs. The well-founded recursion technique does not have this property, but the other three do.

One minor plus is the ability to write recursive definitions in natural syntax, rather than with calls to higher-order combinators. This downside of the first two techniques is actually rather easy to get around using Coq’s notation mechanism, though we leave the details as an exercise for the reader. (For this and other details of notations, see Chapter 12 of the Coq 8.4 manual.)

The first two techniques impose proof obligations that are more basic than termination arguments, where well-founded recursion requires a proof of extensionality and domain- theoretic recursion requires a proof of continuity. A function may not be defined, and thus may not be computed with, until these obligations are proved. The co-inductive techniques avoid this problem, as recursive definitions may be made without any proof obligations.

We can also consider support for common idioms in functional programming. For instance, the thunkmonad effectively only supports recursion that is tail recursion, while the others allow arbitrary recursion schemes.

On the other hand, the comp monad does not support the effective mixing of higher- order functions and general recursion, while all the other techniques do. For instance, we can finish the failed curriedAddexample in the domain-theoretic monad.

Definition curriedAdd’ (n : nat) := Return (funm : nat ⇒Return (n + m)). Definition testCurriedAdd := Bind(curriedAdd’ 2) (fun f ⇒ f 3).

The same techniques also apply to more interesting higher-order functions like list map, and, as in all four techniques, we can mix primitive and general recursion, preferring the former when possible to avoid proof obligations.

Fixpoint map A B (f : A → computation B) (ls : listA) :computation (list B) := match ls with

| nil ⇒ Return nil

| x :: ls’ ⇒ Bind (f x) (funx’ ⇒

Bind(map f ls’) (funls’’ ⇒

Return (x’ :: ls’’))) end.

Theoremtest map : run (map (fun x ⇒ Return (S x)) (1 :: 2 :: 3 :: nil)) (2:: 3 :: 4 :: nil).

exists 1;reflexivity. Qed.

One further disadvantage of compis that we cannot prove an inversion lemma for execu- tions of Bind without appealing to anaxiom, a logical complication that we discuss at more length in Chapter 12. The other three techniques allow proof of all the important theorems within the normal logic of Coq.

Perhaps one theme of our comparison is that one must trade off between, on one hand, functional programming expressiveness and compatibility with normal Coq types and computation; and, on the other hand, the level of proof obligations one is willing to handle at function definition time.

Chapter 8 More Dependent Types

Subset types and their relatives help us integrate verification with programming. Though they reorganize the certified programmer’s workflow, they tend not to have deep effects on proofs. We write largely the same proofs as we would for classical verification, with some of the structure moved into the programs themselves. It turns out that, when we use dependent types to their full potential, we warp the development and proving process even more than that, picking up “free theorems” to the extent that often a certified program is hardly more complex than its uncertified counterpart in Haskell or ML.

In particular, we have only scratched the tip of the iceberg that is Coq’s inductive definition mechanism. The inductive types we have seen so far have their counterparts in the other proof assistants that we surveyed in Chapter 1. This chapter explores the strange new world of dependent inductive datatypes outside Prop, a possibility that sets Coq apart from all of the competition not based on type theory.

8.1 Length-Indexed Lists

Many introductions to dependent types start out by showing how to use them to eliminate array bounds checks. When the type of an array tells you how many elements it has, your compiler can detect out-of-bounds dereferences statically. Since we are working in a pure functional language, the next best thing is length-indexed lists, which the following code defines.

Sectionilist.

VariableA : Set.

Inductiveilist : nat→ Set:=

|Nil : ilist O

|Cons : ∀ n, A → ilistn → ilist(Sn).

We see that, within its section, ilistis given typenat→Set. Previously, every inductive type we have seen has either had plainSetas its type or has been a predicate with some type ending in Prop. The full generality of inductive definitions lets us integrate the expressivity

of predicates directly into our normal programming.

The nat argument to ilisttells us the length of the list. The types of ilist’s constructors tell us that a Nil list has length O and that a Cons list has length one greater than the length of its tail. We may applyilistto any natural number, even natural numbers that are only known at runtime. It is this breaking of thephase distinction that characterizesilistas

dependently typed.

In expositions of list types, we usually see the length function defined first, but here that would not be a very productive function to code. Instead, let us implement list concatena- tion.

Fixpointapp n1 (ls1 : ilistn1)n2 (ls2 : ilistn2) : ilist(n1 + n2) := match ls1 with

| Nil⇒ ls2

| Cons x ls1’ ⇒ Cons x (app ls1’ ls2) end.

Past Coq versions signalled an error for this definition. The code is still invalid within Coq’s core language, but current Coq versions automatically add annotations to the original program, producing a valid core program. These are the annotations onmatch discriminees that we began to study in the previous chapter. We can rewriteapp to give the annotations explicitly.

Fixpointapp’ n1 (ls1 : ilistn1)n2 (ls2 : ilistn2) : ilist(n1 + n2) := match ls1 in (ilist n1) return (ilist(n1 + n2)) with

| Nil⇒ ls2

| Cons x ls1’ ⇒ Cons x (app’ ls1’ ls2) end.

Using return alone allowed us to express a dependency of the match result type on the

valueof the discriminee. Whatinadds to our arsenal is a way of expressing a dependency on thetype of the discriminee. Specifically, then1 in theinclause above is abinding occurrence

whose scope is the return clause.

We may usein clauses only to bind names for the arguments of an inductive type family. That is, each in clause must be an inductive type family name applied to a sequence of underscores and variable names of the proper length. The positions for parameters to the type family must all be underscores. Parameters are those arguments declared with section variables or with entries to the left of the first colon in an inductive definition. They cannot vary depending on which constructor was used to build the discriminee, so Coq prohibits pointless matches on them. It is those arguments defined in the type to the right of the colon that we may name with in clauses.

Our app function could be typed in so-called stratified type systems, which avoid true dependency. That is, we could consider the length indices to lists to live in a separate, compile-time-only universe from the lists themselves. Compile-time data may be erased

such that we can still execute a program. As an example where erasure would not work, consider an injection function from regular lists to length-indexed lists. Here the run-time

computation actually depends on details of the compile-time argument, if we decide that the list to inject can be considered compile-time. More commonly, we think of lists as run-time data. Neither case will work with naïve erasure. (It is not too important to grasp the details of this run-time/compile-time distinction, since Coq’s expressive power comes from avoiding such restrictions.)

Fixpointinject (ls : listA) : ilist(length ls) := match ls with

| nil ⇒Nil

| h :: t ⇒ Cons h (inject t) end.

We can define an inverse conversion and prove that it really is an inverse. Fixpointunject n (ls : ilistn) : listA :=

match ls with

| Nil⇒ nil

| Cons h t ⇒ h :: unjectt

end.

Theoreminject inverse : ∀ls, unject(inject ls) = ls. induction ls; crush.

Qed.

Now let us attempt a function that is surprisingly tricky to write. In ML, the list head function raises an exception when passed an empty list. With length-indexed lists, we can rule out such invalid calls statically, and here is a first attempt at doing so. We write ??? as a placeholder for a term that we do not know how to write, not for any real Coq notation like those introduced two chapters ago.

Definition hd n (ls : ilist(S n)) : A := match ls with

| Nil⇒ ???

| Cons h ⇒ h

end.

It is not clear what to write for the Nil case, so we are stuck before we even turn our function over to the type checker. We could try omitting the Nil case:

Definition hd n (ls : ilist(S n)) : A := match ls with

| Cons h ⇒ h

end.

Error: Non exhaustive pattern-matching: no clause found for pattern Nil Unlike in ML, we cannot use inexhaustive pattern matching, because there is no concep- tion of a Match exception to be thrown. In fact, recent versions of Coq do allow this, by

implicit translation to a matchthat considers all constructors; the error message above was generated by an older Coq version. It is educational to discover for ourselves the encoding that the most recent Coq versions use. We might try using an in clause somehow.

Definition hd n (ls : ilist(S n)) : A := match ls in (ilist (Sn)) with

| Cons h ⇒ h

end.

Error: The reference n was not found in the current environment

In this and other cases, we feel like we want in clauses with type family arguments that are not variables. Unfortunately, Coq only supports variables in those positions. A completely general mechanism could only be supported with a solution to the problem of higher-order unification [15], which is undecidable. There are useful heuristics for handling non-variable indices which are gradually making their way into Coq, but we will spend some time in this and the next few chapters on effective pattern matching on dependent types using only the primitive match annotations.

Our final, working attempt at hd uses an auxiliary function and a surprising return annotation.

Definition hd’n (ls : ilistn) :=

match ls in (ilist n)return (match n with O ⇒unit | S ⇒A end) with

| Nil⇒ tt

| Cons h ⇒ h

end. Check hd’. hd’

: ∀n : nat, ilist n → match n with

|0 ⇒ unit

|S ⇒ A

end

Definition hd n (ls : ilist(S n)) : A :=hd’ ls. End ilist.

We annotate our main match with a type that is itself a match. We write that the function hd’ returns unit when the list is empty and returns the carried type A in all other cases. In the definition ofhd, we just callhd’. Because the index oflsis known to be nonzero, the type checker reduces the match in the type of hd’ toA.

8.2 The One Rule of Dependent Pattern Matching in

In document Certified Programming with Dependent Types (Page 138-144)