Thesis Statement
1.1 Coq, an Interactive Proof Assistant
This section provides a quick introduction to the main concepts behind the interactive proof assistant called Coq. It is not intended to serve as a full tutorial—Coq’s web page is full of better introductory materials than this chapter1—but it gives the casual reader a big picture of the system, hopefully enough to comprehend the main concepts behind this thesis.
We start by giving an analogy, borrowed from Xavier Leroy. He described Coq as a game: The proof developer (proof dev for short) starts by providing a theorem she wants to solve. Then, Coq asks her for a proof. She responds by providing a tactic to somehow simplify the goal. This tactic can be to introduce a hypothesis, to use induction on a given variable, to rewrite a part of the goal with some given equality, to apply a previously proven lemma, etc. Coq, after performing the requested change in the goal, responds with the new goal. The game continues, perhaps branching into
1http://coq.inria.fr
9
01 Lemma addn0 : ∀ n : nat. n + 0 = n.
Figure 1.1: Example of interaction with the Coq proof assistant.
different subgoals (subcases in a pen and paper proof, like when using induction), until every subgoal is solved. Coq then communicates its defeat and the proof dev gives the last estocade by typing Qed. If at any moment the step provided by the proof dev is invalid, Coq immediately complains. This back-and-forth interaction between Coq and the proof dev is what is meant by the term interactive in an interactive proof assistant, in sharp contrast with proof assistants like Twelf (Pfenning and Sch¨urmann,1999), which compiles a proof in a batch fashion very much like a programming language compiler.
Figure1.1shows a very simple example of this “game”. In it, we prove a lemma stating that n + 0 is equal to n, for every natural number n. Note that the proof is very short in itself—only four lines long—but we have interleaved comments, enclosed with (* *), showing the proof state (i.e., Coq’s response) after a command or a tactic is executed.
Commands in Coq are written in capitalized case, as in Proof or Qed, while tactics are
written in lower case, as in elim. Commands modify and query the global environment, while tactics modify a current proof state.
Notational Convention 1. Throughout this thesis, for Coq programs we will mostly use the original Coq syntax. However, in a few exceptional cases, we will take the liberty of making the syntax more “math friendly”. For instance, we will write functions and products as λx. t and ∀x. u, respectively, instead of fun x ⇒ t and forall x, u.
Coming back to Figure1.1, after stating the lemma in the first line, Coq responds with the proof obligation displayed in comments in lines 3 and 4. This proof obligation can be read as “under no assumptions, you need to prove that for all n, . . . ” If there were any assumptions, as we are going to see next, they will be displayed above the double line. The command Proof in line 2 is just a no-op, but it is a Coq convention that every proof should start with it.
In line 5 the first tactic is provided. It is the elim tactic from Ssreflect, which performs induction on the first variable appearing in the goal (in this case, n). Throughout this thesis we will use the idiom for tactics afforded by the library Ssreflect (Gonthier et al., 2008), which is better crafted than Coq’s own tactics. The elim tactic takes an intro pattern (what comes after the ⇒), which is a list of lists of names. The outermost list should have one element per subgoal, separated by |. Each element of this list is a list of names separated by a space. In this case, the induction on natural numbers generates two subgoals, representing the base case, when n = 0, and the inductive case, when n = n0 + 1 for some n0. The definition of natural numbers, together with the addition function and its notation, is standard2 and can be found in Figure1.2.
For the base case, no new hypotheses are added to the context, so no new names are given in the first list of the intro pattern. For the inductive case, two hypotheses are added: the number n0 and the inductive hypothesis stating that n0 + 0 = n0. Coq’s answer is displayed in lines 6–10. In line 6, it communicates that we have to prove two subgoals, which are identified with numbers 1 and 2, respectively, and tell us that we are currently proving goal #1. Then, in lines 7–10, it shows the two subgoals. It does not show the context of subgoal #2.
The first subgoal is trivial, we need to prove that 0 + 0 = 0 under no assumptions, and this holds by computation. We instruct Coq to dismiss this goal as trivial with Ssreflect’s tactical by in line 11. (A tactical si simply a tactic that tas another tactic as argument.) The by tactical uses the tactic given as argument to prove the goal and to check that
2It is interesting to note that the Coq language is minimal: it does not even include natural numbers natively; they are instead defined in the standard library. Ssreflect provides a slightly different notation for numbers, but we stick to the standard one for presentation purposes.
Inductive nat : Set : = O : nat | S : nat → nat Definition addn : =
fix plus (n m : nat) {struct n} : nat : = match n with
| 0 ⇒ m
| S p ⇒ S (plus p m) end.
Notation ”a + b” : = (addn a b).
Figure 1.2: Natural numbers in Coq.
the goal was indeed solved by it. In this case we do not need any tactic, since the goal is trivial, so we provide [] as argument.
After the first subgoal is solved, Coq outputs the remaining subgoal (show in comments in lines 12–15). We note that the hypotheses for the second subgoal appear now in the context, with the names provided in the intro pattern from line 5. This subgoal requires us to prove that S n0+ 0 = S n0. By performing some steps of computation on the left hand side, we can change the goal to S (n0+ 0) = S n0. This is done with the simpl tactic in line 16 (Ssreflect also allows a more general way to perform the same, writing rewrite /=). The output of the simpl tactic is in the comments on lines 17–20.
At this point we can use the inductive hypothesis to rewrite the goal into its final form.
This is done in line 21. The new goal is now trivial, as can be seen in lines 22–25, so we finish the proof again using the by tactical. Note that the last three steps can be accomplished in just one line, thanks to the Ssreflect’s powerful rewrite tactic. The one-liner equivalent is
by rewrite /= IH
The proof is completed and Coq communicates that there are no subgoals left. Quod Erat Demostrandum (Qed), it is then demonstrated. Lastly, Coq announces that the lemma is now part of the global environment, i.e., the global knowledge we can make use of in any other proof.
1.1.1 Proof Terms
Coq does not store the proof script in its environment (the database of knowledge available to the user). Instead, it stores the proof term that the script helped generate.
A proof term is a λ-term that, following the Curry-Howard isomorphism (e.g., Sørensen and Urzyczyn, 2006), represents a valid proof of the theorem stated in its type. For instance, this is the proof term, followed by its type, generated for the lemma above:
λ n : nat.
nat ind (λ n0 : nat. n0 + 0 = n0) eq refl (λ (n0 : nat) (IH : n0 + 0 = n0).
eq ind r (λ n1 : nat. S n1 = S n0) eq refl IH) n : ∀ n : nat. n + 0 = n
It is possible in Coq to write proof terms directly, without using tactics. Similarly, in the context of proof automation, it is possible to automate the generation of proof scripts or proof terms. In this dissertation we focus on the latter problem. For this reason, it is important to know Coq’s λ-calculus, the Calculus of Inductive Constructions, which will be covered in the next section.