Chapter 5 Equality and Object-Level Unification
5.2 first-order unification for constructor forms
5.2.4 cycle
Showing that cycles do not occur in our inductive datatypes is quite a subtle business. Even proving X GÉ £ Ã G4DEHG4D 88 G (= G Ê
requires quite a lot of technology. Let us do it. EliminatingG
, XNË É £ Ã Ë DMS (= S Ê X GÉ £ Ã · GAD}EHG D 88 E wcz D G ¸= G Ê E D = G =B= G Ê
Applying unification (without cycle elimination) to both subgoals, we can at least sim- plify the two constructor-headed equations via the88GÌ>;Ay theorem. This eliminates
theS
case by conflict, while injectivity leaves us with an immediate step:
X ?Â7?w DEpG D 88 E wcz DBG = G Ê E DBG = G Ê
In fact, we can follow the same structure for any number of= ’s, but this is only because
the natural numbers are deceptively symmetrical. Watch what happens if we throw in a spare successor constructor,h , (making type88
C
) and try to prove
X GÉ £ Ã ù GeD}EHGeD 88 C G (=?h G Ê Induction onG yieldsS
andh cases which perish by conflict. The= case is as follows:
X GÉ £ Ã ù GeDVEHG D 88 C E wcz D?G =Bh G Ê E D = G =?h1= G Ê Injectivity yields X*Ã ÍMw D}EHG D 88 C E wcz DBG (=?h G Ê E DBG (h= G Ê
Oh dear! We have the wrong inductive hypothesis! The extra = appeared at the very
inside, rotating the cycle: it is only because one successor usually looks much like another that these theorems are so easy for88 .
In order to prove the result in this style, we must first strengthen it:
X G £ wc D ù D}EHGeD 88 C G (=?h G Ê sÐ: G ¸h= G Ê
By including not just the =Bh cycle, but also all its rotations, we will have more to do:
there will be work in both successor cases, although one is enough to show what hap- pens: X G £ wc D ù ¯ DEHG D 88 C E wcz D G =Bh G Ê sÐ: G h1= G Ê = G =Bh= G Ê sÐ: = G (h=?= G Ê
The right conjunct follows by conflict. The left reduces by injectivity: X ?zc ]Î D}EHG D 88 C E wcz D G ¸= h G Ê ]Ð( G (h= G Ê G (h= G Ê
The rotated conclusion follows by projecting the appropriately rotated conjunct of the inductive hypothesis.
This technique can be generalised to arbitrary cycles in arbitrary datatypes. The draw- back is that (up to rotation), we need a new theorem for every cycle pattern. This leaves us little choice but to generate them on the fly.
A slightly more cunning technique, arising from a conversation with Andrew Adams, is mentioned in [McB96]. It constructs for a given cycle pattern
zt_ 4
in some type
ãåäyæ a quotient functionÏ
Ç ÆÐ D ãäyæ 88 Ï Ç ÆÐ zy· = Ï Ç ÆÐ R Ï Ç ÆÐ S ApplyingÏ Ç ÆÐ
to both sides of the cycle, we get
Ï Ç ÆÐ = Ï Ç ÆÐ R
We have already seen a disproof of that!
While the guarded recursion principles we have constructed for each datatype make these functions relatively easy to manufacture—indeed I have implemented this technique—we have still not escaped from the burden substantial on-the-fly construc- tion work, cycle by cycle.
Remember, though, that in cycle
ztp
,zt·
is a constructor form, and hence we can compute by structural recursion on it: perhaps there is a way to compute the proof we want. Unfortunately, though, there is no way to test whether we have decomposed as far as a non-canonical symbol like
: our programs have no access to the decidable conversion relation of the type theory which describes them.
Nonetheless, we can adopt blunderbuss tactics: for any element
of a datatype, we can construct the property of ‘not being a proper subterm of
’ in such a way that when
has a constructor head, the property reduces to a product explicitly enforcing ‘not
4Without loss of generality, assume
being any of the exposed subterms’. The idea works in the same as the auxiliary data structure with which we earlier constructed guarded recursion. In fact, the predicate we need is just an instance of that structure.
We may define this property for88 , together with its non-strict counterpart as follows:
88G:y>;4¥¦;4f Y 1w D 88 w Ê 88(8:hIÌ,(c¥ £ Y D 88 88 Ê ¥yË 88Õ:¦>;4¥y;Af 88(8:hI(c¥ £ Y 1w D 88 88G:y>;A¥y;Af ºwRsÐ: 88(8(hIÌH(c¥ £ ÙwR
with conversion behaviour
88(8(hIÌ,(c¥ £ S ÿ ÏÏ 88(8(hIÌ,(c¥ £ = G ÿ 88(8:hI(c¥ £ G ÿ 88G:y>;A¥y;Af G sÐ( 88(8(hIÌ,(c¥ £ G 88:8(hIÌ,(c¥ £ w
is thus inhabited exactly when
is not a proper subterm ofw
, whilst
88:8(hI(c¥
£
¸w
adds the requirement%
w
to indicate that
is not any subterm ofw
.
88:8(hIÌ,(c¥
£
Éw
unfolds computationally to reveal a proof that
is not equal to any guarded subterm ofw
. Observe, for example,
88(8(hIÌ,(c¥ £ =?= ÿ 88G:y>;A¥y;4f = R]Ð 88G:y>;A¥y;4f ÙRsÐÞ 88(8:hIÌ,(c¥ £ ÙR
Suppose we can prove
E à D 88KJ Éæ 88(8(hIÌ,(c¥ £ Ã
Then for any hypothetical proof of
zy·
, we have a proof of 88:8(hIÌ,(c¥
£
ªzt·
, which will expand to a product containing
88G:y>;A¥y;Af Ù
from which contradiction the goal should surely follow.
The first step in the proof is to eliminate the equation, leaving the highly plausible
X }wc D D}E D 88 88(8(h2Ì,(c¥ £ Ù
You could be forgiven for hoping that we might get a cheap proof by the88 ¥¦Ësßõ>
theorem we have already built, but sadly, that only proves88
Ê
¥yË
C
whenC
is a constant and all we have to do at each stage is pass on the accumulated information, adding just the new layer. Here, the scheme varies over the recursion, so we must be more cunning.
Our next move is unsurprising: induction on
. X }wc D ® D 88(8:hIÌ,(c¥ £ SÉS X }wc D ¯ DVE D 88 E D 88:8(hIÌ,(c¥ £ Ù 88(8:hIÌ,(c¥ £ = =
The base case is trivial as its type reduces to
ÏÏ
. Unfortunately, the step is genuinely dif- ficult: 88(8:hIÌ,(c¥
£
fixes its first argument, so there is no way, as things stand, that we can reduce the conclusion to the inductive hypothesis. Some intelligent strengthening will be necessary. First reduce the conclusion to its non-strict expansion:
X }wc D ¯ D}E D 88 E D 88(8(h2Ì,(c¥ £ Þ 88(8(hI(c¥ £ = Þ
We must prove that if
is not a proper subterm of itself,=
is certainly not a subterm. We can see that if =
were a subterm,
would be a proper subterm, nomatter what is on the right hand side. Let us make the corresponding generalisation. That is, we introduce the hypotheses, create a hole for the more general version of the goal, then use it to solve the original:
X }wc D ¯§Ö Y D 88 Y D 88(8(hIÌH(c¥ £ Þ XN× GD}E w D 88 E ÉØ w D 88(8:hIÌ,(c¥ £ Ùw 88(8(hI(c¥ £ = Þw × G Ù
Why is this a good move? Well, we have fixed the first argument of the predicates, and we are now free to let the second vary in an induction onw
which corresponds to the computational behaviour of88(8(hIÌ,(c¥
£
XN× G ® D}E ÉØ Ë D 88:8(hIÌ,(c¥ S 88(8(hI(c¥ £ = S XN× G ¯ D}E w D 88 E w DVE ÉØ w D 88(8:hIÌ,(c¥ £ Ùw 88(8(hI(c¥ £ = ºw E ÉØ ?w D 88(8(hIÌ,(c¥ £ = w 88(8(hI(c¥ £ = = w
Applying a little computation, the base case becomes
XN× G ® D}E ÉØ Ë D ÏÏ 88G:y>;4¥¦;4f= S ]Ð ÏÏ
This is easily proven, with a little help from88GÌ>;Ay .
Reducing the step case, we get
XN× G ¯ DVE w D 88 E w D}E É¡Ø w D 88(8(hIÌ,(c¥ £ Þw 88(8:hI(c¥ £ = Ùw E ÉØ ?w D 88G:y>;A¥y;Af ÙwRsÐÞ 88(8(hIÌH(c¥ £ ÞwR 88G:y>;A¥y;Af= = wR]Ð( 88:8(hI(c¥ £ = ÙwR
The implication between the two right conjuncts is exactly given by the inductive hy- pothesis. As for the left conjuncts, expanding the conclusion’s88Õ:¦>;4¥y;Af gives us a
proof of=
=
w
from which we must proveÊ . 88GÌ>;Ay exposes a proof of
w
, for which we have a disproof at the ready.
Having established this property for the natural numbers, there is always the nagging suspicion that we have exploited in some hidden way the symmetry of that datatype, just as we would be wary of generalising to all triangles a property which held in the equilateral case. When there is only one step constructor, with only one recursive argument, the issue of whether phenomena behave conjuctively or disjunctively can become blurred. However, in this case, everything fits together perfectly.
CONSTRUCTION: cycle
ãåäyæ LMNPORQ andá constructors ^ Â D ^ ç ^ D_è ãåäyæ°é ë 9@¦d ^ Â ^ D ãåäyæ
Note that I really should write âMd, as the number of recursive arguments
may vary from constructor to constructor. However, the proof will be even less readable if I start subscripting superscripts.
We may define the inequality property
ãåäyæ î ä Ï Çy[Ao° Y 1w D ã䦿êJ w Ê
We can then add the proper subterm relation
ãåäyæ º ÆB¹,ÅcÇÙ Y D ãäyæÙJFã䦿 ÆKÇ_È ãåäyæ î ä Ï Çy[Ao R
and the non-strict subterm relation
ãåäyæ º ÆBÅcÇÙ Y 1w D ãåäyæêJ ãäyæ î ä Ï Çy[Ao ÙwRsÐÞ ã䦿 º ÆB¹,ÅcÇÙ ÙwR
The computational behaviour of these definitions is as one would hope:
ãåäyæ º ÆB¹,ÅcÇÙ Ý 9@R ^  ^ w ÿ úÚHã䦿 º ÆBÅcÇÙ Ùw ÄpÛ ë Ä
We may now prove the cycle theorem:
XFãåäyæ GÜ o DVE à D ã䦿 E D Éà ãåäyæ º ÆB¹,ÅcÇÙ Ã
First, we eliminate the equation, leaving
XFãåäyæ GÜ o C D}E D ãäyæ ã䦿 º ÆB¹,ÅcÇÙ Ù
Next, we eliminate the
. X<ÝSWw]d DVE ^ Â D ^ ç d E ^ Dsè ãåäyæêé*ë E ^ Dsè ãåäyæ º ÆB¹,ÅcÇÙ Ä ÄFé*ë Ä ãäyæ º ÆB¹HÅcÇ Ù 9@d ^ Â ^ R 9@¦d ^ Â ^ R
The conclusion expands, yielding a product X<ÝSWw]d DVE ^ Â D ^ ç d E^ Dsè ãåäyæêé ë E ^ Dsè ãåäyæ º ÆB¹,ÅcÇÙ Ä ÄFé ë Ä ú è ãäyæ º ÆBÅcÇÙ 9@¦d ^ Â ^ R Ä]é ë Ä
Now we come to the strengthening step. The conclusion we are trying to show isâ -fold now. The trick is to prove each separately, abstracting away
the right hand
Ä inâ separate lemmas: X<ÝSWw]d Ö Y ^  D ^ ç d Y ^ D¬è ã䦿°é*ë Y ^ D¬è ã䦿 º ÆB¹HÅcÇ Ù Ä Ä]é ë Ä X ^ T7w[t D Þß ß ß ß à ß ß ß ßá E w D ãåäyæ E É¡Ø w D ãåäyæ º ÆB¹,ÅcÇÙ Ä w ãäyæ º ÆÅcÇ Ù 9@yd ^  ^ Rw âãß ß ß ßä ß ß ß ß å ë Ä üè T7w[t¨Ä Ä Äsé*ë Ä ý
The proof of each lemma is again inductive. We apply ãåäyæ moq\
, Thus for each of theâ , lemmas, we acquireá constructor cases:
XNT7wtT DVEt^  C D ^ ç E ^ w D¬è ãåäyæ°é ë E ^ D Þ à á E É¡Ø w à D ãåäyæ º ƹ,ÅcÇÙ Ä w à ã䦿 º ÆBÅcÇÙ 9@¦d ^  ^ Rw à â ä å ë à E ÉØ D ãåäyæ º ÆB¹,ÅcÇÙ Ä 9@ ^ ²¸^ wR ãåäyæ º ÆÅcÇ Ù 9@yd ^  ^ R 9@ ^ ²¸^ wR
Now, a little computation is in order:
XNT7wtT DRE ^  C D ^ ç E^ w D¬è ãåäyæ°é ë E ^ D Þ à á E É¡Ø w à D ãåäyæ º ƹ,ÅcÇÙ Ä w à ã䦿 º ÆBÅcÇÙ 9<yd ^  ^ Rw à â ä å ë à E ÉØ D ú Ú ãåäyæ î ä Ï Ç¦[4o Ä w à ]Ð: ãåäyæ º ÆB¹,ÅcÇÙ Ä w Ã Û ë à ú D ãåäyæ î ä Ï Çy[Ao 9@yd ^  ^ R 9@ ^ ² ^ wR ú Ú ãåäyæ º ÆBÅcÇÙ 9@yd ^  ^ Rw ë
Firstly, each ãåäyæ º ÆBÅcÇÙ 9@yd ^  ^ Rw à follows by
à applied to the proof of
ãåäyæ º ÆB¹,ÅcÇÙ Ä w à projected from ÉØ .
Secondly, we must establish
ãåäyæ î ä Ï Çy[Ao 9@yd ^ Â ^ R 9@ ^ ² ^ w
That is, we must prove
9@yd ^  ^ R 9@R ^ ² ^ wR Ê so we apply ã䦿 ¹[
ä . If the constructors are different (æ
èç
), the goal is proved at once, otherwiseæ
éç
and we must show
^ Â ^ ² ^ ^ w Ê But look! ØÉ
contains proofs for eachê of
ãåäyæ î ä Ï Çy[Ao Ä w Ã
We may select the proof of Ä<
w
Ä from the injected equations, and the
proof of ãåäyæ î ä Ï Çy[Ao Ä w Ä from ØÉ
establishing a contradiction and completing the construction.
Let me remark only briefly on the extension to dependent families. For
ZF[A\ DsE·^`MD
^
a
J7LMNPORQ
the appropriate notion of inequality is
ZF[A\î ä Ï Ç¦[4o° Y ^ ^ w D ZF[A\ J ^ ^ wR Ê
We can then construct
Z][4\¨º ƹ,ÅcÇÙ and ZF[4\¨º ÆBÅcÇÙ as before: ZF[A\¨º ÆB¹,ÅcÇÙ Y ^ D ZF[A\ J Z][4\àÆIÇ_È ZF[A\î ä Ï Çy[Ao ^ R ZF[A\¨º ÆBÅcÇÙ Y ^ ^ w D ZF[A\ J ZF[4\î ä Ï Çy[Ao ^ ^ wR]Ð( ZF[4\¨º ÆB¹HÅcÇ Ù ^ ^ w
Since all three of these take two sequences in , rather than two elements in some
Z][4\ ^`
, no problem arises in the strengthening step: we are free to abstract away the whole right hand sequence, ensuring the induction is on the entire family.
As for the equational reasoning, suppose we are trying to prove some inequality
^
yx1
^
wjx1w
Ê where both sequences inhabit ZF[A\
, with both
and w
constructor- headed. Rather than trying to unify the ^
and ^
w
, we may apply the ‘strong’ version of the Peano theorem directly to the telescopic equation, solving the goal in the case of different constructors, and exposing the equations of the predecessors if the construc- tors are the same.
In effect, then, the construction scales up without any difficulty from elements of sim- ple types to sequences in some
Z][4\
.
This construction also generalises easily to datatypes which use higher-order construc- tors to represent infinitely-branching structures. When the higher-order arguments ap- pear as hypotheses they may simply be fixed, so that they may be used as the appro- priate witnesses for higher-order arguments in goal positions. However, it is not easy to exploit this proof automatically, as it is undecidable whether an infinitely-branching