Rewriting Techniques - Automating the Operator Classification

From I) we can easily see that our generators are (Lange) constructors, and from III) we can see that any bool-sorted operator Is a test operator Condition li) says that

Q. asg 2 : (m= 1 ) let t ' be the term eq.

2.6.1 Automating the Operator Classification

2.6.1.1 Rewriting Techniques

Inductionless induction methods aim to decide whether, given two terms M and N in Tg(X) (terms constructed from the operator symbols and a countably infinite set of variables X) the equation M » N is an inductive theorem of a given rewriting system R. The methods are based on the Knuth-Bendix (KB) completion procedure (or variations thereof) and, essentially, if the algorithm completes on R U {M = N}, then M = N is an equation In the inductive theory. An equation in the Inductive theory is not necessarily valid in all models; however, this method is appropriate when using initial algebra semantics.

The approach of [Mus 80a] requires equality predicates and thus an axiomatisation of Bool; if the algorithm completes on R U {M = N} then the resulting theory must be consistent (i.e. true sg false must not be derivable). The approach of [HuH 82] does not require an equality predicate; Instead, the completion algorithm is extended. In order to apply the (extended) algorithm, the specification must satisfy the Principle of Definition. A specification (2,E) with generators Zg has the Principle of Definition when every ground term is equivalent to a unique ground generator term; i.e.

i) (Vt 6 Tj.)(3s e Tj.g) t sg s ii) (Vs,t e Tjg) s sg t implies s=t.

We have tried, without success, to classify operators automatically using the rewrite rule laboratory Reve 2,4 [FoG 84] (using the approach of [HuH 82]) and with ERIL [Die 87] (using the approach of [Mus 80a]).

Problems arise because before an operator can be classified, the axioms for leaves and (finite) multisets of primitive sort terms must be added to each specification. The enriched specification must then be organised into a terminating and confluent rewriting system. There are no difficulties with the axioms for leaves; for example, in the Q ueue specification, we would add an additional sort, s e t say, with generators J__: n a t s e t -> s e t and [ ] : s e t, the operator le a v e s : queue -> set, and the following equations:

Vn : nat.q :queue. leaves (add (q,n) ) * n | leaves (q) leaves(eq) * [ ]

The usual specification of multisets has an associative-commutative generator. Although Reve allows associative-commutative operators, all generators with these properties violate the Principle of Definition and so the classifications cannot be checked within the current version of Reve. The Principle of Definition is further extended In [Mit 87] to allow for alternate sets of constructors but this does not solve our problems. However, there are indications in [HuH 82] that the method could be extended to handle associative-commutative generators and [JoK 86] contains some algorithms, although we are not aware of any implementations of these results at this time.

ERIL (Equationai Reasoning Interactive Laboratory) includes an extension to the Knuth-Bendix completion algorithm which allows for the treatment of order-sorted algebras. By using a suitable axiomatisation of Bool, we can perform inductive proofs in ERIL using the Inductionless Induction approach of [Mus 80a]. When we use the axiomatisation of Horn Clauses given in [Pau 85], then we also have a form of positive conditional equations. ERIL does not include associative-commutative unification and so it cannot allow the associative-commutative nature of multiset construction. As an alternative, when there Is a total order on the primitive terms, then we can give a specification in which each multiset has a normal form which is an ordered list. Such a specification is really more of an implementation of multisets (c.f. chapter five). This approach is also similar to an approach proposed in [Tho 86] where "laws" (rewrite rules) are introduced into ADTs. The types with laws can be thought of as sub-types of the associated free type and the elements of a sub-type are the elements in normal form.

We give below an (ERIL) example specification of multisets and ordered lists of natural numbers. To aid the reader, we also give a lattice of the sorts. Comments are given in between ’7*" and "7"; and when the right hand side of a bool sorted equation is T, then it is omitted. The actual ERIL listings for this specification (and the examples from Appendix 2) are given in Appendix 4.

bool

true false nat set

lattice of sorts

sozts If %, true, false, bool, nat, set

subsorts

! < %, ! :^ true, ! < false, ! < bool, ! < nat, ! < true < %, false < %, bool < %, nat < %, set < %, true < bool, false < bool

set. ops A_ = > 0 succ _<_ [ J _ l_ L I J []

_ c __

mem true false

bool bool -> bool bool bool “> bool bool bool -> bool bool -> bool % % -> bool % ! -> bool 1 % -> bool nat nat ~> bool set set -> bool bool bool-> bool true false~>false false true->false nat

nat -> nat nat nat ->bool nat ~> set nat set ~> set nat set -> set set

set set -> bool nat set -> bool

/*booi ops*/ /*nat ops*/ /*set ops*/ /*unordered insert*/ / ‘ordered insert*/ / ‘ inclusion*/ / ‘ membership*/

e q n s Vn,n':nat, b , b ' : b o o l , S,S' : set. ~T = F - F = T T V b = T b V T = T b => b ’ = ~b V b* ~(b A b *) = ~b V ~b’ b = b / ‘ reflexivity of =7 n = n /‘ must be specifIed7 S = S n < 0 = F n < n = F 0 < succ(n) succ(n) < succ(n') = n < n' (0 = succ(n)) = F (succ(n) =0) ~ F (succ(n) = succ(n’)) = (n = n') n i [] = [n] [n I []] = [n] (n = n ’) => n | [n’ | S] - [n’ | S] (n < n ’) => n | [n’ | S] = [n | [n’ | S ] ] (n < n ’) => n' I [n I S] = [n I [n’ | S] ] n mem [] = F (n = n') => n mem [n* | S] (n < n') => (n mem [n’

j

S] - F) (n < n') => (n* mem [n|S] = n* mem S) [ ] Ç S [n I S] C S ' = (n mem S ' ) a (S C S ' )

As an example, if we were to enrich the Q ueue specification with the above rules and the axiomatisation of leaves, then In order to classify d e q u e u e as an eliminator using inductionless induction, we would add the following rule:

(T) leaves(dequeue(add(q,n))) Ç leaves(add(q,n)).

Unfortunately, running the Knuth-Bendix completion algorithm in ERIL (using the inductive ordering) on the above equations produces an infinite set of rewrite rules. Of course a finite portion of an infinite set of rules may be used as a semi-decision procedure and thus some interesting theorems may be derivable in finite time, but this approach is not adequate for our purposes. For example, the rule (T) is not derivable (in finite time) from the above system.

Infinite sets of rewrite rules may be avoided either by altering the completion procedure (and possibly weakening the results), or by enriching the original set of rewrite rules.

The completion algorithm given in [Fri 86] Is often successful where the KB completion algorithm loops because confluence is, in general, too strong for inductive proofs. [Fri 86] shows that ground confluence (confluency for ground terms) along with a certain property of critical pairs (called complete superposability) is sufficient to guarantee the validity of a theorem in the initial algebra. [Gan 87] also contains a completion algorithm based on ground confluence which gives a weaker result than standard KB completion; coincidentally, this paper uses the specification of ordered lists as an example of a specification for which the algorithm terminates.

In [JaT 87], inductive inference techniques are used to synthesise an enrichment of a rewriting system such that the KB completion algorithm terminates when it is applied to the enriched set of rules. The inductive inference techniques [Bar 83] can synthesise new operators and new rewrite rules; in this case a new operator and some rewrite rules expressing the weakest common generalisation of the (previously) infinite set of critical pairs are synthesised. The example used is the rewriting system which consists of (part of) the Q ueue specification, (R1),...,(R3), a specification of leaves, (R4) and (R5), a very minimal specification of 'sets', (R6) and (R7), and (T). The signature and variable quantification are omitted; > denotes the rewriting relation. ( R 1 ) dequeue (eq) -> eq

(R2) dequeue (add(eq,d) ) -> eq

( R 3 ) dequeue (add (add (q,d) ,dl) ) ~> add (dequeue (add (q, d) ) , dl) (R4) leaves (add(q,d) ) -> d j leaves (q)

(R5) leaves (eq) > []

( R6 ) [] Ç S - > T

( R7 ) (d I S) C [] -> F

( T) leaves(add(q,d)) C leaves(dequeue(add(q,d))) -> T

The rewriting system consisting of only (R1 ),...,(R7) is terminating and confluent; the addition of (T) results in an infinite set of rules. This is not a surprising result as the theory of 'sets'; i.e. (R4), (R5), (R6) and (R7), is very inexpressive. Indeed, (R4), (R5), (R6) and (R7), together form a specification requirement for a theory of sets. When we use inductive inference to synthesise the operators and rules which make the (enriched) rewriting system terminating and confluent, we are, in

effect, synthesising a theory of 'sets' which is just sufficient to make d e q u e u e an eliminator.

The result of applying inductive inference techniques in [JaT 87] is the following rewriting system: a new operator I * : set set > set is introduced and we use the abbreviations 1 ' and 1 ' ’ for leaves (dequeue (add (q, d) ) ) and d | leaves (q) resp. ( R 1 ) dequeue (eq) —> eq ( R 2 ) dequeue(add(eq,d)) > eq ( R 3 ) dequeue(add(add(q,d),d)) -> add(dequeue(add(q ( R 4 ) leaves(add(q,d) d 1 leaves (q) ( R 5 ) leaves(eq) - > _[] ( R 6 )

[] c s

- > T ( R 7 ) (d 1 S) C [] > F ( Q1 ) d I 1'

->

(d 1 []) 1* 1 ’ ( 0 2 ) d 1 1" > _{(d 1 []) 1* 1 "} ( 0 3 ) d 1 (S 1* 1' ) > (d 1 S) 1* 1' ( 0 4 ) d 1 (S 1* 1' ’ ) > (d IS) 1 * 1 ” ( 0 5 ) S 1* (S 1* 1') - > (S 1* S) 1* 1' ( 0 6 ) S 1* (S 1 * 1 ’') - > (S 1* S) 1 * 1 ” ( S ) S 1* 1' C S 1 * 1 ” “ >• T ( T ) 1' Ç 1 ” - > T

A specification of 'sets' has been found which ensures that d e q u e u e is an eliminator. These 'sets' are more familiarly known as left-associative sequences and d e q u e u e is trivially an eliminator because (T) is in the equationai theory. If we remove (T), then the remaining rewriting system is not terminating ^nd confluent because (T) is an Instance of (S) with [ ] substituted for S; i.e. (T) is the base case and (S) is the inductive case.

Unfortunately, a theory of 'sets' in which (T) is not in the equationai theory, but in the inductive theory, is our objective. Can we now use the above specification of 'sets' to derive an alternative specification whose equationai theory is contained in the usual specification of multisets (i.e. the specification with associative-comniutative union) and whose inductive theory contains (T)? An examination of the rules (Q1), (Q2), (Q3), (Q4), (Q5) and (Q6) leads us to conclude that a 'set' constructor which is left-associative is required. The original constructor 1 _ fulfills this requirement and so is not required. The obvious (boolean) relationship between left-associative sequences constructed by is the left subsequence relationship. This leads us to conjecture that c. may be specified as the left subsequence operator; this conjecture seems, informally, to be consistent with (T). For example, one can check that

leaves (dequeue (add (eq, d) ) ) is a left subsequence of leaves (add (eq, d) ). This relationship is specified by the following rules:

( R8) (d 1 S) Ç [] -> F

( R9) ( d I S) Ç ( dl I SI) -> (d = dl ) A (S Ç SI)

The equationai theory of (R6), (R8) and (R9) is obviously contained in an associative-commutative theory of multisets of terms of sort n a t . Moreover, we can easily check that the rewriting systems {( R 1 R 6 ) } , { ( R 1 ) , . . . , ( R 6 ) , ( R 8 ) , ( R 9 ) } and { ( R1) ...( R 6) , ( R 8) , ( R9) ,(T)} are terminating and confluent. (T) is not in the equationai theory of the second rewriting system, but because the third rewriting system is terminating and confluent, then (T) is in the inductive theory of the second system. Thus, in a non-trivlal way, we have now shown that dequeue is an eliminator.

Of course the theory of left-associative sequences, (R6), (R8) and (R9), is not expressive enough to prove the classification of eliminators in all keyless specifications. However, our experience with this example leads us to conjecture that the commutative nature of sets is not necessary for classifying the eliminators of a keyless specification; a suitably expressive specification of finite sequences with associativity as the only permutative property should be sufficient. We note that commutativity may well be required in order to classify the operators in implicitly keyed specifications; for example, commutativity would be required for classifying operators in the specification

Priority_Queue.

Moreover, when only associativity is required, then we also conjecture that for keyless specifications, inductive inference techniques will always be successful when looking for weakest common generalisations and thus a terminating and confluent rewriting system will be found using these techniques.

In document The imperative implementation of algebraic data types (Page 54-60)