4.4 Tree automata completion extensions
4.4.2 Restricted conditional term rewriting systems
Timbuk 3.0 accepts conditional rules of the form l → r if x1 ↓ x01and . . . and xn ↓ x0n
where x1, x01, . . . , xn, x0n ∈ Var(l). Completion using these conditional rules is defined
according to the theoretical framework of Section 3.1.4. Any non-left linear rule of the form f (x, x, y) → g(y) can thus be encoded using a conditional rule f (x, x0, y) → g(y) if x ↓ x0. Note also that any rule of the form f (x, x) → g(x) will necessarily be encoded by a conditional rule of the form f (x, x0) → g(x) if x ↓ x0 (or f (x, x0) → g(x0) if x ↓ x0) making explicit the fact that no intersection is computed by completion. Indeed, assume that we have a transition f (q2, q3) → q0 in the tree automaton, the critical pair would
thus add g(q2) → q (or g(q3) → q) and q → q0. Of course, this is less precise than
computing the intersection of languages recognized by q2and q3but, how it is explained in
the previous section, it keeps completed automata size reasonable in practice. Limitation of conditional term rewriting systems w.r.t. non left-linear rules
In Section 3.1.4, we have given the theory about completion with general conditional TRSs. Let f (x, x0) → g(x) if x ↓ x0 be the conditional rule to apply and f (q2, q3) → q0 be
the transition of the tree automaton to be considered. By definition of completion with conditional rules, the transitions g(q2) → q0 and q0 → q0 will be added if and only if
L(A, q2) ∩ L(A, q3) 6= ∅. This reveals another weakness of the encoding of non left-
linear rules into conditional rules. Indeed, this encoding is not convenient for non left- linear rules where there are strictly more than 2 occurrence of the same variable. For instance, a rule of the form f (x, x, x, y) → g(x), though it can exactly be encoded using a conditional rule f (x, x0, x00, y) → g(x) if x ↓ x0and x0 ↓ x00when rewriting terms, this
is no longer the case with completion. For instance, a reasonable semantics for completion of a transition f (q2, q3, q4, q5) → q0 w.r.t. rule f (x, x, x, y) → g(y) would be to add
transitions g(q5) → q0and q0 → q0if and only if
(1)L(A, q2) ∩ L(A, q3) ∩ L(A, q4) 6= ∅.
When using the above encoding in conditional rewrite rule f (x, x0, x00, y) → g(x) if x ↓
x0and x0↓ x00, the checking becomes:
(2)L(A, q2) ∩ L(A, q3) 6= ∅ and L(A, q3) ∩ L(A, q4) 6= ∅).
Of course, (1) implies (2) (and thus we still have an over-approximation) but they are not equivalent. However, on the practical cases we had, no such non left-linear rules with 3 or more occurrences are necessary.
Now, we give some details about the implementation of completion with these restricted conditional rules. The optimization of matching of linear left-hand sides of rules on the tree automaton is detailed in Section 4.2. Thus, we here focus on the optimization of the test of non-emptiness of the intersection, i.e. L(A, q) ∩ L(A, q0) 6= ∅ for given q and q0.
For this test, we neither build the intersection automaton nor check for emptiness but achieve both at once. Let ∆ and Q be respectively the set of transitions and the set of states of A. The algorithm uses two sets ok ⊂ Q × Q and rec ⊂ Q × Q and a recursive function check : Q × Q 7→ bool. The call check(q, q0) answers true if L(A, q) ∩ L(A, q0) 6= ∅.
The objective of the algorithm is to find, as quickly as possible, a common term between states q and q0or on the opposite to prove rapidly that no such term exists.
The general principle of check(q, q0) is the following: if there are two transitions f (q1, . . . , qn) → q and f (q01, . . . , qn0) → q0and check(q1, qn0) = true, . . . , check(qn, qn0) =
true then check(q, q0) = true. Used as is, this function may not terminate on recursive transition sets. The reason for non termination is the following. Assume that we want to check non emptiness of the intersection between q and q0 and that ∆ contains transitions f (q) → q and f (q0) → q0then this recursive algorithm may go on forever. Note however, that those recursive transitions are not necessary to consider for non-emptiness decision. Indeed, in the previous case (q, q0) is not empty only if some other transitions contribute to the languages recognized by states q and q0. This is the reason why we use the rec state for tabling couples that have already been recursively inspected.
This general principle needs to be completed with several optimization for better ef- ficiency. First, we use the ok set to table all couple of states whose intersection has already proven non empty. Second, there are some cases where check(q, q0), and thus L(A, q) ∩ L(A, q0) 6= ∅, can be computed without recursive calls by a careful inspec-
tion of the set of epsilon transitions or a careful inspection of ∆. Here is a more detailed description of the recursive check function.
check(q, q0) =
1. if (q, q0) ∈ ok then return true 2. if (q, q0) ∈ rec then return f alse
3. if q →A∗q0or q0 →A∗q then ok := (q, q0) ∪ ok; return true1
4. if there exists at least a common constant symbol a ∈ F such that a → q ∈ ∆ and a → q0 ∈ ∆ then ok := (q, q0) ∪ ok; return true
5. rec := rec ∪ (q, q0)
6. for all functional symbol f ∈ F of arity n such that f (q1, . . . , qn) → q ∈ ∆ and
f (q10, . . . , q0n) → q0 ∈ ∆ do
• if check(q1, q10) and . . . and check(qn, qn0) then ok := (q, q0)∪ok; return true
done
7. otherwise return f alse
After each completion step, values of ok and rec are initialized to ∅ and the check function is evaluated for all conditions to check. Note that the value of the ok set could be used from one step to another but this is not the case in the current implementation. Indeed, on the one side, completion only adds transitions to states. On the other side, state couples of ok could be renamed according to state merging operation due to simplification with equations. For 1provided that corresponding states are filled (see Definition 94) which we always assume on all states of all
instance if (q, q0) ∈ ok, i.e. L(A, q) ∩ L(A, q0) 6= ∅, and q0is merged with q00we trivially have L(A, q) ∩ L(A, q00) 6= ∅ and we can simply replace (q, q0) by (q, q00) in the ok set so that it remains valid.