Tree Automata Completion for Static Analysis of Functional Programs

(1)

Tree Automata Completion for Static Analysis of

Functional Programs

Thomas Genet, Yann Salmon

To cite this version:

Thomas Genet, Yann Salmon. Tree Automata Completion for Static Analysis of Functional Programs. 2013. <hal-00780124v2>

HAL Id: hal-00780124

https://hal.archives-ouvertes.fr/hal-00780124v2

Submitted on 27 May 2013

HAL is a multi-disciplinary open access archive for the deposit and dissemination of sci-entific research documents, whether they are pub-lished or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d’enseignement et de recherche fran¸cais ou étrangers, des laboratoires publics ou privés.

(2)

Tree Automaton Completion for Static

Analysis of Functional Programs

Thomas Genet

∗

Yann Salmon

†

May 24, 2013

Tree Automata Completion is a family of techniques for computing or approxi-mating the set of terms reachable by a rewriting relation. For functional programs translated into TRSs, we give a sufficient condition for completion to terminate. Second, in order to take into account the evaluation strategy of functional pro-grams, we show how to refine completion to approximate reachable terms for a rewriting relation controlled by a strategy. In this paper, we focus on innermost strategy which represents the call-by-value evaluation strategy.

1 Introduction

Computing or approximating the set of terms reachable by rewriting finds more and more applications. For a Term Rewriting System (TRS)Rand a set of termsL₀⊆_T₍_Σ_{), the set of} reachable terms isR∗(L0) = n t∈_T₍_Σ₎ ∃s∈L0, s→ ∗ Rt o

. This set can be computed for specific classes ofRbut, in general, it has to be approximated. Applications of the approximation of

R∗(L0) are ranging from cryptographic protocol verification [GK00, ABB+05], to static analysis

of various programming languages [BGJL07, KO11] or to TRS termination proofs [Mid02, GHWZ05]. Most of the techniques compute such approximations using tree automata as the core formalism to represent or approximate the (possibly) infinite set of termsR∗(L0). Most

of them also rely on a Knuth-Bendix completion-like algorithm to produce an automaton A∗_{recognising exactly, or over-approximating, the set of reachable terms. As a result, these} techniques can be refered as tree automata completion techniques [Gen98, TKS00, Tak04, FGVTT04, BCHK09, GR10, Lis12].

In this paper, we investigate the application of tree automata completion techniques to the static analysis of functional programs. The objective of such an analysis is to over-approximate the set of possible results of higher-order functional programs [OR11, KO11]. First, like the standard Knuth-Bendix completion, tree automata completion is not guaranteed to terminate. For TRSs extracted from functional programs, we show that termination of automata comple-tion is guaranteed using natural constraints on the definicomple-tion of the approximacomple-tion. Second, we show that completion can take the rewriting strategy into account, i.e. over-approximate

∗

IRISA, Campus de Beaulieu, 35042 Rennes Cedex, France,[email protected]

†

(3)

terms reachable by rewriting under a strategy. In this paper, we focus on the innermost strat-egy which corresponds to the call-by-value stratstrat-egy that is at the heart of several functional programming languages, such as Ocaml [LDG+12].

Surprisingly, very little effort has been done on computing or over-approximatingR∗_strat(L0),

i.e. set of reachable terms whenRis applied with a strategystrat. To the best of our knowledge, Pierre RÃľty’s work [RV02] is the only one to have tackled this goal. He gives some sufficient conditions onL0andRforR

∗

strat(L0) to be recognised by a tree automatonA

∗

, wherestratcan be the innermost or the outermost strategy. However, those restrictions onRandL0are strong

and generally incompatible with the analysis of functional programs. In this paper, we define a tree automata completion algorithm over-approximating the setR∗_in(L0) for all left-linear

TRSsRand all regular set of input termsL0.

This paper is organised as follows: Section 2 recalls some basic notions about TRSs and tree automata. Section 3 defines tree automata completion. Section 4 shows how to guarantee the termination of completion when analysing TRSs obtained from functional programs. Section 5 presents some experiments. Finally, Section 6 explains how to tune completion so as to take innermost strategy into account.

2 Basic notions and notations

2.1 Terms

Definition 1 (Signature).

A signature is a set whose elements are called function symbols. Each function symbol has an arity, which is a natural integer. Function symbols of arity 0 are called constants. Given a signatureΣandk∈_{N, the set of its function symbols of arity}_k_{is noted}_Σ_k_. 1J Definition 2 (Term, ground term, linearity).

Given a signatureΣand a setX _{whose elements are called variables and such that}_Σ∩ X ₌_∅, we define the set of terms overΣ andX_,_T₍_Σ,X_{), as the smallest set such that :}

1. X ⊆_T₍_Σ,X_{) and}

2. ∀_k∈_N_,∀_f ∈_Σ_k_,∀_t₁_{, . . . , t}_k∈_T₍_Σ,X₎_{, f}₍_t₁_{, . . . , t}_k₎∈_T₍_Σ,X_).

Terms in which no variable appears, i.e. terms inT(Σ,∅), are called ground; the set of ground terms is notedT(Σ).

Terms in which any variable appears at most once are called linear.1 2J Definition 3 (Substitution).

A substitution overT(Σ,X_{) is an application from}X _to_T₍_Σ,X_{). Any substitution is} induc-tively extended toT(Σ,X_{) by}_σ₍_f₍_t₁_{, . . . , t}_k_{)) =}_f₍_σ₍_t₁₎_{, . . . , σ}₍_t_k_{)). Given a substitution}_σ _{and a}

termt, we notetσ instead ofσ(t). 3J

Definition 4 (Context).

A context overT(Σ,X_{) is a term in}_T₍_Σ∪ X_,{}_{) in which the variable}_{appears exactly} once. A ground context overT(Σ,X_{) is a context over}_T₍_Σ_{). The smallest possible context,}_, is called the trivial context. Given a contextC and a termt, we noteC[t] the termCσt, where

σt:7→t. 4J

(4)

Definition 5 (Position).

Positions are finite words over the alphabet N. The set of positions of term t, Pos(t), is defined by induction overt:

1. for all constantcand all variableX, Pos(c) = Pos(X) ={_Λ}_and 2. Pos(f(t1, . . . , tk)) ={Λ} ∪

k [

i=1

{_i}_._Pos(_t_i_). 5J

Definition 6 (Subterm-at-position, replacement-at-position).

The position of the hole in contextC, Pos(C), is defined by induction onC: 1. Pos() =Λ

2. Pos(f(C1, . . . , Ck)) =i.Pos(Ci), whereiis the unique integer inJ1 ;kKsuch thatCiis a

context.

Given a term u and p ∈ _Pos(_u_{), there is a unique context} _C _{and a unique term} _v _such that Pos(C) =pandu =C[v]. The termv is notedu|_p, and, given another termt, we note

u[t]_p=C[t]. 6J

2.2 Rewriting

Definition 7 (Rewriting rule, term rewriting system).

A rewriting rule over (Σ,X_{) is a couple (}_{`, r}₎∈_T₍_Σ,X₎×_T₍_Σ,X_{), that we note}_`→_r_{, such} that any variable appearing inralso appears in`. A term rewriting system (TRS) over (Σ,X_{) is}

a set of rewriting rules over (Σ,X_). 7J

Definition 8 (Rewriting step, redex, reducible term, normal form).

Given a signature (Σ,X_{), a TRS}_R_{over it and two terms}_{s, t}∈_T₍_Σ_{), we say that}_s_{can be} rewritten intotbyR, and we notes→_R_t_{if there exist a rule}_`→_r∈_R_{, a ground context}_C overT(Σ) and a substitutionσ overT(Σ,X_{) such that}_s₌_C_[_`σ_{] and}_t₌_C_[_rσ_].

In this situation, the termsis said to be reducible byRand the subterm`σ is called a redex ofs. A termsthat is not reducible byRis a normal form ofR. The set of normal forms ofRis noted Irr(R).

We note→∗

Rthe reflexive and transitive closure of→R. 8J Definition 9 (Set of reachable terms).

Given a signature (Σ,X_{), a TRS} _R _{over it and a set of terms} _L⊆ _T₍_Σ_{), we note} _R₍_L_{) =} {_t∈_T₍_Σ₎| ∃_s∈_{L, s}→_R_t}_and_R∗₍_L_{) =}n_t∈_T₍_Σ₎ ∃s∈L, s→ ∗ Rt o . 9J Definition 10 (Left-linearity).

A TRSRis said to be left-linear if for each rule`→_r_of_R_{, the term}_`_{is linear.} 10J Definition 11 (Constructors and defined symbols, sufficient completeness).

Given a TRSRover (Σ,X_{), there is a partition (}C_,D_{) of}_Σ _{such that all symbols occurring} at the root position of left-hand sides of rules ofRare inD_. D_{is the set of defined symbols} ofR,C_{is the set of constructors. Terms in}_T₍C_{) are called data-terms. A TRS}_R_{over (}_Σ,X_{) is} sufficiently complete if for alls∈_T₍_Σ_),_R∗₍{_s}₎∩_T₍C₎_,_∅. 11J

(5)

2.3 Equations

Definition 12 (Equivalence relation, congruence).

A binary relation over some setS is an equivalence relation if it is reflexive, symmetric and transitive.

An equivalence relation≡_over_T₍_Σ_{) is a congruence if for all}_k∈_{N, for all}_f ∈_Σ_k_{, for all}

t1, . . . , tk, s1, . . . , sk∈T(Σ) such that∀i∈J1 ;kK,ti≡si, we havef(t1, . . . , tk)≡f(s1, . . . , sk). 12J

Definition 13 (Equation,≡_E_).

An equation over (Σ,X_{) is a pair of terms (}_{s, t}₎∈_T₍_Σ,X₎×_T₍_Σ,X_{), that we note}_s₌_t_{. A set}_E of equations over (Σ,X_{) induces a congruence}≡_E_over_T₍_Σ_{) which is the smallest congruence} overT(Σ) such that for alls=t∈_E_{and for all substitution}_θ_:X →_T₍_Σ_),_sθ≡

Etθ. The classes

of equivalence of≡_E_{are noted with [}·_]_E_. 13J

Definition 14 (Rewriting moduloE).

Given a TRSRand a set of equationsEboth over (Σ,X_{), we define the}_R_modulo_E_rewriting relation,→_R/E_{, as follows. For any}_{u, v}∈_T₍_Σ_),_u→_R/E_v_{if and only if there exist}_u0_{, v}0∈_T₍_Σ₎ such thatu0≡_E_u_,_v0≡_E_v_and_u0→_R_v0_.

We define→∗

R/E, (R/E)(L) and (R/E) ∗

(L) forL⊆_T₍_Σ_{) as in Definitions 8 and 9.} 14J 2.4 Tree automata

Definition 15 (Tree automaton, delta-transition, epsilon-transition, new state).

An automaton overΣis someA_{= (}_{Σ, Q, Q}_F_{, ∆}_{) where}_Q_{is a finite set of states,}_Q_F_{is a subset} ofQwhose elements are called final states and∆a finite set of transitions. A delta-transition is of the formf(q1, . . . , qk)q 0 wheref ∈_Σ k andq1, . . . , qk, q 0 ∈_Q_{. An epsilon-transition is of} the formqq0 whereq, q0∈_Q_{. A configuration of}A_{is a term in}_T₍_{Σ, Q}_).

A stateq∈_Q_{that appears nowhere in}_∆_{is called a new state. A configuration is elementary} if each of its subconfigurations at depth 1 (if any) is a state. A configuration is trivial if it is

just a state. 15J

Remark. We simply writeA_{to denote an automaton, write}_Q_A_{for the set of states of}A_{. We} assimilate an automaton with its set of transitions. When taking a “new state”, we silently expandQAif needed. We are only rarely interested inQ_F, the set of final states.

Definition 16.

LetA_{= (}_{Σ, Q, Q}_F_{, ∆}_{) be an automaton and let}_{c, c}0 _{be configurations of}A_{. We say that}A recognisescintoc0in one step, and notec

A c 0

if there a transitionτρinA_{and a context}

CoverT(Σ, Q) such thatc=C[τ] andc0=C[ρ]. We note∗

A the reflexive and transitive closure

of

A and, for anyq

∈_Q_,L₍A_{, q}_{) =} t∈_T₍_Σ₎ t∗ A q

. We extend this definition to subsets of

Qand noteL(A_{) =}L₍A_{, Q}_F_). 16J

Definition 17 (Determinism, Completeness).

An automaton is deterministic if it has no epsilon-transition and for all delta-transitions

τρandτ0 ρ0, ifτ=τ0 thenρ=ρ0. An automaton is complete if each of its non-trivial configurations is the left-hand side of some of its transitions. 17J

(6)

Definition 18 (Colours).

Transitions may have “colours”, likeRfor transition qR q0, and we denote byAR the automaton obtained fromA_{by removing all transitions coloured with}R. 18J Definition 19.

Given two statesq,q0 of some automatonA_{and a colour}E, we noteq

E A q0 when we have bothqE A q 0 andq0E A q. 19J Remark. q E, ∗ A q0 is stronger than (qE,∗ A q 0 ∧_q0 E,∗ A q). E,∗ A

is an equivalence relation overQA. This relation is extended to a congruence relation overT(Σ, Q). The equivalence classes are noted with [·_]_E_.

Definition 20.

LetA_{= (}_{Σ, Q, Q}_F_{, ∆}_{) be an automaton and}_E_{a colour. We note}A_/_E_{the automaton over}_Σ whose set of states isQ/E, whose set of final states ifQF/Eand whose set of transitions is

n f([q1]E, . . . ,[qk]E) q0 E f(q1, . . . , qk)q 0 ∈_∆o ∪n_[_q_] E q0 E qq 0 ∈_∆∧_[_q_] E, q0 E o . 20J

Remark. For any configurationsc, c0 ofA_{, we have}_c∗ A c 0 if and only if [c]E ∗ A_/_E c0 E. So the languages recognised byA_andA_/_E_{are the same.}

We now give notations used for pair automata, the archetype of which is the product of two automata.

Definition 21 (Pair automaton).

An automatonA_{= (}_{Σ, Q, Q}_F_{, ∆}_{) is said to be a pair automata if there exists some sets}_Q₁_and

Q2such thatQ=Q1×Q2. 21J

Definition 22 (Product automaton).

LetA_{= (}_{Σ, Q, Q}_F_{, ∆}_A_{) and}B_{= (}_{Σ, P , P}_F_{, ∆}_B_{) be two automata. The product automaton of}A andB_isA × B_{= (}_{Σ, Q}×_{P , Q}_F×_P_F_{, ∆}_{) where} ∆=nf(h_q₁_{, p}₁i_{, . . . ,}h_q_k_{, p}_ki₎ q0, p0 f(q1, . . . , qk)q 0 ∈_∆_A∧_f₍_p₁_{, . . . , p}_k₎_p0∈_∆_Bo ∪nh_{q, p}i q0, p0 qq 0 ∈_∆_Ao∪nh_{q, p}i q, p0 pp 0 ∈_∆_Bo_. 22J Definition 23 (Projections).

LetA_{= (}_{Σ, Q, Q}_F_{, ∆}_{) be a pair automaton, let}_τ _ρ_{be one of its transitions and}h_{q, p}i_be one of its states. We define Π₁(h_{q, p}i_{) =}_q _{and extend} _Π₁₍·_{) to configurations inductively:}

Π1(f(γ1, . . . , γk)) =f(Π1(γ1), . . . , Π1(γk)). We defineΠ1(τρ) =Π1(τ)Π1(ρ). We define

Π1(A) = (Σ, Π1(Q), Π1(QF), Π1(∆)).Π2(·) is defined on all these objects in the same way for

the right components. 23J

Remark. UsingΠ1(A) amounts to forgetting the precision given by the right components of

(7)

2.5 Innermost strategies

In general, a strategy over a TRS R is a set of (computable) criteria to describe a certain subrelation of→_R_{. In this paper, we will be interested in innermost strategies. In these} strate-gies, commonly used to execute functional programs (“call-by-value”), terms are rewritten by always contracting one of the lowest reducible subterms.

Definition 24 (Innermost strategy).

Given a TRSRand two termss, t, we say thatscan be rewritten intotbyRwith an innermost strategy, and we notes→_R

int, ifs→Rtand each strict subterm of the redex insis aR-normal form. We define→∗

Rin,Rin(L) andR ∗

in(L) as in Definitions 8 and 9. 24J

Remark. It is in fact sufficient to check whether subterms ofsat depth 1 are in normal form to decide whetherscan be rewritten with an innermost strategy.

To deal with innermost strategies, we will have to discriminate normal forms. This is possible within the tree automaton framework whenRis left-linear.

Theorem 25 ([CR87]).

LetRbe a left-linear TRS. There is a deterministic and complete tree automatonIRR(R) such thatL(IRR(R)) = Irr(R) and with a distinguished statep

redsuch thatL(IRR(R), pred) =

T(Σ)rIrr(R). 25

Remark. From determinism and the property ofpred follows that for any state pdifferent

frompred,L(IRR(R), p)⊆Irr(R).

3 Classical equational completion

Equational completion of [GR10] is an iterative process on automata that is not guaranteed to terminate. Each iteration comprises two parts: (exact) completion itself, then equational merging. The former tends to incorporate descendants byR of already recognised terms into the recognised language; this leads to the creation of new states. The latter tends to merge states in order to ease termination of the overall process, at the cost of precision of the computed result. Some transition added by equational completion will have coloursRorE; it is assumed that the transitions of the input automatonA₀_{do not have any colour and that}A₀ does not have any epsilon-transition.

3.1 Exact completion

Exact completion is about resolving critical pairs. A critical pair represents a situation where some term is recognised by the current automaton, but not its descendants byR. Its resolution consists in adding transitions to let the descendants be recognised as well. This process can create new critical pairs.

Definition 26 (Critical pair).

A pair (`→_{r, σ , q}_{) where}_`→_r∈_R_,_σ _:X →_Q_A_and_q∈_Q_A_{is critical if (see Figure 1(a))} 1. `σ∗

(8)

2. rσ ∗ A q. 26J We will want to add transitionsrσ q0 andq0 q0. However, doing the former is not generally possible in one step, asrσ might be a non-elementary configuration. We will have to normalise this configuration by adding intermediate states. For example, to setf(g(a, b))q0, we will first addaq_a, then,bq_b,g(q_a, q_b)q_gand finallyf(q_g)q0, exploringf(g(a, b)) with a postfix traversal. At each of these steps, we will look whether we could reuse an already existing transition; if not, we will add a new state.

Definition 27 (NormA(c, q)).

LetA_{be an automaton,}_c_{be a non-trivial configuration and}_q_{be any state. Let}_Ξ_{= (}_ξ₁_{, . . . , ξ}_K₎ be a postfix traversal ofc(ξK is the root position). With∆the set of transitions ofA, let us use an auxiliary function:

NormA(c, q) =NormAuxΞ,_∆1(c, q). (1) Now let us defineNormAux·Ξ,i. Foriranging from 1 toK−1, for any set of transitions∆and any configurationdsuch thatd|_ξ_i is an elementary configuration, let

NormAuxΞ,i_∆ (d, q) ={_d_|_ξ i q 0 } ∪_NormAuxΞ,i+1 ∆∪{_d|_ξiq0}(d q0 ξi, q) (2)

whereq0 is such thatd|_ξ_i q 0

in∆/E(or, if there is no such state,q0is a new state, distinct from

q). Also, for any set of transitions∆and any elementary configurationd, let

NormAuxΞ,K_∆ (d, q) ={_d_q}_. ₍₃₎ 27J Remark. It is necessary to consider equivalence byEwhen searching for an already existing transition. Suppose we work with somef(q₁) and haveq₁E

∆

q₂ andf(q₂)

∆ q

0

: we do not want to create a new state here, but reuseq0 and setf(q1)q

0 . Definition 28 (Completion of a critical pair).

A critical pairCP = (` →_{r, σ , q}_{) in automaton}A _{is completed by first computing} _N ₌

Norm_A

R(rσ , q

0

) whereq0 is a new state, then adding toA_{the new states and the transitions} appearing inN as well as the transitionq0R q. Ifrσ is a trivial configuration (ie. ris just a

variable), only transitionrσ R qis added. 28J

`σ rσ q q0 R A ∗ ∗ _A0 A0 R

(a) A critical pair

sθ tθ q1 q2 E A ∗ ∗ _A E A0 (b) Situation of ap-plication

(9)

Definition 29 (Step of completion).

The automaton resulting from completion of a list comprising of one critical pairCP is notedJ(CP)

R (A). For a list of zero critical pair, we set J

()

R (A) = A and for any listLCP of critical pairs (ofA_{) whose head is}_CP _{and tail is}_LCP0_,JLCP

R (A) =JLCP 0 R J(CP) R (A) . A step of completion of automatonA_{consists in producing automaton}C_R₍A_{) =}JCP

R (A), whereCP is a

list of all critical pairs ofA_. 29J

Example 30.

LetΣ be defined with Σ0 ={n,0}, Σ1 = {s, a, f}, Σ2 ={c} where 0 is meant to represent

integer zero,sthe successor operation on integers,athe predecessor (“antecessor”) operation,

nthe empty list,cthe constructor of lists of integers andf intended by some unwise person to be the function on lists that filters out integer zero. LetR ={_f₍_n₎→_{n, f}₍_c₍_s₍_X₎_{, Y}₎₎ →

c(s(X), f(Y)), f(c(a(X), Y))→ _c₍_a₍_X₎_{, f}₍_Y₎₎_{, f}₍_c₍₀_{, Y}₎₎→_f₍_Y₎_{, a}₍_s₍_X₎₎ →_{X, s}₍_a₍_X₎₎→_X}_{. Let} A₀₌{_n_q_n_,₀_q₀_{, s}₍_q₀₎_q_s_{, a}₍_q_s₎_q_a_{, c}₍_q_a_{, q}_n₎_q_c_{, f}₍_q_c₎_q_f}_.

We haveL(A₀_{, q}_f_{) =}{_f₍_c₍_a₍_s₍₀₎₎_{, n}₎₎}_and_R(L₍A₀_{, q}_f_{)) =}{_f₍_c₍₀_{, n}₎₎_{, c}₍_a₍_s₍₀₎₎_{, f}₍_n₎₎}_.

There is a critical pairCP1inA0with the rulef(c(a(X), Y))→c(a(X), f(Y)), the substitution

σ₁ ={_X 7→ _q

s, Y 7→ qn} and the state qf. It is resolved by adding transitions to recognise

c(a(qs), f(qn)) intoqf. Normalisation finds and reuses the transitiona(qs)qa. It has to create a new stateqN1 such that f(qn)qN1, and qN2 such that c(qa, qN1) qN2. We then add

qN2

R

qf, and have producedJ

(P C1) R (A0).

Another critical pair isCP2inA0with the rulea(s(X))→X, the substitutionσ2={X7→q0}

and the stateqa. It is resolved by adding toJ

(P C1) R (A0) the transitionq0 R qa, producing J(P C1,P C2) R (A0).

There is no more critical pair inA₀_{: thus}C_R₍A₀_{) =}J(P C1,P C2)

R (A0). There is a new critical

pair inC_R₍A₀_{) with}_f₍_n₎→_n_{, the empty substitution and state}_q_N₁_. 30C 3.2 Equational merging

Since completion of a critical pair can create new critical pairs, the process fuels itself, which is problematic for obtaining a fix-point. Equational merging is a way of countering this phenomenon at the cost of precision that is parametrised by equations overT(Σ).

Definition 31 (Situation of application of an equation).

Given an equations=t, an automatonA_{, a}_θ_:X →_Q_A_{and states}_q₁_and_q₂_{, we say that} (s=t, θ, q₁, q₂) is a situation of application inA_{if (see Figure 1(b))}

1. sθ∗ A q1, 2. tθ∗ A q2 and 3. q1 E A q2. 31J

(10)

Definition 32 (Application of an equation).

Given (s=t, θ, q1, q2) a situation of application inA, applying the underlying equation in it

consists in adding transitionsq1

E

q2andq2

E

q1toA. This produces a new automatonA

0

and we noteA;EA0_. 32J

Remark. In [GR10],q1andq2were merged by “renaming”q2intoq1, ie. removingq2from

QA and replacing every occurrence ofq₂byq₁in the transitions ofA. This is equivalent to applying our method, then considering automatonA_/_E_{(see definition 20) and finally choosing} a representative (hereq1for the class{q1, q2}) of each equivalence class of states.

Definition 33 (Simplified automaton).

Given an automatonA_{and a set of equations}_E_{, we call simplified automaton of}A_by_E_and noteS_E₍A_{) the automaton resulting from the successive application of all applicable equations}

inA_. 33J

Remark (;Eis confluent). Indeed, there is a unique automatonA!_{that di}ff_{ers from}A_only by itsE-transitions and is suchAR;∗

E(A!)R and there is no more situation of application of equations in (A!₎R.

Definition 34 (Step of equational completion).

A step of equational completion is the composition of a step of exact completion, then equational simplification:CE_R,E₍A_{) =}S_E₍C_R₍A_)). 34J The following notion is part of an easier discourse about theR/E-coherence notion of [GR10]. Definition 35 (Coherent automaton).

LetA₀_{= (}_{Σ, Q, Q}_F_{, ∆}_{) be a tree automaton and}_E_{a set of equations. The automaton}A₀_is said to beE-coherentif for allq∈_Q_{, there exists}_s∈L₍A₀_{, q}_{) such that}L₍A₀_{, q}₎⊆_[_s_]_E_. 35J 3.3 Known results

We now recall the two main theorems of [GR10]. Theorem 36 (Correctness).

LetA₀_{be some automaton. Assume the equational completion procedure defined above} terminates when applied toA₀_{. Let}A_∗_{be the resulting fix-point automaton. If}_R_{is left-linear,} then the calculated over-approximation is correct, that is

L(A_∗₎⊇_R∗(L₍A₀₎₎_. 36 We will make usage ofE-coherence for the precision theorem.

Lemma 37.

LetA₀_{be a} _E_{-coherent automaton,} _R_{a left-linear TRS and}A _{be an automaton obtained} fromA₀_{after several steps of equational completion with}_{R, E}_{. Then}ARisE-coherent and moreover, for all stateqofA_{, there exists}_s∈L₍AR, q) such thatL(A, q)⊆(R/E)∗(s). 37 Such an automaton is said to beR/E-coherent. The intuition behind this is the following: in the tree automaton,R-transitions represent rewriting steps and transitions ofARrecognize

(11)

are recognized into the same stateqinARthen they belong to the sameE-equivalence class. Otherwise, if at least oneR-transition is necessary to recognize, say,tintoqthen at least one step of rewriting was necessary to obtaintfroms. In [GR10], the following theorem made an assumption ofR/E-coherence forA₀_{, but, given that}A₀_{does not have any}_R_-transition,A₀_is

R/E-coherent if and only if it isE-coherent. Theorem 38 (Upper bound).

LetEbe a set of equations,A₀_a_E_{-coherent tree automaton and}_R_{a left-linear TRS. If}A_is an automaton produced fromA₀_{after several steps of equational completion with}_{R, E}_{, then}

L(A₎⊆₍_R/E₎∗(L₍A₀₎₎ 38

4 Termination of completion for functional programs

Now, we consider functional programs viewed as TRSs. We assume that such TRSs are left-linear, which is a common assumption on TRSs obtained from functional programs [BN98]. In this section, we will restrict ourselves to sufficiently complete constructor-based TRSs and will refer to them asfunctional TRSs. For the moment, types used in the functional program are not taken into account in the TRS, but they will be in a next section. First, we state a sufficient condition for completion to terminate and, next, we will specialize it for functional TRS. Remark. In this section, we will be interested in counting the states of A_/_{E, where} A _is produced by equational completion. As we remarked just after definition 32, dealing with A_/_E _{amounts to act as if applying an equation would e}ff_{ectively merge (or rename) states.} Therefore, we will assimilateA_/_E_withA_{and consider that states in relation modulo} E,

∗ A

are merged. Note that with this convention,AR(that is indeed (A/E)R) has no epsilon-transition. 4.1 Ensuring termination of completion

In this section, we show that ifT(Σ)/≡

E is finite then completion terminates by proving that completion produces no more new states thanE-equivalence classes. Doing so is not possible for an arbitraryEbecause equational completion does not take reflexivity of≡_E _{into account} and there may existq,q0such thats

A qandsA q 0

.2We will enrichEwith a set of reflexivity equations that solve this difficulty without altering≡_E_.

Definition 39 (Set of reflexivity equationsEr).

For a given set of symbolsΣ,Er={_f₍_x₁_{, . . . , x}_k_{) =}_f₍_x₁_{, . . . , x}_k₎|_k∈_N∧_f ∈_Σ_k}_. 39J Note that for all set of equationsE, the relation≡_E_{is the same as}≡_E_∪_Er. However, simpli-fication withEr transforms all automaton into an automatonA_{whose states coincide with} equivalence classes of≡_E_{and such that}ARis deterministic, as stated in the following lemmas.

Lemma 40.

LetA_{be some automaton,}_s₌_t_{be some equation of}_E _and_θ_:X →_Q_A_{. If}_sθ ∗

S_E₍A₎qand tθ ∗ S_E(A)q 0 , thenq ∗ S_E(A)q 0 orq0 ∗ S_E(A)q. If S

E(A) has no epsilon-transition, thenq=q 0

. 40

(12)

Proof.

This translates the fact that, by definition, there cannot be a situation of application for equations=tinS_E₍A_{), and we chose a representation of automata where states in relation} moduloE, ∗ A are merged. Lemma 41.

For all tree automatonA_{without epsilon-transition and all set of equations}_E_{, if}_E⊇_Er_, thenS

E(A) is deterministic. 41

Proof.

SinceA_{has no epsilon-transition, neither does}S

E(A). We prove this lemma by induction on the terms recognized byS_E₍A_{). Let}_a_{be a constant such that}_a ∗

S_E₍A₎qanda ∗ S_E₍A₎q 0 : by Lemma 40,q=q0. For the inductive case, lett=f(t1, . . . , tk) such thatt

∗ S_E₍A₎qandt ∗ S_E₍A₎q 0 . By induction hypothesis, for eachi∈

J1 ;kK, there exists a unique stateqisuch thatti

∗ S_E₍A₎qi. Hencef(q1, . . . , qk) ∗ S_E₍A₎q and f(q1, . . . , qk) ∗ S_E₍A₎q 0 . Since f(x1, . . . , xk) =f(x1, . . . , xk) ∈Er, by Lemma 40,q=q0. Corollary 42.

LetA_{be an automaton produced by some steps of equational completion with}_E⊇_Er_{. The}

automatonARis deterministic. 42

Definition 43 (SetE_Fc of contracting equations forF _).

LetF ⊆_Σ_{. The set of equations}_E_Fc _is_contracting_(forF_{) if its equations are of the form}

u=u|_pwithulinear andp,Λand if the set of normals forms ofT(F) w.r.t. TRS −−→ EFc = n u→_u_| p u=u|p∈E c F o is finite. 43J

Contracting equations define an upper bound on the number of states of a simplified automa-ton.

Lemma 44.

Simplification of a tree automatonA _{using a set} _E _{of equations such that} _E ⊇_Er∪_Ec Σ withE_Σc contracting ends up on an automatonS

E(A) having no more states than terms in Irr(

−−→

E_Σc ). 44

Proof.

Assume that no term of Irr(

−−→

Ec_Σ) is recognised by S_E₍A_{). Then, for all term} _s _{such that}

s ∗

S_E₍A₎q, we know thatsis not in normal form w.r.t. −−→

E_Σc . As a result, the left-hand side of an equation ofEc_Σcan be applied tos. This means that there exists an equationu=u|_p, a ground contextC and a substitutionθ such thats=C[uθ]. Furthermore, since s ∗

S_E₍A₎q, we know thatC[uθ] ∗

S_E₍A₎qand that there exists a stateq 0 such thatC[q0] ∗ S_E₍A₎qanduθ ∗ S_E₍A₎q 0 . From

(13)

uθ ∗ S_E₍A₎q

0

, we know that all substerms ofuθare recognised by at least one state inS_E₍A_). Thus, there exists a stateq00 such thatu|pθ

∗ S_E(A)q

00

. We thus have a situation of application of the equationu=u|_pin the automaton. SinceS_E(A) is simplified, we thus know thatq

0 =q00. As mentioned above, we know thatC[q0] ∗

S_E(A)q. HenceC[u|pθ] ∗ S_E(A)C[q 0 ] ∗ S_E(A)q. IfC[u|pθ]

is not in normal form w.r.t. −E−→c_Σ then we can do the same reasoning onC[u|_pθ] ∗

S_E₍A₎quntil getting a term that is in normal form w.r.t. −E−→_Σc and recognised by the same stateq. Thus, this contradicts the fact thatS_E₍A_{) recognises no term of I}rr(

−−→

E_Σc ) are disjoint. Then, by definition ofE_Σc, Irr(

−−→

E_Σc ) is finite. Let{_t₁_{, . . . , t}_n}_{be the subset of I}rr(

−−→

Ec_Σ) recognised byS_E₍A_{). Let}_q₁_{, . . . , q}_n_{be the states recognising}_t₁_{, . . . , t}_n_{respectively. We know that there is a} finite set of states recognisingt1, . . . , tnbecauseE⊇Erand Corollary 42 entails thatSE(A)R is deterministic. Now, for all terms recognised by a stateqinS_E₍A_{), i.e.} _s ∗

S_E₍A₎q, we can use a reasoning similar to the one carried out above and show thatqis equal to one state of {_q₁_{, . . . , q}_n}_{recognising normal forms of}

−−→

E_Σc inS_E₍A_{). Finally, there are at most}_card_(Irr(

−−→

E_Σc ))

states inS_E₍A_).

Now it is possible to state the Theorem guaranteeing the termination of completion if the set of equationsEcontainsErand a set of contracting equationsE_Σc.

Theorem 45.

LetA_{be a tree automaton,}_R_{a left linear TRS and}_E_{a set of equations. If}_E⊇_Er∪_Ec Σwith

Ec_Σcontracting then completion ofA_by_R_and_E_terminates. 45 Proof.

For completion to diverge it must produce infinitely many new states. This is impossible

with sets of equationE_Σc andEr as shown in Lemma 44.

Remark. Note that ifEcontainsEr and a set of contracting equations,≡_E_{is of finite index} (there are finitely many equivalence classes inT(Σ)/≡_E). However, finiteness ofT(Σ)/≡_E alone is

not enough to guarantee termination of equational completion as defined in Section 3. For instance, ifΣ={_{f , g, a}}_and_E₌{_f₍_x_{) =}_g₍_x₎_{, g}₍_x_{) =}_x}_then_T₍_Σ₎_/≡

Eis finite but completion of a tree automaton recognisingf(a) withf(x)→_f₍_f₍_x_{)) will not terminate because terms having}

gsymbols will not be recognised andg(x) =xwill not be applied.

For TRSs representing functional programs, defining contraction equations ofECc onCrather than onΣ is enough to guarantee termination of completion. This is more convenient and also closer to what is usually done in static analysis where abstractions are usually defined on data and not on function applications. Since the TRSs we consider are sufficiently complete, any term ofT(Σ) can be rewritten into a data-term ofT(C_{). As above, using equations of}_E_Cc _we are going to ensure that the data-terms of the computed languages will be recognised by a bounded set of states. To lift-up this property toT(Σ) it is enough to ensure that∀_{s, t}∈_T₍_Σ_{) if}

s→_R_t_then_s_and_t_{are recognised by equivalent states. This is the role of the set of equations}

(14)

Definition 46 (ER).

LetRbe a TRS, the set ofR-equations isER={l=r|l→r∈R}. 46J Theorem 47.

LetA_{be a tree automaton,}_R_{a su}ffi_{ciently complete left-linear TRS and}_E_{a set of equations.} IfE⊇_Er∪_Ec

C∪ERwithE c

Ccontracting then completion ofAbyRandEterminates. 47

Proof.

Firstly, we can use a reasoning similar to the one used in the proof of Lemma 44 to show that the number of states recognising terms ofT(C_{) are in finite number. Let}_G⊆_T₍C_{) be the} set of normal forms ofT(C_{) w.r.t.} −_E−→c

C. SinceE⊇E r_∪_Ec

C, like in the proof of Lemma 44, we can show that in any completed automaton, terms ofT(C_{) are recognised by no more states than} terms inG. Secondly, sinceRis sufficiently complete, for all terms∈_T₍_Σ₎\_T₍C_{) we know that} there exists a termt∈_T₍C_{) such that}_s→∗

Rt. The fact thatE⊇ER guarantees thatsandtwill be recognised by equivalent states in the completed (and simplified) automaton. Since the number of states necessary to recogniseT(C_{) is finite, so is the number of states necessary to}

recognise terms ofT(Σ).

4.2 Ensuring termination of completion for functional TRS with types

To exploit the types of the functional program, we now see Σ as a many-sorted signature whose set of sorts is S_{. Each symbol} _f ∈_Σ _{is associated to a profile} _f _: _S₁×_{. . .}×_S_k 7→_S where S1, . . . , Sk, S ∈ S and kis the arity off. Well-sorted terms are inductively defined as follows: f(t1, . . . , tk) is a well-sorted term of sort S if f : S1×. . .×Sk 7→S and t1, . . . , tk are well-sorted terms of sortsS1, . . . , Sk, respectively. We denote byT(Σ,X)

S

, T(Σ)S andT(C₎S the set of well-sorted terms, ground terms and constructor terms, respectively. Note that we haveT(Σ,X₎S ⊆_T₍_Σ,X_),_T₍_Σ₎S ⊆_T₍_Σ_{) and}_T₍C₎S ⊆_T₍C_{). We assume that}_R_and_E _are

sort preserving,i.e.that for all rulel→_r ∈_R_{and all equation}_u₌_v∈_E_,_{l, r, u, v}∈_T₍_Σ,X₎S_,_l andr have the same sort and so dou andv. Again, this assumption onRis natural ifRis the translation of a well-typed functional program. We still assume that the (sorted) TRS is sufficiently complete, which is defined in a similar way except that it holds only for well-sorted terms, i.e. for alls∈_T₍_Σ₎S _{there exists a term}_t∈_T₍C₎S _{such that}_s→∗

Rt. We slightly refine the definition of contracting equations as follows. For all sortS, ifS has a unique constant symbol we note itcS.

Definition 48 (SetEFc_,S of contracting equations forF andS).

LetF ⊆_Σ_{. The set of well-sorted equations}_E_Fc

,S iscontracting(forF) if its equations are of the form (a)u=u|_pwithulinear andp,Λ, or (b)u=cS withuof sortS, and if the set of normal forms ofT(F₎S _{w.r.t. the TRS}

−−−−→ EcF_,S = n u→_v u=v∈E c F_,S ∧(v=u|_p∨v=cS) o is finite. 48J

The termination theorem for completion of the sorted TRSs is close to the previous one except that that it takes into account the refined version of contracting equations and that it needsE-coherence of A₀_{. This is useful to ensure that terms recognised by completed} automata are well-sorted.

(15)

Theorem 49.

LetA₀_{be a tree automaton recognising well-sorted terms,}_R_{a su}ffi_{ciently complete} sort-preserving left-linear TRS andEa sort-preserving set of equations. IfE⊇_Er∪_E_Cc

,S∪ERwith

EcC_,S contracting andA0isE-coherent then completion ofA0byRandEterminates. 49 Proof.

LetA_{be any tree automaton obtained by completion of}A₀_by_R_and_E_{. As in Lemma 44,} from finiteness of the set normal forms ofT(C₎S _w.r.t. −−−→_E_Cc

,S, we can obtain finiteness of the set of states recognising terms ofT(C₎S _{in the completed automaton. The only slight di}ff_erence comes from rules of the formu =cS. If a terms ∈_T₍C₎S _{is not in normal form w.r.t.}

−−−→

ECc_,S because the ruleu →_cS _{applies then we have:} _s ₌_C_[_uσ_]∗

A q. Thus there exists a stateq 0

such thatuσ ∗ A q

0

. SincecS is the only constant of sortSand sinceuσ is of sortS, we know thatcS is necessarily a subterm ofuσ. Thus there exists a state q00 such that cS ∗

A q 00

and since completed automata are simplified,q0 =q00and finallyC[cS]∗

A q. As in Lemma 44, we can iterate the process until finding a normal form of −−−→ECc_,S. This entails the finiteness of the set of states recognising terms ofT(C₎S _inA_{. Then, as in the proof of Theorem 47 we can} use the fact thatE⊇_E_R _{to have that terms of}_T₍_Σ₎S _{are recognised in}A_{using a finite set of} states. What remains to be proved is thatA _{recognises only well-sorted terms, i.e. that it} recognises no term ofT(Σ)\_T₍_Σ₎S_{. This is true because}A

0isE-coherent, and by Theorem 38,

L(A₎⊆₍_R/E₎∗(L₍A₀_{)). Besides,}_R_and_E_{being sort-preserving, so is}_R/E_{. Thus, terms of}L₍A₎

are all well-sorted.

5 Experiments

All completions were performed usingTimbukandTimbukCEGAR[BBGL12]. All theIRR(R) au-tomata constructions and auau-tomata intersections were performed using Taml. All completion results have been certified by Coq [BC04] using the Coq-extracted completion checker [BGJ08]. All those tools are freely available onTimbukweb site [Tim12].

5.1 An introductory example

Ops append:2 rev:1 nil:0 cons:2 a:0 b:0 Vars X Y Z U Xs

TRS R

append(nil,X)->X rev(nil)->nil

append(cons(X,Y),Z)->cons(X,append(Y,Z)) rev(cons(X,Y))->append(rev(Y),cons(X,nil)) Automaton A0 States q0 qla qlb qnil qf qa qb Final States q0 Transitions

rev(qla)->q0 cons(qb,qnil)->qlb cons(qa,qla)->qla nil->qnil cons(qa,qlb)->qla a->qa cons(qb,qlb)->qlb b->qb Equations E Rules

append(nil,X)=X a=a b=b nil=nil cons(X,cons(Y,Z))=cons(Y,Z) append(cons(X,Y),Z)=cons(X,append(Y,Z)) cons(X,Y)=cons(X,Y)

rev(nil)=nil append(X,Y)=append(X,Y) rev(cons(X,Y))=append(rev(Y),cons(X,nil)) rev(X)=rev(X)

(16)

In this example, the TRSRencodes the classicalreverseandappendfunctions. The language recognised by automatonA₀_{is the set of terms of the form}_rev₍_l_a_{) where}_l_a _{can be any non} empty list of the form [a, a, . . . , b, b, . . .]. Note that there is at least oneaand onebin the list. We assume thatS ₌{_{T , list}} _{and sorts for symbols are the following:} _a_:_T_, _b_:_T_, _nil _:_list_,

cons:T×_list7→_list_,_append_:_list×_list7→_list_and_rev_:_list7→_list_{. Now, to use Theorem 49, we} need to prove each of its assumptions. The setEof equations containsER,ErandECc_,S. The set of EquationsECc_,S is contracting because the automatonIRR(

−−−→

ECc_,S) recognises a finite language. This automaton can be computed using Taml: it is the intersection between the automaton A_T₍_C₎S3recognisingT(C)S and the automatonIRR({cons(X, cons(Y , Z))→cons(Y , Z)}):

States q2 q1 q0 Final States q0 q1 q2 Transitions b->q2 a->q2 nil->q1 cons(q2,q1)->q0

It is easy to see thatEandRare sort preserving and thatA₀_{recognises well-sorted terms.} We can also prove sufficient completeness ofRonL(A0) using, for instance, Maude [CDE+09] or evenTimbuk[Gen98] itself. The last assumption of Theorem 49 to prove is thatA₀_is_E -coherent. This can be shown by remarking that each stateA₀_{recognises at least one term and} that for all stateqsuch thats∗

A₀qandt ∗

A₀qthens

≡_E_t_{. For instance}_cons₍_{b, cons}₍_{b, nil}₎₎∗ A₀ qlb andcons(b, nil)∗

A₀ qlbandcons(b, cons(b, nil))

≡_E_cons₍_{b, nil}_{). Thus, completion is guaranteed} to terminate: after 4 completion steps (7 ms) we obtain the following fixpoint automatonA_∗_:

States q5 q7 q8 q11 Final States q11 Transitions

b -> q8 nil -> q5 rev(q5) -> q5 append(q5,q11) -> q11 append(q11,q11) -> q11 cons(q7,q5) -> q11 cons(q7,q11) -> q11 cons(q8,q5) -> q11 cons(q8,q11) -> q11 rev(q11) -> q11 a -> q7

To restrain the language to normal forms, it is necessary to compute the intersection with Irr(R).Since we are dealing with sufficiently complete TRSs, we know that Irr(R)⊆T(C)S.

Thus, we can use again the tree automatonA

T(C)S. If we compute the intersection ofA_∗with

the automatonA

T(C₎S we obtain the automaton: States q3 q2 q1 q0 Final States q3 Transitions

a->q0 nil->q1 b->q2 cons(q0,q1)->q3 cons(q0,q3)->q3 cons(q2,q1)->q3 cons(q2,q3)->q3

which recognises any (non empty) flat list ofaandb. Thus, our analysis preserved the property that the result cannot be the empty list but lost the order of the elements in the list. This is not surprising if we have a closer look to our set of equationsE. In particular, the equation

cons(X, cons(Y , Z)) =cons(X, Z) makescons(a, cons(b, nil)) equal tocons(a, nil). It is possible to refine by handECc_,S as follows:

cons(a, cons(a,X))=cons(a,X) cons(b,cons(b,X))=cons(b,X) cons(a,cons(b,cons(a,X)))=cons(a, X)

This set of equations avoids the previous problem. Again E verifies the conditions of Theorem 49 and completion is thus guaranteed to terminate. The result is the automaton:

3_{Such an automaton can be automatically defined. It has one state per sort and one transitions per constructor. For}

instance, on our exampleA

(17)

States q0 q1 q5 q7 q8 q11 q17 Final States q0 Transitions

nil -> q5 rev(q5) -> q5 append(q5,q11) -> q11 append(q11,q11) -> q11 cons(q8,q5) -> q11 cons(q8,q11) -> q11 rev(q11) -> q11 append(q5,q17) -> q17 append(q17,q17) -> q17 cons(q7,q5) -> q17 cons(q7,q17) -> q17 cons(q7,q1) -> q1 cons(q7,q11) -> q1 b -> q8 append(q0,q17) -> q0 append(q11,q17) -> q0 cons(q8,q0) -> q0 cons(q8,q17) -> q0 rev(q1) -> q0 a -> q7

This time, intersection withA

T(C₎S gives: States q4 q3 q2 q1 q0 Final States q4 Transitions

a->q1 b->q3 nil->q0 cons(q1,q0)->q2 cons(q1,q2)->q2 cons(q3,q2)->q4 cons(q3,q4)->q4

This automaton exactly recognises lists of the form [b, b, . . . , a, a, . . .] with at least oneband one

a, as expected. However, equation tuning by hand is difficult. Hopefully, this can be automated using a tree automata completion equipped with a counter-example guided approximation refinement (a.k.a. CEGAR) [BBGL12]. This is what we do in the following example.

5.2 Higher-order function example

Here we show how to take higher-order functions into account and the benefit of a CEGAR for approximation refinement. We choose to illustrate the two aspects on an example taken from [OR11]. The encoding of higher-order functions into first order terms is borrowed from [Jon87]: defined symbols become constants, constructor symbols remain the same, and an additionalapplicationoperator ’@’ of arity 2 is introduced. For instance, the functionnz

testing if a natural is different from 0 and the higher-order functionf ilter filtering out all elements that do not satisfy a predicate are represented by the following TRS:

@(nz, zero)→_{f alse} @(nz, s(X))→_true @(@(f ilter, X), nil)→_nil

@(@(f ilter, X), cons(Y , Z))→_if_(@(_{X, Y}₎_{, cons}₍_{Y ,}_@(@(_{f ilter, X}₎_{, Z}₎₎_,_@(@(_{f ilter, X}₎_{, Z}₎₎

if(true, X, Y)→_X

if(f alse, X, Y)→_Y

The objective of [OR11] is to compute an approximation of all possible results of the function call (f ilter nz l) wherelis any list of naturals. To respect the presentation used in [OR11], we also give the initial set of terms as a grammar generating terms of the form (f ilter nz l) withl

any list of naturals. As a result, the initial automaton recognises only a non terminal symbol genSand the TRS contains the rules used to generate the language. The correspondingTimbuk

specification is the following, whereappstands for @.

Ops app:2 filter:0 zero:0 s:1 nz:0 nil:0 cons:2 if:3 true:0 false:0 genl:1 Vars F X Y Z U Xs X2 X3 Y2 Z2

TRS R

app(nz,zero) -> false app(nz,s(X)) -> true

if(true,X,Y) -> X if(false,X,Y) -> Y app(app(filter,X),nil) -> nil

app(app(filter,X),cons(Y,Z)) -> if(app(X,Y),cons(Y,app(app(filter,X),Z)),app(app(filter,X),Z)) genS -> app(app(filter,nz),genl(zero)) genl(X) -> cons(X,genl(zero))

genl(X) -> nil genl(X) -> genl(s(X)) Automaton A0 States qf Final States qf Transitions genS -> qf

(18)

Equations E Rules

if(true,X,Y)=X app(X,Y)=app(X,Y) s(X)=zero

if(false,X,Y)=Y filter=filter cons(X,Y)=Y

app(nz,zero)=false zero=zero s(X)=s(X)

app(nz,s(X))=true nz=nz nil=nil

app(app(filter,X),nil)=nil cons(X,Y)=cons(X,Y) app(app(filter,X),cons(Y,Z))= if(X,Y,Z)=if(X,Y,Z)

if(app(X,Y2),cons(Y,app(app(filter,X2),Z)), true=true false=false app(app(filter,X3),Z2)) genl(X)=genl(X) Automaton Bad States ql ql0 q1 qz qnil Final States ql0 Transitions

cons(qz,ql)->ql0 cons(q1,ql)->ql cons(q1,qnil)->ql cons(q1,ql0)->ql0 s(qz)->q1 cons(qz,ql0)->ql0 cons(qz,qnil)->ql0 nil->qnil s(q1)->q1 zero->qz Automaton ATCS States qnat qlist Final States qnat qlist Transitions

zero->qnat s(qnat)->qnat nil->qlist cons(qnat,qlist)->qlist

As in previous example, we can check thatR,EandA₀_{respect the assumptions of} Theo-rem 49. Note that contracting equations ofECc_,S are very drastic: the effect of the equation

s(X) =zerois to merge all naturals together and the effect of cons(X, Y) =Y is to scramble all lists together. TheBadautomaton defines the language of undesirable terms: any list of naturals where there is at least one 0. As soon as automatic refinement is used, termination is no longer guaranteed. This is a common limitation of all CEGAR techniques like in [OR11]. However, within 58msTimbukCEGARachieves 8 completion steps, 5 refinement steps and produces a completed tree automaton that have no term in common withL(Bad). Its product with the automatonA_T₍_C₎S gives the automaton recognising lists of naturals strictly greater to

0, which is the expected result:

States q3 Final States q3 Transitions

nil->q2 nil->q3 zero->q0 s(q0)->q1 s(q1)->q1 cons(q1,q3)->q3 cons(q1,q2)->q3

This automaton recognizes lists of naturals strictly greater than 0 which is the expected language of normal forms of the higher-order call (f ilter nz l) with l any list of naturals. Example 8 of [OR11] can be solved in the same way.

Ops app:2 map2:0 kzero:0 kone:0 one:0 zero:0 nil:0 cons:2 genS:0 Vars F X Y Z U X2

TRS R1

app(app(app(map2, X), Y), nil) -> nil app(app(app(map2, X), Y), cons(Z, U)) ->

cons(app(X, Z), app(app(app(map2, Y), X), U)) app(kzero, X) -> zero

app(kone, X) -> one genS -> nil

genS -> cons(zero,genS) Set A0

app(app(app(map2, kzero), kone),genS) Automaton Bad

(19)

States qf q0 q1 qok0 qok1 qnil Final States qf

Transitions

cons(q1,qok1) -> qf cons(q0,qok0) -> qf cons(q0,qf) -> qf cons(q1,qf) -> qf cons(q0,qok1) -> qok0 cons(q1,qok0) -> qok1 cons(q0,qnil) -> qok0 cons(q1,qnil) -> qok1 nil -> qnil zero -> q0 one -> q1

Equations Simple Rules

app(app(app(map2, X), Y), nil) = nil app(X,Y)=app(X,Y) cons(X,Y)=Y app(app(app(map2, X), Y), cons(Z, U)) = zero=zero

cons(app(X, Z), app(app(app(map2, Y), X2), U)) one=one

app(kzero, X) = zero nil=nil

app(kone, X) = one cons(X,Y)=cons(X,Y) kone=kone

kzero=kzero

On this example,timbukCEGARachieves 4 completion steps and 2 refinement steps and gives in 36ms the following tree automaton with 24 transitions:

Automaton Acomp

States q0 q1 q2 q3 q4 q5 q6 q7 q8 q9 q10 q11 q12 q13 q14 q15 q16 q17 q18 Final States q6

Transitions

cons(q10,q16) -> q14 app(q11,q1) -> q12 app(q0,q3) -> q11 app(q1,q8) -> q10 genS -> q9 cons(q8,q9) -> q9 zero -> q8 nil -> q7

app(q4,q9) -> q6 app(q2,q3) -> q4 kone -> q3 app(q0,q1) -> q2 one -> q17 app(q3,q8) -> q17 kzero -> q1 app(q12,q9) -> q16 cons(q17,q6) -> q16 map2 -> q0 q14 -> q6 q10 -> q8

q8 -> q10 q7 -> q16 q7 -> q9 q7 -> q6

Again, the intersection of this automaton withA_T₍_C₎_{results in the automaton:}

States q5:0 q4:0 q3:0 q2:0 q1:0 q0:0 Final States q5

Transitions

nil -> q1 nil -> q4 nil -> q5 one -> q2 zero -> q3 cons(q3,q1) -> q5 cons(q3,q1) -> q0 cons(q3,q4) -> q0 cons(q2,q0) -> q4 cons(q2,q1) -> q4 cons(q2,q5) -> q4 cons(q3,q4) -> q5

This automaton recognizes lists of the form[zero,one,zero,one...], which is the expected result.

6 Adaptation to innermost strategies

6.1 Introduction

The classical equational completion procedure with a left-linear TRSRproduces a correct over-approximaton of R∗(L0) whenever it terminates. AsR

∗

in(L0)⊆R

∗

(L0), this is a correct

over-approximation ofR∗_in(L0) as well. Still, we would like to refine this procedure to deal more

precisely withRin. Indeed, there are some critical pairs that we would not want to complete