Computing the Least Fix-point Semantics of Definite Logic Programs Using BDDs

(1)

Logic Programs Using BDDs

Fr´

ed´

eric Besson, Thomas Jensen, Tiphaine Turpin

To cite this version:

Fr´

ed´

eric Besson, Thomas Jensen, Tiphaine Turpin. Computing the Least Fix-point Semantics

of Definite Logic Programs Using BDDs. [Research Report] PI 1939, 2009, pp.25.

<inria-00433820v2>

HAL Id: inria-00433820

https://hal.inria.fr/inria-00433820v2

Submitted on 24 Nov 2009

HAL

is a multi-disciplinary open access

archive for the deposit and dissemination of

sci-entific research documents, whether they are

pub-lished or not.

The documents may come from

teaching and research institutions in France or

abroad, or from public or private research centers.

L’archive ouverte pluridisciplinaire

HAL

, est

destin´

ee au d´

epˆ

ot et `

a la diffusion de documents

scientifiques de niveau recherche, publi´

es ou non,

´

emanant des ´

etablissements d’enseignement et de

recherche fran¸cais ou ´

etrangers, des laboratoires

publics ou priv´

es.

(2)

PI 1939 – November 23, 2009

Computing the Least Fix-point Semantics of Definite Logic Programs Using

BDDs

Fr´ed´eric Besson* , Thomas Jensen , Tiphaine Turpin*

Abstract: We present the semantic foundations for computing the least fix-point semantics of definite logic programs using only standard operations over boolean functions. More precisely, we propose a representation of sets of first-order terms by boolean functions and a provably sound formulation of intersection, union, andprojection (an operation similar to

restriction in relational databases) using conjunction, disjunction, and existential quantification. We report on a prototype implementation of a logic solver using Binary Decision Diagrams (BDDs) to represent boolean functions and compute the above-mentioned three operations. This work paves the way for efficient solvers for particular classes of logic programse.g., static program analyses, which leverage BDD technologies to factorise similarities in the solution space.

Key-words: Semantics, binary decision diagrams, logic programs

Calcul de la s´

emantique de plus petit point-fixe de programmes logiques d´

efinis en utilisant des BDDs

Résumé : Nous présentons les fondements sémantiques nécessaires pour calculer la sémanique de plus petit point-fixe de programmes logiques définis, en utilisant uniquement des opérations standard sur les fonctions booléennes. Plus précisément, nous proposons une représentation d’ensembles de termes du premier ordre par des fonctions booléennes, et une formulation (prouvée correcte) de l’intersection, l’union et la projection(une opération similaire à la restrictiondans les bases de données relationnelles) qui utilise la conjonction, la disjonction et la quantification existentielle. Nous rapportons les résultats d’un prototype d’implémentation d’un solveur logique qui utilise des diagrammes de décision binaires (BDDs) pour représenter les fonctions booléennes et calculer ces trois opérations. Ce travail ouvre la voie à des solveurs efficaces pour des classes particulières de programmes logiques, par exemple, des analyses statiques de programmes, qui utilisent la technologie des BDDs pour factoriser les similarités dans l’espace de solutions.

Mots clés : Sémantique, diagrammes de décision binaires, programmes logiques

*_{INRIA Rennes - Bretagne Atlantique} **_IRISA/CNRS

(3)

Introduction

Since their introduction by Bryant, Binary Decisions Diagrams [7] (BDDs) have proved to be a very efficient data-structure for representing large sets and relations, provided that sufficient similarities exist which can be factorised. They have been successfully used in various areas of computer science as a way to obtain scalability. Success stories include hardware verification [9], symbolic model checking [8], software model checking [2].

However, BDD is a low-level data type, and expressing the solution of a problem in terms of BDDs requires considerable work. Hence, higher-level languages have been proposed, which make the specification of a problem easier and can be compiled to BDD operations. Examples include the Ever [17] language, the work of Iwaihara and Inoue [18], the Crocopat [5] system, andbddbddb_{[25] which all use relations over finite domains as their basic data structure. In particular the input language for} bddbddb_is Datalog_{which is basically the subset of} Prolog_{obtained by excluding all first-order terms except constants}

and variables.

The work presented here is, to the best of our knowledge, the first attempt to introduce first-order terms in a BDD-based logic solver, thus obtaining the expressive power of Prolog_{. More precisely, the contribution of this report consists in the}

algorithmic and semantic foundations for such a solver,i.e., a boolean function-based operational semantics which computes the least fix-point of puredefinite (i.e., without negation) logic programs.

Overview The first step toward this goal will be the identification of three basic operations on sets of terms, namely, union, intersection, andprojection (analogous to arestriction), which allow the expression of the least fix-point semantics of logic programs. Then we will address the computation of those operations on a boolean function-based representation of sets of terms. Intuitively, the set of instances of a single term will be represented by a conjunction of atoms which describe the function symbols and variable equalities defining this term. Accordingly, a set of terms (which, in our case, is always the set of instances of a finite number of terms) is represented by a disjunction of such conjunctions (i.e., a formula in Disjunctive Normal Form). Therefore, the union of sets of terms is naturally implemented by a disjunction of boolean formulas. The intersection is also correctly implemented by a conjunction, but in an implicit and “lazy” way, as a constructive implementation of the intersection of the instances of two finite sets of terms amounts to unifying the terms pairwise. Then, we will show that the projection a set of terms with respect to a givenpath (which identifies a position in the term) can be computed by performing an existential quantification with respect to a set of atoms which are about this path, but this is only sound for a DNF formula whose disjuncts satisfy a particularclosure property (mainly, the absence ofimplied atoms). This hypothesis is met by applying an appropriate closure procedure to the result of conjunction operations, which in fact implements the actual unification process associated with the intersection of sets of terms. The interest of performing the unification using formulas is that we can have a representation of those (namely, BDDs), whose size in not related to the number of disjuncts of the DNF (which correspond to the most general terms satisfying the formula) and therefore the complexity of unifying two formulas in a single process can be much lower than unifying all the corresponding pairs of terms independently.

The most challenging difficulty in this work is the treatment of the term equalities implicitly represented by multiple occurrences of a variable. This is what makes the elaborate closure property necessary, and the soundness proof for the projection operation subtle. We solve this problem in the general case, in particular without requiring that the programs be

range-restricted.

Organisation of the report The definition of the least-fix-point semantics of logic programs and its expression with set operations are described in Section 2. Section 3 defines precisely the boolean function-based encoding of sets of terms. Section 4 is dedicated to the implementation of theprojectionoperation onclosed boolean functions and its soundness proof. Section 5 describes the closure procedure which provides an implementation of intersection which is compatible with that of projection. Section 6 discusses various optimisations to the formal development presented here that we have used in a prototype implementation, and briefly discusses preliminary experimental results. Section 7 mentions related topics with respect to the representation of sets of terms. In the conclusion (Section 8) we set our work back in the context of static program analysis and detail other possible further work.

2

Evaluating logic programs using set operations

We briefly recall the definition of the least fix-point semantics of logic programs, and we show how to express the immediate consequence operatorTc for a clause using basic set operations.

(5)

2.1

Least fix-point semantics

The least fix-point, or bottom-up semantics [19] of a logic program is defined as the set of facts that are recursively deducible from its clauses. It is computed by repeatedly applying the immediate consequence operator associated with each clause, which is defined as follows. Let c be a clause h:−b1, . . . , bn. We assume that hand b1, . . . , bn are terms (in other words,

predicate names are assumed to be particular function symbols) thus the semantics of the program is just as set of terms. The immediate consequences ofc for a setS of terms are given by the operator Tc defined by:

Tc(S) ={hσ| ∀i≤n biσ∈S}.

2.2

Computing

T

c

with set operations

The computation of Tc(S) involves three steps. First for each premise bi, we compute the set of matching substitutions

{σ | biσ ∈ S}. Each substitution σ is represented by a term subst(σ(x1), . . . , σ(xp)) where x1, . . . , xp are the variables

appearing in clause c, and subst is a particular function symbol. Then we intersect those sets. Finally, we apply the resulting set of substitutions to the head. For the first and last steps, we rely on special terms which represent the notion of instantiation: for each term u, we consider the termhu,subst(x1, . . . , xp)i (where h·,·i is a particular function symbol).

This term has the property that the set of its instances represents the set of pairs{huσ, σi}. Given this property, the set of matching substitution for a premisebi can be computed as:

{σ|biσ∈S}=πsnd(inst(hbi,subst(x1, . . . , xp)i)∩(S× T(F,X))).

In this expression,πsnd represent the projection of a set of pairs onto their second components, the function inst returns the

set of instances of a term, andT(F,X) is the set of all terms. Similarly, the application of a set of substitutionsS to the head is given by:

{hσ|σ∈S}=πfst(inst(hh,subst(x1, . . . , xp)i)∩(T(F,X)×S)).

Thus, we obtain an implementation of the operatorTc using the operations∩, πfst,πsnd,×, and inst. With the addition of ∪, this allows us to express the least fix-point semantics of a logic program. In the remaining of this report, we will show how to compute those operations on a boolean formula-based representation of sets of terms. As a minor variation, instead of×,πfst, andπsnd, our formalisation will use an existential quantification operator∃p(withpis apath such asfst orsnd)

which plays the same role, and is informally defined by∃snd S=πfst(S)× T(F,X) (and the other way around).

3

Representing sets of terms with boolean functions

In this section we define a representation of sets of first-order terms by boolean functions, and show that the union and intersection of sets of terms, as well as the set of instances of a single term, can be expressed on those boolean functions.

3.1

Terms and boolean functions

We first recall standard definitions about first-order terms and boolean functions.

3.1.1 First-order terms

LetF be a ranked alphabet and X a set of variables. We write T(F,X) for the set of first-order terms onF andX. The application of a substitutionσto a termuis denoted by uσ, and the substitution which maps the variablexto the termu

is denoted by [u/x]. For allS ⊆ T(F,X), we define inst(S)⊆ T(F,X) as the set of instances of the terms ofS: inst(S) ={uσ|u∈S, σ∈ X → T(F,X)}

and we let inst(u) = inst({u}).

3.1.2 Boolean functions

LetAbe a set of boolean variables. We callformula a boolean function overAwhich only depends on a finite subset ofA. We writeB(A) for the set of formulas onA:

B(A) = {φ: (A →boolean)→boolean| |support(φ)|<∞}

where the support of a boolean function is defined in the obvious way,i.e.,

(6)

Elements ofB(A) can be denoted by finite propositional formulas overA, which are interpreted up to propositional equiva-lence. In particular, we will sometimes assume that formulas are indisjunctive normal form, which we define as a disjunction of non-false conjunctions of possibly negated atoms (minimality of the disjuncts is not required).

For a formulaφ∈ B(A), a valuationv:A →boolean, and a positive atoma∈ A, we writev∈φifφ(v) is true anda∈v

ifv(a) is true.

3.2

Paths, term formulas and their interpretation

Our aim is to represent sets of terms by boolean functions. The core idea is to use formulas whose atoms describe the function symbols that are reached by following each possible path in the terms. Additional equality atoms will also be required to represent variable equality in terms.

Example 1 Consider the termu=f(x, g(x)). Then uwill be represented by the formula

ǫ7→f ∧ f.27→g ∧ f.1 =f.2.g.1.

3.2.1 Paths

To identify the position of the sub-terms of (sets of) terms, we use a notion of path. A path is an alternating sequence of function symbols and integers where each integer is the index of an argument of the preceding function symbol. We writeP

for the set of paths:

P ={f1.p1. . . . .fn.pn|n≥0,∀i≤n fi ∈ F ∧ 1≤pi≤arity(fi)}.

The empty path is denoted byǫ. We write u(p) for the sub-term ofuobtained by following the pathp, if any, andp∈uif

u(p) is defined. We also write u[u′_/p_{] for the term} _u_{where the sub-term reached by path} _p_{has been replaced by} _u′_{. The}

notationCp(u) =f expresses thatu(p) is defined, is not a variable, and that its top-most function symbol isf.

Example 2 For the term u=f(x, g(x))of Example 1, the paths which are defined inuare ǫ, f.1, f.2, f.2.g.1, and we have for exampleu(f.1) =xandCf.2(u) =g.

3.2.2 Formulas

We consider the following setAof positive atomsa:

A ∋a::=p7→f |p=p′

where p, p′ _∈_P _and _f _{∈ F}_{. Equality atoms are defined up to commutation of their parameters, so that} _p₌_p′ _and_p′ ₌_p

are the same positive atom. Intuitively, the first kind of atoms match the outer-most function symbol of a sub-term, while the second kind force the equality of two sub-terms. In the following, we always consider formulas onA, and we writeBfor the set of formulas (i.e.,B=B(A)).

3.2.3 Meaning

Formulas are interpreted as sets of terms in the following way. The meaning of positive atoms is given by a satisfaction relation|= between terms and atoms defined as follows:

u |= p7→f iff p∈u ∧ Cp(u) =f

u |= p=p′ _iff _p_∈_u _∧ _p′_∈_u _∧ _u₍_p_{) =}_u₍_p′₎_.

As formulas are defined as boolean functions, the meaning of formulas is first defined in terms of valuations. A more intuitive expression is provided by Lemma 1. The meaning of a formulaφis defined as:

JφK={u∈ T(F,X)| ∃v∈φ∀a∈ Au|=a ⇐⇒ a∈v}.

Lemma 1 For any positive atoma, and any formulasφ andφ′_{, the following holds:}

JaK = {u∈ T(F,X)|u|=a}

J¬φK = T(F,X)\JφK Jφ ∧ φ′_K ₌ _J_φ_K_∩_J_φ′_K

Jφ ∨ φ′_K ₌ _J_φ_K_∪_J_φ′_K_. Proof 1 This follows from the definitions.

This lemma shows that the usual set operations on sets of terms may be expressed straightforwardly on the formula repre-sentation using their logical equivalents.

(7)

3.2.4 Representing the instances of a single term

The set of instances of single term can be represented by a formula as shown by Lemma 2. This implies that boolean functions are expressive enough to represent any set of terms obtained as the set of instances of a finite number of terms.

Lemma 2 Let ube a term. Defineφu as

φu= ^ Cp(u)=f p7→f ∧ ^ p6=p′ u(p)∈X u(p)=u(p′) p=p′. ThenJφuK= inst(u).

Proof 2 This follows from the definitions.

Remark: The formula of Example 1 corresponds precisely toφu.

4

Projecting sets of terms

This Section defines the projection of a set of terms with respect to a path and shows that it can be computed by an existential quantification of a formula, provided an additionalclosure assumptions on this formula.

4.1

Introducing the projection

The purpose of the projection operator is to abstract a set of terms with respect to a given path. This is very similar to the projection operationsπfst andπsnd described in Section 2, except that the sub-terms under the projected path are not

removed but replaced by every possible term.

4.1.1 Projection of a set of terms

The projection∃p uof a termuwith respect to a pathpis defined as the set of terms obtained by substituting any term in

T(F,X) for the path p(if it appears) inu. Formally:

∃p u=

{u[u′_/p_]_|_u′_{∈ T}₍_F_,_X₎_} _if_p_∈_u

{u} otherwise.

The projection is extended to sets of terms in the obvious way:

∃p S={∃p u|u∈S}.

4.1.2 The projection problem

The intuition for implementing the projection on formulas is the same as for the other operations: compute the projection on a pathpby projecting (i.e., existentially quantifying) on the atoms which involvepor a suffix ofp. Formally, the setAp

of atoms implying a pathpis defined by

Ap={p.p′ 7→f ∈ A} ∪ {p.p′=p′′∈ A}.

However, the formulas inBare too general to allow this simple translation, and thus need to be refined. Example 3 illustrates the projection problem.

Example 3 Consider the ranked alphabet F ={f /3} and the two formulas φ =ǫ 7→ f ∧ f.1 = f.2 ∧ f.2 = f.3 and

ψ=φ ∧ f.1 =f.3. We have

JφK=JψK={f(u, u, u)|u∈ T(F,X)}.

Now suppose that we want to project this set of terms on the pathf.2. We get

∃f.2 JφK={f(u, u′_{, u}₎_|_{u, u}′_{∈ T}₍_F_,_X₎_}_.

On the formula side, we proceed by quantifying existentially with respect to Af.2={f.2 =f.1, f.2 =f.3, . . .}, and we obtain

∃Af.2 φ=ǫ7→f and∃Af.2 ψ=ǫ7→f ∧ f.1 =f.3. We observe that

J∃Af.2 φK={f(u, u′, u′′)|u, u′, u′′∈ T(F,X)}

and thus J∃Af.2 φK6=∃f.2 JφK, but J∃Af.2 ψK=∃f.2 JψK. The reason is that the atom f.1 =f.3 which is implied inφ is

(8)

Given this example we can make two remarks, which sketch the main theoretical results of this work. First, we can apply the existential quantification to a conjunctive formula to compute the projection provided the formula is “closed” with respect to implicit atoms. This is formalised in the remaining of this section, in two steps: after defining the notion of closed conjunction in Section 4.2, we state in Section 4.3 a key result giving a constructive expression of the meaning of closed conjunctions ; then this result is used in Section 4.4 to prove the soundness of existential quantification-based projection for (disjunctions of) closed conjunctions. Second, the conjunction operator on formulas, used to implement intersection, does not preserve the closure property (the formulaφcan be obtained trivially as the conjunction of the two closed conjunctions

ǫ7→f ∧ f.1 = f.2 and ǫ7→f ∧ f.2 =f.3). Therefore, closure must be restored after each conjunction operation. This constitutes the main topic of Section 5, with Section 5.3 gathering those results in a boolean function-based domain which is closed under union, intersection, and projection.

4.2

Closed conjunctions

The notion of closed conjunctions is defined by a collection of constraints, which are presented in turn.

4.2.1 Conjunctions

In the following, we will callconjunctiona non-false formula which can be expressed as a conjunction of (possibly negated) atoms. This representation is obviously unique up to symmetry (assuming distinct conjuncts), and cannot contain both a positive atom and its negation. We writea∈φ(respectively¬a ∈ φ) ifa(respectively¬a) is one of the conjuncts ofφ.

4.2.2 Prefix-closed conjunctions

We say that a conjunctionφisprefix-closed if for allp,f,i,p′_{, and}_f′_,

(p.f.i7→f′ _∈ _φ _∨ _p.f.i₌_p′ _∈ _φ_{) =}_⇒ _p_7→_f _∈ _φ

(remember that equality atoms are defined up to commutation, so that p.f.i=pis identical top=p.f.i). In other words, for any positive atom involving a pathp.f.i occuring in φ, the positive atom p7→f must also occur. The purpose of this definition is to make explicit the implied constraints on the prefixes of paths appearing in a positive atom, namely, the fact that (u |= p7→f) =⇒ p∈uand (u |= p=p′_{) =}_⇒ _{p, p}′_∈_u_.

4.2.3 Positive conjunctions

The second constraint restricts the use of negation to a single case : only function symbol atoms may be negated, provided that a positive atom i with the same path is present. Formally, a conjunctionφispositiveif

∀a∈ A(¬a)∈φ =⇒ ∃p, f, f′ a=p7→f ∧ p7→f′ ∈φ.

4.2.4 Deduction rules

The last (and essential) constraint is expressed by a set of rules that a conjunction must satisfy to make sure that it has no implied atoms (an atom is implied by a conjunction if it is satisfied by every term which satisfies the conjunction). A rule is composed of a set of atomshi which are called the hypotheses and are interpreted in a conjunctive sense, and a set

of conclusionsci interpreted in a disjunctive sense (in practice however we only consider rules with at most one conclusion).

The empty set of conclusions is denoted by⊥, and rules are noted

h1 . . . hn

c1, . . . , cm ,

h1 . . . hn

⊥ .

We define the following five rules schemes: mat₁ p=p

′ _p_7→_f

p′_7→_f mat2

p=p′ _p_7→_f

p.f.i=p′_.f.i where 1≤i≤arity(f)

trans p=p ′ _p′₌_p′′ p=p′′ cycle p=p.p′ ⊥ wherep ′₆₌_ǫ conflict p7→f p7→f ′ ⊥ wheref 6=f ′

We first give the intuition behind those rules (whose precise relation with the meaning of formulas is stated in Properties 1 and 2). The rulesmat₁ _andmat₂ _{have a}_{materialisation} _{purpose: they are used to introduce the consequences of sub-term}

(9)

equalities. The rule trans _{expresses the transitivity of equality, and the rules} cycle _and conflict _{detect inconsistent}

situations.

We denote by Rthe (infinite) set of rules generated by the above schemes. In the following, we call rule an element of

R. An important property of those rules is that they are both sound and complete with respect to the meaning of formulas. This is stated in Properties 1 and 2.

Property 1 (Soundness) Let ube a term andR∈ R. Ifusatisfies all the hypotheses of Rthenusatisfies at least one of the conclusions ofR.

Proof 3 This follows from the definitions.

Property 2 (Completeness) Let φ be a conjunction andψ a disjunction of positive atoms. If JφK⊆JψK, then φ⊢R ψ, where the deduction relation⊢R∈ P(A)× P(A)is defined in the standard way. In particular, if JφK=∅ thenφ⊢R⊥. Idea of the proof. The relation⊢R can be shown to simulate a unification algorithm. The result follows from the functional

correctness of unification and its termination which mainly relies on so-calledoccurs checks, whose equivalent here is the rule

cycle_.

4.2.5 Rule satisfaction

The fact that a conjunctionφsatisfies a ruleR (which we writeφ|=R) is defined as follows:

φ|= h_c1 . . . hn

1 . . . cm iff (∀i≤n hi∈φ) =⇒ (∃j≤m cj ∈φ).

4.2.6 Closed conjunctions

We can now define closed conjunctions using the above constraints. A conjunctionφisclosed if a)φis prefix-closed, b)φis positive, and c)φsatisfy every rule inR.

4.3

A characterisation of the meaning of closed conjunctions

We complete the presentation of closed conjunctions by proving a fundamental property about their meaning. In Lemma 3 we define, for any closed conjunction φ, a termuφ whose set of instances is equal to the meaning of φ. In other words,uφ

is the most general term satisfying φ. A particularly important consequence of this result is that closed conjunctions are satisfiable. This satisfiability feature is the key to the proof of Lemma 4 (which establishes the soundness of projection): in this lemma, given a termusatisfying∃Ap φ(here,∃Ap φis the projection ofφonto some pathp), we have to modifyu(by

substituting some appropriate term to the pathp) such as to satisfyφ, thus ensuring thatu∈ ∃pJφK.

Lemma 3 Let φbe a closed conjunction. Then there exists a term uφ such thatJφK= inst(uφ).

Proof 4 We give a constructive definition ofuφ, and prove that JφK= inst(uφ).

First, we remark that, because of ruletrans_,_p₌_p′ _∈_φ_{is an equivalence relation over paths. We let}_p_{be the equivalence} class of a pathp. Now, for every pathp, we define a termup and show that for every atom(¬)p.p′ 7→f inφ(either positive

and negative), up |= (¬) p′ 7→f. The definition and proof are done by induction on the set of paths p′ such that some

positive atom p.p′_7→_f _{belongs to}_φ_:

• If there exist (possibly negated) atoms of the form(¬)p7→f in φ, then as φis positive, one of those atoms is positive, and by the ruleconflict_{, it is unique. Let}_u_p₌_f₍_u_p.f.₁_{, . . . , u}_p.f.n₎_where_n_{= arity(}_f₎_{. By induction hypothesis, for} all i≤n, up.f.i satisfies all the atoms (¬) p′ 7→f′ such that (¬) p.f.i.p′ 7→f′ ∈ φ, thus up satisfies (¬) f.i.p′ 7→ f′

for the samep′, f′. Furthermore, up obviously satisfies all the atoms(¬)ǫ7→f′ such that (¬)p7→f′∈φ. Finally, as

φ is prefix-closed, there cannot be any positive atom p.f′_.i.p′ _7→_f′′ _with _f′ ₆₌_f _in _φ_{. Thus,} _u

p satisfies all the atoms

(¬)p′_7→_f′ _{such that}₍_¬₎_p.p′_7→_f′ _∈_φ_.

• Otherwise we let up be the variable xp and asφ is prefix-closed, the result follows by the same argument.

We conclude that the term uφ =uǫ satisfies all the atoms (¬)p7→c in φ. Now we have to deal with path equalities. We

observe that, because usatisfiesφ andφ is prefix closed, for every equalityp=p′ _∈_φ_,_p_∈_u_and _p′ _∈_u_{. We then prove by} induction onu(p)that for every such equality,u(p) =u(p′₎_{. First, because of rule} _mat

1 and sinceusatisfies the constraints

of φof the form (¬)p7→c, we know that u(p)is a variable if and only if u(p′₎_{is, and otherwise} _C

p(u) =C′p(u). We prove

(10)

• If u(p)andu(p′)are the variables xp,xp′, thenp=p′ by definition thus the atom is satisfied.

• If u(p) =f(u1, . . . , un)and u(p′) =f(u′1, . . . , u′n)then the rule mat2 ensures that for all i≤n,p.f.i=p′.f.i∈φ. By

induction hypothesis we deduce thatu(p.f.i) =u(p′_.f.i₎_{and the result follows.} Therefore uφ|=φ.

Finally, we must prove that every term usatisfyingφ is an instance ofuφ. By definition, for every equalityp=p′ inφ,

we know that p∈u,p′_∈_u_{, and}_u₍_p_{) =}_u₍_p′₎_{. We denote this term by}_u₍_p₎_(where_p_{is again the equivalence class of}_p_with respect to the relationp=p′ _∈_φ_{). It follows by induction on}_u

φ that u=uφ[u(p)/xp].

Remark about the proof of Lemma 3: The fact that closed conjunctions satisfy the rule cycle _{has not been used} in the proof (so the closure hypothesis is slightly stronger than necessary). However, this rule is required for completeness (Property 2), and this property will be used in Section 5.2.

4.4

Projection

Using the characterisation of the meaning of closed conjunctions provided by Lemma 3, we prove the soundness of using existential quantification on formulas to express the projection of sets of terms, for a restricted set of formulas that are disjunctions of closed conjunctions. Lemma 5 establishes the result, the essential step being proved in Lemma 4.

Lemma 4 Let pbe a path andφ be a closed conjunction. Defineuφ andu(∃Ap φ) as in Lemma 3. Then

inst(∃p uφ) = inst(u(∃Ap φ)).

Remark: The statement of Lemma 4 implicitely assumes that ∃Ap φis a closed conjunction. The three constraints which

define the closure of this formula are easily checked from the definition ofAp.

Proof 5 There are two cases.

• If p6∈uφ, then it is easy to see that uφ=u(∃Apφ)and the results follows. • Otherwise, we define a substitutionσ such that

∃p uφ={u(∃Apφ)σ[xp7→u]|u∈ T(F,X)} (1) where xp is a particular variable appearing in the range of σ. First, we remark that, because φ satisfies trans, the

equivalence classes for the relation p′₌_p′′_{∈ ∃A}

p φare the same ones as for the relation p′ =p′′∈φ, except that the

class pof the latter may be splited into two in the former. More precisely, if we use the notations ·∃Ap φ _and _·φ _for those equivalence classes, then either p∃Apφ₌_pφ₌_{_p_} _or _pφ _{is splited into} _{_p_}_and_pφ_{\ {}_p_}_{. We thus define} _σ_by

σ(x_p′∃Ap φ) = x

p′φ for allp

′₆₌_p_{such that}_p₌_p′ _6∈_A

σ(xpφ_\{_p_}) = x_pφ ifpφ\ {p} 6=∅

σ(x{p}) = xp wherexp is a new variable.

It follows from the definitions that σ satisfies (1). Furthermore, asσ maps distinct variables to distinct variables, we have

inst(u(∃Ap φ)) = inst(u(∃Apφ)σ) = inst({u(∃Apφ)σ[u/xp]|u∈ T(F,X)}).

We conclude by combining this equality with (1).

The following property of the projection operation is the last argument required for justifying our implementation of projection:

Property 3 For every term uand every pathp,

∃pinst(u) = inst(∃p u).

We now establish the main soundness lemma.

Lemma 5 Let pbe a path andφ a disjunction of closed conjunctions. Then

∃pJφK=J∃Ap φK.

Proof 6 We first prove the result in the case of a conjunction. The proof follows from the previous lemmas:

∃pJφK = ∃pinst(uφ) Lemma 3

= inst(∃p uφ) Property 3

= inst(u∃Apφ) Lemma 4

= J∃Ap φK Lemma 3.

The proof for general case follows from the distributivity of projection, existential quantification, and meaning with respect

(11)

5

Closing formulas

We have described a domain of formulas for which the projection is easily expressed as an existential quantification. However, this domain is not closed under conjunction. Therefore, to obtain a domain which is closed under union, intersection, and projection, we have to enforce the closure property on formulas while preserving their meaning. This is achieved by repeatedly “applying” the rules until they are all satisfied. This idea is slightly complicated by the fact that, to make the process terminate, we need an additional notion ofmonotone formula.

5.1

Rule application

The basic idea in order to make a conjunction satisfy a rule is to add some conclusion of the rule to the conjunction, if all the premises of the rule are in the conjunction. It is clear that this operation may make some other rules unsatisfied, therefore an iterative process is required, which we describe in Section 5.2.

5.1.1 Applying a rule to a conjunction

Given a ruleR∈ R, and if

R= h_c1 . . . hn 1 . . . cm ,

we define a formula app(R, φ) as the result of applyingR to a conjunctionφ:

app(R, φ) = ( W i≤m (¬ci)6∈φ (φ ∧ ci) if∀i≤n hi∈φ φ otherwise.

The application of a rule to a conjunction clearly makes this rule satisfied, without changing the meaning of the conjunction. Now we need to extend this definition to formulas. We do this using the DNF representation, thus we have to show that the obtained formulation is independent of the particular expression of a formula. Furthermore, we will eventually have to compute it with basic boolean function operations. For these two reasons we need a more “semantic” definition of rule application, which is given in Lemma 6. This definition requires a notion of monotone conjunction, and in particular a special operator on boolean functions used to ensure monotony.

5.1.2 Making conjunctions monotone

We say that a conjunctionφismonotonewith respect to a set of atomsA⊆ Aif it does not contain any negated atom¬a

witha∈A. A conjunction ismonotoneif it is monotone with respect toA. Note that we use the termpositivefor a distinct purpose in this report.

Given a set of atomsA, we define function monotoneA :B → B whose purpose is to make conjunctions monotone with

respect to A, while preserving monotony with respect to any other positive atoms. monotoneA is defined by the following

boolean expression (only the existence of this expression matters): monotoneA(φ) = ∃A φ ∧ ^ a∈A a =⇒ a′ !! [a/a′ |a∈A]

where for alla∈A,a′_{is a fresh boolean variable, and [}_a/a′_|_a_∈_A_{] is the substitution which renames each}_a′ _to_a_{. The only}

thing that matters about the above expression is that it only uses standard logic operations, namely, conjunction, existential quantification, and renaming.

5.1.3 Applying a rule to a DNF

Lemma 6 gives a propositional expression of the application of rule to a conjunction.

Lemma 6 Let φbe a monotone conjunction andR∈ R, and let

R= h_c1 . . . hn 1 . . . cm . Then app(R, φ) = monotone{h1,...,hn}(φ ∧ φR) whereφR is defined as φR = h1 ∧ · · · ∧ hn =⇒ c1 ∨ · · · ∨ cn.

(12)

In this expression, the function monotone{h1,...,hn}has no incidence on the meaning of the computed formula, which suggests

that rule application could have been defined just as a conjunction withR. The role of monotony is to ensures that only the consequence ofR on the particular formulaφare kept, which is essential for the rule propagation process to terminate.

Given Lemma 6, and given that ∧ and monotoneA are both distributive with respect to ∨, we can safely extend the

application function to arbitrary formulas. This is done in Lemma 7.

Lemma 7 Let φ=W

i∈Iφi be a disjunction of monotone conjunctions and R ∈ R, and define{h1, . . . , hn} andφR as in

Lemma 6. Then

monotone{h1,...,hn}(φ ∧ φR) =

_

i∈I

monotone{h1,...,hn}(φi ∧ φR).

We denote this formula by app(R, φ)and we have

app(R, φ) =_

i∈I

app(R, φi).

Proof 7 The first point is a consequence of the distributivity of∧ andmonotoneA with respect to∨, the latter being itself

a consequence of the distributivity of conjunction, existential quantification, and renaming. The second point follows from Lemma 6.

Example 4 Consider again the conjunction φ = ǫ 7→ f ∧ f.1 = f.2 ∧ f.2 = f.3 of Example 3, as well as a second conjunctionψ=ǫ7→f ∧ f.17→a ∧ f.2 =f.3. The disjunction of the two represents the set of terms whose form is one of f(u, u, u) or f(a, u, u). Suppose that we want to apply to φ ∨ ψ the rule R corresponding to trans _{for the equalities}

f.1 =f.2 andf.2 =f.3. We first take the conjunction with φR, which yields the following DNF:

(φ ∨ ψ) ∧ φR = ǫ7→f ∧ f.1 =f.2 ∧ f.2 =f.3 ∧ f.1 =f.3

∨ǫ7→f ∧ f.17→a ∧ f.1 =f.3

∨ǫ7→f ∧ f.17→a ∧ f.16=f.2.

Then we apply the functionmonotone{f.1=f.2,f.2=f.3} which discards the dis-equalityf.16=f.2 in the third disjunct, and we obtain the desired result:

app(R, φ ∨ ψ) = ǫ7→f ∧ f.1 =f.2 ∧ f.2 =f.3 ∧ f.1 =f.3

∨ǫ7→f ∧ f.17→a.

The second disjunct is discarded, as it is less general thanǫ7→f ∧ f.17→a.

Formally, Lemma 8 shows how the function app can be used to enforce the satisfaction of rules in (the disjuncts of) a formula, while keeping the formula monotone and prefix-closed, and preserving its meaning.

Lemma 8 Let φbe a disjunction of monotone, prefix-closed conjunctions, and R∈ R. Then

• Japp(R, φ)K=JφK, and

• app(R, φ)can be expressed as a disjunction of monotone, prefix-closed conjunctions which satisfyRas well as the rules satisfied by φwhose premises do not occur in the conclusion ofR.

Proof 8 We first prove the result for the case whereφis a single monotone, prefix-closed conjunction. In this case, the fact that Japp(R, φ)K= JφK follows from the definition of app, the definition of meaning, and Property 1. The second point is easily checked from the definitions.

In the general case, the result follows from the second point of Lemma 7 and the distributivity of meaning with respect to disjunction (Lemma 1).

5.2

Iterating rules

Procedure 1 is used to compute the closure of a formula with respect toR. It uses a simple work-set algorithm which keeps track of the set of rules that may still need to be applied. Rules are applied one by one, following a particular strategy: they are selected by increasing length of the longest path appearing their atoms. This strategy ensures the termination of rule propagation (see Lemma 10).

The partial correctness of Procedure 1 is proved in Lemma 9.

(13)

procedureclosure(φ) =

Letw={R∈ R |R has a premise in the support ofφ} whilew6=∅ do

choose a ruleR∈w with a minimum max path length

φ←app(R, φ)

w←w\ {R}

if φhas changed then

w←w ∪ {R′_{∈ R |}_{some premise of} _R′ _{is a conclusion of}_R_} return φ

Procedure 1: Closure ofφwith respect to R

• Jclosure(φ)K=JφK, and

• closure(φ)can be expressed as a disjunction of monotone, closed conjunctions.

Proof 9 We prove that the closure algorithm maintains the following invariant: φ can be expressed as a disjunction of monotone, prefix-closed conjunctions which satisfy every rule not inw, andJφK is equal to its initial value.

• Initially, this is clear from the hypotheses, and from the fact that any rule whose premises are not in the support of a conjunction is satisfied by this conjunction.

• Assume that a disjunction φ satisfies the invariant for some setw of rules. Let R∈w. Then Lemma 8 ensures that

app(R, φ)has the same meaning asφ and that its disjuncts satisfyR and the rules of wwhose premises do not occur in the conclusion of R. Furthermore, if app(R, φ) = φ, then its disjuncts obviously satisfy all the rules not inw by hypothesis. We conclude that the invariant is maintained after one iteration of the loop.

At the end, we conclude from the fact that w=∅, since monotony implies positiveness.

The termination of Procedure 1 is stated in Lemma 10.

Lemma 10 Letφ be a disjunction of monotone, prefix-closed conjunctions. Thenclosure(φ) terminates. Idea of the proof. Let W

i∈Iφi be a DNF of the initial formula to which Procedure 1 is applied. In the execution of the

algorithm, the computed formulaφcan always be expressed as a disjunction of conjunctions where each disjunct is the result of applying the rules selected so far to someφi. Thus, the algorithm can only diverge if for someφi the repeated application

of the rules yields an infinite set of atoms, and therefore a set of atoms with arbitrary long paths. This implies thatJφiK=∅,

and by Proposition 2 we conclude that φi ⊢R ⊥. As the rules are selected by increasing path length, and since rules with

arbitrary long paths are selected, those implied in this deduction will eventually be selected as long as they apply and the conjunction be removed, which contradicts the hypothesis.

5.3

The domain of monotone closed formulas

We have defined in Section 3.2 a kind of boolean functions to express sets of terms. We have given in Sections 4.2 and 4.4 sufficient constraints that allow the projection operation on sets of terms defined in Section 4.1 to be expressed as an existential quantification on formulas. Finally, we have shown in Sections 5.1 and 5.2 how to restore the only one of those constraints which is not preserved by the conjunction operation (namely, the satisfaction of the rules inR), at the cost of an additionalmonotony requirement. Together, those results make possible the definition of a boolean function-based domain for representing and computing on sets of terms, as follows.

Definition 1 The domain D ⊆ Bis defined as the set of formulas that can be expressed as disjunctions of monotone, closed conjunctions.

Theorem 1 states thatDcan represent the set of instances of a term, is closed under disjunction, quantification with respect toAp for any p, and conjunction followed by closure, and that these operations compute respectively the union, projection

onp, and intersection operators on the meaning of formulas.

Theorem 1 Let ube a term andφ, φ′ _{∈ D}_{. Let}_p_{be a path and} _A

p defined as in Lemma 4. Then the following holds.

φu ∈ D and JφuK = inst(u)

φ ∨ φ′ _{∈ D} _and _J_φ _∨ _φ′_K ₌ _J_φ_K _∪ _J_φ′_K

∃Ap φ ∈ D and J∃Ap φK = ∃pJφK

(14)

Proof 10 The fact thatφu∈ Dfollows from the definitions. By definition, Dis also obviously closed by disjunction. Given

the distributivity of existential quantification, it is easy to check from the definitions (in particular the definitions ofAp and

R) that∃Ap φ ∈ D. Finally, φ ∧ φ′ can clearly be expressed as a disjunction of monotone, prefix-closed conjunctions. We

conclude by applying Lemmas 1, 2, 5, 9, and 10.

6

Experiments

We have implemented our ideas in a prototype solver based on thecuddBDD library [23]. This solver is built from the same basic elements which constitute the theory that we have presented, but with a few optimisations.

Most importantly, the boolean function encoding is slightly different: we relax the monotony constraint used to ensure the termination of the closure procedure, in order to obtain better BDD performance. The reason is that the mutual exclusion of function symbols for a given path cannot be represented in monotone formulas, which authorises spurious boolean valuations and causes an explosion of the number of BDD nodes. Instead, the termination of the closure procedure is enforced by using an additional set of atoms, which intuitively express whether the sub-term reached by a given path is “materialised” or not, and by only ensuring the monotony with respect to these atoms. We believe that this idea would also be useful to enable negation in our formalism, as it introduces negated atoms without compromising the termination of rule propagation.

The implementation of the immediate consequence operator Tc, it is mostly based on the naive formulation described

in Section 2 (adapted to the use of ∃p instead of πp), but we apply some well-known optimisations to avoid as much

as possible the large intermediate relations. One of them is the early projection of the sets of matching substitutions (substitutions are restricted to the variables that appear in the head or in subsequent premises). Another idea is the use of path renaming, boolean identification of atoms, and a limited use of intersection, instead of the costly intersection with the termhu,subst(x1, . . . , xp)iof Section 2.

Other optimisations include the use of regular types to reduce the number of atoms, small changes in the choice of rules, and a less conservative initial work-set in the rule propagation process.

The fix-point iteration procedure which computes the least model of a logic program is a standard work-set algorithm with a simple strategy based on the dependency graph between clauses, and in particular its strongly connected components. The termination of the computation over boolean functions does not directly follow from the finiteness of the least fix-point, since the representation of a set of terms is not unique and we do not provide a decision procedure for inclusion. We conjecture that testing implication between boolean functions ensures termination in the same cases, and we have not found any counter-example.

The least fix-point, restricted to a particular (small) predicate of interest, is converted back to a set of terms using Lemma 3, which allows us to (implicitly) enumerate the terms from a disjunctive normal form of the formula. We apply it to the minimal DNF, which can be computed efficiently by iterating over BDD nodes rather than BDD paths (which correspond to a non-minimal DNF). We believe that the disjuncts of the minimal DNF for any formula inDcan be shown to satisfy the hypotheses of Lemma 3, which makes this strategy sound.

Case studies

We have applied our prototype to simple test cases, such as then-queens and the dining philosophers problems. Some sets of terms obtained as solutions of our examples exhibit an efficient BDD encoding, which we measure by the fact that the order of magnitude of the number of nodes in the BDD is significantly smaller than the number of most general terms in the corresponding set. For example, the representation of the state space and transition relation for the 14 dining philosophers (which has 228486 states and 2067856 transitions) involves BDDs whose size does not exceed 20000 nodes with intermediate relations staying under 400000 nodes throughout the computation. However, the obtained performance is not sufficient yet: the execution time (27 hours for the above example) and even the memory usage are often much higher than with an explicit representation of terms, for example with XSB. In particular, we have encountered criticalvariable ordering issues. So far, our treatment of variable ordering relies on general-purpose heuristics which are part of thecudd_{package, but a late} scheduling of reordering causes an explosion in the BDD size. Some formulas also remain large even after reordering, which suggests that our encoding can be further improved.

7

Related work

A large range of existing and ongoing research is relevant to the goal of the present work, most notably on the topic of logic program compilation. As for the original proposal of this report, which is a boolean function-based domain (targeted towards BDDs), connections exist with other data structures which have been proposed for representing sets of terms in various contexts, and with different properties. In the following, we compare our work with two of those.

(15)

7.1

Term indexing

A lot of research about theorem provers has been concerned with variousterm indexing techniques [15], which are repre-sentations (or abstractions) of sets of first-order terms. Those data structures are designed to be compact and to offer fast but generally approximate answers to some particular queries, such as the retrieval of a given term, or set of terms. Some of those representations, tree-based indexing techniques, even provide union and unification of sets of terms (the latter of which is equivalent to our intersection operation). In particular, theadaptive discrimination tree [22] representation is very similar to our boolean function encoding, if boolean functions are represented with BDDs. In those trees, nodes are labelled by paths (similar to the paths that we used here), and edges are labelled with function symbols to be matched with the path labelling their source nodes. Thus the structure is the same as that of a BDD with only “function symbol” atoms. The most obvious difference is that there is no sharing between sub-trees as in (reduced) BDDs. Substitution tree indexing [14] is a more general technique which is able to exactly represent sets of terms. The treatment of variable equality is different from ours: variable are represented in a concrete way, while we treat each equality as a single atom. Still, we are not aware of any term indexing technique which could represent sets of terms in an exact way, and have a much smaller size than the number of terms.

7.2

Tree automata

Another data structure for representing sets of first-order terms istree automata [11]. In their simplest form, tree automata are able to represent the domain ofregular sets of terms, which is incomparable with the domain considered here, i.e., sets of instances of finite sets of terms. On one hand, tree automata can express the nesting of a given pattern any number of times, which yields an infinite number of most general terms. The representation is rather different from BDDs, as a term corresponds to a partial unfolding of the automata, i.e., a tree, while in BDDs a term corresponds to a BDD path, i.e., a word. On the other hand, tree automata recognise sub-terms independently, thus they are not able to express sub-term equality. Various extensions of tree automata with equality or dis-equality constraints have been proposed. Themost general class [11, Chapter 4], which allows the transitions to be constrained by arbitrary boolean combination of path equalities and dis-equalities, subsumes our propositional formula domain. Unfortunately, the computations that we study are not feasible with this most general class of tree automata with equalities and dis-equalities: emptiness is undecidable even for less expressive sub-classes. It is decidable for some of those however, such asautomata with constraints between brothers [6] and reduction automata [12, 10]. We believe that only the latter is able to represent the instances of a finite number of arbitrary terms. This class is closed under union and complementation, but we are not aware of an implementation of a projection operator for it.

8

Conclusions and further work

We have established the semantic foundations for computing with sets of first-order terms in the world of boolean functions. This opens the way for BDD-based manipulations of logic objects, and in particular, for a BDD-based bottom-up execution of logic programs. As a conclusion, we discuss the main motivation for such a goal, which is static program analysis, then we list a few possible directions for pursuing our quest for expressiveness.

8.1

Application to static analysis

First, we should stress that the BDD-based solver that we are building is not intended as a general-purposeProlog_engine.

In the following we try to recount the path which led to the combination of BDDs and logic programs for implementing scalable static program analyses, in particular Control Flow Analysis.

8.1.1 Logic languages and static analysis

The idea of computing the result of a static analysis as the least fix-point of a logic program is at the core of abstract compilation [16], which is a popular static analysis technique among the logic programming community. It consists in abstracting a logic program p by another logic program p♯_{. The result of the analysis of} _p _{is obtained by running the}

programp♯_{. In [1], Albert} _{et al.} _{added one more compilation layer. For analysing Java bytecode programs, they compiled}

them into equivalentProlog_{programs that were thereafter themselves analysed. In the Control Flow Analysis domain, some}

(16)

8.1.2 Using BDDs to solve logic programs

An even closer inspiration for the current study was the work of Whaley and Lam [26] who proposed the BDD-basedDatalog

enginebddbddb_{as a framework for designing scalable context-sensitive static analyses in a very natural declarative fashion.} An off-the-shelf logic solver such as CORAL [20] or XSB [21] could of course have been used to solve theDatalog_clauses,

and in fact, the XSB system (which implements a tabulation mechanism allowing to compute the well-founded semantics of logic programs) has been used on a small scale in a program analysis context for computing groundness and strictness information [13]. The reason for designing a dedicated BDD-based solver is that data computed by static analyses is unusual. First, the amount of computed data is huge, and second, so is the amount of redundancy, which seems to be exactly the application range for BDDs.

8.1.3 Introducing first-order terms

In Whaley and Lam’s work however, the lack of expressiveness of Datalog _{is perceptible, as it fails to represent a key}

aspect of the analysisviz., the context sensitive call graph. For this crucial step, the authors hand-craft anad hocalgorithm maximising the sharing properties of BDDs. On the other hand, the availability of terms in our solver allows a straightforward representation of calling contexts by lists of callers, with a simple logic specification, and whose BDD encoding features similar sharing. Furthermore, this enables an easy experimentation of various choices of abstractions for calling contexts. Overall, our contribution to the field of analysis specification by logic programs is the added expressiveness of first-order terms.

8.1.4 Lifting the “range-restricted” limitation

In [24, Chapter 5], we have used boolean formulas to represent finite sets of ground terms, which is enough to compute the semantics ofrange restricted programs (i.e., every variable appearing in the head of a clause must also appear in the body). This case was significantly simpler, because no equality atoms were required, and the formulas could just be “materialised” up to givendepth, following the maximum term depth. A new difficulty here is that a common materialisation depth cannot be found for all the terms of a formula, and must instead be chosen at the conjunction level, which is achieved through the rule propagation mechanism described in Section 5.

8.2

Perspectives on expressiveness

A natural direction for future research is to investigate the specification of static program analyses in Prolog_{, and in}

particular the possible benefits of terms such as lists and trees, of which we have given an example with the calling contexts in context-sensitive analyses. On the other hand, we can try to accept an even more expressive class of programs with two main ideas.

8.2.1 Magic transformations

The semantics that we have presented here computes the least fix-point of definite logic programs, but using an infinite union. So, the actual computation only terminates if the least fix-point is finite,i.e., can be expressed as the set of instances of a finite number of terms. This is a significant limitation to the programs that we can accept: many standard recursive predicates, such as list operations (e.g., append), do not have a finite least fix-point.

This situation can be greatly improved by applying magic transformations [3] to the programs. The idea of these automatic program transformations is to specialise the predicates such that they only compute terms which contribute to a given query. The transformed program somehow simulates a top-down execution of the original one from the specified query. Many standard infinite predicates can be eliminated by applying magic transformations, and this is what we have done (manually) for our case studies. In the future, we will automate this treatment.

Finally, even if applied to finite predicates, magic transformations may still be useful for performance reasons, as they make the least fix-point smaller. In particular, for the application to static analysis, we believe that a demand-driven implementation could be obtained almost for free by using this technique.

8.2.2 Stratified negation

In this report we have considereddefinite programs,i.e., without negation. However, a bottom-up semantics can be given to logic programs withstratified negation,i.e., where the negation is used only between the strongly connected components of the dependency graph, but is not involved in cyclic dependencies. To introduce stratified negation in our formalism, the only operation needed is the complementation of sets of terms. This operation is straightforwardly implemented by a negation of formulas, but the resulting formulas escape from the domain that we used, as they are not positive (and even less monotone). Overall, dealing with negation requires an adaptation of most of our results, but we believe that these difficulties may be

(17)

overcome, and that our solver can be extended to accept any logic program with stratified negation (and with a finite least fix-point). From the static analysis point of view, this covers most of the uses of negation, due to the monotony properties of such analyses.

Acknowledgements The authors thank Bertrand Jeannet for his help on BDDs, and Delphine Demange and David Pichardie for their comments on this report.

References

[1] Elvira Albert, Miguel G´omez-Zamalloa, Laurent Hubert, and Germ´an Puebla. Verification of Java bytecode using analysis and transformation of logic programs. In Proc. of the 9th International Symposium on Practical Aspects of Declarative Languages. Springer, 2007.

[2] Thomas Ball and Sriram K. Rajamani. The SLAM project: debugging system software via static analysis. InProc. of the 29th symposium on Principles of programming languages (POPL ’02). ACM, 2002.

[3] Fran¸cois Bancilhon, David Maier, Yehoshua Sagiv, and Jeffrey D Ullman. Magic sets and other strange ways to implement logic programs. InProc. of the 5th symposium on Principles of database systems (PODS’86). ACM, 1986. [4] Fr´ed´eric Besson and Thomas Jensen. Modular class analysis with Datalog. InProc. of the 10th Static Analysis

Sympo-sium. Springer-Verlag, 2003.

[5] Dirk Beyer, Andreas Noack, and Claus Lewerentz. Simple and efficient relational querying of software structures. In

Proc. of the 10th Working Conference on Reverse Engineering (WCRE’03). IEEE Computer Society, 2003.

[6] Bruno Bogaert and Sophie Tison. Equality and disequality constraints on direct subterms in tree automata. InSTACS ’92: Proceedings of the 9th Annual Symposium on Theoretical Aspects of Computer Science. Springer-Verlag, 1992. [7] Randal E. Bryant. Graph-based algorithms for boolean function manipulation. IEEE Transactions on Compututers,

35(8), 1986.

[8] Jerry R. Burch, Edmund M. Clarke, Kenneth L. McMillan, David L. Dill, and Lain-Jinn Hwang. Symbolic model checking 1020 _{states and beyond.} _{Information and Computation}_{, 98(2), 1992.}

[9] Gianpiero Cabodi and Marco Murciano. BDD-based hardware verification. InProc. of the 6th International School on Formal Methods for the Design of Computer, Communication, and Software Systems. Springer, 2006.

[10] Anne-C´ecile Caron, Hubert Comon, Jean-Luc Coquid´e, Max Dauchet, and Florent Jacquemard. Pumping, cleaning and symbolic constraints solving. InICALP ’94: Proceedings of the 21st International Colloquium on Automata, Languages and Programming. Springer-Verlag, 1994.

[11] H. Comon, M. Dauchet, R. Gilleron, C. L¨oding, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. Tree automata techniques and applications. Available on: http://www.grappa.univ-lille3.fr/tata, 2007.

[12] M. Dauchet, A.-C. Caron, and J.-L. Coquid´e. Reduction properties and automata with constraints. Journal of Symbolic Computation, 20, 1995.

[13] Steven Dawson, Coimbatore R. Ramakrishnan, and David S. Warren. Practical program analysis using general purpose logic programming systems—a case study. InProc. of the conference on Programming language design and implemen-tation (PLDI ’96). ACM, 1996.

[14] Peter Graf. Substitution tree indexing. InRTA ’95: Proc. of the 6th International Conference on Rewriting Techniques and Applications. Springer-Verlag, 1995.

[15] Peter Graf. Term Indexing. Springer-Verlag, 1996.

[16] Manuel V. Hermenegildo, Richard Warren, and Saumya K. Debray. Global flow analysis as a practical compilation tool.

Journal of Logic Programming, 13, 1992.

[17] A. J. Hu, D. L. Dill, A. Drexler, and C. Han Yang. Higher-level specification and verification with BDDs. InProc. of the 4th Int. Workshop on Computer Aided Verification (CAV’92). Springer-Verlag, 1993.

(18)

[18] Mizuho Iwaihara and Yusaku Inoue. Bottom-up evaluation of logic programs using binary decision diagrams. InProc. of the 11th International Conference on Data Engineering (ICDE’95). IEEE Computer Society, 1995.

[19] David B. Kemp, Peter J. Stuckey, and Divesh Srivastava. Magic sets and bottom-up evaluation of well-founded models. InProc. of the 1991 Int. Symposium on Logic Programming. MIT, 1991.

[20] Raghu Ramakrishnan, Divesh Srivastava, S. Sudarshan, and Praveen Seshadri. Implementation of the coral deductive database system. InSIGMOD ’93: Proc. of the 1993 international conference on Management of data. ACM, 1993. [21] Prasad Rao, Konstantinos Sagonas, Terrance Swift, David S. Warren, and Juliana Freire. XSB: A system for efficiently

computing well-founded semantics. InLogic Programming and Non-monotonic Reasoning, 1997.

[22] R. C. Sekar, R. Ramesh, and I. V. Ramakrishnan. Adaptive pattern matching. In ICALP ’92: Proc. of the 19th International Colloquium on Automata, Languages and Programming. Springer-Verlag, 1992.

[23] Fabio Somenzi. CUDD: CU Decision Diagram Package. University of Colorado at Boulder, 2009. Available on: http://vlsi.colorado.edu/ fabio/CUDD/.

[24] Tiphaine Turpin. Pruning program invariants. PhD thesis, Universit´e de Rennes 1, December 2008.

[25] John Whaley. Context-Sensitive Pointer Analysis using Binary Decision Diagrams. PhD thesis, Stanford University, March 2007.

[26] John Whaley and Monica S. Lam. Cloning-based context-sensitive pointer alias analysis using binary decision diagrams. InProc. of the conference on Programming language design and implementation (PLDI ’04). ACM Press, 2004.

(19)

A

The dining philosophers

We reproduce below the logic program corresponding to our “dining philosophers” example. Our language features regular types, but it should be self explanatory. Note the use of the subReachable _{predicate, which ensures the termination of} recursive predicates on lists.

type philosopher = | thinking | hasLeftFork | eating type fork = | free | used type element = { philosopher : philosopher; rightFork : fork }

type table = element list predicate initial(element)

initial({philosopher = thinking ; rightFork = free}). predicate reachable(table)

reachable([P1, P2, P3]) :- initial(P1), initial(P2), initial(P3). predicate subReachable(table)

subReachable(T) :- reachable(T).

subReachable(T) :- subReachable([_ | T]). predicate transition(table, table) predicate specialTransition(table, table)

transition([{philosopher = P ; rightFork = free}, {philosopher = thinking ; rightFork = F} | T], [{philosopher = P ; rightFork = used}, {philosopher = hasLeftFork ; rightFork = F} | T]). transition([{philosopher = hasLeftFork ; rightFork = free} | T],

[{philosopher = eating ; rightFork = used} | T]).

transition([{philosopher = P ; rightFork = used}, {philosopher = eating ; rightFork = used} | T], [{philosopher = P ; rightFork = free}, {philosopher = thinking ; rightFork = free} | T]). transition([P | T], [P | T1]) :- subReachable([P | T]), transition(T, T1).

predicate takeLastFork(table, table)

takeLastFork ([{philosopher = P ; rightFork = free}], [{philosopher = P ; rightFork = used}]). takeLastFork([P | T], [P | T1]) :- subReachable([P | T]), takeLastFork(T, T1).

predicate releaseLastFork(table, table)

releaseLastFork ([{philosopher = P ; rightFork = used}], [{philosopher = P ; rightFork = free}]). releaseLastFork([P | T], [P | T1]) :- subReachable([P | T]), releaseLastFork(T, T1).

specialTransition([{philosopher = thinking ; rightFork = F} | T],

[{philosopher = hasLeftFork ; rightFork = F} | T1]) :- takeLastFork(T, T1). specialTransition([{philosopher = eating ; rightFork = used} | T],

[{philosopher = thinking ; rightFork = free} | T1]) :- releaseLastFork(T, T1). reachable(T1) :- reachable(T), transition(T, T1).

reachable(T1) :- reachable(T), specialTransition(T, T1). predicate waiting(element)

waiting({philosopher = hasLeftFork ; rightFork = used}). predicate stuck(table)

stuck([]).

stuck([P | T]) :- subReachable([P | T]), waiting(P), stuck(T). predicate deadLock(table)