2.1 Existential rules framework
2.1.3 Chase and finite expansion set
In order to answer queries over a set of facts and rules, the exhaustive derivation has to be finite. A chase is a mechanism that takes an exhaustive derivation and removes what it considers “redundant” rule applications us- ing a derivation reducer. We use the formalization of Rocher [2016] for its simplicity to define a derivation reducer and a chase.
Definition 2.12 (Derivation reducer). Given a set of factsF and a set of rulesR, a derivation reducer σ is a function that takes a rule application tupleDi = (Fi, ri, πi) in a derivation δ = hD0, . . . , Di, . . . i of F with respect to R and returns a rule applications tuple σ (Di) = (Fi0, ri, πi) such that Fi0≡Fi. Definition 2.13 (σ-chase). Given a set of facts F, a set of rules R, a derivation reducerσ, and an exhaustive breadth first derivation δ = hD0, . . . , Di, . . . i of F with respect to R: σ-chase(F, R) = hσ (D0), . . . , σ (Di), . . . i and σ (Di) ∈ σ-chase(F, R) if and only if Facts(σ (Di)) , Facts(σ (Di−1)).
The above definition ensures that only non redundant “meaningful” rule applications are kept (i.e. rule applications that generate something new according to the derivation reducer). A chase is finite if there is a breadth- first rule application stepk such that for all Dj at stepk, no new facts are generated [Baget et al., 2014b].
Applying a chase on a set of facts F and a set of rules R generates the saturated set of facts F∗ that contains all initial and generated facts. Definition 2.14 (Saturated set of facts). Given a set of facts F and a set of rules R, the saturation (or equivalently, closure) of F is SatR(F) = S
D ∈σ-chase (F,R)Facts(D). We also refer to SatR(F) by F∗ when the set of rules R is obvious.
Saturating a set of factsF with a set of rules R until no new rule appli- cation is possible allows us to obtain the universal model. The particularity of this model is that it is representative of all models of (F ∪ R) (we denote the set of models of (F ∪ R) by models(F, R)).
Definition 2.15 (Universal model). Given a set of facts F and a set of rules R, a universal model M of (F ∪ R) is a model of (F ∪ R) such that for all models M0 of (F ∪ R), there is a homomorphism from M to M0.
It is not always possible to obtain the universal model (the saturated set of facts might be infinite), however if the chase is finite then the model of the saturated set of facts is a universal model [Baget et al., 2011]. Therefore query entailment can be expressed using the notion of chase.
Theorem 2.1 (Query entailment and chase [Baget et al., 2011]). Let us consider a set of facts F, a set of rules R and a Boolean conjunc- tive query Q. If σ-chase(F, R) is finite then, (F ∪ R) Q if and only if Facts(σ-chase(F, R)) Q.
Different kinds of chases can be defined using different derivation re- ducers. Each derivation reducer ensures a universal model if its chase is finite. The most common chase is the Frontier chase [Baget et al., 2011], it yields equivalent results as the well-known Skolem chase [Marnette, 2009] that relies on a “skolemisation” of the rules by replacing each occurrence of an existential variableY with a functional term fYr( ~X ), where ~X = f r (r ) are the frontier variables ofr. Frontier chase and skolem chase yield isomorphic results [Baget et al., 2014a], in the sense that they generate exactly the same atoms, up to a bijective renaming of nulls by skolem terms.
The frontier chase considers two rule applications redundant if their map- ping of the frontier variables are the same for the same rule.
Definition 2.16 (Frontier/Skolem chase). The frontier chaseσf r-chase (equivalent to the Skolem chase relies on the frontier derivation reducer (de- noted byσf r) defined as follows. For any derivationδ, σf r(D0) = D0 and for
every Di = (Fi, ri, πi) ∈ δ: Facts(σf r(Di)) =
Fi−1∪πisaf e(Head (ri)) if for every j < i with ri = rj,
πj|f r (rj)(Body(rj)) , πi|f r (ri)(Body(ri))
Fi−1 otherwise
Example 2.7 (Frontier chase). Consider the following set of factsF and set of rules R, inspired from the thesis of Hecham [2018], stating that an animal shelter would keep a dog found alone if it has an owner. If it has a collar or a microchip then it has an owner. A dog named “Rex” with a collar and a microchip is found alone.
• F ={alone(rex), hasCollar (rex), hasMicrochip(rex)}
• R ={r1 : ∀X, Y hasOwner (X, Y ) → keep(X ),
r2 : ∀X alone(X ) ∧ hasCollar (X ) → ∃Y hasOwner (X, Y ), r3 : ∀X alone(X ) ∧ hasMicrochip(X ) → ∃Y hasOwner (X, Y )}
A possible frontier chase of F and R is: σf r-chase(F, R) = h(F, ∅, ∅),
(F1 = F ∪{hasOwner (rex, Null1)}, r2, π1 ={X → rex}), (F2 = F1∪{hasOwner (rex, Null2)}, r3, π2={X → rex}), (F3 = F2∪{keep(rex)}, r1, π3 ={X → rex, Y → Null1})i. First, r2 is applied on {alone(rex), hasCollar (rex)} and generates ∃Y hasOwner (rex, Y ) which is not redundant since r2 has never been applied before, therefore F1 = F0 ∪{hasOwner (rex, Null1)}. Then r3 is applied on
{alone(rex), hasMicrochip(rex)} and generates ∃Y hasOwner (rex, Y ) which is
also not redundant becauser3 has never been applied before (even if it gen- erates the same atom asr2), therefore F2 = F1∪{hasOwner (rex, Null2)}.
Afterwards, r1 is applied on {hasOwner (rex, Null1)} and generates
{keep(rex)} which is not redundant as r1 has never been applied before, therefore F3 = F2 ∪{keep(rex)}. Finally, r1 is applied on the set of facts
{hasOwner (rex, Null2)}with the homomorphism π4 ={X → rex, Y → Null2} and generates {keep(rex)} which is redundant since this rule application reuses the same rule and the same frontier mapping as the rule application on{hasOwner (rex, Null1)}(i.e. π4|f r (r1) = π3|f r (r1)={X → rex}). Since any
additional rule application would be redundant (all rules have been applied with all possible homomorphisms) the frontier chase stops.
Even if the frontier reducer removes some redundant rule applications, the frontier chase might be infinite as shown in the following Example 2.8. Example 2.8 (Infinite frontier chase). Consider the set of fact F and the set of rulesR containing one fact and one rule.
• F ={p(a)}
• R ={r1 : ∀X p(X ) → ∃Y p(Y )}
A possible frontier chase of F and R is:
σf r-chase(F, R) = h(F, ∅, ∅), (F1= F ∪{p(Null1)}, r1, π1={X → a}), (F2 = F1∪{p(Null2)}, r1, π2 ={X → Null1}), (F3 = F2∪{p(Null3)}, r1, π2 ={X → Null2}), . . . i. First,r1is applied usingπ1 and generates ∃Y p(Y ) which is not redundant sincer1 has never been applied before, therefore F1 = F0∪{p(Null1)}. Then r1 is applied on {p(Null1)} using π2 and generates ∃Y p(Y ) which is not redundant since π2|f r (r1) = {X → Null1} , π1|f r (r1) = {X → a}, therefore
F2= F1∪{p(Null2)}, and so on infinitely.
Some derivation reducers are “stronger” than others, this implies that their chase might stop in cases where others do not. This is known as the reducer order relation.
Definition 2.17 (Reducer order relation [Rocher, 2016]). Given two derivation reducersσ1 andσ2, we say thatσ1 is weaker thanσ2 (denoted by σ1 ≤σ2) if for any set of rules R and set of facts F, if σ1-chase is finite then σ2-chase is also finite. Furthermore, we say that σ1 is strictly weaker than σ2 if σ1 ≤σ2 andσ2 σ1.
In the literature, there are four well-known types of chase: the Oblivious chase (σobl-chase) [Cal`ı et al., 2013], the Skolem/Frontier chase (σf r-chase) [Marnette, 2009; Baget et al., 2011], the Restricted chase (σr e s-chase) [Fagin
et al., 2005], and the Core chase (σcor e-chase) [Deutsch et al., 2008].
Proposition 2.1 (Chases finiteness order [Onet, 2013; Rocher, 2016]). The following relations hold: σobl ≤σf r ≤σr es ≤σcor e.
It is well-known that query entailment using a chase is undecidable (the chase might be infinite) [Beeri and Vardi, 1981] even under strong restric- tions such as using a single rule or restricting to binary predicates with no constants. However, some restrictions on the set of rules can ensure decid- ability for a specific type of chase. These restrictions are classified into three big categories known as “abstract classes”. The first one is “Finite Expan- sion Set ” (FES) [Baget et al., 2014b] that ensures that a finite universal model of the knowledge base exists and can be generated using a chase. For each chase we can define its FES class: oblivious-FES, skolem-FES, restricted-FES, and core-FES . The second class is called “Finite Unification Set ” (FUS) [Baget et al., 2011] which guarantees that some backward chain- ing method halts. Finally, the class called “Greedy Bounded Treewidth Set ”
(GBTS) [Baget et al., 2011] ensures that the potentially infinite universal model of a knowledge base has a bounded treewidth. Each abstract class has a set of “concrete classes” that classifies rules based on their syntactic properties e.g. the concrete class Datalog describes rules that do not contain existentially quantified variables. The following Figure 2.1 shows the most studied concrete classes in the literature and the relation between them: an upward edge going from a class C to a class C’ means that any set of rules in class C is also in class C’.
In this thesis we rely mainly on the frontier chase to reason with exis- tential rules, for simplicity we will only give examples and intuitions about concrete classes of skolem-FES.2 Restricting ourselves to the frontier chase and subsequently to the skolem-FES classes of rules is not a very restric- tive constraint since most studied concrete FES classes are skolem-FES (cf. Figure 2.2). MFA Super-weak- acyclic Jointly-acyclic Weakly-acyclic aGRD Datalog Weakly-sticky W-sticky-join Sticky-join Sticky Domain- restricted Linear Jointly-fg Glut-fg Weakly- frontier-guarded Weakly- guarded Frontier- guarded Guarded Frontier-1 FES FUS GBTS
Figure 2.1: Abstract and known concrete classes of existential rules [Baget et al., 2011; Rocher, 2016]
A concrete class is simply a syntactic distinction of rules. The most basic skolem-FES concrete class is the Datalog class (also known as Range Restricted [Abiteboul et al., 1995]) which are rules without the existential quantifier. Another simple class is the aGRD class (Acyclic Graph of Rule Dependency) [Baget et al., 2014a]. A Graph of Rule Dependency is a directed graph that encodes possible interactions between rules: the
2For more information about these concrete classes, see the work of Baget et al. [2011]. The online tool Kiabora http://graphik-team.github.io/graal/downloads/ kiabora-online checks automatically if a set of rules is skolem-FES.
MFA Super-weak- acyclic Jointly-acyclic Weakly-acyclic aGRD Datalog Skolem-FES Oblivious-FES
Figure 2.2: Known concrete FES classes and chases finiteness (all skolem-FES concrete classes are restricted-FES and core-FES
nodes represent the rules and there is an edge from a noder1 to r2 if and only if an application of the rule r1 may create a new application of the rule r2. A GRD is acyclic when it has no circuit. The notions of “weak acyclicity” [Marnette, 2009] and “joint acyclicity” [Kr¨otzsch and Rudolph, 2011] are based on the position of the predicate and the existential and frontier variables. The MFA class (Model Faithful Acyclicity) [Grau et al., 2013] relies on detecting a specific set of facts called critical instance. The following Example 2.9 provides some rules that are skolem-FES.
Example 2.9 (Skolem-FES rules). Consider the following sets of rules: • R1 = {∀X, Y , Z p(X, Z ) ∧ p(Z, Y ) → p(X, Y )} is range restricted (Data-
log).
• R2={∀X, Y siblinдO f (X, Y ) → ∃Z parentO f (Z, X, Y )}is aGRD. • R3={r1 : ∀X, Y p(X, Y ) → ∃Z r (Y , Z ),
r2 : ∀X, Y r (X, Y ) → p(Y , X )}. {r1} is aGRD and {r2} is range re- stricted, howeverR3 is weakly-acyclic and is neither aGRD nor range- restricted.
• R4={r1 : ∀X, Y p(X, Y ) → ∃Z r (Y , Z ),
r2: ∀X, Y r (X, Y ) ∧ r (Y , X ) → p(X, Y )}. {r1} is aGRD and{r2}is range restricted, howeverR4 is Jointly-acyclic.
• R5={r1 : ∀Xq(X ) → ∃Y p(X, Y ) ∧ p(Y , X ) ∧ p(X, X ), r2: ∀X p(X, X ) → r (X ),
r3: ∀X r (X ) → q(X )}. {r1} alone is aGRD, {r2, r3} is range restricted, however R5 is super-weakly-acyclic.
• R6 ={∀X, Y p(X, Y ) → ∃Z,T q(Y , Z ) ∧p(Z,T )}is model-faithful-acyclic. Not all concrete classes are created equal, some might have higher com- plexity for query answering, and applying a chase on these classes would require more time. In the next section we recall the definitions for some complexity classes and describe the complexity of CQ entailment for the skolem-FES concrete classes.