2.3 Application Dynamics and Consequences
3.1.4 Reasoning and its Computational Complexity
A key feature of DLs is the possibility to reason about the axioms and assertions in a KB and to infer additional implicit knowledge, thereby making it explicit. In general, what can be inferred – the consequences – depends on (i) the rules of inference defined by a (description) logic and obviously on (ii) what is known already – the premises. The rules of inference can be defined so that the set of consequences changes either monotonically or non-monotonically with the set of premises. A (description) logic is
said to be monotonic only if its rules of inference do not allow for reduction of the set of consequences when new premises are added. Otherwise it is non-monotonic. Simply put, it means that learning a new piece of knowledge cannot reduce what is implicitly known by inference. Most DLs, in particular S HOI N and S ROI Q, are monotonic if interpreted under OWA, while interpretation under CWA results in non-monotony since a negative consequence¬ψis no longer entailed if a positive assertion/axiom ψ is
learned (added to a KB).
In the following, we will briefly introduce main reasoning tasks in DLs without go- ing into detail on reasoning algorithms as their internals are not of importance through- out this thesis. Main results on their worst-case computational complexity forS HOI N
andS ROI Qare given at the end of this section.
Standard Reasoning Tasks
Several reasoning tasks (problems) exist for TBoxes and ABoxes. Reasoning over an ABox can further be done in isolation or w.r.t. a TBox. We start with TBox reasoning tasks, which fall into the category of terminological reasoning.
First, when conceptualizing a domain, it is often needed to find out whether con- cepts are contradictory and whether they actually make sense. Intuitively, a concept C makes sense w.r.t. a set of interrelated concepts – a TBox – if there exists an inter- pretation I that satisfies each concept (cf. Definition 3.6) and where C has at least one individual as a member in I. Such a concept is said to be satisfiable w.r.t. the TBox and unsatisfiable otherwise. Other important reasoning tasks over TBoxes are whether one concept subsumes another one, whether two concepts are equivalent, and whether two concepts are disjoint. These tasks can be transferred analogously to roles. Formally, they are defined as follows.
Definition 3.12 (TBox Reasoning Problems). Let C, D be concepts, T a TBox, and A an ABox.
• Concept Satisfiability: C is satisfiable w.r.t. T iff there exists a modelI ofT such that CI is nonempty.
• Concept Subsumption and Equivalence: C is subsumed by (resp. equivalent to) D w.r.t. T iff CI ⊆ DI (CI = DI) for every model I of T, written T |= C v D (T |= C≡ D).12
• Concept Disjointness: C and D are disjoint w.r.t. T iff CI ∩DI = ∅ for every model I ofT.
In DLs where the concept intersection constructor (u) exists and which contain the unsatisfiable concept (⊥), all problems above can be reduced to concept subsumption. This means that it is sufficient to implement subsumption checking in order to imple- ment the other. Similarly, these problems can be reduced to unsatisfiability for DLs having concept intersection (u) and negation (¬).
12The relation symbol|=is either understood as “satisfies” or as “entails” depending on whether the left-hand side is an interpretation or a set of axioms and/or assertions.
The task of computing the entire subsumption hierarchy of parents and children of each named concept in a TBox is the so-called classification process. It is an important feature for verification and graphical visualization of a conceptualization.
Standard reasoning tasks in the ABox (and w.r.t. a TBox) are instance checking, con- sistency checking, the retrieval problem, and its dual the realization problem. Similar to reduction of TBox reasoning tasks to concept subsumption (or satisfiability), the latter three can all be accomplished by reduction to instance checking.
Definition 3.13(ABox Reasoning Problems). Let C be a concept, α an ABox assertion,T a TBox, andAan ABox.
• Instance Checking: Aentails α w.r.t.T (or α is a consequence ofAw.r.t.T) if every interpretation that satisfiesT andAalso satisfies α, writtenT ,A |= α.
• Retrieval Problem: Find all individuals a such thatT ,A |= C(a); analogous for roles. • Realization Problem: Given an individual a and a set of concepts C, find the most
specific concepts Ci ∈ C (i.e., there is no C0 ∈ C such that C0 6=Ci and C0 vCi) such thatT ,A |= Ci(a); analogous for roles.
If the TBox is empty (e.g., when it is acyclic and has been compiled away) then we can dropT inDefinition 3.13. Finally, it is worth mentioning that ABox instance check- ing for negated assertions can be polynomially reduced to ABox consistency without negated assertions, and vice versa [Mil08, Section 2.2]. Formally, if ϕ is either C(a) or R(a, b)thenT,A |= ¬ϕiffA ∪ {ϕ}is inconsistent w.r.t.T.
Queries as Advanced Instance Retrieval
The retrieval problem can be seen as a very limited querying facility to knowledge bases. Data-intensive applications, however, usually have demanding requirements to query- ing facilities. More powerful instance retrieval can be done using so-called conjunctive ABox queries [CGL98, HT00] of the form
q := α1∧ · · · ∧αn
where αi is an atom of the form C(x) or R(x, y), C is a concept, and R is either a simple but possibly inverse abstract role or a concrete role (concrete roles do not have inverses). Let VVbe a finite set of variable names disjoint form VIandVLS. Then x is either an indi- vidual x∈ VI or a variable x∈ VVand y is either a variable y ∈VV, an individual y∈ VI if R is an abstract role, or a lexical form y∈ VLSif R is a concrete role.13 Note that in this form we allow for (i) direct use of individuals in atoms analogous to [KRH07] and (ii) the use of data values and concrete roles analogous to [HM05]. These are unproblem- atic extensions compared to the original form considered in [CGL98, HT00]. As usual, a
13The general concept of a conjunctive query refers to the class of FOL formulas that can be built from atomic formulas, the conjunction connector, and the existential quantifier; that is, a conjunctive query is of the form x1, . . . , xk,∃xk+1, . . . ,∃xl(α1∧ · · · ∧αm)where x1, . . . , xkare free variables (distinguished), xk+1, . . . , xlare bound variables (undistinguished), and α1, . . . , αmare atomic formulas (i.e., n-ary predi- cates over constants and the variables x1, . . . , xl).
variable is represented by a symbol and can be substituted by a value. Variables are parti- tioned into distinguished variables also called answer or solution set variables (i.e., where the substituted value is part of a solution) and existentially quantified undistinguished variables that are not part of a solution. A query without distinguished variables is called a Boolean query because it can only be used to test whether “something” is entailed by an ABox (true) or not (false). Let Var(q)be the set of variables occurring in a conjunctive ABox query q. LetAbe an ABox,I a model ofA, VarF ⊆ Var(q) be the set of variables that occur at the filler position of a concrete role R (i.e., if there is an α = R(x, y)where R is a concrete role and y∈Var(q)then y∈ VarF), and π : Var(q) ∪VI∪VLS →∆I∪∆D a total function such that
π(x) =
aI with a ∈ VI if x ∈ Var(q), x 6∈VarF, and x distinguished lDwith l ∈ VLS if x ∈ Var(q), x ∈VarF, and x distinguished
ε∈ ∆I if x ∈ Var(q), x 6∈VarF, and x undistinguished
ε∈ ∆D if x ∈ Var(q), x ∈VarF, and x undistinguished xI if x is an individual name x ∈VI
xD if x is a lexical form x ∈ VLS .
π is called a match (solution) for I and q if I |=π α for every α ∈ q. For α a concept
membership assertion C(x)or a role membership assertion R(x, y)then
I |=π C(x)if π(x) ∈C I and I |=π R(x, y)if(π(x), π(y)) ∈R I .
If there is a match π for I and q then it is also said that I satisfies q w.r.t. π, written
I |=π q. The query entailment problem (QEP) is deciding whether all models I of a
knowledge base K also satisfy q for some match π, written K |= q. In fact, query en- tailment is the decision procedure used for Boolean queries. The solutions of a query with distinguished variables are tuples of individual names or lexical forms where each tuple is obtained by substituting the distinguished variables according to each match entailed by the K. Finding all solutions corresponds to the query answering problem (QAP) [GLHS08]. QEP and QAP can be mutually reduced [CGL98, HT00].
Finally, the query containment problem (or query subsumption) is the reasoning task of deciding whether a conjunctive query generally has at least the matches that another query has. Formally, given a DL L, a query q is subsumed by a query q0 w.r.t. anL-KB
K = (T,A), denoted withK |=qvq0, iff for everyL-AboxA0and the KBK0 = (T,A0)
it holds that the solution set of q is a subset of the solution set of q0. Observe that this assumes that q, q0 share the same set of distinguished variables.
Computational Complexity of Reasoning Tasks
The worst-case computational complexity of subsumption reasoning for S HOI N is known to be intractable since it is NExpTime-complete [Tob01]. In the general case
S ROI Q is even harder, namely N2ExpTime-complete but NExpTime under the syn-
role inclusion axioms R1◦ · · · ◦Rn vR. For a summary of complexity result for various sublanguages ofS HOI N, S ROI Qand other DLs see [Zol].
Complexity of conjunctive query answering is analyzed either as a function of the size of the query only, the size of the ABox only, or both together. They are respec- tively called query complexity, data complexity, and combined complexity. At the time of writing, decidability of query answering inS HOI N andS ROI Qis still open, though signs that this is the case take shape [GR10]. Recently, it has been shown that com- bined complexity in the Horn fragment ofS HOI QandS ROI Qis ExpTime-complete and 2ExpTime-complete, respectively [ORS11], which means that it is not harder than subsumption in the full versions of these DLs. Complexity of the fairly expressive sub- language S HI Q is known to be 2ExpTime-complete in the presence of inverse roles and ExpTime-complete otherwise [Lut08].
As usual, asymptotic worst-case complexity results say little about average com- plexity in practice. Several studies have given empirical evidence that optimized rea- soning procedures yield reasonable response times in practical settings [Hor98, HM01a, HM08]. It has also been noted that exponentially hard cases can be exponentially rare in practice [Har06]. As a response to intractable complexity results, however, less ex- pressive DLs have been devised of which reasoning is known to be tractable and query answering implementable on top of conventional relational database technology (see
Section 3.3). They are the result of profoundly understanding which particular interac- tions of modeling constructs lead to intractability. In other words, these DLs reflect the desire to get to the highest expressivity for which worst-case tractability is retained.