BIROn - Birkbeck Institutional Research Online
Artale, A. and Ryzhikov, V. and Kontchakov, Roman (2012) DL-lite with
attributes and datatypes. In: de Raedt, L. and Bessiere, C. and Dubois, D.
and Doherty, P. and Frasconi, P. and Heintz, F. and Lucas, P. (eds.) Frontiers
in Artificial Intelligence and Applications. Amsterdam, The Netherlands: IOS
Press, pp. 61-66. ISBN 9781614990970.
Downloaded from:
Usage Guidelines:
Please refer to usage guidelines at
or alternatively
BIROn -
B
irkbeck
I
nstitutional
R
esearch
On
line
Enabling open access to Birkbeck’s published research output
DL-lite with attributes and datatypes
Book chapter
(Published Version)
http://eprints.bbk.ac.uk/5784
Citation:
© 2012 IOS Press
Publisher version
______________________________________________________________
All articles available through Birkbeck ePrints are protected by intellectual property law, including copyright law. Any use made of the contents should comply with the relevant law.
______________________________________________________________
Deposit Guide
Contact: [email protected]
Birkbeck ePrints
Birkbeck ePrints
Artale, A., Ryzhikov, V. and Kontchakov, R. (2012) DL-lite with attributes
and datatypes -
De Raedt, L. et al (Eds.) -
Frontiers in Artificial Intelligence and
Applications
, pp. 61 - 66
DL-Lite
with Attributes and Datatypes
Alessandro Artale
and
Vladislav Ryzhikov
1and
Roman Kontchakov
2Abstract.
We extend theDL-Lite languages by means ofattributesand
datatypes. Attributes—a notion borrowed from data models— associate concrete values from datatypes to abstract objects and in this way complement roles, which describe relation-ships between abstract objects. The extended languages remain tractable (with a notable exception) even though they contain both existential and (a limited form of) universal quantification. We present complexity results for two most important reason-ing problems inDL-Lite: combined complexity of knowledge base satisfiability and data complexity of positive existential query answering.
1
Introduction
TheDL-Litefamilyof description logics has recently been proposed and investigated in [7, 8] and later extended in [2, 15, 3]. The rel-evance of theDL-Litefamily is witnessed by the fact that it forms the basis of OWL 2 QL, one of the three profiles of the Web Ontol-ogy Language, OWL 2 (www.w3.org/TR/owl2-profiles). According to the official W3C profiles document, the purpose of OWL 2 QL is to be the language of choice for applications that use very large amounts of data.
This paper extends theDL-Lite languages of [3] with so-called
attributes(A), which associate concrete values from datatypes with abstract objects. These extensions will be formalized in a new fam-ily of languages,DL-Lite(αHN A)withα∈ {core,krom,horn,bool},
which contain role and attribute inclusions with both (unqualified) existential and (a limited form of) universal quantification. Original and tight complexity results for both knowledge base satisfiability and query answering will be presented in this paper.
The notion ofattributes, borrowed from conceptual modelling for-malisms, introduces a distinction between (abstract) objects and con-crete values (integers, reals, strings, etc.) and, consequently, between concepts (sets of objects) and datatypes (sets of values), and between roles (relating objects to objects) and attributes (relating objects to values). The languageDL-LiteA[15] was introduced with the aim of capturing the notion of attributes inDL-Lite in the setting of
ontology-based data access(OBDA). The datatypes ofDL-LiteAare modelled as pairwisedisjointsets of values (which are also disjoint from concepts); a similar choice is made by various DLs encoding conceptual models [9, 6, 1]. Furthermore, datatypes of DL-LiteA
are used for typing attributes globally: e.g., the concept inclusion
∃salary− Realcan be used to constrain the range of attribute
salaryto the typeReal. However, this means that even if associated
1KRDB Research Centre, Free University of Bozen-Bolzano, Italy, email: {lastname}@inf.unibz.it
2Dept. of Comp. Science and Inf. Sys., Birkbeck, University of London, UK,
email:[email protected]
Employee
Researcher
Professor
salary(Real)
salary({35K–70K})
[image:3.609.342.511.196.275.2]salary({55K–100K})
Figure 1. Salary example
with different concepts, attributes sharing the same name must have the same range restriction.
We consider a more expressive language for attributes and datatypes inDL-Lite. We present two main extensions of the orig-inalDL-LiteA: (i) datatypes are not necessarily mutually disjoint; instead, Horn clauses define relations between them (including dis-jointness and subtype relations); (ii) range restrictions for attributes arelocal(rather than global): i.e., concept inclusions of the form
C ∀U.T specify that all values of the attributeU for instances of the conceptCbelong to datatype T. In this way, we capture a wider range of datatypes (e.g., intervals over the reals) and allow re-use of the very same attribute associated to different concepts, but with different range restrictions. As an example, consider the Entity-Relationship diagram in Fig. 1, which says, in particular, that
• employees’ salary is of typeReal, i.e.,Employee ∀salary.Real;
• researchers’ salary is in the range35K–70K, which is an interval type, a subset ofReal, i.e.,Researcher ∀salary.{35K–70K};
• and professors’ salary in the range55K–100K, i.e.,Professor ∀salary.{55K–100K};
• with researchers and professors being employees, i.e.,ResearcherEmployeeandProfessorEmployee. Local attributes are strictly more expressive than global attributes: for example, the concept inclusion ∀salary.Realis equivalent to∃salary−Realmentioned above and implies thateveryvalue of
salaryis aReal, independently from the type of the employee. Us-ing local attributes we can infer concept disjointness from datatype disjointness for thesame(existentially qualified) attribute. For exam-ple, assume that in the scenario of Fig. 1 we add the concept of For-eignEmployeeas having at-least onesalarythat must be aString(to take account of the currency). ThenEmployeeandForeignEmployee
become disjoint concepts—i.e.,EmployeeForeignEmployee ⊥
will be implied—because of disjointness of the respective datatypes and restrictions on the salary attribute. We also allow more general datatype inclusions, which, for instance, can express that the inter-section of a number of datatypes is empty.
Our work lies between theDL-LiteAproposal and the extensions
ECAI 2012
Luc De Raedt et al. (Eds.) © 2012 The Author(s).
This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License.
doi:10.3233/978-1-61499-098-7-61
of DLs with concrete domains (see [13] for an overview). According to the concrete domain terminology, we consider a path-free exten-sion with unary predicates—predicates coincide with datatypes with a fixed interpretation, as inDL-LiteA. Differently from the concrete domain approach, we do not require attributes to be functional; in-stead, we can specify generic number restrictions over them, simi-larly to extensions ofELwith datatypes [5, 11] and the notion of datatype properties in OWL 2 [14, 10]. Our approach works as far as datatypes aresafe, i.e., unbounded—query answering isCONP-hard in presence of datatypes of specific cardinalities [12, 16]—and no covering constraints hold between them—query answering becomes
CONP-hard again in the presence of a datatype, whose extension is a subset of (is covered by) the union of two other datatypes (cf. Theo-rem 2).
We provide tight complexity results showing that for the Bool, Horn and core languages addition oflocalandsaferange restrictions on attributes does not change the complexity of knowledge base sat-isfiability. On the other hand, surprisingly, for the Krom language complexity increases from NLOGSPACEto NP. These results reflect the intuition that universal restrictions on attributes—as studied in this paper—cannot introduce cyclic dependencies between concepts; on the other hand, unrestricted use of universal restrictions (∀R.C) together with sub-roles, by which qualified existential restrictions (∃R.C) can be encoded, results in EXPTIME-completeness [8].
We complete our complexity results by showing thatpositive exis-tential queryanswering (and so, conjunctive query answering) over core and Horn knowledge bases with attributes, local range restric-tions and safe datatypes is still FO-rewritable and so, is in AC0for data complexity.
The paper is organized as follows. Section 2 presentsDL-Liteand its fragments. Section 3 discusses the notion ofsafedatatypes used in this paper. Sections 4 and 5 study combined complexity of KB satisfiability and data complexity of answering positive existential queries, respectively, when attributes and datatypes are present. Sec-tion 6 concludes this paper. Complete proofs of all the results can be found in the full version [4].
2
The Description Logic
DL-Lite
(boolHN A)The language ofDL-Lite(boolHN A) containsobject names a0, a1, . . .,
value names v0, v1, . . ., concept names A0, A1, . . ., role names
P0, P1, . . ., attribute names U0, U1, . . ., and datatype names
T0, T1, . . .. ComplexrolesR,datatypesT andconceptsC are de-fined as follows:
R ::= Pi | Pi−, B ::= | ⊥ | Ai | ≥q R | ≥q Ui T ::= ⊥D | Ti, C ::= B | ¬C | C1C2,
whereqis a positive integer. Concepts of the formBare calledbasic concepts. ADL-Lite(boolHN A)TBox,T, is a finite set of concept, role and attributeinclusionsof the form:
C1C2 and C ∀U. T, R1R2, U1U2,
and anABox,A, is a finite set of assertions of the form:
Ak(ai), ¬Ak(ai), Pk(ai, aj), ¬Pk(ai, aj), Uk(ai, vj).
We standardly abbreviate≥1Rand≥1U by∃Rand∃U, respec-tively. Taken together, a TBox T and an ABox A constitute the
knowledge base(KB)K= (T,A).
It is known [3] that reasoning with role inclusions and number restrictions (even in core TBoxes without attributes) is already rather
costly, EXPTIME-complete. Thus we impose the following syntactic restriction on sub-roles and sub-attributes [3]:
(interR) ifRhas a proper sub-role inT then it contains no negative
occurrences3of≥q Ror≥q R−forq≥2;
(interU) ifU has a proper sub-attribute in T then it contains no
negative occurrences of≥q Uforq≥2.
Semantics. As usual in description logic, aninterpretation, I =
(ΔI,·I), consists of a nonempty domain ΔI and an
interpreta-tion funcinterpreta-tion·I. The interpretation domainΔI is the union of two nonempty disjoint sets: thedomain of objectsΔIOand thedomain of valuesΔIV. We assume that all interpretations agree on the seman-tics of datatypes and values:⊥ID = ∅andTiI = val(Ti) ⊆ ΔIV
is the set of values of each datatypeTi(which does not depend on
a particular interpretation) andvIj =val(vj) ∈ΔIV is the value of
each namevj(which, again, does not depend onI). Note that the
datatypes do not have to be mutually disjoint—instead, we assume that datatype constraints can be captured by Horn clauses—we will clarify the assumptions in Section 3.
The interpretation function·I assigns an elementaIi ∈ ΔIO to
each object nameai, a subsetAIk ⊆ ΔIO of the domain of objects
to each concept nameAk, a binary relationPkI ⊆ΔIO×ΔIO over
the domain of objects to each role namePk, and a binary relation UkI ⊆ΔIO×ΔIV to each attribute nameUk. We adopt theunique
name assumption(UNA):aIi = aIj,for alli= j. It is known [3]
thatnot adopting the UNA inDL-Lite languages with number re-strictions leads to a significant increase in the complexity of reason-ing: KB satisfiability goes from NLOGSPACEto PTIME-hard with functionality constraints and even to NP-hard with arbitrary number restrictions; query answering loses the AC0data complexity. Com-plex roles and concept are interpreted inIin the standard way:
(P−
k )I = {(w, w)∈ΔIO×ΔIO |(w, w)∈PkI}, I = ΔIO, ⊥I = ∅,
(C1C2)I = CI
1 ∩C2I, (¬C)I = ΔIO\CI, (≥q R)I = w∈ΔI
O|{w|(w, w)∈RI} ≥q
, (≥q U)I = w∈ΔOI |{v|(w, v)∈UI} ≥q
, (∀U. T)I = w∈ΔI
O| ∀v.(w, v)∈UI→v∈TI
,
whereXis the cardinality ofX. Thesatisfaction relation|=is also standard:
I |=C1C2iffCI1 ⊆C2I, I |=R1R2iffRI1⊆RI2, I |=U1U2iffU1I⊆U2I, I |=Uk(ai, vj)iff(aIi, vIj)∈UkI,
I |=Ak(ai)iffaIi ∈AIk, I |=Pk(ai, aj)iff(aIi, aIj)∈PkI, I |=¬Ak(ai)iffaiI ∈/AIk, I |=¬Pk(ai, aj)iff(aIi, aIj)∈/PkI.
A KBK= (T,A)is said to besatisfiable(orconsistent) if there is an interpretation,I, satisfying all the members ofT andA. In this case we writeI |=K(as well asI |=T andI |=A) and say thatI is amodel ofK(T andA).
A positive existential query q(x1, . . . , xn) is a first-order
for-mulaϕ(x1, . . . , xn)constructed by means of conjunction,
disjunc-tion and existential quantificadisjunc-tion starting from atoms of the from
Ak(t1), Tk(t1),Pk(t1, t2)andUk(t1, t2), whereAk is a concept
name,Tk a datatype name,Pk a role name,Ukan attribute name, 3An occurrence of a concept on the right-hand (left-hand) side of a concept
inclusion is callednegativeif it is in the scope of an odd (even) number of negations¬; otherwise it is calledpositive.
A. Artale et al. / DL-Lite with Attributes and Datatypes
andt1, t2aretermstaken from the list of variablesy0, y1, . . ., ob-ject namesa0, a1, . . . and value namesv0, v1, . . .; object names and value names will be calledconstants. We writeq(x)for a query with free variablesx = x1, . . . , xnandq(a)for the result of replacing
every occurrence ofxiinϕ(x)with theith componentaiof a
vec-tor of constantsa = a1, . . . , an. We will equivocate between DL
and first-order interpretations and writeI |=q(a)to say thatq(a)is true inI. Aconjunctive queryis a positive existential query without disjunctions.
For a KBK= (T,A), we say that a tupleaof constants fromA is acertain answertoq(x)with respect toK, and writeK |=q(a), if
I |=q(a)wheneverI |=K. Thequery answering problemis: given a KBK= (T,A), a queryq(x)and a tupleaof constants fromA, decide whetherK |=q(a).
Fragments ofDL-Lite(boolHN A).We consider syntactic restrictions on the form of concept inclusions inDL-Lite(boolHN A)TBoxes. Following the naming scheme of the extendedDL-Lite family [3], we adopt the following definitions. A KBKbelongs toDL-Lite(kromHN A)if only negation is used in construction of its complex concepts:
C ::= B | ¬B (Krom)
(here and below theBare basic concepts).Kis inDL-Lite(hornHN A)if its complex concepts are constructed by using only intersection:
C ::= B1 · · · Bk. (Horn)
Finally, we sayKis inDL-Lite(coreHN A)if its concept inclusions are of
the form:
B1B2, B1B2 ⊥. (core) Note that the positive occurrences ofBon the right-hand side of the above inclusions can also be of the form∀U.T. AsB1 ¬B2is equivalent toB1B2 ⊥, core TBoxes can be regarded as sitting in the intersection of Krom and Horn TBoxes.
The following table summarizes the obtained combined complex-ity results for KB satisfiabilcomplex-ity and data complexcomplex-ity results for query answering (with numbers coded in binary):
language KB satisfiability query answering
DL-Lite(coreHN A) NLogSpace[Th.4] AC0[Th.6]
DL-Lite(HN A)
horn PTIME [Th.4] AC0[Th.6]
DL-Lite(HN A)
krom NP[Th.5] CONP[3]
DL-Lite(HN A)
bool NP[Th.4] CONP[3]
3
Safe Datatypes
In this section we define the notion ofsafedatatypes and show that such restrictions are required for preserving data complexity of query answering.
DEFINITION1. A set of datatypesD={T1, . . . , Tn}is calledsafe
if (i)the difference between an arbitrary intersection of datatypes and an arbitrary union of datatypes is either empty or unbounded; (ii)all constraints between datatypes are in the form of Horn clauses
Ti1∩ · · · ∩Tik⊆DTi0.
A set of datatypesDis calledweakly safeif(i)arbitrary inter-sections of datatypes are either empty or unbounded and(ii)holds.
Restriction (i) has been independently introduced by Savkovic [16].
It follows, in particular, that ifDis (weakly) safe we can assume that each non-empty datatypeTiis unbounded (note that query
an-swering becomesCONP-hard in presence of datatypes of specific cardinalities [12]); and ifDis safe then also arbitrary intersections of datatypes are either empty or unbounded. Thus, ifDis safe then it is also weakly safe. Condition (ii) ensures that datatype constraints inD have the form of Horn clauses,T1∩ · · · ∩Tk ⊆D T, and
thus computable in PTIME; we further restrict datatype constraints toT1⊆DT2andT1∩ · · · ∩Tk⊆D⊥Dwhen dealing with thecore
language. Indeed, allowing covering constraints between datatypes leads toCONP-hardness of conjunctive query answering:
THEOREM2. Conjunctive query answering inDL-Lite(coreHN A)with
covering constraints on datatypes isCONP-hard(even without sub-roles, sub-attributes and number restrictions).
Proof. We prove the result by reduction of the complement of 2+2CNF (similar to instance checking inALE [17]). Suppose we are given a CNFψin which every clause contains two positive and two negative literals (including the constantstrue,false). LetT be a datatype covered by non-empty disjointT0andT1. LetT contain the following concept inclusions for an attributeU and conceptsBand
C:B ∃U,B ∀U.T,C ∀U.T0, and consider the following conjunctive query
q=∃y, t, uP1(y, t1)∧P2(y, t2)∧N1(y, t3)∧N2(y, t4) ∧U(t1, u1)∧U(t2, u2)∧U(t3, u3)∧U(t4, u4)
∧T0(u1)∧T0(u2)∧T1(u3)∧T1(u4)
with rolesP1,P2,N1andN2. We construct an ABoxAψwith
in-dividualstrueandfalsefor the propositional constants, an individ-ualxi, for each propositional variable xi inψ, and an individual ci, for each clause ofψ. LetAψ contain assertionB(xi), for each
propositional variablexiinψ, assertionsC(false), U(true, v1), for a
valuev1of datatypeT1, and the following assertions, for each clause
xji1∨xji2∨ ¬xji3∨ ¬xji4ofψ:
P1(ci, xji1), P2(ci, xji2), N1(ci, xji3), N2(ci, xji4)
(here the xj may include propositional constants). It is readily
checked that(T,Aψ) |= qiff ψis satisfiable. Indeed, ifψis
sat-isfiable we constructIby ‘extending’AψbyU(xi, v0)ifxiis false
in the satisfying assignment and byU(xi, v1)otherwise, wherev0
is inT0andv1inT1(recall that these datatypes are non-empty and disjoint). Conversely, if(T,Aψ) |= q then there is a modelIof (T,Aψ)in whichqis false. Then the satisfying assignment can be
defined as follows: a propositional variablexiis true if one of the
attributeU values ofxibelongs to datatypeT1—it does not matter
whether other values belong toT0or not, the negative answer to the queryqguarantees thatψis true under such an assignment.
The following theorem shows that without condition (i) we lose FO-rewritability of conjunctive queries in the presence of number restrictions.
THEOREM3. Conjunctive query answering inDL-Lite(coreHN A)with
datatypes not respecting condition(i)of Definition 1 isCONP-hard
(even without sub-roles and sub-attributes).
Proof. We modify the proof of Theorem 2. Assume that the differ-ence between a datatype,T, and a union of two datatypes,T0and
T1, has a finite cardinality, sayk. We replace the concept inclusion
B ∃U withB ≥(k+ 1)U, which forces a choice of at least
oneU attribute value to be in eitherT0orT1. In the former case, as before, we assume that the propositional variable gets valuefalse, while in the latter case it gets valuetrue.
Thus, the safe condition essentially disallows the use of enumer-ations and any datatypes, whose non-empty intersection or differ-ence has a finite number of elements. From now on we consider only (weakly) safe datatypes.
4
Complexity of KB Satisfiability in
DL-Lite
(αHN A)We first introduce the encoding of aDL-Lite(boolHN A)KBK= (T,A)
into a first-order sentenceK‡awith one variable, adopting the
tech-nique introduced in [3]. We denote byrole±(K)the set of role names inKand their inverses, byatt(K)anddt(K)the sets of attribute and datatype names inK, respectively, and byob(A)andval(A)the set of all object and value names inA, respectively. To simplify the pre-sentation, we will assume that(R−)− is the same asRand will often useHfor a roleRor an attribute nameU. We will also assume that all number restrictions are of the form∃Rand∃T (i.e., only
q= 1is allowed) and that the ABox contains no negative assertions of the form¬Ak(ai)and¬Pk(ai, aj)—see the full version [4] for
the treatment of the full language.
Everyai ∈ ob(A)is associated with the individual constantai,
and every concept nameAiwith the unary predicateAi(x). For each
concept∃R, we take a fresh unary predicateER(x). Intuitively, for a role namePk, the predicateEPkrepresents the objects with aPk
-successor—the domain ofPk—andEPk−the range ofPk. We also
introduce individual constants, as representatives of the objects in the domain (dpk) and the range (dp−k) of each role Pk. Similarly,
for each attribute nameUi, we take a unary predicateEUi(x),
rep-resenting the objects with at least one value of the attributeU. We also need, for each attribute nameUiand each datatype nameTj, a
unary predicateUiTj(x), representing the objects such thatalltheir Uiattribute values belong to the datatypeTj(as usual, if they have
attributeUivalues at all).
The encodingC∗of a conceptCis then defined inductively:
⊥∗=⊥, (A
i)∗=Ai(x), ∗=, (¬C)∗=¬C∗(x),
(∃H)∗=EH(x), (C1C2)∗=C1(x)∗ ∧C∗2(x),
(∀U.⊥D)∗=¬E1U(x), (∀U.Ti)∗=U Ti(x).
The following sentence then encodes the knowledge baseK:
K‡a=∀xT∗(x)∧β(x)∧ R∈role±(K)
R(x)∧
U∈att(K) θU(x)
∧ A‡a,
where
T∗(x) =
C1C2∈T
C1∗(x)→C2∗(x) ∧
H∗ TH
(∃H)∗(x)→(∃H)∗(x),
A‡a= A(ai)∈A
A∗(ai) ∧
P(ai,aj)∈A
(∃P)∗(a
i)∧(∃P−)∗(aj)
∧
U(ai,vj)∈A
(∃U)∗(a
i) ∧
T∈dt(K) val(vj)∈/val(T)
¬(∀U.T)∗(a
i)
,
and ∗T is the reflexive and transitive closure of the sub-role and sub-attribute relations of the TBox, i.e., of the union
{(R, R),(R−, R−)|RR∈ T } ∪ {(U, U)|U U∈ T }.
Roles are interpreted as binary predicates in a DL interpretation and so, the range of a roleRis not empty whenever its domain con-tains an element. So, in order to capture this intuition, inK‡a we
include the following formula, for eachR∈role±(K):
R(x) =ER(x)→ER−(dr−).
Attributes are involved both in existential and universal quantifica-tion. So, the second conjunct ofT∗reflects the fact that if an object has aU value (existential quantifier∃U) then it also has aUvalue, for eachUwithU ∗T U; universal quantification propagates the
datatypes in the opposite direction:
β(x) =
UU∈T
T∈dt(K)
(∀U.T)∗→(∀U.T)∗.
We also need a formula that captures the relationships between datatypes, as defined by the Horn clauses inD, for all attributesU:
θU(x) =
T1∩···∩Tk⊆DT
(∀U.T1)∗∧ · · · ∧(∀U.T
k)∗→(∀U.T)∗
.
We note that the formulaθU(x), in particular for disjoint datatypes,
e.g., withT1∩T2 ⊆D ⊥D, demonstrates a subtle interaction be-tween attribute range constraints, ∀U.T, and minimal cardinality constraints,∃U.
We now show that for the Bool, Horn and core languages the ad-dition of attributes toDL-Lite(αHN)of [3] does not change the
com-bined complexity of KB satisfiability:
THEOREM 4. Checking KB satisfiability with weakly safe datatypes is NP-complete in DL-Lite(boolHN A), PTIME-complete in
DL-Lite(hornHN A)andNLOGSPACE-complete inDL-Lite (HN A) core .
Proof. (Sketch) We show that a given KBKis satisfiable iff the universal first-order sentence K‡a is satisfiable. One direction is
straightforward: if there is a model ofKthen the model ofK‡acan
be defined on the same domain by taking, say,C∗to beCI. The key ingredient of the converse direction is the unravelling construction: every model ofK‡acan be unravelled into a DL interpretation—in
essence, the pointsdpkanddp−k are copied to recover the structure
of roles as binary relations, while recovering attributes requires more subtlety; see [4] for more details.
It is of interest to note that the complexity of KB satisfiability in-creases in the case of Krom TBoxes:
THEOREM5. Satisfiability ofDL-Lite(kromHN A)KBs isNP-hard with a single pair of disjoint datatypes even without role and attribute inclusions nor cardinalities(and so, forDL-LiteAkrom).
Proof. The proof is by reduction of 3SAT. It exploits the structure of the formulaθU(x)inK‡a: if datatypesTandTare disjoint then the
concept inclusion
∀U.T ∀U.T ∃U ⊥,
although not in the syntax of DL-Lite(kromHN A), is a logical conse-quence ofT. Using such ternary intersections with the full negation of the Krom fragment one can encode 3SAT. Letϕ = mi=1Cibe
a 3CNF, where theCiare ternary clauses over variablesp1, . . . , pn.
Now, supposepi1∨ ¬pi2∨pi3is theith clause ofϕ. It is equivalent
to¬pi1∧pi2∧ ¬pi3 → ⊥and so, can be encoded as follows:
¬Ai1 ∀Ui.T, Ai2 ∀Ui.T, ¬Ai3 ∃Ui,
A. Artale et al. / DL-Lite with Attributes and Datatypes
whereA1, . . . , Anare concept names for variablesp1, . . . , pn, and Uiis an attribute for theith clause (note that Krom concept inclusions
of the form¬B Bare required, which is not available in core TBoxes). LetT consist of all such inclusions for clauses inϕ. It can be seen thatϕis satisfiable iffT is satisfiable.
5
Query Answering: Data Complexity
In this section we study the data complexity of answering posi-tive existential queries over a KB expressed in languages with at-tributes and datatypes. As follows from the proof of Theorem 4, for aDL-Lite(boolHN A)KBK= (T,A), every modelMof the first-order sentenceK‡a induces a forest-shaped modelIMofKwith the
fol-lowing properties:
(forest) The namesa∈ob(A)∪val(A)induce a partitioning of the domainΔIMinto disjoint labelled treesTa = (Ta, Ea, a)with
nodesTa, edgesEa, roota, and a labelling functionathat assigns
a role or an attribute name to each edge (indicating a minimal, w.r.t.∗T, role or attribute name that required a fresh successor due to an existential quantifier); the trees forv∈ val(A)consist of a single node,v.
(copy) There is a mapcp: ΔIM →ob(A)∪val(A)∪dr |R∈
role±(K), such thatcp(a) = a, ifa ∈ ob(A)∪val(A), and
cp(w) =dr, ifa(w, w) =R−, for(w, w)∈Ea.
(role) For every role (attribute name)H,
HIM = (a
i, aj)|H(ai, aj)∈ A, H∗T H∪
(w, w)∈E
a|a(w, w) =H, H∗T H, a∈ob(A).
THEOREM6. The positive existential query answering problem for
DL-Lite(hornHN A)andDL-Lite (HN A)
core is inAC0for data complexity.
Proof. We adopt the technique of the proof of Theorem 7.1 [3]. Sup-pose that we are given aconsistentDL-Lite(hornHN A)KBK= (T,A)
and a positive existential query in prenex formq(x) = ∃y ϕ(x, y) in the signature ofK. LetM0be theminimal Herbrand modelof (the universal Horn sentence)K‡a, and letI0 = (ΔI0,·I0)be the
canonicalmodel ofK, i.e., the model induced byM0(see its con-struction in the full version [4]). The following properties hold, for all basic conceptsBand datatypesT:
aIi0∈B
I0 iff K |=B(a
i),forai∈ob(A), (1) w∈BI0 iff K |=∃RB, forwwithcp(w) =dr, (2)
vI0
i ∈T
I0 iff val(v
i)∈val(T),forvi∈val(A), (3) v∈TI0iffw∈BI0
1 , . . . , BkI0,T |=B1 · · · Bk ∀U.T,(4)
for(w, v)∈UI0andv /∈val(A).
Formula (1) describes conditions when a named object,ai, belongs
to a basic concept,B, in the canonical modelI0—we say it describes thetypeofai. Similarly, (2) describes types of unnamed objects,
which are copies of thedr, for rolesR; it is worth pointing out that those types are determined by a single concept,∃R. The same two properties were used in the proof of he proof of Theorem 7.1 [3]. The other two properties are specific to datatypes: (3) describes the type of a named datatype value and (4) the type of an unnamed datatype value. We note that (4) holds only for safe datatypes, and even weakly safe datatypes cannot guarantee that in the process of unravelling it is always possible, for everyw∈(∃U)I0, to pick a fresh attributeU
value of the ‘minimal type’, i.e., a datatype value that belongsonly
to datatypesT withw∈(∀U.T)I0.
It is straightforward to check that the canonical modelI0provides correct answers to all queries:
LEMMA7. K |=q(a)iffI0|=q(a), for all tuplesa.
Thedepthof a pointw ∈ ΔI0 is the length of the shortest path
in the respective tree to its root. Denote byWm the set of points
of depth≤m(including also valuesv ∈ ΔIV0) that were taken to
satisfy existential quantifiers for objects inWm−1. Our next lemma
shows that to check whetherI0|=q(a)it suffices to consider points of depth≤m0inΔI0, for somem0that does not depend on|A|:
LEMMA8. IfI0 |= ∃y ϕ(a, y)then there is an assignmenta0in
Wm0such thatI0|=a0 ϕ(a, y)anda0(yi)∈Wm0, for allyi∈y,
wherem0=|y|+|role±(T)|+ 1.
To complete the proof of Theorem 6, we encode the problem ‘I0 |= q(a)?’ as a model checking problem for first-order formu-las over the ABoxAconsidered as a first-order model, also denoted byA, with domainob(A)∪val(A); we assume that this first-order model also contains all datatype extensions. Now we define a first-order formula ϕT,q(x) in the signature of T andq such that (i)
ϕT,q(x)depends onT andqbut not onA, and (ii)A |=ϕT,q(a)iff I0|=q(a).
Denote bycon(K)the set of basic concepts inKtogether with all concepts of the form∀U.T, for attribute namesU and datatypesT
fromT. We begin by defining formulasψB(x), forB ∈ con(K),
that describe the types of named objects (cf. (1)): for allai∈ob(A), A |=ψB(ai) iff aiI0 ∈BI0, ifBis a basic concept, (5) A |=ψ∀U.T(ai) iff aIi0 ∈B
I0
1 , . . . , BkI0and (6) T |=B1 · · · Bk ∀U.T.
These formulas are defined as the ‘fixed-points’ of sequences
ψ0B(x), ψB1(x), . . . defined by takingψB0(x) = B∗if BisA, ⊥
or,ψB0(x) =
H∗TH∃y H
(x, y)ifB=∃H,ψ0
B(x) =⊥if B=∀U.T and
ψiB(x) =ψ0B(x)∨
B1 ··· BkB∈ext(T)
ψBi−11(x)∧ · · · ∧ψ
i−1 Bk(x)
,
whereext(T)is the extension ofT with the following: – ∃H ∃H, for allH∗T H,
– ∀U.T ∀U.T, for allU∗T UandT ∈dt(K),
– ∀U.T1 · · · ∀U.Tk ∀U.T, for allT1∩ · · · ∩Tk⊆DT.
(We again assume that all number restrictions are of the form∃R and∃U; the full version [4] treats arbitrary number restrictions). It should be clear that there isNwithψN
B(x)≡ψBN+1(x), for allBat
the same time, and thatNdoes not exceed the cardinality ofcon(K). We setψB(x) =ψBN(x).
Next we define sentencesθB,dr, for B ∈ con(K) anddr with R∈role±(K), that describe types of the unnamed points, i.e., copies of thedr(cf. (2)): for allwwithcp(w) =dr,
A |=θB,dr iff w∈BI0,ifBis a basic concept, (7) A |=θ∀U.T,dr iff T |=∃R ∀U.T. (8)
Note that the type of copies ofdris determined by a single concept,
∃R, and therefore, there is no need to consider conjunctions in (8); see also (6). We inductively define a sequenceθ0B,dr, θ1B,dr, . . . by
takingθB,dr0 =ifB=∃Randθ0B,dr=⊥otherwise, withθiB,dr
defined similarly toψBi above. As with theψB, setθB,dr=θB,drN .
Now, supposeI0 |=a0 ϕ(a, y) anda0(yi) ∈ Wm
0, for every
yi∈y, wherem0is as in Lemma 8. Recall that our aim is to compute
the answer to this query in the first-order modelArepresenting the ABox. This model, however, does not contain points inWm0\W0,
and to represent them, we use the following ‘trick.’ By(forest), every
w∈Wm0is uniquely determined by a pair(a, σ), whereais the root
of the treeTacontainingwandσis the sequence of labelsa(u, v)
on the path fromatow. Not every such pair, however, corresponds to an element inWm0. In order to identify points inWm0, we consider
the following directed graphGT = (VT, ET), whereVT is the set of equivalence classes[H] ={H|H ∗T HandH∗T H}and
ET is the set of all pairs([R],[H])such thatT |=∃R− ∃Hand
R− ∗T H, andH has no proper sub-role/attribute satisfying this
property. LetΣT,m0be the set of all paths in the graphGT of length
≤m0: more precisely,
ΣT,m0 =
ε∪VT ∪([H1], . . . ,[Hn])|2≤n≤m0
and([Hj],[Hj+1])∈ET, for1≤j < n
.
By the unravelling procedure, we have σ ∈ ΣT,m0, for all pairs
(a, σ)representing elements ofWm0. We note, however, that a pair
(a, σ)withσ = ([H], . . .) ∈ ΣT,m0 corresponds to aw∈ Wm0
only ifahas not enoughH-witnesses inA.
In the first-order rewritingϕT,qwe are about to define we assume
that the bound variables yi range overW0 and represent the first
component of the pairs(a, σ)(theseyishould not be confused with
theyiin the original queryq, which range overWm0), whereas the
second component is encoded in theith memberσiof a vectorσ.
Note that constants and free variables need no second component,σ, and, to unify the notation, for each termtwe denote itsσ-component bytσ, which is defined as follows:tσ =εiftis a constant or free
variable andtσ=σ
iift=yi.
Letkbe the number of bound variablesyiand letΣkT,m0be the set
ofk-tuplesσ = (σ1, . . . , σk)withσi ∈ ΣT,m0. Given an
assign-menta0inWm0, we denote bysplit(a0)the pair(a, σ)made of an
assignmentainAandσ∈ΣkT,m0such thatt
σ = ([H1], . . . ,[H n]),
for a sequenceH1, . . . , Hnofa-labels on the path fromatoa0(t).
We define now, for everyσ ∈ ΣkT,m0, concept nameA, role or
at-tribute nameHand datatype nameT:
Aσ(t) =
ψA(t), iftσ=ε,
θA,inv(ds), iftσ=σ.[S],
Hσ(t 1, t2) =
⎧ ⎨ ⎩
HT(t1, t2),iftσ
1=tσ2 =ε,
(t1=t2), iftσ
1.[S]=tσ2,ortσ2=tσ1.[S−],forS∗T H,
⊥, otherwise,
Tσ(t) =
⎧ ⎨ ⎩
T(t), iftσ=ε,
ψ∀U.T(t), iftσ= [U],
θ∀U.T,ds−,iftσ=σ.[S].[U].
LEMMA9. For each assignmenta0inWm0with split(a0) = (a, σ),
I0|=a0 A(t) iff A |=aAσ(t), for concept namesA,
I0|=a0H(t1, t2) iff A |=aHσ(t
1, t2), for roles and attribute namesH,
I0|=a0T(t) iff A |=aTσ(t), for datatype namesT.
Finally, we define the first-order rewriting ofqandT by taking:
ϕT,q(x) = ∃y
σ∈ΣkT,m
0
ϕσ(x, y) ∧ 1≤i≤k σi=([Hi],...)=ε
¬ψ0
∃Hi(yi)∧ψ∃Hi(yi)
,
whereϕσ(x, y) is the result of attaching the superscriptσ to each
atom of ϕ; the last conjunct ensures that each pair(a, σi)
corre-sponds an element ofw∈ Wm0. Correctness of this rewriting
fol-lows from Lemma 9, see the full version [4].
6
Conclusions
We extendedDL-Litewithlocal attributes—allowing the use of the same attribute associated to different concepts—andsafedatatypes where datatype constraints can be expressed with Horn-like clauses. Notably, this is the first time thatDL-Lite is equipped with a form of the universal restriction∀U.T. We showed that such an extension is harmless with the only exception of the Krom fragment, where the complexity rises from NLOGSPACEto NP. We studied also the problem of answering positive existential queries and showed that for the Horn and core extensions the problem remains in AC0(i.e., FO-rewritable).
As a future work we are interested in relaxing the safe condition for datatypes, in particular we conjecture that the restriction on the boundedness of datatype difference can be relaxed for particular con-crete domains.
REFERENCES
[1] A. Artale, D. Calvanese, R. Kontchakov, V. Ryzhikov, and M. Za-kharyaschev, ‘Reasoning over extended ER models’, inProc. of the
26th Int. Conf. on Conceptual Modeling (ER 2007), volume 4801 of
Lecture Notes in Computer Science, pp. 277–292. Springer, (2007).
[2] A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev, ‘
DL-Litein the light of first-order logic’, inProc. of the 22nd Nat. Conf. on
Artificial Intelligence (AAAI 2007), pp. 361–366, (2007).
[3] A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev, ‘The
DL-Litefamily and relations’, Journal of Artificial Intelligence
Re-search,36, 1–69, (2009).
[4] A. Artale, R. Kontchakov, and V. Ryzhikov, ‘DL-Lite with attributes, datatypes and sub-roles (full version)’, Technical Report BBKCS-12-01, Department of Computer Science and Information Systems, Birk-beck, University of London, (2012).
[5] F. Baader, S. Brandt, and C. Lutz, ‘Pushing theELenvelope’, inProc. of the 19th Int. Joint Conf. on Artificial Intelligence, IJCAI-05. Morgan-Kaufmann Publishers, (2005).
[6] D. Berardi, D. Calvanese, and G. De Giacomo, ‘Reasoning on UML class diagrams’,Artificial Intelligence,168(1–2), 70–118, (2005). [7] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati,
‘DL-Lite: Tractable description logics for ontologies’, inProc. of the
20th Nat. Conf. on Artificial Intelligence (AAAI), pp. 602–607, (2005).
[8] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati, ‘Tractable reasoning and efficient query answering in description log-ics: TheDL-Litefamily’,Journal of Automated Reasoning,39(3), 385– 429, (2007).
[9] D. Calvanese, M. Lenzerini, and D. Nardi, ‘Unifying class-based rep-resentation formalisms’,Journal of Artificial Intelligence Research,11, 199–240, (1999).
[10] B. Cuenca Grau, I. Horrocks, B. Motik, B. Parsia, P. Patel-Schneider, and U. Sattler, ‘OWL 2: The next step for OWL’,Journal of Web
Se-mantics: Science, Services and Agents on the World Wide Web,6(4),
309–322, (2008).
[11] M. Despoina, Y. Kazakov, and I. Horrocks, ‘Tractable extensions of the description logic EL with numerical datatypes’,Journal of Automated
Reasoning, (2011).
[12] E. Franconi, Y. A. Ib´a˜nez-Garc´ıa, and I. Seylan, ‘Query answering with DBoxes is hard’,Electr. Notes Theor. Comput. Sci.,278, 71–84, (2011). [13] C. Lutz, ‘Description logics with concrete domains—a survey’, in
Ad-vances in Modal Logics Volume 4. King’s College Publications, (2003).
[14] J. Pan and I. Horrocks, ‘OWL-Eu: Adding customised datatypes into OWL’,Web Semantics: Science, Services and Agents on the World Wide Web,4(1), (2011).
[15] A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati, ‘Linking Data to Ontologies’,Journal on Data Semantics,
X, 133–173, (2008).
[16] O. Savkovi´c,Managing Datatypes in Ontology-Based Data Access, MSc dissertation, European Master in Computational Logic, Faculty of Computer Science, Free University of Bozen-Bolzano, October 2011. [17] A. Schaerf, ‘On the complexity of the instance checking problem in
concept languages with existential quantification’,Journal of
Intelli-gent Information Systems,2, 265–278, (1993).
A. Artale et al. / DL-Lite with Attributes and Datatypes