• No results found

DL lite with attributes and datatypes

N/A
N/A
Protected

Academic year: 2020

Share "DL lite with attributes and datatypes"

Copied!
8
0
0

Loading.... (view fulltext now)

Full text

(1)

BIROn - Birkbeck Institutional Research Online

Artale, A. and Ryzhikov, V. and Kontchakov, Roman (2012) DL-lite with

attributes and datatypes. In: de Raedt, L. and Bessiere, C. and Dubois, D.

and Doherty, P. and Frasconi, P. and Heintz, F. and Lucas, P. (eds.) Frontiers

in Artificial Intelligence and Applications. Amsterdam, The Netherlands: IOS

Press, pp. 61-66. ISBN 9781614990970.

Downloaded from:

Usage Guidelines:

Please refer to usage guidelines at

or alternatively

(2)

BIROn -

B

irkbeck

I

nstitutional

R

esearch

On

line

Enabling open access to Birkbeck’s published research output

DL-lite with attributes and datatypes

Book chapter

(Published Version)

http://eprints.bbk.ac.uk/5784

Citation:

© 2012 IOS Press

Publisher version

______________________________________________________________

All articles available through Birkbeck ePrints are protected by intellectual property law, including copyright law. Any use made of the contents should comply with the relevant law.

______________________________________________________________

Deposit Guide

Contact: [email protected]

Birkbeck ePrints

Birkbeck ePrints

Artale, A., Ryzhikov, V. and Kontchakov, R. (2012) DL-lite with attributes

and datatypes -

De Raedt, L. et al (Eds.) -

Frontiers in Artificial Intelligence and

Applications

, pp. 61 - 66

(3)

DL-Lite

with Attributes and Datatypes

Alessandro Artale

and

Vladislav Ryzhikov

1

and

Roman Kontchakov

2

Abstract.

We extend theDL-Lite languages by means ofattributesand

datatypes. Attributes—a notion borrowed from data models— associate concrete values from datatypes to abstract objects and in this way complement roles, which describe relation-ships between abstract objects. The extended languages remain tractable (with a notable exception) even though they contain both existential and (a limited form of) universal quantification. We present complexity results for two most important reason-ing problems inDL-Lite: combined complexity of knowledge base satisfiability and data complexity of positive existential query answering.

1

Introduction

TheDL-Litefamilyof description logics has recently been proposed and investigated in [7, 8] and later extended in [2, 15, 3]. The rel-evance of theDL-Litefamily is witnessed by the fact that it forms the basis of OWL 2 QL, one of the three profiles of the Web Ontol-ogy Language, OWL 2 (www.w3.org/TR/owl2-profiles). According to the official W3C profiles document, the purpose of OWL 2 QL is to be the language of choice for applications that use very large amounts of data.

This paper extends theDL-Lite languages of [3] with so-called

attributes(A), which associate concrete values from datatypes with abstract objects. These extensions will be formalized in a new fam-ily of languages,DL-Lite(αHN A)withα∈ {core,krom,horn,bool},

which contain role and attribute inclusions with both (unqualified) existential and (a limited form of) universal quantification. Original and tight complexity results for both knowledge base satisfiability and query answering will be presented in this paper.

The notion ofattributes, borrowed from conceptual modelling for-malisms, introduces a distinction between (abstract) objects and con-crete values (integers, reals, strings, etc.) and, consequently, between concepts (sets of objects) and datatypes (sets of values), and between roles (relating objects to objects) and attributes (relating objects to values). The languageDL-LiteA[15] was introduced with the aim of capturing the notion of attributes inDL-Lite in the setting of

ontology-based data access(OBDA). The datatypes ofDL-LiteAare modelled as pairwisedisjointsets of values (which are also disjoint from concepts); a similar choice is made by various DLs encoding conceptual models [9, 6, 1]. Furthermore, datatypes of DL-LiteA

are used for typing attributes globally: e.g., the concept inclusion

∃salary− Realcan be used to constrain the range of attribute

salaryto the typeReal. However, this means that even if associated

1KRDB Research Centre, Free University of Bozen-Bolzano, Italy, email: {lastname}@inf.unibz.it

2Dept. of Comp. Science and Inf. Sys., Birkbeck, University of London, UK,

email:[email protected]

Employee

Researcher

Professor

salary(Real)

salary({35K–70K})

[image:3.609.342.511.196.275.2]

salary({55K–100K})

Figure 1. Salary example

with different concepts, attributes sharing the same name must have the same range restriction.

We consider a more expressive language for attributes and datatypes inDL-Lite. We present two main extensions of the orig-inalDL-LiteA: (i) datatypes are not necessarily mutually disjoint; instead, Horn clauses define relations between them (including dis-jointness and subtype relations); (ii) range restrictions for attributes arelocal(rather than global): i.e., concept inclusions of the form

C ∀U.T specify that all values of the attributeU for instances of the conceptCbelong to datatype T. In this way, we capture a wider range of datatypes (e.g., intervals over the reals) and allow re-use of the very same attribute associated to different concepts, but with different range restrictions. As an example, consider the Entity-Relationship diagram in Fig. 1, which says, in particular, that

employees’ salary is of typeReal, i.e.,Employee ∀salary.Real;

researchers’ salary is in the range35K–70K, which is an interval type, a subset ofReal, i.e.,Researcher ∀salary.{35K–70K};

and professors’ salary in the range55K–100K, i.e.,Professor ∀salary.{55K–100K};

with researchers and professors being employees, i.e.,ResearcherEmployeeandProfessorEmployee. Local attributes are strictly more expressive than global attributes: for example, the concept inclusion ∀salary.Realis equivalent to∃salary−Realmentioned above and implies thateveryvalue of

salaryis aReal, independently from the type of the employee. Us-ing local attributes we can infer concept disjointness from datatype disjointness for thesame(existentially qualified) attribute. For exam-ple, assume that in the scenario of Fig. 1 we add the concept of For-eignEmployeeas having at-least onesalarythat must be aString(to take account of the currency). ThenEmployeeandForeignEmployee

become disjoint concepts—i.e.,EmployeeForeignEmployee ⊥

will be implied—because of disjointness of the respective datatypes and restrictions on the salary attribute. We also allow more general datatype inclusions, which, for instance, can express that the inter-section of a number of datatypes is empty.

Our work lies between theDL-LiteAproposal and the extensions

ECAI 2012

Luc De Raedt et al. (Eds.) © 2012 The Author(s).

This article is published online with Open Access by IOS Press and distributed under the terms of the Creative Commons Attribution Non-Commercial License.

doi:10.3233/978-1-61499-098-7-61

(4)

of DLs with concrete domains (see [13] for an overview). According to the concrete domain terminology, we consider a path-free exten-sion with unary predicates—predicates coincide with datatypes with a fixed interpretation, as inDL-LiteA. Differently from the concrete domain approach, we do not require attributes to be functional; in-stead, we can specify generic number restrictions over them, simi-larly to extensions ofELwith datatypes [5, 11] and the notion of datatype properties in OWL 2 [14, 10]. Our approach works as far as datatypes aresafe, i.e., unbounded—query answering isCONP-hard in presence of datatypes of specific cardinalities [12, 16]—and no covering constraints hold between them—query answering becomes

CONP-hard again in the presence of a datatype, whose extension is a subset of (is covered by) the union of two other datatypes (cf. Theo-rem 2).

We provide tight complexity results showing that for the Bool, Horn and core languages addition oflocalandsaferange restrictions on attributes does not change the complexity of knowledge base sat-isfiability. On the other hand, surprisingly, for the Krom language complexity increases from NLOGSPACEto NP. These results reflect the intuition that universal restrictions on attributes—as studied in this paper—cannot introduce cyclic dependencies between concepts; on the other hand, unrestricted use of universal restrictions (∀R.C) together with sub-roles, by which qualified existential restrictions (∃R.C) can be encoded, results in EXPTIME-completeness [8].

We complete our complexity results by showing thatpositive exis-tential queryanswering (and so, conjunctive query answering) over core and Horn knowledge bases with attributes, local range restric-tions and safe datatypes is still FO-rewritable and so, is in AC0for data complexity.

The paper is organized as follows. Section 2 presentsDL-Liteand its fragments. Section 3 discusses the notion ofsafedatatypes used in this paper. Sections 4 and 5 study combined complexity of KB satisfiability and data complexity of answering positive existential queries, respectively, when attributes and datatypes are present. Sec-tion 6 concludes this paper. Complete proofs of all the results can be found in the full version [4].

2

The Description Logic

DL-Lite

(boolHN A)

The language ofDL-Lite(boolHN A) containsobject names a0, a1, . . .,

value names v0, v1, . . ., concept names A0, A1, . . ., role names

P0, P1, . . ., attribute names U0, U1, . . ., and datatype names

T0, T1, . . .. ComplexrolesR,datatypesT andconceptsC are de-fined as follows:

R ::= Pi | Pi−, B ::= | ⊥ | Ai | ≥q R | ≥q Ui T ::= ⊥D | Ti, C ::= B | ¬C | C1C2,

whereqis a positive integer. Concepts of the formBare calledbasic concepts. ADL-Lite(boolHN A)TBox,T, is a finite set of concept, role and attributeinclusionsof the form:

C1C2 and C ∀U. T, R1R2, U1U2,

and anABox,A, is a finite set of assertions of the form:

Ak(ai), ¬Ak(ai), Pk(ai, aj), ¬Pk(ai, aj), Uk(ai, vj).

We standardly abbreviate1Rand1U by∃Rand∃U, respec-tively. Taken together, a TBox T and an ABox A constitute the

knowledge base(KB)K= (T,A).

It is known [3] that reasoning with role inclusions and number restrictions (even in core TBoxes without attributes) is already rather

costly, EXPTIME-complete. Thus we impose the following syntactic restriction on sub-roles and sub-attributes [3]:

(interR) ifRhas a proper sub-role inT then it contains no negative

occurrences3of≥q Ror≥q R−forq≥2;

(interU) ifU has a proper sub-attribute in T then it contains no

negative occurrences of≥q Uforq≥2.

Semantics. As usual in description logic, aninterpretation, I =

I,·I), consists of a nonempty domain ΔI and an

interpreta-tion funcinterpreta-tion·I. The interpretation domainΔI is the union of two nonempty disjoint sets: thedomain of objectsΔIOand thedomain of valuesΔIV. We assume that all interpretations agree on the seman-tics of datatypes and values:⊥ID = andTiI = val(Ti) ΔIV

is the set of values of each datatypeTi(which does not depend on

a particular interpretation) andvIj =val(vj) ΔIV is the value of

each namevj(which, again, does not depend onI). Note that the

datatypes do not have to be mutually disjoint—instead, we assume that datatype constraints can be captured by Horn clauses—we will clarify the assumptions in Section 3.

The interpretation function·I assigns an elementaIi ΔIO to

each object nameai, a subsetAIk ΔIO of the domain of objects

to each concept nameAk, a binary relationPkI ΔIO×ΔIO over

the domain of objects to each role namePk, and a binary relation UkI ΔIO×ΔIV to each attribute nameUk. We adopt theunique

name assumption(UNA):aIi = aIj,for alli= j. It is known [3]

thatnot adopting the UNA inDL-Lite languages with number re-strictions leads to a significant increase in the complexity of reason-ing: KB satisfiability goes from NLOGSPACEto PTIME-hard with functionality constraints and even to NP-hard with arbitrary number restrictions; query answering loses the AC0data complexity. Com-plex roles and concept are interpreted inIin the standard way:

(P

k )I = {(w, w)∈ΔIO×ΔIO |(w, w)∈PkI}, I = ΔIO, ⊥I = ∅,

(C1C2)I = CI

1 ∩C2I, (¬C)I = ΔIO\CI, (≥q R)I = wΔI

O|{w|(w, w)∈RI} ≥q

, (≥q U)I = w∈ΔOI |{v|(w, v)∈UI} ≥q

, (∀U. T)I = wΔI

O| ∀v.(w, v)∈UI→v∈TI

,

whereXis the cardinality ofX. Thesatisfaction relation|=is also standard:

I |=C1C2iffCI1 ⊆C2I, I |=R1R2iffRI1⊆RI2, I |=U1U2iffU1I⊆U2I, I |=Uk(ai, vj)iff(aIi, vIj)∈UkI,

I |=Ak(ai)iffaIi ∈AIk, I |=Pk(ai, aj)iff(aIi, aIj)∈PkI, I |=¬Ak(ai)iffaiI ∈/AIk, I |=¬Pk(ai, aj)iff(aIi, aIj)∈/PkI.

A KBK= (T,A)is said to besatisfiable(orconsistent) if there is an interpretation,I, satisfying all the members ofT andA. In this case we writeI |=K(as well asI |=T andI |=A) and say thatI is amodel ofK(T andA).

A positive existential query q(x1, . . . , xn) is a first-order

for-mulaϕ(x1, . . . , xn)constructed by means of conjunction,

disjunc-tion and existential quantificadisjunc-tion starting from atoms of the from

Ak(t1), Tk(t1),Pk(t1, t2)andUk(t1, t2), whereAk is a concept

name,Tk a datatype name,Pk a role name,Ukan attribute name, 3An occurrence of a concept on the right-hand (left-hand) side of a concept

inclusion is callednegativeif it is in the scope of an odd (even) number of negations¬; otherwise it is calledpositive.

A. Artale et al. / DL-Lite with Attributes and Datatypes

(5)

andt1, t2aretermstaken from the list of variablesy0, y1, . . ., ob-ject namesa0, a1, . . . and value namesv0, v1, . . .; object names and value names will be calledconstants. We writeq(x)for a query with free variablesx = x1, . . . , xnandq(a)for the result of replacing

every occurrence ofxiinϕ(x)with theith componentaiof a

vec-tor of constantsa = a1, . . . , an. We will equivocate between DL

and first-order interpretations and writeI |=q(a)to say thatq(a)is true inI. Aconjunctive queryis a positive existential query without disjunctions.

For a KBK= (T,A), we say that a tupleaof constants fromA is acertain answertoq(x)with respect toK, and writeK |=q(a), if

I |=q(a)wheneverI |=K. Thequery answering problemis: given a KBK= (T,A), a queryq(x)and a tupleaof constants fromA, decide whetherK |=q(a).

Fragments ofDL-Lite(boolHN A).We consider syntactic restrictions on the form of concept inclusions inDL-Lite(boolHN A)TBoxes. Following the naming scheme of the extendedDL-Lite family [3], we adopt the following definitions. A KBKbelongs toDL-Lite(kromHN A)if only negation is used in construction of its complex concepts:

C ::= B | ¬B (Krom)

(here and below theBare basic concepts).Kis inDL-Lite(hornHN A)if its complex concepts are constructed by using only intersection:

C ::= B1 · · · Bk. (Horn)

Finally, we sayKis inDL-Lite(coreHN A)if its concept inclusions are of

the form:

B1B2, B1B2 ⊥. (core) Note that the positive occurrences ofBon the right-hand side of the above inclusions can also be of the form∀U.T. AsB1 ¬B2is equivalent toB1B2 ⊥, core TBoxes can be regarded as sitting in the intersection of Krom and Horn TBoxes.

The following table summarizes the obtained combined complex-ity results for KB satisfiabilcomplex-ity and data complexcomplex-ity results for query answering (with numbers coded in binary):

language KB satisfiability query answering

DL-Lite(coreHN A) NLogSpace[Th.4] AC0[Th.6]

DL-Lite(HN A)

horn PTIME [Th.4] AC0[Th.6]

DL-Lite(HN A)

krom NP[Th.5] CONP[3]

DL-Lite(HN A)

bool NP[Th.4] CONP[3]

3

Safe Datatypes

In this section we define the notion ofsafedatatypes and show that such restrictions are required for preserving data complexity of query answering.

DEFINITION1. A set of datatypesD={T1, . . . , Tn}is calledsafe

if (i)the difference between an arbitrary intersection of datatypes and an arbitrary union of datatypes is either empty or unbounded; (ii)all constraints between datatypes are in the form of Horn clauses

Ti1∩ · · · ∩Tik⊆DTi0.

A set of datatypesDis calledweakly safeif(i)arbitrary inter-sections of datatypes are either empty or unbounded and(ii)holds.

Restriction (i) has been independently introduced by Savkovic [16].

It follows, in particular, that ifDis (weakly) safe we can assume that each non-empty datatypeTiis unbounded (note that query

an-swering becomesCONP-hard in presence of datatypes of specific cardinalities [12]); and ifDis safe then also arbitrary intersections of datatypes are either empty or unbounded. Thus, ifDis safe then it is also weakly safe. Condition (ii) ensures that datatype constraints inD have the form of Horn clauses,T1∩ · · · ∩Tk ⊆D T, and

thus computable in PTIME; we further restrict datatype constraints toT1⊆DT2andT1∩ · · · ∩Tk⊆D⊥Dwhen dealing with thecore

language. Indeed, allowing covering constraints between datatypes leads toCONP-hardness of conjunctive query answering:

THEOREM2. Conjunctive query answering inDL-Lite(coreHN A)with

covering constraints on datatypes isCONP-hard(even without sub-roles, sub-attributes and number restrictions).

Proof. We prove the result by reduction of the complement of 2+2CNF (similar to instance checking inALE [17]). Suppose we are given a CNFψin which every clause contains two positive and two negative literals (including the constantstrue,false). LetT be a datatype covered by non-empty disjointT0andT1. LetT contain the following concept inclusions for an attributeU and conceptsBand

C:B ∃U,B ∀U.T,C ∀U.T0, and consider the following conjunctive query

q=∃y, t, uP1(y, t1)∧P2(y, t2)∧N1(y, t3)∧N2(y, t4) ∧U(t1, u1)∧U(t2, u2)∧U(t3, u3)∧U(t4, u4)

∧T0(u1)∧T0(u2)∧T1(u3)∧T1(u4)

with rolesP1,P2,N1andN2. We construct an ABoxwith

in-dividualstrueandfalsefor the propositional constants, an individ-ualxi, for each propositional variable xi inψ, and an individual ci, for each clause ofψ. Let contain assertionB(xi), for each

propositional variablexiinψ, assertionsC(false), U(true, v1), for a

valuev1of datatypeT1, and the following assertions, for each clause

xji1∨xji2∨ ¬xji3∨ ¬xji4ofψ:

P1(ci, xji1), P2(ci, xji2), N1(ci, xji3), N2(ci, xji4)

(here the xj may include propositional constants). It is readily

checked that(T,Aψ) |= qiff ψis satisfiable. Indeed, ifψis

sat-isfiable we constructIby ‘extending’byU(xi, v0)ifxiis false

in the satisfying assignment and byU(xi, v1)otherwise, wherev0

is inT0andv1inT1(recall that these datatypes are non-empty and disjoint). Conversely, if(T,Aψ) |= q then there is a modelIof (T,Aψ)in whichqis false. Then the satisfying assignment can be

defined as follows: a propositional variablexiis true if one of the

attributeU values ofxibelongs to datatypeT1—it does not matter

whether other values belong toT0or not, the negative answer to the queryqguarantees thatψis true under such an assignment.

The following theorem shows that without condition (i) we lose FO-rewritability of conjunctive queries in the presence of number restrictions.

THEOREM3. Conjunctive query answering inDL-Lite(coreHN A)with

datatypes not respecting condition(i)of Definition 1 isCONP-hard

(even without sub-roles and sub-attributes).

Proof. We modify the proof of Theorem 2. Assume that the differ-ence between a datatype,T, and a union of two datatypes,T0and

T1, has a finite cardinality, sayk. We replace the concept inclusion

B ∃U withB (k+ 1)U, which forces a choice of at least

(6)

oneU attribute value to be in eitherT0orT1. In the former case, as before, we assume that the propositional variable gets valuefalse, while in the latter case it gets valuetrue.

Thus, the safe condition essentially disallows the use of enumer-ations and any datatypes, whose non-empty intersection or differ-ence has a finite number of elements. From now on we consider only (weakly) safe datatypes.

4

Complexity of KB Satisfiability in

DL-Lite

(αHN A)

We first introduce the encoding of aDL-Lite(boolHN A)KBK= (T,A)

into a first-order sentenceK‡awith one variable, adopting the

tech-nique introduced in [3]. We denote byrole±(K)the set of role names inKand their inverses, byatt(K)anddt(K)the sets of attribute and datatype names inK, respectively, and byob(A)andval(A)the set of all object and value names inA, respectively. To simplify the pre-sentation, we will assume that(R) is the same asRand will often useHfor a roleRor an attribute nameU. We will also assume that all number restrictions are of the form∃Rand∃T (i.e., only

q= 1is allowed) and that the ABox contains no negative assertions of the form¬Ak(ai)and¬Pk(ai, aj)—see the full version [4] for

the treatment of the full language.

Everyai ob(A)is associated with the individual constantai,

and every concept nameAiwith the unary predicateAi(x). For each

concept∃R, we take a fresh unary predicateER(x). Intuitively, for a role namePk, the predicateEPkrepresents the objects with aPk

-successor—the domain ofPk—andEPkthe range ofPk. We also

introduce individual constants, as representatives of the objects in the domain (dpk) and the range (dp−k) of each role Pk. Similarly,

for each attribute nameUi, we take a unary predicateEUi(x),

rep-resenting the objects with at least one value of the attributeU. We also need, for each attribute nameUiand each datatype nameTj, a

unary predicateUiTj(x), representing the objects such thatalltheir Uiattribute values belong to the datatypeTj(as usual, if they have

attributeUivalues at all).

The encodingC∗of a conceptCis then defined inductively:

⊥∗=⊥, (A

i)=Ai(x), =, (¬C)=¬C(x),

(∃H)=EH(x), (C1C2)=C1(x) C2(x),

(∀U.⊥D)=¬E1U(x), (∀U.Ti)=U Ti(x).

The following sentence then encodes the knowledge baseK:

K‡a=∀xT(x)β(x) R∈role±(K)

R(x)

U∈att(K) θU(x)

∧ A‡a,

where

T∗(x) =

C1C2∈T

C1(x)→C2(x)

H∗ TH

(∃H)(x)→(∃H)(x),

A‡a= A(ai)∈A

A∗(ai)

P(ai,aj)∈A

(∃P)(a

i)(∃P)(aj)

U(ai,vj)∈A

(∃U)(a

i)

T∈dt(K) val(vj)∈/val(T)

¬(∀U.T)(a

i)

,

and T is the reflexive and transitive closure of the sub-role and sub-attribute relations of the TBox, i.e., of the union

{(R, R),(R, R)|RR∈ T } ∪ {(U, U)|U U∈ T }.

Roles are interpreted as binary predicates in a DL interpretation and so, the range of a roleRis not empty whenever its domain con-tains an element. So, in order to capture this intuition, inK‡a we

include the following formula, for eachR∈role±(K):

R(x) =ER(x)→ER−(dr).

Attributes are involved both in existential and universal quantifica-tion. So, the second conjunct ofT∗reflects the fact that if an object has aU value (existential quantifier∃U) then it also has aUvalue, for eachUwithU ∗T U; universal quantification propagates the

datatypes in the opposite direction:

β(x) =

UU∈T

T∈dt(K)

(∀U.T)(∀U.T).

We also need a formula that captures the relationships between datatypes, as defined by the Horn clauses inD, for all attributesU:

θU(x) =

T1∩···∩Tk⊆DT

(∀U.T1)∧ · · · ∧(∀U.T

k)∗→(∀U.T)

.

We note that the formulaθU(x), in particular for disjoint datatypes,

e.g., withT1∩T2 ⊆D ⊥D, demonstrates a subtle interaction be-tween attribute range constraints, ∀U.T, and minimal cardinality constraints,∃U.

We now show that for the Bool, Horn and core languages the ad-dition of attributes toDL-Lite(αHN)of [3] does not change the

com-bined complexity of KB satisfiability:

THEOREM 4. Checking KB satisfiability with weakly safe datatypes is NP-complete in DL-Lite(boolHN A), PTIME-complete in

DL-Lite(hornHN A)andNLOGSPACE-complete inDL-Lite (HN A) core .

Proof. (Sketch) We show that a given KBKis satisfiable iff the universal first-order sentence K‡a is satisfiable. One direction is

straightforward: if there is a model ofKthen the model ofK‡acan

be defined on the same domain by taking, say,C∗to beCI. The key ingredient of the converse direction is the unravelling construction: every model ofK‡acan be unravelled into a DL interpretation—in

essence, the pointsdpkanddp−k are copied to recover the structure

of roles as binary relations, while recovering attributes requires more subtlety; see [4] for more details.

It is of interest to note that the complexity of KB satisfiability in-creases in the case of Krom TBoxes:

THEOREM5. Satisfiability ofDL-Lite(kromHN A)KBs isNP-hard with a single pair of disjoint datatypes even without role and attribute inclusions nor cardinalities(and so, forDL-LiteAkrom).

Proof. The proof is by reduction of 3SAT. It exploits the structure of the formulaθU(x)inK‡a: if datatypesTandTare disjoint then the

concept inclusion

∀U.T ∀U.T ∃U ⊥,

although not in the syntax of DL-Lite(kromHN A), is a logical conse-quence ofT. Using such ternary intersections with the full negation of the Krom fragment one can encode 3SAT. Letϕ = mi=1Cibe

a 3CNF, where theCiare ternary clauses over variablesp1, . . . , pn.

Now, supposepi1∨ ¬pi2∨pi3is theith clause ofϕ. It is equivalent

to¬pi1∧pi2∧ ¬pi3 → ⊥and so, can be encoded as follows:

¬Ai1 ∀Ui.T, Ai2 ∀Ui.T, ¬Ai3 ∃Ui,

A. Artale et al. / DL-Lite with Attributes and Datatypes

(7)

whereA1, . . . , Anare concept names for variablesp1, . . . , pn, and Uiis an attribute for theith clause (note that Krom concept inclusions

of the form¬B Bare required, which is not available in core TBoxes). LetT consist of all such inclusions for clauses inϕ. It can be seen thatϕis satisfiable iffT is satisfiable.

5

Query Answering: Data Complexity

In this section we study the data complexity of answering posi-tive existential queries over a KB expressed in languages with at-tributes and datatypes. As follows from the proof of Theorem 4, for aDL-Lite(boolHN A)KBK= (T,A), every modelMof the first-order sentenceK‡a induces a forest-shaped modelIMofKwith the

fol-lowing properties:

(forest) The namesa∈ob(A)∪val(A)induce a partitioning of the domainΔIMinto disjoint labelled treesTa = (Ta, Ea, a)with

nodesTa, edgesEa, roota, and a labelling functionathat assigns

a role or an attribute name to each edge (indicating a minimal, w.r.t.T, role or attribute name that required a fresh successor due to an existential quantifier); the trees forv∈ val(A)consist of a single node,v.

(copy) There is a mapcp: ΔIM ob(A)val(A)dr |R

role±(K), such thatcp(a) = a, ifa ob(A)∪val(A), and

cp(w) =dr, ifa(w, w) =R−, for(w, w)∈Ea.

(role) For every role (attribute name)H,

HIM = (a

i, aj)|H(ai, aj)∈ A, H∗T H∪

(w, w)E

a|a(w, w) =H, H∗T H, a∈ob(A).

THEOREM6. The positive existential query answering problem for

DL-Lite(hornHN A)andDL-Lite (HN A)

core is inAC0for data complexity.

Proof. We adopt the technique of the proof of Theorem 7.1 [3]. Sup-pose that we are given aconsistentDL-Lite(hornHN A)KBK= (T,A)

and a positive existential query in prenex formq(x) = ∃y ϕ(x, y) in the signature ofK. LetM0be theminimal Herbrand modelof (the universal Horn sentence)K‡a, and letI0 = (ΔI0,·I0)be the

canonicalmodel ofK, i.e., the model induced byM0(see its con-struction in the full version [4]). The following properties hold, for all basic conceptsBand datatypesT:

aIi0∈B

I0 iff K |=B(a

i),forai∈ob(A), (1) w∈BI0 iff K |=∃RB, forwwithcp(w) =dr, (2)

vI0

i ∈T

I0 iff val(v

i)∈val(T),forvi∈val(A), (3) v∈TI0iffwBI0

1 , . . . , BkI0,T |=B1 · · · Bk ∀U.T,(4)

for(w, v)∈UI0andv /val(A).

Formula (1) describes conditions when a named object,ai, belongs

to a basic concept,B, in the canonical modelI0—we say it describes thetypeofai. Similarly, (2) describes types of unnamed objects,

which are copies of thedr, for rolesR; it is worth pointing out that those types are determined by a single concept,∃R. The same two properties were used in the proof of he proof of Theorem 7.1 [3]. The other two properties are specific to datatypes: (3) describes the type of a named datatype value and (4) the type of an unnamed datatype value. We note that (4) holds only for safe datatypes, and even weakly safe datatypes cannot guarantee that in the process of unravelling it is always possible, for everyw∈(∃U)I0, to pick a fresh attributeU

value of the ‘minimal type’, i.e., a datatype value that belongsonly

to datatypesT withw∈(∀U.T)I0.

It is straightforward to check that the canonical modelI0provides correct answers to all queries:

LEMMA7. K |=q(a)iffI0|=q(a), for all tuplesa.

Thedepthof a pointw ΔI0 is the length of the shortest path

in the respective tree to its root. Denote byWm the set of points

of depth≤m(including also valuesv ΔIV0) that were taken to

satisfy existential quantifiers for objects inWm−1. Our next lemma

shows that to check whetherI0|=q(a)it suffices to consider points of depth≤m0inΔI0, for somem0that does not depend on|A|:

LEMMA8. IfI0 |= ∃y ϕ(a, y)then there is an assignmenta0in

Wm0such thatI0|=a0 ϕ(a, y)anda0(yi)∈Wm0, for allyi∈y,

wherem0=|y|+|role±(T)|+ 1.

To complete the proof of Theorem 6, we encode the problem ‘I0 |= q(a)?’ as a model checking problem for first-order formu-las over the ABoxAconsidered as a first-order model, also denoted byA, with domainob(A)∪val(A); we assume that this first-order model also contains all datatype extensions. Now we define a first-order formula ϕT,q(x) in the signature of T andq such that (i)

ϕT,q(x)depends onT andqbut not onA, and (ii)A |=ϕT,q(a)iff I0|=q(a).

Denote bycon(K)the set of basic concepts inKtogether with all concepts of the form∀U.T, for attribute namesU and datatypesT

fromT. We begin by defining formulasψB(x), forB con(K),

that describe the types of named objects (cf. (1)): for allai∈ob(A), A |=ψB(ai) iff aiI0 ∈BI0, ifBis a basic concept, (5) A |=ψ∀U.T(ai) iff aIi0 ∈B

I0

1 , . . . , BkI0and (6) T |=B1 · · · Bk ∀U.T.

These formulas are defined as the ‘fixed-points’ of sequences

ψ0B(x), ψB1(x), . . . defined by takingψB0(x) = B∗if BisA,

or,ψB0(x) =

H∗TH∃y H

(x, y)ifB=∃H,ψ0

B(x) =if B=∀U.T and

ψiB(x) =ψ0B(x)

B1 ··· BkB∈ext(T)

ψBi−11(x)∧ · · · ∧ψ

i−1 Bk(x)

,

whereext(T)is the extension ofT with the following: – ∃H ∃H, for allH∗T H,

∀U.T ∀U.T, for allU∗T UandT ∈dt(K),

∀U.T1 · · · ∀U.Tk ∀U.T, for allT1∩ · · · ∩Tk⊆DT.

(We again assume that all number restrictions are of the form∃R and∃U; the full version [4] treats arbitrary number restrictions). It should be clear that there isNwithψN

B(x)≡ψBN+1(x), for allBat

the same time, and thatNdoes not exceed the cardinality ofcon(K). We setψB(x) =ψBN(x).

Next we define sentencesθB,dr, for B con(K) anddr with R∈role±(K), that describe types of the unnamed points, i.e., copies of thedr(cf. (2)): for allwwithcp(w) =dr,

A |=θB,dr iff w∈BI0,ifBis a basic concept, (7) A |=θ∀U.T,dr iff T |=∃R ∀U.T. (8)

Note that the type of copies ofdris determined by a single concept,

∃R, and therefore, there is no need to consider conjunctions in (8); see also (6). We inductively define a sequenceθ0B,dr, θ1B,dr, . . . by

takingθB,dr0 =ifB=∃Randθ0B,dr=otherwise, withθiB,dr

defined similarly toψBi above. As with theψB, setθB,dr=θB,drN .

Now, supposeI0 |=a0 ϕ(a, y) anda0(yi) Wm

0, for every

yi∈y, wherem0is as in Lemma 8. Recall that our aim is to compute

(8)

the answer to this query in the first-order modelArepresenting the ABox. This model, however, does not contain points inWm0\W0,

and to represent them, we use the following ‘trick.’ By(forest), every

w∈Wm0is uniquely determined by a pair(a, σ), whereais the root

of the treeTacontainingwandσis the sequence of labelsa(u, v)

on the path fromatow. Not every such pair, however, corresponds to an element inWm0. In order to identify points inWm0, we consider

the following directed graphGT = (VT, ET), whereVT is the set of equivalence classes[H] ={H|H ∗T HandH∗T H}and

ET is the set of all pairs([R],[H])such thatT |=∃R− ∃Hand

R− ∗T H, andH has no proper sub-role/attribute satisfying this

property. LetΣT,m0be the set of all paths in the graphGT of length

≤m0: more precisely,

ΣT,m0 =

ε∪VT ([H1], . . . ,[Hn])|2≤n≤m0

and([Hj],[Hj+1])∈ET, for1≤j < n

.

By the unravelling procedure, we have σ ΣT,m0, for all pairs

(a, σ)representing elements ofWm0. We note, however, that a pair

(a, σ)withσ = ([H], . . .) ΣT,m0 corresponds to aw∈ Wm0

only ifahas not enoughH-witnesses inA.

In the first-order rewritingϕT,qwe are about to define we assume

that the bound variables yi range overW0 and represent the first

component of the pairs(a, σ)(theseyishould not be confused with

theyiin the original queryq, which range overWm0), whereas the

second component is encoded in theith memberσiof a vectorσ.

Note that constants and free variables need no second component,σ, and, to unify the notation, for each termtwe denote itsσ-component by, which is defined as follows:tσ =εiftis a constant or free

variable and=σ

iift=yi.

Letkbe the number of bound variablesyiand letΣkT,m0be the set

ofk-tuplesσ = (σ1, . . . , σk)withσi ΣT,m0. Given an

assign-menta0inWm0, we denote bysplit(a0)the pair(a, σ)made of an

assignmentainAandσ∈ΣkT,m0such thatt

σ = ([H1], . . . ,[H n]),

for a sequenceH1, . . . , Hnofa-labels on the path fromatoa0(t).

We define now, for everyσ ΣkT,m0, concept nameA, role or

at-tribute nameHand datatype nameT:

(t) =

ψA(t), if=ε,

θA,inv(ds), if=σ.[S],

(t 1, t2) =

⎧ ⎨ ⎩

HT(t1, t2),iftσ

1=2 =ε,

(t1=t2), if

1.[S]=2,or2=1.[S−],forS∗T H,

⊥, otherwise,

(t) =

⎧ ⎨ ⎩

T(t), if=ε,

ψ∀U.T(t), if= [U],

θU.T,ds−,if=σ.[S].[U].

LEMMA9. For each assignmenta0inWm0with split(a0) = (a, σ),

I0|=a0 A(t) iff A |=a(t), for concept namesA,

I0|=a0H(t1, t2) iff A |=a(t

1, t2), for roles and attribute namesH,

I0|=a0T(t) iff A |=a(t), for datatype namesT.

Finally, we define the first-order rewriting ofqandT by taking:

ϕT,q(x) = ∃y

σ∈ΣkT,m

0

ϕσ(x, y) 1≤i≤k σi=([Hi],...)=ε

¬ψ0

∃Hi(yi)∧ψ∃Hi(yi)

,

whereϕσ(x, y) is the result of attaching the superscriptσ to each

atom of ϕ; the last conjunct ensures that each pair(a, σi)

corre-sponds an element ofw∈ Wm0. Correctness of this rewriting

fol-lows from Lemma 9, see the full version [4].

6

Conclusions

We extendedDL-Litewithlocal attributes—allowing the use of the same attribute associated to different concepts—andsafedatatypes where datatype constraints can be expressed with Horn-like clauses. Notably, this is the first time thatDL-Lite is equipped with a form of the universal restriction∀U.T. We showed that such an extension is harmless with the only exception of the Krom fragment, where the complexity rises from NLOGSPACEto NP. We studied also the problem of answering positive existential queries and showed that for the Horn and core extensions the problem remains in AC0(i.e., FO-rewritable).

As a future work we are interested in relaxing the safe condition for datatypes, in particular we conjecture that the restriction on the boundedness of datatype difference can be relaxed for particular con-crete domains.

REFERENCES

[1] A. Artale, D. Calvanese, R. Kontchakov, V. Ryzhikov, and M. Za-kharyaschev, ‘Reasoning over extended ER models’, inProc. of the

26th Int. Conf. on Conceptual Modeling (ER 2007), volume 4801 of

Lecture Notes in Computer Science, pp. 277–292. Springer, (2007).

[2] A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev, ‘

DL-Litein the light of first-order logic’, inProc. of the 22nd Nat. Conf. on

Artificial Intelligence (AAAI 2007), pp. 361–366, (2007).

[3] A. Artale, D. Calvanese, R. Kontchakov, and M. Zakharyaschev, ‘The

DL-Litefamily and relations’, Journal of Artificial Intelligence

Re-search,36, 1–69, (2009).

[4] A. Artale, R. Kontchakov, and V. Ryzhikov, ‘DL-Lite with attributes, datatypes and sub-roles (full version)’, Technical Report BBKCS-12-01, Department of Computer Science and Information Systems, Birk-beck, University of London, (2012).

[5] F. Baader, S. Brandt, and C. Lutz, ‘Pushing theELenvelope’, inProc. of the 19th Int. Joint Conf. on Artificial Intelligence, IJCAI-05. Morgan-Kaufmann Publishers, (2005).

[6] D. Berardi, D. Calvanese, and G. De Giacomo, ‘Reasoning on UML class diagrams’,Artificial Intelligence,168(1–2), 70–118, (2005). [7] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati,

DL-Lite: Tractable description logics for ontologies’, inProc. of the

20th Nat. Conf. on Artificial Intelligence (AAAI), pp. 602–607, (2005).

[8] D. Calvanese, G. De Giacomo, D. Lembo, M. Lenzerini, and R. Rosati, ‘Tractable reasoning and efficient query answering in description log-ics: TheDL-Litefamily’,Journal of Automated Reasoning,39(3), 385– 429, (2007).

[9] D. Calvanese, M. Lenzerini, and D. Nardi, ‘Unifying class-based rep-resentation formalisms’,Journal of Artificial Intelligence Research,11, 199–240, (1999).

[10] B. Cuenca Grau, I. Horrocks, B. Motik, B. Parsia, P. Patel-Schneider, and U. Sattler, ‘OWL 2: The next step for OWL’,Journal of Web

Se-mantics: Science, Services and Agents on the World Wide Web,6(4),

309–322, (2008).

[11] M. Despoina, Y. Kazakov, and I. Horrocks, ‘Tractable extensions of the description logic EL with numerical datatypes’,Journal of Automated

Reasoning, (2011).

[12] E. Franconi, Y. A. Ib´a˜nez-Garc´ıa, and I. Seylan, ‘Query answering with DBoxes is hard’,Electr. Notes Theor. Comput. Sci.,278, 71–84, (2011). [13] C. Lutz, ‘Description logics with concrete domains—a survey’, in

Ad-vances in Modal Logics Volume 4. King’s College Publications, (2003).

[14] J. Pan and I. Horrocks, ‘OWL-Eu: Adding customised datatypes into OWL’,Web Semantics: Science, Services and Agents on the World Wide Web,4(1), (2011).

[15] A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, and R. Rosati, ‘Linking Data to Ontologies’,Journal on Data Semantics,

X, 133–173, (2008).

[16] O. Savkovi´c,Managing Datatypes in Ontology-Based Data Access, MSc dissertation, European Master in Computational Logic, Faculty of Computer Science, Free University of Bozen-Bolzano, October 2011. [17] A. Schaerf, ‘On the complexity of the instance checking problem in

concept languages with existential quantification’,Journal of

Intelli-gent Information Systems,2, 265–278, (1993).

A. Artale et al. / DL-Lite with Attributes and Datatypes

Figure

Figure 1.Salary example

References

Related documents

ƒ During the creation of the sales order the system must determine the shipping point from which the material will be shipped and the route the material will take to get from

Our empirical analysis emphasized the salience of two large groups of users in banks, which face different IS risks and have different needs of information in their work

(b) The average pairwise mutual information I SYMB computed from pre-symbolized time series, or (c) the corresponding version I computed from the associated multiplex

There is a strong connection between query answering in data integration and query answering in databases with incomplete information under constraints (or, query answering in

Currently, there is a human investment advisor answering to these questions, but Nordea is planning on integrating their customer service chatbot Nova into Nora to make the process

Ipomoea maurandiodes is a widespread species often growing on rock outcrops or in sandy cerrado extending from NE Argentina (O ’ Donell 1959b ), eastern Paraguay and eastern

“Permission” shall mean a record in the data base (Information System of the Communications Regulatory Authority) administered by the Communications Regulatory