Query inseparability for description logic knowledge bases

(1)

Query Inseparability for Description Logic Knowledge Bases

E. Botoeva,

1

R. Kontchakov,

2

V. Ryzhikov,

1

F. Wolter

3

and M. Zakharyaschev

2

1_{Faculty of Computer Science} 3_{Department of Computer Science} 2_{Department of Computer Science} Free University of Bozen-Bolzano, Italy University of Liverpool, U.K. Birkbeck, University of London, U.K.

{botoeva,ryzhikov}@inf.unibz.it [email protected] {roman,michael}@dcs.bbk.ac.uk

Abstract

We investigate conjunctive query inseparability of description logic (DL) knowledge bases (KBs) with respect to a given signature, a fundamental problem for KB versioning, module extraction, forgetting and knowledge exchange. We study the data and combined complexity of deciding KB query insep-arability for fragments ofHorn-ALCHI, including the DLs underpinningOWL 2 QL andOWL 2 EL. While all of these DLs are P-complete for data complexity, the combined com-plexity ranges from P to EXPTIMEand 2EXPTIME. We also resolve two major open problems forOWL 2 QLby showing that TBox query inseparability and the membership problem for universal UCQ-solutions in knowledge exchange are both EXPTIME-complete for combined complexity.

Introduction

A description logic (DL) knowledge base (KB) consists of a terminological box (TBox), storing conceptual knowledge, and an assertion box (ABox), storing data. Typical applica-tions of KBs involve answering queries over incomplete data sources (ABoxes) augmented by ontologies (TBoxes) that provide additional information about the domain of interest as well as a convenient vocabulary for user queries. The standard query language in such applications, which bal-ances expressiveness and computational complexity, is the language of conjunctives queries (CQs).

With typically large data, often tangled ontologies, and the hard problem of answering CQs over ontologies, vari-ous transformation and comparison tasks are becoming in-dispensable for KB engineering and maintenance. For ex-ample, to make answering certain CQs more efficient, one may want to extract from a given KB a smaller module re-turning the same answers to those CQs as the original KB; to provide the user with a more convenient query vocabu-lary, one may want to reformulate the KB in a new language. These tasks are known as module extraction (Stucken-schmidt, Parent, and Spaccapietra 2009) and knowledge ex-change (Arenas et al. 2012); other relevant tasks include ver-sioning, revision and forgetting (Jim´enez-Ruiz et al. 2011; Wang, Wang, and Topor 2010; Lin and Reiter 1994).

In this paper, we investigate the following relationship be-tween KBs which is fundamental for all such tasks. LetΣ

be a signature consisting of concept and role names. We call KBsK1andK2Σ-query inseparableand writeK1 ≡ΣK2 if any CQ formulated in Σhas the same answers overK1 andK2. Note that even for Σcontaining all concept and role names,Σ-query inseparability does not necessarily im-ply logical equivalence. The relativisation to (smaller) sig-natures is crucial to support the tasks mentioned above:

(versioning) When comparing twoversionsK1 andK2 of a KB with respect to their answers to CQs in a relevant signatureΣ, the basic task is to check whetherK1≡ΣK2.

(modularisation) AΣ-module of a KBKis a KBK0_{⊆ K}

such thatK0 _≡

Σ K. If we are only interested in answer-ing CQs in Σover K, then we can achieve our aim by querying anyΣ-module ofKinstead ofKitself.

(knowledge exchange) In knowledge exchange, we want to transform a KBK1in a signatureΣ1to a new KBK2in a disjoint signatureΣ2 connected toΣ1via a declarative mapping specification given by a TBoxT12. Thus, the tar-get KBK2should satisfy the conditionK1∪ T12≡Σ2 K2,

in which case it is called auniversal UCQ-solution(CQ and UCQ inseparabilities coincide for Horn DLs).

(forgetting) A KBK0 _{results from}_forgetting_{a signature}_Σ

in a KBKifK0 _≡

sig(K)\ΣKandsig(K0)⊆sig(K)\Σ. Thus, the result of forgettingΣdoes not useΣand gives the same answers to CQs without symbols inΣasK.

(2)

restrictions (or role inclusions) that makes Σ-query insep-arability hard. To establish the upper complexity bounds, we develop a uniform game-theoretic technique for check-ing finiteΣ-homomorphic embeddability between (possibly infinite) materialisations of KBs.

Horn-ALCHI

Horn-ALCI

Horn-ALCH

Horn-ALC ELH

EL

DL-LiteH_horn

DL-Litehorn

DL-LiteH_core

DL-Litecore

P Thms. 12, 24

EXPTIME Thms. 12, 25

EXPTIME Thms. 23, 25

P Thms. 16, 24 2EXPTIME

Thms. 23, 25

forward strategy

arbitrary strategy

backward+forward strategy

Σ-query inseparability for KBs has not been investigated systematically before. The polynomial upper bound forEL

was established as a preliminary step to study TBox insep-arability (Lutz and Wolter 2010), and this notion was also used to study forgetting forDL-LiteN_bool (Wang et al. 2010). We apply our results to resolve two important open prob-lems. First, we show that the membership problem for uni-versal UCQ-solutions in knowledge exchange for KBs in

DL-LiteH_core is EXPTIME-complete for combined complex-ity, which settles an open question of (Arenas et al. 2013), where only PSPACE-hardness was established. We also show thatΣ-query inseparability ofDL-LiteH_core TBoxes is EXPTIME-complete, which closes the PSPACE–EXPTIME

gap that was left open by Konev et al. (2011).

Recall that TBoxesT1andT2areΣ-query inseparable if, forallΣ-ABoxesA(which only use concept and role names fromΣ), the KBs(T1,A)and(T2,A)areΣ-query insepa-rable. TBox and KB inseparabilities have different applica-tions. The former supports ontology engineering when data is not known or changes frequently: one can equivalently replace one TBox with another only if they return the same answers to queries foreveryΣ-ABox. In contrast, KB insep-arability is useful in applications where data is stable such as knowledge exchange, module extraction or forgetting for a stable KB in order to re-use it in a new application or as a compilation step to make CQ answering more efficient. As we show below, TBox and KBΣ-query inseparabilities also have different computational properties.

TBoxΣ-query inseparability has been extensively stud-ied (Kontchakov, Wolter, and Zakharyaschev 2010; Lutz and Wolter 2010; Konev et al. 2012). For work on dif-ferent notions of TBox inseparability and the correspond-ing notions of modules and forgettcorrespond-ing, we refer the reader to (Cuenca Grau et al. 2008; Konev, Walther, and Wolter 2009; Del Vescovo et al. 2011; Nikitina and Rudolph 2012; Nikitina and Glimm 2012; Lutz, Seylan, and Wolter 2012).

Omitted proofs can be found in the full version available atwww.dcs.bbk.ac.uk/˜roman.

Horn-

ALCHI

and its Fragments

All the DLs for which we investigate KBΣ-query insepara-bility are Horn fragments ofALCHI. To define these DLs, we fix sequences ofindividual namesai,concept namesAi,

androle namesPi, wherei < ω. Aroleis either a role name Pi or aninverse role P_i−; we assume that (P_i−)− = Pi.

ALCI-concepts,C, are defined by the grammar

C::= Ai | > | ⊥ | ¬C|C1uC2|C1tC2| ∃R.C | ∀R.C, whereRis a role.ALC-conceptsareALCI-concepts with-out inverse roles; EL-concepts are ALC-concepts without the constructs⊥,t,¬and∀R.C.DL-Litehorn-conceptsare

ALCI-concepts withoutt,¬and∀R.C, in whichC =>

in every occurrence of∃R.C. Finally,DL-Litecore-concepts

areDL-Litehorn-concepts withoutu; in other words, they are basic conceptsof the form⊥,>,Aior∃R.>.

For a DLL, anL-concept inclusion(CI) takes the form C v D, where CandD areL-concepts. AnL-TBox,T, contains a finite set ofL-CIs. AnALCHI,DL-LiteH_hornand

DL-LiteH_core TBox can also contain a finite set of role in-clusions(RIs)R1 v R2, where theRiare roles. In ELH TBoxes, RIs do not have inverse roles. DL-Lite TBoxes may also containdisjointness constraintsB1uB2v ⊥and R1uR2v ⊥, for basic conceptsBiand rolesRi.

To introduce the Horn fragments of these DLs, we re-quire the following (standard) recursive definition (Hustadt, Motik, and Sattler 2005; Kazakov 2009): a conceptC oc-curs positively in C; if C occurs positively (respectively, negatively) in C0 then C occurs positively (negatively) in C0 tD, C0 uD, ∃R.C0,∀R.C0, D v C0, and it occurs negatively (positively) in¬C0 _and_C0 _v_D_{. Now, we call a}

TBoxT Hornif no concept of the formCtDoccurs posi-tively inT, and no concept of the form¬Cor∀R.Coccurs negatively inT. In the DLHorn-L, whereLis one of our DLs, only HornLTBoxes are allowed. Clearly, theELand

DL-LiteTBoxes are Horn by definition.

An ABox, A, is a finite set of assertions of the form Ak(ai) orPk(ai, aj). AnL-TBox T and an ABoxA to-gether form anLknowledge base(KB)K = (T,A). The set of individual names inKis denoted byind(K).

The semantics for the DLs is defined in the usual way based on interpretationsI = (∆I,·I₎_{that comply with the}

unique name assumption:aI_i 6=aI_j fori6=j(Baader et al. 2003). We writeI |=αin case an inclusion or assertionα is true inI. IfI |=α, for allα∈ T ∪ A, thenIis amodel

of a KBK= (T,A); in symbols:I |=K.Kisconsistentif it has a model.K |=αmeans thatI |=αfor allI |=K.

Aconjunctive query(CQ)q(~x)is a formula∃~y ϕ(~x, ~y), whereϕis a conjunction of atoms of the form Ak(z1) or Pk(z1, z2)withzi∈~x∪~y. A tuple~a⊆ind(K)(of the same length as~x) is acertain answertoq(~x)overK= (T,A)if

I |=q(~a)for allI |=K; in this case we writeK |=q(~a). If ~

x=∅, the answer toqis ‘yes’ ifK |=qand ‘no’ otherwise. For combined complexity, the problem ‘K |= q(~a)?’ is NP-complete for theDL-Litelogics (Calvanese et al. 2007),

ELandELH(Rosati 2007), and EXPTIME-complete for the remaining Horn DLs above (Eiter et al. 2008). For data com-plexity (with fixedT andq), this problem is in AC0_{for the}

DL-Lite logics (Calvanese et al. 2007) and P-complete for the remaining DLs (Rosati 2007; Eiter et al. 2008).

(3)

Σ-Query Entailment and Inseparability

We define the central notions of this paper.

Definition 1 LetK1andK2be KBs andΣa signature. – K1Σ-query entailsK2ifK2|=q(~a)impliesK1|=q(~a)

forallΣ-CQsq(~x)and all~a⊆ind(K2).

– K1andK2areΣ-query inseparableif theyΣ-query entail each other. In this case we writeK1≡ΣK2.

Observe thatΣ-query inseparability is weaker than log-ical equivalence even if Σ = sig(K1)∪ sig(K2), where sig(Ki)is the signature ofKi. For example,(∅,{A(a)})is

{A, B}-query inseparable from({BvA},{A(a)})but the two KBs are clearly not logically equivalent. Since check-ingΣ-query inseparability can be reduced to twoΣ-query entailment checks, we can prove complexity upper bounds for entailment. Conversely, for most languages we have a semantically transparent reduction ofΣ-query entailment to Σ-query inseparability:

Theorem 2 LetLbe any of our DLs containingELor hav-ing role inclusions. ThenΣ-query entailment forL-KBs is

LOGSPACE-reducible toΣ-query inseparability forL-KBs.

Proof sketch.LetKi= (Ti,Ai),i= 1,2, andΣbe given. We may assume that Σ = sig(K1) ∩sig(K2). We also assume thatL has role inclusions, K1 andK2 are consis-tent and thetrivial interpretationI∅ (with|∆I∅| = 1and SI∅ ₌ ∅_{, for any}_S_{) is a model of the}T

i (a proof without those assumptions is given in the full version). LetK0

i be a copy ofKiin which all symbolsSare replaced by freshSi, and letKΣ

i extendK0i withSi v S, forS ∈ Σ. One can show thatK1Σ-query entailsK2iffK1≡ΣKΣ1 ∪ KΣ2. q ThatI∅ |= Ki is essential in the reduction above. Take

T1 ={AvB, Av ∃R.C},T2 ={> vB, CuB v ⊥} andΣ ={A, B, R, C}. ThenK1 = (T1,{A(a)}) Σ-query entailsK2= (T2,{A(a)})butK16≡ΣKΣ1 ∪ KΣ2.

We now consider the relationship between inseparability and universal UCQ-solutions in knowledge exchange. Sup-poseK1 andK2are KBs in disjoint signaturesΣ1andΣ2. LetT12 be a mapping consisting of inclusions of the form S1 v S2, where theSi are concept (or role) names inΣi. ThenK2 is a universal UCQ-solution for (K1,T12,Σ2) if

K1∪ T12≡Σ2K2. Deciding the latter is called the

member-ship problem for universal UCQ-solutions. For DLsLwith role inclusions, the problem whetherK1∪ T12≡Σ2 K2is a

Σ2-query inseparability problem inL. Conversely, we have:

Theorem 3 Σ-query entailment for any of our DLs L is

LOGSPACE-reducible to the membership problem for uni-versal UCQ-solutions inL.

Proof sketch. We want do decide whether K1 Σ-query entails K2. We again assume that I∅ |= Ti and use the proof of Theorem 2 (for the general case, see the full ver-sion). We may assume thatΣ = sig(K1)∩sig(K2). Let Σ1 = sig(K1). Then K1 Σ-query entails K2 iffK1 Σ1-query entails K2. By the proof of Theorem 2, the latter is the case iff K1 Σ1-query entails KΣ11 ∪ K

Σ1

2 . Clearly,

KΣ1

1 ∪ K Σ1

2 Σ1-query entails K1, and so the two KBs are Σ1-query inseparable. Then K1 Σ-query entailsK2iffK1 is a universal UCQ-solution for(K0

1∪ K02,T12,Σ1), where

T12={S1vS, S2vS|S∈Σ1}. q

Semantic Characterisation

In this section, we give a semantic characterisation of KB Σ-query entailment based on an abstract notion of material-isation and finite homomorphisms between such models.

LetKbe a KB. An interpretationIis called a materiali-sationofKif, for all CQsq(~x)and tuples~a⊆ind(K),

K |=q(~a) iff I |=q(~a).

We say thatKismaterialisableif it has a materialisation. Materialisations can be used to characterise KBΣ-query entailment by means ofΣ-homomorphisms. For an interpre-tationIand a signatureΣ, theΣ-typestI_Σ(x)andrI_Σ(x, y) ofx, y∈∆Iare defined by taking:

tI_Σ(x) ={Σ-concept nameA|x∈AI}, rI_Σ(x, y) ={Σ-roleR|(x, y)∈RI}.

SupposeIi is a materialisation ofKi,i = 1,2. A function h: ∆I2 _→_∆I1 _{is a}_Σ-_homomorphism_from_I

2toI1if, for anya∈ind(K2)and anyx, y∈∆I2_,

– h(aI2_{) =}_aI1 _whenever_tI2

Σ(a)6=∅orr

I2

Σ(a, y)6=∅for somey∈∆I2_{, and}

– tI2

Σ(x)⊆t

I1

Σ(h(x)), r

I2

Σ(x, y)⊆r

I1

Σ(h(x), h(y)). As answers to Σ-CQs are preserved under Σ-homomorph-isms,K1Σ-query entailsK2if there is aΣ-homomorphism fromI2toI1. However, the converse does not hold:

Example 4 SupposeI2 andI1 below are materialisations of KBsK2andK1, whereais the only ABox individual:

a u

P R S T S T

Q

Q Q Q

a x

T , Q R

S, Q R

T , Q R

S, Q

I2

I1

LetΣ ={Q, R, S, T}. Then there is noΣ-homomorphism from I2 toI1 (as r_ΣI2(a, u) = ∅, we can map u to, say, x but then only the shaded part of I2 can be mapped Σ-homomorphically toI1). However, for anyΣ-queryq(~x),

I2|=q(~a)impliesI1|=q(~a)as any finite subinterpretation ofI2can beΣ-homomorphically mapped toI1.

We say thatI2isfinitelyΣ-homomorphically embeddable

intoI1if, for everyfinitesubinterpretationI20 ofI2, there exists aΣ-homomorphism fromI0

2toI1.

To prove the following theorem, one can regard any finite subinterpretation ofI2as a CQ whose variables are elements of∆I2_{, with the answer variables being in}_ind₍_K

2).

Theorem 5 SupposeKiis a consistent KB with a

material-isationIi,i = 1,2. ThenK1 Σ-query entailsK2iffI2 is

finitelyΣ-homomorphically embeddable intoI1.

One problem with applying Theorem 5 is that materiali-sations are in general infinite for any of the DLs considered in this paper. We address this problem by introducing finite representations of materialisations. Let K = (T,A)be a KB and letG = (∆G,·G_, ₎_{be a finite structure such that}

(4)

function on∆GwithAG_i ⊆∆G,P_iG⊆ind(K)×ind(K), and (∆G, )is a directed graph (containing loops) with nodes ∆G and edges ⊆∆G×Ω, in which every edgeu vis labelled with a set(u, v)G 6=∅of roles satisfying the condi-tion: ifu1 vandu2 v, then(u1, v)G = (u2, v)G. We callG agenerating structure forK if the interpretationM

defined below is a materialisation ofK.

ApathinGis a sequenceσ=u0. . . unwithu0∈ind(K) andui ui+1fori < n. Lettail(σ) =unand letpath(G) be the set of paths inG. The materialisationMis given by:

∆M=path(G), aM=a, fora∈ind(K), AM={σ|tail(σ)∈AG},

PM=PG∪ {(σ, σu)|tail(σ) u, P∈(tail(σ), u)G} ∪ {(σu, σ)|tail(σ) u, P− ∈(tail(σ), u)G}. We say that a DLLhasfinitely generated materialisations

if everyL-KB has a generating structure.

Theorem 6 Horn-ALCHI and all of its fragments defined above have finitely generated materialisations. Moreover,

– for any L ∈ {ALCHI,ALCI,ALCH,ALC} and any

Horn-LKB (T,A), a generating structure can be con-structed in time|A| ·2p(|T |)_,_p_{a polynomial}_;

– for anyLin theELandDL-Litefamilies and anyLKB

(T,A), a generating structure can be constructed in time

|A| ·p(|T |),pa polynomial.

Finite generating structures have been defined for

EL(Lutz, Toman, and Wolter 2009),DL-Lite (Kontchakov et al. 2010) and more expressive Horn DLs (Eiter et al. 2008). With the exception ofDL-Lite, however, the relation guiding the construction of materialisations was implicit. We show how the existing constructions can be converted to generating structures in the full version.

Example 7 The materialisationI2from Example 4 can be generated by the structureG2shown below:

a

P R−

S−

T−

S− Q

−

Q−

G2

For a generating structureGforKand a signatureΣ, the Σ-typestG_Σ(u)andrG_Σ(u, v)ofu, v∈∆Gare defined by:

tG_Σ(u) ={Σ-concept nameA|u∈AG},

rG_Σ(u, v) =  



{Σ-roleR|(u, v)∈RG_}_, _if_{u, v}_∈_ind₍_K₎_,

{Σ-roleR|R∈(u, v)G}, ifu v,

∅, otherwise,

where(P−)Gis the converse ofPG. We also definer¯G_Σ(u, v) to contain the inverses of the roles inrG_Σ(u, v); note that ¯

rG_Σ(u, v)is not the same asr_ΣG(v, u); cf. theT−, S−-cycle in Example 7. We writeu Σ_v_if_u _v_and_rG

Σ(u, v)6=∅. In the next section, we show that, for a DL L having finitely generated materialisations, the problem of checking Σ-query entailment betweenL-KBs can be reduced to the problem of finding a winning strategy in a game played on the generating structures for these KBs.

Σ-Query Entailment by Games

Suppose a DLLhas finitely generated materialisations,Ki is a consistentL-KB, fori = 1,2, andΣa signature. Let

Gi= (∆Gi,·Gi, i)be a generating structure forKiand let

Mi be its materialisation; GiΣandMΣi denote the restric-tions ofGiandMitoΣ.

We begin with a very simple game on the finite generating structureGΣ

2 and the possibly infinite materialisationMΣ1.

Infinite game GΣ(G2,M1). This game is played by two players: player 2 and player 1. Thestatesof the game are of the formsi = (ui 7→σi), fori ≥0, whereui ∈ ∆G2 and σi∈∆M1satisfy the following condition:

(s1) tGΣ2(ui)⊆t

M1

Σ (σi).

The game starts in a states0 = (u0 7→ σ0)withσ0 = u0 in caseu0 ∈ ind(K2). In each roundi >0, player 2 chal-lenges player 1 with someui∈∆G2_{such that}_ui

−1 Σ2 ui. Player 1 has to respond with aσi∈∆M1_satisfying_(s

1)and

(s2) rGΣ2(ui−1, ui)⊆rMΣ1(σi−1, σi).

This gives the next statesi= (ui7→σi). Note that of all the uionlyu0may be an ABox individual; however, there is no such a restriction on theσi. Aplay of lengthn≥0starting

froms0is any sequences0, . . . ,snof states obtained as de-scribed above. For an ordinalλ≤ ω, we say that player 1 has aλ-winning strategy in the gameGΣ(G2,M1)starting

from a states0if, for any play of lengthi < λ, which starts froms0and conforms with this strategy, and any challenge of player 2 in roundi+ 1, player 1 has a response.

The following theorem gives a game-theoretic flavour to the criterion of Theorem 5 (see the full paper for a proof).

Theorem 8 M2is finitelyΣ-homomorphically embeddable

intoM1iff the following conditions hold:

(abox) rM2

Σ (a, b)⊆r

M1

Σ (a, b), for anya, b∈ind(K2);

(win) for anyu0∈∆G2andn < ω, there existsσ0∈∆M1

such that player 1 has ann-winning strategy in the game

GΣ(G2,M1)starting from(u07→σ0).

Example 9 LetΣ = {Q, R, S, T}. ConsiderGΣ

2 andMΣ2 shown in the picture below:

a

u

R− S−

T−

S− Q

−

Q−

a σ

T , Q R

S, Q R

T , Q R

S, Q

0

1 2 ₂ 3

3

4 4

GΣ 2

MΣ 1

For anyn < ω andu ∈ ∆G2_{, player 1 has an} _n_-winning

strategy inGΣ(G2,M1). A4-winning strategy starting from (u7→σ)is shown by dotted lines (in round 2, player 2 has two possible challenges). For a largern, a suitableσcan be chosen further away from the rootaofM1.

(5)

by analysing a more complex game on the finite generat-ing structuresG2andG1. We consider four types of strate-gies inGΣ(G2,M1). For each type,τ, we define a game Gτ

Σ(G2,G1)such that, for anyu0∈∆G2, the following con-ditions are equivalent:

(< ω) for everyn < ω, player 1 has ann-winning strategy of typeτinGΣ(G2,M1)starting from some(u07→σ0n);

(ω) player 1 has anω-winning strategy inGτ

Σ(G2,G1) start-ing from some state dependstart-ing onu0andτ.

We start by considering ‘forward’ winning strategies that are sufficient for the DLs without inverse roles.

Forward strategy and gameGf_Σ(G2,G1).We say that aλ -strategy (λ ≤ ω) for player 1 in the gameGΣ(G2,M1)is

forwardif, for any play of length i−1 < λ, which con-forms with this strategy, and any challenge ui−1 Σ2 ui by player 2, the responseσi of player 1 is such that either σi−1, σi∈ind(K1)orσi =σi−1v, for somev∈∆G1.

For example, if theGi,i= 1,2, satisfy the condition

(f) theΣ-labels on i-edges contain no inverse roles, then every strategy in GΣ(G2,M1) is forward. This is clearly the case forHorn-ALCH,Horn-ALC,ELHandEL, which by definition do not have inverse roles.

The existence of a forwardλ-winning strategy for player 1 in GΣ(G2,M1) is equivalent to the existence of such a strategy in the game Gf_Σ(G2,G1), which is defined simi-larly to GΣ(G2,M1) but with two modifications: (1) it is played onG2 andG1; and (2) the responsexi ∈ ∆G1 of player 1 to a challengeui−1 Σ2 uimust be such that either xi−1, xi∈ind(K1)orxi−1 1xi, and(s1)–(s2)hold (with

G1andxiin place ofM1andσi).

Example 10 LetG2 andG1be as shown below. Then, for anyu ∈ ∆G2_{, there is}_x _∈ _∆G1 _{such that player 1 has an}

ω-winning strategy inGf_Σ(G2,G1)starting from(u7→x).

a

R R Q

R R

a

R

Q

0

1 1

2

GΣ 2

GΣ 1

The next theorem follows from K¨onig’s Lemma:

Lemma 11 Foru0 ∈∆G2, condition(< ω)holds for

for-ward strategies inGΣ(G2,M1)iff(ω)holds inG f

Σ(G2,G1)

for some state(u07→x0).

Gf_Σ(G2,G1)is a standard simulation or reachability game on finite graphs, where the existence ofω-winning strate-gies for player 1 follows from the existence of n-winning strategies forn=O(|G2| × |G1|), which can be checked in polynomial time (Mazala 2001; Baier and Katoen 2007). By Theorem 6 and(f), we obtain:

Theorem 12 For combined complexity, checking Σ-query entailment is inPforELandELHKBs, and inEXPTIME forHorn-ALCandHorn-ALCHKBs. For data complexity, it is inPfor all these DLs.

In comparison to forward strategies, the winning strate-gies used in Example 9 can be described as ‘backward.’

Backward strategy and game Gb

Σ(G2,G1). Aλ-strategy for player 1 in GΣ(G2,M1) isbackward if, for any play of length i−1 < λ, which conforms with this strategy, and any challengeui−1 Σ2 uiby player 2, the responseσi of player 1 is theimmediate predecessorofσi−1inM1in the sense thatσi−1 = σiw, for somew ∈ ∆G1 (player 1 loses in caseσi−1∈ind(K1)). Note that, sinceM1is tree-shaped, the response of player 1 to any different challenge ui−1 Σ2 u0imust be the sameσi; cf. Example 9.

That is why the states of the gameGb_Σ(G2,G1)are of the form si = (Ξi 7→ xi), where Ξi ⊆ ∆G2, Ξi 6= ∅, and xi∈∆G1_{satisfy the following condition:}

(s01) tGΣ2(u)⊆t

G1

Σ(xi), for allu∈Ξi.

The game starts in a states0= (Ξ07→x0)such that

(s00) ifu∈Ξ0∩ind(K2), thenx0=u∈ind(K1).

For eachi >0, player 2 always challenges player 1 with the setΞi= Ξi−1, where

Ξ ={v∈∆G2 _|_u Σ

2 v, for someu∈Ξ}, provided that it is not empty (otherwise, player 2 loses). Player 1 responds with xi ∈ ∆G1 _{such that} _xi

1 xi−1 and(s01)and the following condition hold:

(s02) r

G2

Σ(u, v)⊆¯r

G1

Σ(xi−1, xi), for allu∈Ξi−1,v∈Ξi.

Lemma 13 Foru0∈∆G2, condition(< ω)holds for

back-ward strategies inGΣ(G2,M1)iff(ω)holds inGbΣ(G2,G1)

for some state({u0} 7→x0).

Although Lemmas 11 and 13 look similar, the game Gb_Σ(G2,G1)turns out to be more complex thanGfΣ(G2,G1).

Example 14 To illustrate, considerGΣ

2 shown below (with concepts and roles omitted) and an arbitraryG1:

GΣ 2

a u

w1 v1

w2 v2

v3

A play in Gb_Σ(G2,G1) may proceed as: ({u} 7→x0), ({v1, w1} 7→x1),({v2, w2} 7→x2),({v3, w1} 7→ x3), etc. This gives at least 6 different sets Ξi. But ifG2 contained k cycles of lengths p1, . . . , pk, where pi is the ith prime number, then the number of states inGb_Σ(G2,G1)could be exponential (p1× · · · ×pk). In fact, we have the following:

Lemma 15 Checking(ω)in Lemma 13 isCONP-hard.

Observe that in the case of DL-Litecore and DL-Litehorn

(which have inverse roles but no RIs), generating structures

G = (∆G,·G_, ₎_{can be defined so that, for any} _u _∈ _∆G

(6)

Theorem 16 CheckingΣ-query entailment forDL-Litecore and DL-Litehorn KBs is in P for both combined and data complexity.

An arbitrary strategy for player 1 in GΣ(G2,M1) is a combination of a backward strategy and a number of start-bounded strategies to be defined next.

Start-bounded strategy and gameGs

Σ(G2,G1).A strategy for player 1 in the gameGΣ(G2,M1)starting from a state (u0 7→ σ0)isstart-boundedif it never leads to(ui 7→σi) such thatσ0 =σiv, for somevandi >0. In other words, player 1 cannot use those elements ofM1that are located closer to the ABox thanσ0; the ABox individuals inM1can only be used ifσ0∈ind(K1).

Example 17 The strategy starting from (u2 7→ σ1) and shown below is start-bounded:

u2

T W W₁− T₁−

σ1 T , T1 W, W1 0

4

1

3

2

GΣ 2

MΣ 1

In the gameGs_Σ(G2,G1), player 1 will have to guessallthe points ofG2that are mapped to the same point ofM1.

ThestatesofGs_Σ(G2,G1)are of the form(Γi,Ξi 7→xi), i ≥ 0, whereΓi,Ξi ⊆ ∆G2,Ξi 6= ∅,xi ∈ ∆G1 and(s01) holds. The initial state is of the form(∅,Ξ07→x0)such that

(s00)holds. In each roundi >0, player 2 challenges player 1 with someu Σ

2 vsuch thatu∈Ξi−1and

(nbk) ifv∈Γi−1thenrGΣ2(u, v)6⊆¯r

G1

Σ(xi−2, xi−1). Player 1 responds with either a state(Ξi−1,Ξi 7→xi)such thatxi−1 1xi(and soxi∈/ ind(K1)) and(s002)holds, or a state(∅,Ξi7→xi)such thatxi−1, xi∈ind(K1)and

(s00₂) rG2

Σ(u, v)⊆r

G1

Σ(xi−1, xi).

We make challenges u Σ₂ v, for which u ∈ Ξi−1 and

(nbk)does not hold, ‘illegitimate’ becausexi−2can always be used as a response. Because of this, player 1 always moves ‘forward’ inG1, but has to guess appropriate setsΞi in advance. Note thatΓiis always uniquely determined by xi−1,xiandΞi−1(and it is eitherΞi−1or empty).

Example 18 LetGΣ 2 andG

Σ

1 be as follows (cf. Example 17):

u2 _T u6 _W u7 W₁− u8 T₁− u9

x1 T , T1 x3 W, W1 x4

a 0

0

1

2

GΣ 2

GΣ 1

We show that player 1 has an ω-winning strategy in Gs

Σ(G2,G1) starting from (∅,{u2, u9} 7→ x1). Player 2 challenges with u2 Σ2 u6, and player 1 responds with ({u2, u9},{u6, u8} 7→x3). Then player 2 picksu6 Σ2 u7 and player 1 responds with({u6, u8},{u7} 7→ x4), where the game ends. Note the crucial guesses{u2, u9} 7→x1and

{u6, u8} 7→x3made by player 1. If player 1 responded with ({u2, u9},{u6} 7→ x3) (and failed to guess that u8 must also be mapped tox3), then after the challengeu6 Σ2 u7 and response({u6, u8},{u7} 7→x4)), player 2 would pick u7 Σ2 u8, to which player 1 could not respond.

Lemma 19 For anyu0 ∈ ∆G2, condition(< ω)holds for

start-bounded strategies in GΣ(G2,M1) iff (ω) holds in Gs

Σ(G2,G1)for some state(∅,Ξ07→x0)withu0∈Ξ0. As we shall see in the next section, the problem of check-ing the conditions of this lemma is EXPTIME-hard.

Arbitrary strategies and gameGa

Σ(G2,G1). An arbitrary winning strategy in the gameGΣ(G2,M1)can be composed of one backward and a number of start-bounded strategies.

Example 20 ConsiderGΣ

2 andMΣ1 shown below:

u1 u2

R−

u3

u6

S−

T u7

W

u8

W₁−

u9

T₁−

u10

S−₁

u4

U U− u5

σ2

σ1

R

σ3

T,T1

σ4

W, W1

a

S, S1

b

U

0

1 ₁ ₂ ₂

GΣ 2

MΣ 1

Starting from(u1 7→σ2), player 1 can respond to the chal-lenges u1 Σ2 u2 Σ2 u3 according to the backward strategy; the challengesu2 Σ2 u6 Σ2 u7 Σ2 u8 Σ2 u9 according to the start-bounded strategy as in Example 17; the challenges u3 Σ2 u4 Σ2 u5 also according to the obvious start-bounded strategy; finally, the challenge u9 Σ2 u10 needs a response according to the backward strategy. We will combine the two backward strategies into a single one, but keep the start-bounded ones separate.

The game Ga

Σ(G2,G1) begins as GbΣ(G2,G1), but with states of the form(Ξi 7→xi,Ψi),i ≥0, whereΞi ⊆∆G2 and xi ∈ ∆G1 _satisfy _(s0

1) and Ψi is a (possibly empty) subset of Ξ_i , which indicates initial challenges in start-bounded games. The initial state satisfies (s00). In each round i > 0, if xi−1 ∈ ind(K1) then player 2 launches the start-bounded game Gs

Σ(G2,G1) with the initial state (∅,Ξi−1 7→xi−1). Otherwise, ifxi−1 ∈/ ind(K1), player 2 has two options. First, he can challenge player 1 with the set Ψi−1(that is, similar to the backward game but with a possi-bly smallerΨi−1in place ofΞi−1); player 1 responds to this challenge with a state(Ξi 7→xi,Ψi)such thatΨi−1 ⊆Ξi, xi 1 xi−1and(s02)holds. Second, player 2 can launch the start-bounded game Gs_Σ(G2,G1) with the initial state (∅,Ξi−17→xi−1), where the first challenge of player 2 must

be picked fromΦi−1= Ξi−1\Ψi−1.

Example 21 We illustrate the ω-winning strategy for player 1 inGa

Σ(G2,G1)starting from({u1} 7→ x2,{u2}), whereGΣ

2 is from Example 20 andG1Σlooks likeMΣ1 from Example 20 (but withxiin place ofσi):

{u1} 7→x2,{u2}

{u2, u9} 7→x1,{u3,u10}

{u3, u10} 7→a,∅ ∅,{u3, u10} 7→a ∅,{u4} 7→b

u3 u4 ∅,{u5} 7→a

u4 u5

∅,{u2, u9} 7→x1 {u2,u9},{u6, u8} 7→x3

u2 u6

{u6,u8},{u7} 7→x4

(7)

Lemma 22 For any u0 ∈ ∆G2, condition (< ω) holds

for arbitrary strategies in GΣ(G2,M1) iff (ω) holds in Ga

Σ(G2,G1)for some state(Ξ07→x0,Ψ0)withu0∈Ξ0. Condition (ω) in the lemma above is checked in time O(|ind(K2)| ×2|∆G2\ind(K2)|_{× |}_∆G1_|_{), which can be readily}

seen by analysing the full game graph forGa

Σ(G2,G1) (sim-ilar to that in Example 21). By Theorem 6, we then obtain:

Theorem 23 For combined complexity,Σ-query entailment is in2EXPTIME forHorn-ALCHI andHorn-ALCI KBs, and inEXPTIMEforDL-LiteH_hornandDL-LiteH_coreKBs. For data complexity, these problems are all inP.

Lower Bounds

We have shown that, for all of our DLs,Σ-query entailment and inseparability are in P for data complexity. The next theorem establishes a matching lower bound:

Theorem 24 For data complexity,Σ-query entailment and inseparability areP-hard forDL-LitecoreandELKBs.

Proof. The proof is by reduction of the P-complete entail-ment problem foracyclicHorn ternary clauses: given a con-junction ϕof clauses of the form ai andai ∧ai0 → aj,

i, i0 < j, decide whetheran is true in every model ofϕ. Consider theELTBoxT ={V v ∃P.(∃R1.V u ∃R2.V)} and an ABoxAcomprised ofF(an)and

P(ai, ai), R1(ai, ai), R2(ai, ai), for each clauseaiinϕ, P(aj, c), R1(c, ai), R2(c, ai0), forc=ai∧ai0 →ajinϕ.

Set Σ = {F, P, R1, R2}, K2 = (T,A ∪ {V(an)}) and

K1 = (∅,A). Obviously, K2 Σ-query entails K1. On the other hand, the materialisation of K2 is (finitely) Σ-homomorphically embeddable in the materialisation ofK1 iff ϕ derives an (see the full version for details). For

DL-Litecore, we takeT to containV v ∃P,∃P− v ∃Ri and∃R−_i vV, fori= 1,2. _q For combined complexity, EXPTIME-hardness of Σ-query inseparability forHorn-ALCcan be proved by reduc-tion of the subsumpreduc-tion problem: we haveT |=AvBiff (T,{A(a)})and(T ∪ {A v B},{A(a)})are{B}-query inseparable. We now establish matching lower bounds in the technically challenging cases.

Theorem 25 For combined complexity,Σ-query entailment and inseparability are(i)2EXPTIME-hard forHorn-ALCI

KBs and(ii) EXPTIME-hard forDL-LiteH_coreKBs.

Proof. The proof of (i) is by encoding alternating Turing machines (ATMs) with exponential tape and using the fact that AEXPSPACE= 2EXPTIME; see, e.g. (Kozen 2006).

LetM = (Γ, Q, q0, q1, δ)be an ATM with a tape alpha-betΓ, a set of statesQpartitioned into existentialQ∃and

universalQ∀states, an initial stateq0 ∈ Q∃, an accepting

stateq1∈Q, and a transition function

δ: (Q\ {q1})×Γ× {1,2} →Q×Γ× {−1,0,+1}, which, for a state qand symbol a, gives two instructions, δ(q, a,1)andδ(q, a,2). We assume that existential and uni-versal states strictly alternate: any transition from an exis-tential state results in a universal state, and vice versa. We

extendδ with the instructionsδ(q1, a, k) = (q1, a,0), for a ∈ Γ andk = 1,2, which go into an infinite loop ifM reaches the accepting stateq1. Thus, assuming thatM ter-minates on every input, it acceptsw~ iff the modified ATM M0has a run onw~, all branches of which are infinite.

Our aim is to construct, givenM andw~, TBoxesT1and

T2 and a signatureΣsuch thatM0 has a run with only in-finite branches iff the materialisation M2 of (T2,{A(c)}) is finitelyΣ-homomorphically embeddable into the materi-alisationM1 of(T1,{A(c)}). Letf be a polynomial such that, on any input of lengthm,M uses at most2n₋₂_tape cells, withn=f(m), which are numbered from1to2n₋_2, and the head stays to the right of cell 0, which contains the marker[∈Γ. The construction proceeds in five steps.

Step 0. We use tuples of2nconcepts to represent distances of up to2n_{between the cells on the tape in consecutive} con-figurations. We refer to a tupleYn−1, Yn−1, . . . , Y0, Y0 of concept names asY and assume that the TBox contains the following CIs to encode ann-bitR-counter onY:

YkuYk−1u · · · uY0v ∀R.(YkuYk−1u · · · uY0), n > k≥0, YiuYk v ∀R.Yi and YiuYk v ∀R.Yi, n > i > k.

We use the expressionifY₌₂n₋₁on the left-hand side of CIs to say that the Y-value is 2n −1 (which is a shortcut for Yn−1u · · · uY0); we also useifY<2n₋₁on the left-hand side of CIs for the complementary statement (which is a shortcut fornCIs withifY_<₂n₋₁replaced by each ofYn−1, . . . , Y0). Finally, we use setY

0 on the right-hand side of CIs for the reset command (which is equivalent toYn−1u · · · uY0). Note that the counter stops at2n₋_{1: the}_R_{-successors of a} domain element inifY₌₂n₋₁do not have to encode any value.

Step 1. First we encode configurations and transitions ofM0

usingT1. We represent a configuration by ablock, which is a sequence of2n_{+ 1}_{domain elements connected by a role}_P_. The first element distinguishes the blocks for the two alter-native transitions; using aP-counter on a tupleT, we assign indices from 0 to2n−1to all other elements in each block. The element with index 0 is needed for padding. Each of the remaining2n−1elements belongs to a conceptCa, for somea∈ Γ: if the element with indexi+ 1is inCa, then the celliis assumed to containain the configuration repre-sented by the block (in particular, the element with index 1 contains[for cell 0) as shown below:

M1

A

setT0

C[ Ca1 setH0

Ca2 Cam

I

C I

ifT

=2n−1

The first block represents the initial configuration: the input ~

w=a1. . . amis followed by2n−m−2blank symbols and the head is positioned over cell 1, which is indicated by the 0 value of theP-counter on a tupleH. This is achieved by the following CIs in the TBoxT1:

Av ∃P.(setT₀ u ∃P.(C[u ∃P.(Ca1uset

H 0 u

(8)

Step 2. The contents of the tape and the head position in each configuration is encoded in a block of length2n_{+ 1;} the current stateq∈ Qis recorded in the conceptZ0

qathat contains the last element of the block (a ∈ Γspecifies the contents of the active cell scanned by the head). At the end of the block, when theT-value reaches2n−1, we branch out one block for each of the two transitions, reset theP-counter onT, and propagate viaZqa1 andZqa2 the current state and symbol in the active cell: forq ∈Qanda ∈Γ, we add to

T1the CI

ifT₌₂n₋₁uZ_qa0 v

l

k=1,2

∃P.(Xku ∃P.(set₀T uZ_qak )), (T1-4) whereX1andX2are two fresh concept names.

The acceptance condition for M0 is enforced by means of T2, which uses aP-counter on a tuple T0 for a block representing the initial configuration (aT0_-block):

Av ∃P.setT₀0, (T2-1) ifT_<0₂n₋₁v ∃P. (T2-2) Two P-counters, onT1 and T2, are used for blocks rep-resenting configurations with universal sates (T1_{- and}_T2 -blocks respectively) and oneP-counter, on a tupleT3, suf-fices for blocks representing configurations with existential states (T3-blocks). These blocks are arranged into an infi-nite tree-like structure: theT0_{-block is the root, from which} aT1_{- and a}_T2_{-blocks branch out (successors of the initial} stateq0 are universal). Each of them is followed by aT3 -block, which branches out a T1_{- and a}_T2_{-blocks, and so} on. This is achieved by adding toT2the following CIs:

ifT=2kn₋1v

l

j=1,2

∃P.(Xju ∃P.setT0j), fork= 0,3, (T2-3)

ifT_<k₂n₋₁v ∃P.G, fork= 1,2,3, (T2-4) ifT=2kn₋₁v ∃P.∃P.setT

3

0 , fork= 1,2, (T2-5) whereGis a concept name. IfΣ = {A, X1, X2, P}then there is a unique Σ-homomorphism from the T0_{-block in}

M2 to the block of the initial configuration inM1. Next, conceptsX1andX2ensure that theT1- andT2-blocks are Σ-homomorphically mapped (in a unique way) into the re-spective blocks inM1, which reflects the acceptance con-dition of universal states. The following T3_-block, how-ever, contains neither X1 nor X2 and can be mapped to either of the blocks in M1, which reflects the choice in existential states; see the picture below, where possible Σ-homomorphisms are shown by thick dashed arrows:

M2

M1 A

0 1

2

3

A

X1

X2

Step 3. Recall that theP-counter onH measures the dis-tance from the head: if the active cell in the current configu-ration isk, then itsH-value is 0 and theH-value of the cell k−2in asuccessorconfiguration is2n₋_{1. So, until the} H-counter reaches2n₋_{1, the following CIs in}_T

1propagate the state and symbol in the active cell along the blocks: for q∈Q,a∈Γandk= 0,1,2,

ifT_<₂n₋₁uif H

<2n₋₁uZ_qak v

l

b∈Γ

∃P.(CbuZ_qak ) (T1-5)

(for eachb∈Γ, these CIs generate a branch inM1to rep-resent the same cell but with a different symbol, b, tenta-tively assigned to the cell—Step 4 will ensure that the cor-rect branch and symbol are selected to match the cell con-tents in the preceding configuration). When the distance from the last head position is 2n, the contents of the cell and the current state are changed according toδ:

ifT_<₂n₋₁uifH₌₂n₋₁uZ_qak v

l

b∈Γ

∃P.(Cbu∆k_qa,b), (T1-6)

whereδ(q, a, k) = (q0, a0, σ)and∆k

qa,bis the concept

setH₀ uZ_q00_bu ∃P.(Ca0uGa0), ifσ=−1,

∃P.(Ca0uGa0usetH₀ uZ_q00_a0), ifσ= 0,

∃P.(Ca0uGa0u l

b0_∈_Γ

∃P.(Cb0usetH₀ uZ_q00_b0)), ifσ= +1

(the symbol in the active cell is changed according to the in-struction, and the current state and symbol in the next active cell are then recorded inZqa0 ). Since the head never visits cell 0, this happens over cells0to2n₋_{1, that is, at least one} element after theP-counter onT is reset to 0. These three situations are shown below, where grey and hatched nodes denote domain elements withH-values2n₋₁_and_0, respec-tively, and the domain elements in the dashed oval represent the active cell of the preceding configuration:

C

a0 , G

a0

(a)

Zqak

Cb, Zq00_b

C_b0, Z_q00_b0

Zq00_b

Z_q00_b0

Cb, Zq00b

Cb0, Z_q00_b

(b)

Zqak

Cb

Cb0 Zq00a0

Zq00a0 _C

b, Zq00a0 Cb0, Z_q00_a0

(c)

Zqak

Cb

Cb0

Cb, Zq00b

Cb0, Z_q00_b0

(Note that there is only one branch for the modified cell, which corresponds to the new symbol, a0, in that cell; see explanations below.) Then, the current state and the symbol in the active cell are propagated along the tape using (T1-5).

(9)

finiteΣ-homomorphism,h, fromM2 toM1. To exclude wrong choices, we take

Σ = {A, P, X1, X2} ∪ {Da |a∈Γ}. Recall that ifd1∈CaM1, for somea∈Γ, then it represents a cell containinga. The following CIs inT1ensure that, for eachb ∈ Γdifferent froma, there is a block of(2n ₊ 1)-manyP−-connected elements that ends in the conceptDb (called aDb-blockin the sequel):

CavDaul

b∈Γ\{a}Gb, (T1-7) Gbv ∃P−.(SbusetB₀), (T-1) ifB<2n₋₁uSbv ∃P−.Sb, (T-2) ifB₌₂n₋₁uSbv ∃P−.Db, (T-3) where we use aP−-counter on a tupleB(unlikeP-counters in all other cases) and a conceptSbto propagatebalong the whole block. Supposeh(d2) = d1andd2 belongs toGin

M2(it represents a cell in a non-initial configuration). Then the following CI and (T-1)–(T-3), added toT2, generate a Db-block, foreachb∈Γ(includinga):

Gvl

b∈ΓGb. (T2-6) Each of theDb-blocks inM2, forb ∈ Γwithb 6= a, can be mapped byhto the respectiveDb-block inM1. By the choice ofΣ, the only remainingDa-block, in caseais tenta-tively contained in this cell, could be mapped (in the reverse order) along the branch inM1but onlyif the cell containsa in the preceding configuration (that is, the element which is 2n_{+ 1}_{steps closer to the root of}_M

1belongs toDa):

M2

M1

cellk

cellk ₂n_{+ 2}

G setB0 ifB

=2n−1

Da

Db Db0

setB

0

ifB=2n−1

Ca Da

Db Db0

setB0 _ifB

=2n₋₁

Note (see∆k

qa,b) that the cell whose content is changed gen-erates the additionalDa-block inM1to allow the respective Da-block fromM2to be mapped there.

One can show that M0 has a run with only infinite branches iff(T1,{A(c)}) Σ-query entails (T2,{A(c)}). It follows, by Theorem 2, that decidingΣ-query inseparability is 2EXPTIME-hard.

(ii) A proof of EXPTIME-hardness ofΣ-query insepara-bility forDL-LiteHcoreKBs is given in the full paper. It uses

the same idea of encoding computations of ATMs. One es-sential difference is that the expressive power ofDL-LiteHcore

is not enough to representn-bit counters in Step 0, and so we can only encode computations on polynomial tape. _q As a consequence of Theorems 3, 23 and 25 we obtain:

Theorem 26 For combined complexity, the membership problem for universal UCQ-solutions is 2EXPTIME -complete for Horn-ALCHI and Horn-ALCI; EXPTIME -complete for Horn-ALCH, Horn-ALC, DL-LiteH_horn and

DL-LiteHcore; and P-complete for ELand ELH. For data complexity, all these problems areP-complete.

In the case of DL-LiteH_core, we also obtain an EXPTIME

algorithm for checking the existence and computing univer-sal UCQ-solutions. Indeed, given a KBK1, a target signa-tureΣ2and a mappingT12, we first compute theΣ2-ABox overind(K1)that is implied byK1andT12, and then check whether at least one KBK2inΣ2with this ABox is a uni-versal UCQ-solution (there are≤O(2|Σ2|₎_{such KBs). This}

gives an EXPTIMEupper bound for the non-emptiness prob-lem for universal UCQ-solutions in DL-LiteH_core (Arenas et al. 2013). Similarly, we can check in EXPTIMEwhether the result of forgetting a signature in aDL-LiteH_coreKB exists.

Σ-query inseparability ofDL-LiteH_coreTBoxes was known to sit between PSPACE and EXPTIME(Konev et al. 2011). Using the fact that witness ABoxes for DL-LiteHcore TBox

separability can always be chosen among the singleton ABoxes (Konev et al. 2011, Theorem 8), we can modify the proof of Theorem 25 to improve the PSPACElower bound:

Theorem 27 Σ-query inseparability ofDL-LiteH_coreTBoxes isEXPTIME-complete.

For more expressive DLs, TBox Σ-query inseparability is often harder than KB inseparability: forDL-Litehorn, the

space of relevant witness ABoxes for TBox separability is of exponential size and, in fact, TBox inseparability is NP-hard, while KB inseparability is in P. Similarly,Σ-query in-separability ofELKBs is tractable, whileΣ-query insepa-rability of TBoxes is EXPTIME-complete (Lutz and Wolter 2010). The complexity of TBox inseparability for Horn-DLs extendingHorn-ALCis not known.

Future Work

From a theoretical point of view, it would be of interest to investigate the complexity ofΣ-query inseparability for KBs in more expressive Horn DLs (e.g.,Horn-SHIQ) and non-Horn DLs extending ALC. We conjecture that the game technique developed in this paper can be extended to those DLs as well. Our games can also be used to defineefficient approximationsofΣ-query entailment and inseparability for KBs. The existence of a forward strategy, for example, pro-vides a sufficient condition forΣ-query entailment for all of our DLs. Thus, one can extract aΣ-query module of a given KB K by exhaustively removing from K those inclusions and assertionsαfor which player 1 has a winning strategy in the gameGf_Σ(G1,G2), whereG1is a generating structure forK \ {α}andG2forK. The resulting modules are min-imal for our DLs without inverse roles, and we conjecture that in practice they are often minimal for DLs with inverse roles as well; see (Konev et al. 2011) for experiments testing similar ideas for module extraction from TBoxes.

(10)

References

Arenas, M.; Botoeva, E.; Calvanese, D.; Ryzhikov, V.; and Sherkhonov, E. 2012. Exchanging description logic knowl-edge bases. In Proc. of the 13th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR 2012). AAAI Press.

Arenas, M.; Botoeva, E.; Calvanese, D.; and Ryzhikov, V. 2013. Exchanging OWL 2 QL knowledge bases. InProc. of the 23rd Int. Joint Conf. on Artificial Intelligence (IJCAI 2013).

Baader, F.; Calvanese, D.; McGuinness, D.; Nardi, D.; and Patel-Schneider, P. F., eds. 2003. The Description Logic Handbook: Theory, Implementation and Applications. Cam-bridge University Press. (2nd edition, 2007).

Baader, F.; Brandt, S.; and Lutz, C. 2005. Pushing the EL envelope. InProc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI-05), 364–369.

Baier, C., and Katoen, J.-P. 2007. Principles of Model Checking. MIT Press.

Calvanese, D.; De Giacomo, G.; Lembo, D.; Lenzerini, M.; and Rosati, R. 2007. Tractable reasoning and efficient query answering in description logics: TheDL-Litefamily. Jour-nal of Automated Reasoning39(3):385–429.

Cuenca Grau, B.; Horrocks, I.; Kazakov, Y.; and Sattler, U. 2008. Modular reuse of ontologies: theory and practice.

Journal of Artificial Intelligence Research (JAIR) 31:273– 318.

Del Vescovo, C.; Parsia, B.; Sattler, U.; and Schneider, T. 2011. The modular structure of an ontology: Atomic decom-position. InProc. of the 22nd Int. Joint Conf. on Artificial Intelligence (IJCAI 2011), 2232–2237. AAAI Press. Eiter, T.; Gottlob, G.; Ortiz, M.; and Simkus, M. 2008. Query answering in the description logic Horn-SHIQ. In

Proc. of the 11th European Conf. on Logics in Artificial In-telligence (JELIA 2008), volume 5293 ofLecture Notes in Computer Science, 166–179. Springer.

Hustadt, U.; Motik, B.; and Sattler, U. 2005. Data com-plexity of reasoning in very expressive description logics. InProc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI-05), 466–471.

Jim´enez-Ruiz, E.; Cuenca Grau, B.; Horrocks, I.; and Berlanga Llavori, R. 2011. Supporting concurrent ontol-ogy development: Framework, algorithms and tool. Data Knowl. Eng.70(1):146–164.

Kazakov, Y. 2009. Consequence-driven reasoning for Horn SHIQ ontologies. InProc. of the 21st Int. Joint Conf. on Artificial Intelligence (IJCAI 2009), 2040–2045.

Konev, B.; Kontchakov, R.; Ludwig, M.; Schneider, T.; Wolter, F.; and Zakharyaschev, M. 2011. Conjunctive query inseparability of OWL 2 QL TBoxes. InProc. of the 25th AAAI Conf. on Artificial Intelligence (AAAI 2011). AAAI Press.

Konev, B.; Ludwig, M.; Walther, D.; and Wolter, F. 2012. The logical difference for the lightweight description logic EL. Journal of Artificial Intelligence Research (JAIR)

44:633–708.

Konev, B.; Walther, D.; and Wolter, F. 2009. Forgetting and uniform interpolation in large-scale description logic termi-nologies. InProc. of the 21st Int. Joint Conf. on Artificial Intelligence (IJCAI 2009), 830–835. AAAI Press.

Kontchakov, R.; Lutz, C.; Toman, D.; Wolter, F.; and Za-kharyaschev, M. 2010. The combined approach to query answering in DL-Lite. In Proc. of the 12th Int. Conf. on Principles of Knowledge Representation and Reasoning (KR 2010). AAAI Press.

Kontchakov, R.; Wolter, F.; and Zakharyaschev, M. 2010. Logic-based ontology comparison and module extraction, with an application to DL-Lite. Artificial Intelligence

174:1093–1141.

Kozen, D. 2006. Theory of Computation. Springer. Kr¨otzsch, M.; Rudolph, S.; and Hitzler, P. 2013. Complex-ities of horn description logics. ACM Trans. Comput. Log.

14(1):2.

Lin, F., and Reiter, R. 1994. Forget it! InIn Proc. of the AAAI Fall Symposium on Relevance, 154–159.

Lutz, C., and Wolter, F. 2010. Deciding inseparability and conservative extensions in the description logic EL.Journal of Symbolic Computation45(2):194–228.

Lutz, C.; Seylan, I.; and Wolter, F. 2012. An automata-theoretic approach to uniform interpolation and approxi-mation in the description logic EL. In Proc. of the 13th Int. Conf. on Principles of Knowledge Representation (KR 2012). AAAI Press.

Lutz, C.; Toman, D.; and Wolter, F. 2009. Conjunctive query answering in the description logic EL using a rela-tional database system. InProc. of the 21st Int. Joint Conf. on Artificial Intelligence (IJCAI-09), 2070–2075.

Mazala, R. 2001. Infinite games. InAutomata, Logics, and Infinite Games, 23–42.

Nikitina, N., and Glimm, B. 2012. Hitting the sweetspot: Economic rewriting of knowledge bases. InProc. of the 11th Int. Semantic Web Conf. (ISWC 2012), Part I, volume 7649 ofLecture Notes in Computer Science, 394–409. Springer. Nikitina, N., and Rudolph, S. 2012. ExpExpExplosion: Uni-form interpolation in general EL terminologies. In Proc. of the 20th European Conf. on Artificial Intelligence (ECAI 2012), 618–623. IOS Press.

Rosati, R. 2007. On conjunctive query answering in EL. In

Proc. of the 2007 Int. Workshop on Description Logics (DL 2007).

Stuckenschmidt, H.; Parent, C.; and Spaccapietra, S., eds. 2009. Modular Ontologies: Concepts, Theories and Tech-niques for Knowledge Modularization, volume 5445 of Lec-ture Notes in Computer Science. Springer.

Wang, Z.; Wang, K.; Topor, R. W.; and Pan, J. Z. 2010. For-getting for knowledge bases in DL-Lite. Ann. Math. Artif. Intell.58(1-2):117–151.

(11)

Appendix

The proofs of the theorems and lemmas from the paper are presented in the following order. We begin by proving Theorem 6, where we show how to construct finite generat-ing structures for each of the languages and check that these structures give rise to materialisations. Theorem 6 is then used to prove Theorem 5, which characterisesΣ-query en-tailment through finite Σ-homomorphisms. After that we prove Theorems 2 and 3 that establish a connection between query entailment, query inseparability and universal UCQ-solutions.

Next, we prove the results for our games: first, Theorem 8 relating finiteΣ-homomorphisms with infinite games, and second, Lemma 22 saying that arbitrary games cover and admit arbitrary strategies. Proofs of Lemmas 11, 13 and 19 are obtained as corollaries of the proof of Lemma 22. Fi-nally, we prove Lemma 19, establishingCONP-hardness of backward strategies.

We conclude with the proofs of lower bounds. First, we prove Theorem 24. Then, in the proof of Theo-rem 25 we show EXPTIME-hardness ofΣ-query entailment inDL-LiteH_core.

Proof of Theorem 6: construction of

generating structures

Theorem 6 Horn-ALCHIand all of its fragments defined above have finitely generated materialisations. Moreover,

– for any L ∈ {ALCHI,ALCI,ALCH,ALC} and any

Horn-LKB (T,A), a generating structure can be con-structed in time|A| ·2p(|T |)_,_p_{a polynomial}_;

– for anyLin theELandDL-Litefamilies and anyLKB

(T,A), a generating structure can be constructed in time

|A| ·p(|T |),pa polynomial.

We construct the generating structures first for

Horn-ALCHI, then forELH, and finally for DL-LiteH_horn. The construction for Horn-ALCI, Horn-ALCH, and

Horn-ALCis the same as forHorn-ALCHI, the construc-tion forELis the same as for ELH, and the construction forDL-Litecore,DL-LiteHcore, andDL-Litehornis the same as

forDL-LiteH_hornand therefore omitted.

Horn-

ALCHI

To construct the generating structure for Horn-ALCHI

TBoxes we first transform the TBox into normal form (Kr¨otzsch, Rudolph, and Hitzler 2007). AHorn-ALCHI

TBox is innormal formif its concept inclusions are of the following form

AvB, A1uA2vB, Av ⊥, > vB, Av ∃R.B, Av ∀R.B,

∃R.AvB, R1vR2,

whereA, A1, A2, B are concept names andR, R1, R2 are roles. The following result is well-known (Kr¨otzsch, Rudolph, and Hitzler 2007; Eiter et al. 2008):

Theorem A.28 For everyHorn-ALCHI TBoxT one can construct in polynomial time aHorn-ALCHI TBoxT0 _in

normal form such thatT andT0 _are _Σ_{-query inseparable}

for the signatureΣofT. Moreover,

• ifT does not contain any role inclusions thenT0_{does not}

contain any role inclusions;

• ifT does not contain any inverse roles thenT0 _{does not}

contain any inverse roles.

Now assume a consistent KB K = (T,A) with a

Horn-ALCHITBoxT in normal form is given. Bysub(T) we denote the set of subconcepts of concepts inT. TheT -type ofdinIis defined by taking

hI(d) ={C∈sub(T)|d∈CI}.

We say thattis aT-typeif the there exists a modelIofT

such thatt=hI(d), for somed∈∆I. Denote bytype(T) the set of allT-types. It is well known thattype(T)can be computed in exponential time in|T |. We now construct the finitely generating structureG = (∆G,·G, )forK, where ∆G=ind(K)∪Ω.

For any roleRinT, we set

[R] ={S| T |=RvS,T |=SvR}.

We write[R]≤T [S]ifT |=R vS; thus,≤T is a partial

order on the set{[R]|Ra role inT }.Ωwill be a subset of the set of pairs([R], t)witht∈type(T). First define as follows:

• a ([R], t) if a ∈ ind(K) and t is a maximal (with respect to set-inclusion) T-type such that K |=

∃R.(d

C∈tC)(a)andK 6|= R(a, b)for anyb ∈ ind(K) witht⊆ {C∈sub(T)| K |=C(b)};

• ([R1], t1) ([R2], t2)if t2 is a maximalT-type such thatT |= (d

C∈t1C)v ∃R2.( d

C∈t2C).

Ωis defined as the set of all pairs([R], t)such that there are a∈ind(K),R1, . . . , Rn=Randt1, . . . , tn=tsuch that

a ([R1], t1) · · · ([Rn], tn). We define the interpretation function·G_{by setting}

AG = {a∈ind(K)| K |=A(a)} ∪ {([R], t)∈Ω|A∈t}, PG = {(a, b)|there isR(a, b)∈ AwithT |=RvP},

and for every edgeu vwithv= (R, t), we set

(u, v)G ={S|[R]≤T [S]}.

It can now be proved thatGis a generating structure forK. LetMbe the interpretation defined by unravellingG.

Proposition A.29 Mis a model ofK.

Proof. Clearly,Mis a model ofA. We showM |=T by verifyingM |=αfor eachα∈ T.

α=AvB. Here we consider A to be either a concept nameA, or>. Letx∈ AM_{. If}_x ₌ _a_∈ _ind₍_K_{), then}

K |=A(a)by construction ofAM. SinceAvB ∈ T, it followsK |=B(a), hencea∈BM. Ifx=x0·([R], t), thenA ∈tand by construction ofG,tis a maximalT -type such thatT |= (d

C∈t0C)v ∃R.( d

(12)

α=A1uA2vB. The argument is analogous for x ∈ AM₁ ∩AM₂ .

α=Av ∃R.B. Let x ∈ AM. If x = a ∈ ind(K), then K |= A(a) by construction ofAM. Since A v ∃R.B ∈ T, it follows K |= ∃R.B(a). Assume K |=

{R(a, b), B(b)}for someb ∈ind(K), then(a, b)∈RM andb ∈ BM, soM |= α. Otherwise, take a maximal

T-typet withB ∈ t such thatK |= ∃R.(d

C∈tC)(a) andK 6|= R(a, b)for any b ∈ ind(K)witht ⊆ {C ∈

sub(T) | K |= C(b)}. Then it holds thata ([R], t) andB ∈ t. From the construction ofRM, we have that (a, a·([R], t))∈RM, and sinceB ∈t,([R], t)∈BM. For the casetail(x) = ([R], t), the proof is similar. α=Av ∀R.B. Letx∈ AM_{, and assume}₍_{x, y}₎ _∈ _RM_.

We consider various cases ofxandy:

• x =a,y = bfora, b ∈ ind(K). ThenK |= R(a, b) andK |=A(a), consequentlyK |=B(b), sob ∈BM by construction ofM.

• x = a ∈ ind(K), y = a·([S], t) such that T |= S v R. Thent is a maximalT-type such thatK |=

∃S.(d

C∈tC)(a). Moreover, from A v ∀R.B ∈ T it follows that B ∈ t. So, by definition of BM, ([S], t)∈BM.

• tail(x)∈/ind(K),y=x·([S], t)such thatT |=SvR. The proof is as for the case above.

• y = b ∈ ind(K) and x = y · ([S], t) such that

T |= S− v R. Then A ∈ tand soK |= ∃S.A(b), consequentlyK |= ∃R−.A(b), and asA v ∀R.B is equivalent to∃R−.AvB, finally,K |=B(b). Hence, b∈BM.

• tail(y) ∈/ ind(K)andx = y·([S], t)such thatT |= S− vR. Lettail(y) = ([Q], t0), thentis a maximal

T-type such thatT |= (d_C_∈_t0C) v ∃S.( d

C∈tC), moreoverA ∈ t. If follows T |= ∃S.(d_C_∈_tC) v ∃R−.A, and as before,T |=∃R−.A vB. Therefore,

T |= (d_C_∈_t0C)vB. t0is a maximalT-type and so

we can conclude thatB ∈t0. Hence,y∈BM.

α=∃R.AvB. Observe thatαis equivalent to the axiom Av ∀R−.B, whose satisfaction was shown above. That proof can be adjusted to the case ofα.

α=Av ⊥. AssumeAM 6=∅andx∈AM. Ifx=a ∈

ind(K), thenK |=A(a), and, sinceα∈ T, we get a con-tradiction withKbeing consistent. Iftail(x) = ([R], t) for some role R andT-type t, then it follows A ∈ t, which contradicts the definition of aT-type. Therefore, AM=∅.

α=R1vR2. Assume(x, y)∈RM1 . Ifx=aandy =b fora, b ∈ind(K), it followsK |=R1(a, b). Fromαwe obtain thatK |= R2(a, b), therefore (a, b) ∈ RM₂ . If y=x·([R], t)for someRandt, by construction ofRM1 ,

T |= R v R1. Then because ofα,T |= R v R2, so finally,(x, y)∈RM

2 . Ifx=y·([R], t), thenT |=R−v R1andT |=R−vR2, so again(x, y)∈RM2 .

This finishes the proof. q

The proof above also shows thatT-types of a node inM

coincide with the T-types used in the construction of the node.

Lemma A.30 LetMbe the unravelling ofG. Then 1. for alla ∈ ind(K), hM(a) = {C ∈ sub(T) | K |=

C(a)};

2. For allσ·([R], t)∈∆M,hM(σ·([R], t)) =t.

Proposition A.31 IfI is a model ofK, then there exists a homomorphism fromMtoI.

Proof. We define a functionh: ∆M _→_∆I_{for each}_σ _∈

∆M by induction on the length ofσ, and simultaneously show it is a homomorphism, i.e.,

(1) h(aM) =aIfora∈ind(K), (2) hM(σ)⊆hI(h(σ))forσ∈∆M,

(3) rM(σ, σ0)⊆rI(h(σ), h(σ0))forσ, σ0∈∆M.

First, for eacha ∈ ind(K), we seth(aM) = aI. This en-sures (1). Conditions 2 and 3 follow forσ, σ0_∈_ind₍_K₎_from

Lemma A.30, the condition thatI is a model ofK, and the construction ofM.

Letσ·([S], t)∈∆Msuch thath(σ)is defined. By con-struction ofM, it followsK |=∃S.(d

C∈tC)(a)ifσ=a, orT |= (d

C∈t0C)v ∃S.( d

C∈tC)iftail(σ) = ([Q], t 0_).

By the condition thatI is a model ofK, by Lemma A.30, and by the induction hypothesishM(σ)⊆hI(h(σ)), it fol-lows that there existsz ∈ ∆I such thatS ∈ rI(h(σ), z) andt ⊆ hI(z). We seth(σ·([S], t)) = z and show that (2) and (3) hold. (2) follows immediately from the fact thathM(σ·([S], t)) =t(by Lemma A.30). For (3), from R∈rM(σ, σ·([S], t))it followsT |=SvR, and sinceI

is a model ofT, we getR∈rI(h(σ), z). q

ELH

We now construct generating structures forELH. Again we first transform the TBox into normal form (Baader, Brandt, and Lutz 2005). An ELH TBox is in normal form if its concept inclusions are of the following form:

AvB, A1uA2vB,

> vB, Av ∃P.B,

∃P.AvB, P1vP2,

where A, A1, A2, B are concept names and P, P1, P2 are role names. The following result is well known (Baader, Brandt, and Lutz 2005):

Theorem A.32 For everyELHTBoxT one can construct in polynomial time anELHTBoxT0 _{in normal form such}

thatT andT0 _are_Σ_{-query inseparable for the signature}_Σ

ofT.

Assume K = (T,A) with T an ELH TBox in normal form is given. We construct the generating structure G = (∆G,·G_, ₎_for_K_{as follows, where}_∆G₌_ind₍_K₎_∪_Ω_and

(13)

defined in the construction forHorn-ALCHI.) Define as follows:

• a ([P], A)if a ∈ ind(K)andA is a concept name inT such thatK |=∃P.A(a)andK 6|=P(a, b)for any b∈ind(K)withK |=A(b)};

• ([P1], A1) ([P2], A2)ifT |=A1v ∃P2.A2.

Ωis defined as the set of all pairs([P], A)such that there area∈ind(K),P1, . . . , Pn=PandA1, . . . , An=Asuch that

a ([P1], A1) · · · ([Pn], An). We define the interpretation function·G _{by setting}

AG = {a∈ind(K)| K |=A(a)} ∪ {([P], B)∈Ω| T |=BvA},

PG = {(a, b)|there isP0(a, b)∈ AwithT |=P0vP}

and for every edgeu vwithv= ([P], A), we set

(u, v)G ={P0|[P]≤T [P0]}.

It can be shown thatG is a generating structure forK. Let

Mbe the interpretation defined by unravellingG.

Proof. Clearly,Mis a model ofA. We showM |=T by verifyingM |=αfor eachα∈ T.

α=AvB. Here we consider A to be either a concept nameA, or>. Letx∈ AM. Ifx =a ∈ ind(K), then

K |=A(a)by construction ofAM_{. Since}_A_v_B _{∈ T}_{, it}

followsK |=B(a), hencea∈BM. Ifx=x0·([R], C), thenT |=CvA. Because ofα, we have thatT |=Cv

B, thereforex∈BM.

α=A1uA2vB. The argument is analogous for x ∈ AM₁ ∩AM₂ .

α=Av ∃P.B. Letx ∈ AM. Ifx = a ∈ ind(K), then

K |= A(a) by construction of AM. Since α ∈ T, it followsK |= ∃P.B(a). AssumeK |= {P(a, b), B(b)}

for someb∈ind(K), then(a, b)∈PMandb∈BM, so

M |= α. Otherwise, we have thata ([P], B). From the construction ofM, we have that(a, a·([P], B)) ∈

PM, and([P], B)∈BM. For the casetail(x) = ([R], t), the proof is similar.

α=∃P.AvB. Assume(x, y)∈ PMandy ∈ AM. We consider various cases ofxandy:

• x =b,y = afora, b ∈ ind(K). ThenK |= P(b, a) andK |=A(a), consequentlyK |=B(b), sob ∈BM by construction ofM.

• x=b∈ind(K),y=b·([S], C). Then by construction ofM,K |= ∃S.C(b), moreoverT |={S v P, C v

A}. Next,K |= ∃P.A(b), and finallyK |= B(b), so b∈BM.

• tail(x) = ([Q], D),y=x·([S], C). Then by construc-tion ofM,T |= D v ∃S.C, moreoverT |= {S v

P, C vA}. It followsT |=D v ∃P.A, and because ofα,T |=D v B. Hence, by construction ofBM, x∈BM.

Observe that the casex=y·([S], C)for someS andC is not possible as it would required thatT |= S− v R, which is not possible inELH.

α=P1vP2. Assume(x, y)∈P1M. Ifx=aandy =b fora, b ∈ ind(K), it followsK |=P1(a, b). Fromαwe obtain that K |= P2(a, b), therefore (a, b) ∈ P₂M. If y=x·([P], t)for somePandt, by construction ofPM

1 ,

T |= P v P1. Then because ofα,T |= P v P2, so finally,(x, y)∈RM₂ .

q

Proof.Analogous to the proof of Proposition A.31. q

DL-Lite

H_horn

Finally, assumeK = (T,A)with aDL-LiteH_hornTBoxT is given. We construct the finitely generating structureG = (∆G,·G_, ₎_for_K_{as follows, where}_∆G ₌_ind₍_K₎_∪_{Ω. For}

each[R]withRare role inT, we introduce awitnessw[R]. Ωwill be a subset of the set of allw[R]. First define as follows:

• a w[R]ifa∈ind(K)and[R]is≤T-minimal such that

K |=∃R(a)andK 6|=R(a, b)for anyb∈ind(K);

• w[S] w[R]if[R]is≤T-minimal withT |=∃S−v ∃R

and[S−]6= [R].

Ωis defined as the set of all w[R] such that there area ∈ ind(K)andR1, . . . , Rn=Rsuch that

a w[R1] · · · w[Rn]. We define the interpretation function·G_{by setting}

AG= {a∈ind(K)| K |=A(a)} ∪ {w[R]∈Ω| T |=∃R− vA},

PG= {(a, b)|there isR(a, b)∈ AwithT |=RvP},

and for every edgeu vwithv=w[R], we set (u, v)G={R0|[R]≤T [R0]}.

One can show thatGis a generating structure forK. LetM

be the interpretation defined by unravellingG. The follow-ing two propositions can be proved by analogy with Propo-sition 17 and Lemma 18 in the full version of (Konev et al. 2011).

Finally, we show thatMis indeed a materialisation.

Theorem A.37 LetK= (T,A)be a consistentL-KB, and