How many variables are needed to express an existential positive query?

(1)

Bova, S. and Chen, Hubie (2018) How many variables are needed to express

an existential positive query? Theory of Computing Systems , ISSN

1432-4350. (In Press)

Downloaded from:

Usage Guidelines:

Please refer to usage guidelines at

or alternatively

(2)

existential positive query?

Simone Bova

1

_{and Hubie Chen}

2

1 TU Wien (Austria)

[email protected]

2 Birkbeck, University of London

Abstract

The number of variables used by a first-order query is a fundamental measure which has been studied in numerous contexts, and which is known to be highly relevant to the task of query evaluation. In this article, we study this measure in the context of existential positive queries. Building on previous work, we present a combinatorial quantity defined on existential positive queries; we show that this quantity not only characterizes the minimum number of variables needed to express a given existential positive query by another existential positive query, but also that it characterizes the minimum number of variables needed to express a given existential positive query, over all first-order queries. Put differently and loosely, we show that for any existential positive query, no variables can ever be saved by moving out of existential positive logic to first-order logic. One component of this theorem’s proof is the construction of a winning strategy for a certain Ehrenfeucht-Fraïssé type game.

1998 ACM Subject Classification F.1.3 Complexity Measures and Classes; F.4.1 Mathematical Logic; G.2.2 Graph Theory; H.2.1 Logical Design

Keywords and phrases Existential positive queries, finite-variable logics, first-order logic, query optimization

Digital Object Identifier 10.4230/LIPIcs...

1 Introduction

Background. The number of variables used by a first-order query is recognized as a highly useful and fundamental measure, and has been studied in numerous settings, including descriptive complexity, finite model theory, and query evaluation. By the number of variables used by a query, we refer to the total number of variables that appear in the query. Note that this measure is, in essence, equivalent to thewidth of a query, which is defined as the maximum number of free variables over all subformulas of the query: a query having width

kcan be rewritten, just by syntactically renaming variables, as a query using kvariables; and, a query usingkvariables clearly has width at most k. Within this article, all queries dealt with are relational and first-order.

In the setting of query evaluation, the number of variables is a measure of prime and crucial interest. A first reason for this is that the natural bottom-up algorithm for evaluating a first-order query on a finite structure exhibits, in general, an exponential dependence on the number of variables; it also runs in polynomial-time when a constant bound is placed on the number of variables [22]. Furthermore, there are complexity classification theorems [18, 15, 10, 11] on classes of Boolean queries in which the number of variables emerges as the decisive measure for describing whether or not a class of Boolean queries enjoys tractable query evaluation; in particular, these classification theorems show that, if a

licensed under Creative Commons License CC-BY Leibniz International Proceedings in Informatics

(3)

class of queries enjoys tractable query evaluation at all, then there exists a constantk≥1 such that each query in the class can be expressed by (that is, is logically equivalent to) a query using at mostkvariables. Let us remark that these classification theorems are given in a parameterized complexity setting in which a query can be preprocessed independently of the structure on which it is to be evaluated; and, that these theorems concern classes of queries having bounded arity.1 (We refer the reader to the cited articles for precise theorem statements and more information.)

Given the computational relevance of the number of variables as a query measure, it is natural to inquire, given a query, to what extent the number of variables can be minimized; indeed, it is a natural desire to attempt to rewrite/optimize a given query as one that uses the fewest number of variables (and which retains logical equivalence to the original query). In this article, we study this question on existential positive queries. They include and are semantically equivalent to the so-calledunions of conjunctive queries, also known asselect-project-join-union queries; these have been argued to be the most common database queries [1]. Previous work [6] due to the present authors yields a combinatorial characterization (Theorem 21) of the minimum number of variables needed to express a given Boolean existential positive query,by another existential positive query. Let FO denote the class of first-order queries, let EP denote the class of existential positive queries, and let FOk and EPk denote the restrictions of these classes to queries using at mostk variables, respectively; say that a queryφisL-expressibleif there exists a queryψ∈Lthat is logically equivalent to φ. Rephrasing, the combinatorial characterization yields, given a Boolean existential positive queryφ, the minimum value k such that φ is EPk-expressible. This characterization thus indicates how to minimize the number of variableswithin the class of existential positive queries. However, this characterization does not preclude the possibility that a query requiringk variables to be expressed as an existential positive query, could be expressed by a first-order query that uses strictly fewer thankvariables.

Contributions. We prove that the just-mentioned possibility can never occur. We generalize the aforementioned combinatorial quantity so that it is defined on all existential positive queries (both Boolean and non-Boolean), and dub this quantity thecombinatorial width. Our primary theorem states that, foranyexistential positive queryφ, whenk is set equal to the combinatorial width,

φis can be expressed by an existential positive query usingkvariables, but

φcannot be expressed by any first-order query using a number of variables that is strictly less thank.

That is, the combinatorial width not only gives the minimum valuek such thatφis EPk -expressible, it in fact more sharply gives the minimum valueksuch thatφis FOk-expressible. This theorem can be viewed as a collapse result, namely, that for existential positive queries, FOk-expressibility implies (and hence coincides with) EPk-expressibility. We want to emphasize that the theorem applies individually to every single existential positive query; in our view, the theorem contains a certain element of surprise, since it states (essentially) that there isno existential positive query whatsoever for which one can save variables by moving out of existential positive logic to the more general first-order logic.

One corollary of our development is that deciding FOk-expressibility of existential positive sentences is complete for the class Πp₂of the polynomial hierarchy (Corollary 25); this follows

1 _{A class of queries has}_{bounded arity}_{if there is a constant upper bound on the arity of all relation}

(4)

from the present theory in conjunction with a previous theorem on the complexity of deciding EPk-expressibility of existential positive sentences ([6, Theorem 6]). Let us remark that FOk-expressibility is undecidable on first-order sentences ([2, Remark 5.3]), and that (to our knowledge) prior to this work, FOk-expressibility of existential positive sentences was not even known to be decidable.

To establish the inexpressibility portion of our primary theorem, for each Boolean existential positive queryφ, we show how to construct two finite structuresB,B0 on which the query differs, but which are not distinguishable from each other by any first-order query using a number of variables strictly less than the combinatorial width ofφ. To show this non-distinguishability, we make use of a known Ehrenfeucht-Fraïssé type game [5, 20] designed for showing non-FOm-expressibility. We in fact first perform this construction for Boolean conjunctive queries (phrased in terms of homomorphisms; see Section 3.1, Theorem 6); after this, we observe that this result extends to Boolean existential positive queries (Theorem 22), and then build on this understanding to treat general existential positive queries (Theorem 23).

The construction of the aforementioned two structures is based on a construction due to Atserias et al. [3]. This previous work characterized, for each finite structure A, the number of pebbles needed for theexistential pebble game[21] to act as a solution procedure for deciding if there exists a homomorphism fromAto n given structureB(or, equivalently, if the conjunctive query corresponding to A evaluates to true on a given structure B). As we show in the present article (see the discussion of Theorem 28 in Section 5), this previous characterization theorem can be readily derived from our primary theorem, and hence our primary theorem provides a strengthening of and broader perspective on this previous theorem.

Let us mention that, in related work, there are numerous articles that investigate the applicability of pebble games to query evaluation problems, which issue was a motivation for the Atserias et al. article [3]. As examples, we mention the work of Dalmau et al. on conjunctive queries and the existential pebble game [16]; the works of Chen and Dalmau on quantified conjunctive queries [14, 9]; the work of Chen and Dalmau on conjunctive queries and generalized hypertree width [13]; and, the work of Barceló et al. [4] on semantically acyclic query evaluation under database constraints.

A conference version of this article appeared in the proceedings of ICDT 2017 [7].

2 Preliminaries

For an integerk≥0, we use [k] to denote the set{1, . . . , k}, with the convention that [0] =∅. Fori= 1,2, we freely letπidenote the ith projection both over pairs and over sets of pairs,

so for instanceπi((a1, a2)) =ai andπi(A1×A2) =Ai. For an integern, we letn(mod 2)

denote the valueb ∈ {0,1} such thatn andb are congruent modulo 2, that is, such that

n−b is an integer multiple of 2.

Whenf:A→B andg:B→Care functions, we useg(f) to denote their composition. Whenhis a partial function, we use dom(h) to denote the domain ofh.

Graphs, Structures, and Logic

Graphs. All graphsG= (V, E) in this article are undirected and simple, that is, Eis a set of 2-element subsets ofV.

AwalkinGis a sequenceW = (a1, . . . , am)∈Vm,m≥0, such thata1, . . . , am∈V and

(5)

terminology. We say thatW: contains a∈V ifa=ai for somei∈[m]; isfroms∈V to

t∈V ifa1 =s andam=t; isfrom s∈V to T ⊆V if a1=s andam ∈T; is s-cyclic if

a1=am=s. A graph G= (V, E) isconnected if for every two verticesv andv0 in V there

exists a walk fromv to v0.

A tree decomposition of a graphG= (V, E) consists of a treeT where each node uis associated to a nonempty subsetBu ofV (also calledbag) such that the following holds:

For each vertexv∈V, the nodes uofT such thatv∈Bu form a non-empty connected

subtree ofT.

For each edgee∈E, there exists a nodeuin T such thate⊆Bu.

Thewidthof a tree decompositionT of a graphGis defined as the maximum size attained by its bags minus 1, that is, maxu∈T|Bu| −1. Thetreewidth of a graphG, denoted bytw(G),

is the minimum width over all tree decompositions ofG.

LetG= (V, E) be a graph. We say that two subsetsU andU0 ofV touchifU∩U0 6=∅ or there existu∈U,u0∈U0 such that{u, u0} ∈E. A setMof mutually touching connected subsets ofV is called abrambleofG. A subsetH ofV hitsMifH∩M 6=∅for allM ∈ M. Theorder of a bramble is the minimum size attained over its hitting sets. We will make use of the following duality theorem.

ITheorem 1. (refer to [17]) For k≥1, a graph has treewidth≥k if and only if it has a bramble of order> k.

Structures. A relational vocabulary σis a set ofrelation symbols R, each of which has an associated natural numberar(R) called itsarity.

Letσbe a relational vocabulary. Aσ-structureAis specified by a nonempty setA, called theuniverseofAand denoted by the corresponding italic letter, and a relationRA_⊆_Aar(R)

for each relation symbolR∈σ. A structure isfinite if its universe is finite.

LetAandB beσ-structures. Ahomomorphism fromAtoBis a mappingh:A→B

such that for each symbol R ∈ σ: if the tuple (a1, . . . , aar(R)) is in RA, then the tuple

(h(a1), . . . , h(aar(R))) is inRB. We writeA→Bto indicate that there exists a homomorphism

fromAtoB. AandBarehomomorphically equivalent ifA→BandB→Aboth hold. An endomorphism ofAis a homomorphism fromAtoA. Anautomorphism ofAis a bijective mappingh: A→Asuch that (a1, . . . , aar(R))∈RAif and only if (h(a1), . . . , h(aar(R)))∈RA,

for allR∈σand (a1, . . . , aar(R))∈Aar(R); note that the inverse of an automorphism is an

automorphism. A structureAis acore if every endomorphism ofAis an automorphism of

A.

The structure B is a substructure of the structure A ifB ⊆A and RB _⊆_RA _{for all} relation symbolsR∈σ. WhenBis a substructure of A, there exists a homomorphismh

fromAto B, andhfixes each elementb∈B, the mappinghis said to be aretraction from

A toB; when there exists a retraction from A toB, it is said that A retracts toB. A core of a structure A is a structureC such thatAretracts toC, butAdoes not retract to any proper substructure ofC. It is well known that each finite structure has a core and all cores of a finite structure are isomorphic [19]; we therefore freely refer tothecore of a finite structureA, and denote this object bycore(A). The following facts are known and straightforwardly verified: whenAis a core, thenAis a core ofA; and, whenCis a core of

A, thenCis a core.

TheGaifman graphof a structure Ais the graph with vertex setAand having an edge {a, a0}if and only ifa6=a0 andaanda0 cooccur in a tuple ofA. The treewidth of a structure

(6)

Logic. In this article, we deal with first-order logic. An atom (over vocabularyσ) is an equality of variables (x=y) or is a predicate applicationR(x1, . . . , xr), wherex1, . . . , xr are

variables, andR∈σis a relation symbol of arityr. Aformula (over vocabularyσ) is built from atoms (overσ), negation (¬), conjunction (∧), disjunction (∨), universal quantification (∀), and existential quantification (∃). We definefree(φ) to be the set of free variables of a

formulaφ. Asentenceis a formula having no free variables.

We let FO denote the class of first-order formulas. An existential positiveformula (over vocabularyσ) is a formula built from atoms (overσ) using conjunction, disjunction, and existential quantification; we let EP denote the class of existential positive formulas. A primitive positive formula (over vocabularyσ) is a formula built from atoms (overσ) using conjunction and existential quantification; we let PP denote the class of primitive positive formulas. Let L⊆FO. A formulaφ∈FO is called an L-formula (respectively, an L-sentence) ifφis in L (respectively, ifφis a sentence in L).

When A is a structure,f is an assignment of variables in A, andφis a formula over the vocabulary ofA, we write A, f |=φto indicate thatφis true inA underf; ifφis a sentence, we simply writeA|=φ. Letφandψbe formulas over the vocabularyσhaving the same free variables. We writeφ|=ψto indicate thatφentailsψ, that is, for allσ-structures

Aand assignments f inAit holds that,A, f |=φimpliesA, f |=ψ. We say thatφandψ

arelogically equivalent (denotedφ≡ψ) ifφ|=ψand ψ|=φ. Letφbe an FO-formula and let L⊆FO. We say thatφis L-expressibleif there exists an L-formulaφ0 such thatφ≡φ0. For any PP-sentenceφoverσ, we letC[φ] denote the canonical structure induced byφ. Thecanonical structure ofφis obtained by first prenexing the quantifiers and eliminating the equalities inφ, obtaining a logically equivalent PP-sentenceφ0 in prenex form and equality free; and second by defining C[φ] to be the structure having a universe element for each existentially quantified variable inφ0, and where, for eachR∈σ, the relationRC[φ] _contains

(x1, . . . , xr) if and only ifR(x1, . . . , xr) appears in the quantifier free part ofφ0.

For a finite σ-structure A, we let Q[A] denote the canonical query of A, namely, if

A={a1, . . . , an}, then

Q[A] =∃a1. . .∃an ^

R∈σ

^

(a0

1,...,a0k)∈RA

R(a01, . . . , a0k).

It is straightforward to verify that any PP-sentence φis logically equivalent toQ[C[φ]], and that every finite structureAis homomorphically equivalent toC[Q[A]]. We will use the following known fact [8].

ITheorem 2(Chandra and Merlin [8]). Letφbe aPP-sentence and letAbe a finite structure, such thatφ=Q[A] or A=C[φ]. Then, for any structure B, it holds that A→B if and only ifB|=φ.

Finite Variable Logics

For each class L of first-order formulas and each integer k ≥ 1, we let Lk denote those formulas in L that use at most kdistinct variable symbols. Fragments of first-order logic using only finitely many variables, called finite variable logics, are central in finite model theory. Equivalence in these fragments can be characterized by Ehrenfeucht-Fraïssé style games calledpebble games [20], which we now introduce.

(7)

for allR∈σanda1, . . . , aar(R) ∈dom(h) it holds that (a1, . . . , aar(R))∈RA if and only if

(h(a1), . . . , h(aar(R)))∈RB.

Thek-pebble game is played by two players, a spoiler and a duplicator, over two (finite) relational structuresAandBover the same vocabulary. A position of the game is a subset of

A×B of size at mostk. The game starts in the empty position and continues in a sequence of rounds. In each round of the game, the spoiler removes a pair from the current position if its size isk, and then selects an elementa∈Aor b∈B; the duplicator answers by selecting an elementb∈B ora∈A, respectively. The new position is defined by adding the pair (a, b) to the old position. The duplicator wins the game if each position occurring along the

rounds is a partial isomorphism fromAtoB.

IDefinition 4. [20] Letk≥0. Aduplicator winning strategy in the k-pebble game on A

andB is a familyS of partial isomorphismshfromAto Bwith|dom(h)| ≤k such that: (S1) ∅ → ∅is in S.

(S2) Ifh∈ S and|dom(h)|< k, then:

(S2.F) For every a∈Athere existsb∈B such thath∪ {(a, b)}is inS. (S2.B) For everyb∈B there existsa∈Asuch thath∪ {(a, b)} is inS. (S3) Ifh∈ S anda∈dom(h), thenh|dom(h)\{a} is inS.

It is clear that the duplicator wins the above described k-pebble game onA andB if and only if the game admits a duplicator winning strategy.

We say that two structuresA,B areindistinguishable byFOk-sentences(in short FOk -indistinguishable), if for each FOk-sentenceφ it holds that A |= φ if and only ifB |= φ. As anticipated, k-pebble games characterize expressibility in the k-variable fragment of first-order logic.

ITheorem 5(Immerman [20]). Letk≥0and letA andB be relational structures on the same vocabulary. The following are equivalent.

1. There exists a duplicator winning strategy in thek-pebble game on AandB.

2. AandB areFOk-indistinguishable.

3 Construction of Structures

In this section, we show a theorem implying that certain PP-sentences, namely those corresponding (via Chandra-Merlin, Theorem 2) to cores of treewidth at leastk, cannot be expressed by FO-sentences usingkvariables (Theorem 6). This inexpressibility result allows us to later derive our primary theorem (Theorem 23).

ITheorem 6. Let A be a core on the relational vocabulary σ such that tw(A) ≥k ≥1. There existσ-structuresB andB0 such thatB→A;A→B0;A6→B; and, B andB0 are

FOk-indistinguishable.

The remainder of the current section is devoted to the construction (Section 3.1) and the study (Section 3.2 and Section 3.3) of the structuresBandB0 mentioned in Theorem 6. We remark that the structureBis essentially equal to the structure defined by Atserias et al. in [3, Section 4].

(8)

Proof of Theorem 6. It is readily verified thattw(A)≥k ≥1 implies the existence of a connected component (C, E) in the Gaifman graph ofAsuch thattw((C, E))≥k≥1. Lets

be an arbitrary but fixed vertex inC. LetBandB0 be the twoσ-structures defined relative toA ands in Section 3.1; thenB →A by Observation 9 in Section 3.1. In Section 3.2, Lemma 11 and Lemma 12 show that A→ B0 and A6→ B, respectively. In Section 3.3, Lemma 16 gives a duplicator winning strategy in thek-pebble game on (B,B0), which suffices

to yield the theorem, via Theorem 5. _J

INotation 7. The following names are reserved throughout the current section:

Adenotes a core on the relational vocabularyσ, with universeA, such thattw(A)≥k≥1.

G = (C, E) denotes a connected component in the Gaifman graph of A such that

tw(G)≥k≥1. Note that, in particular, |C| ≥2.

sdenotes an arbitrary but fixed vertex inC.

Mdenotes an arbitrary but fixed bramble ofGhaving order> k(such a bramble exists by Theorem 1). Recall that, therefore, any hitting set forMhas size at leastk+ 1.

3.1 Construction of

B

and

B

0

In this section, relative toAands, we define twoσ-structuresBandB0 as follows. For alla∈A, letEa denote the edges incident on ain the Gaifman graph ofA. Let:

UC= (

(a, f)

a∈C, f: Ea→ {0,1} is such that X

e∈Ea

f(e) (mod 2) =

(

0 ifa6=s

1 ifa=s

)

,

U_C0 =

(

(a, f)

a∈C, f: Ea→ {0,1} is such that 0 = X

e∈Ea

f(e) (mod 2)

)

,

U_A\C={(a, f:Ea → {0})|a∈A\C}.

ThenBandB0 have universesB andB0 defined as follows:

B=UC∪UA\C,

B0=U_C0 ∪U_A\C.

The vocabulary is interpreted as follows. For all R ∈ σ, let (a1, f1), . . . ,(ar, fr) be

elements ofB(respectively, of B0), wherer=ar(R). Then ((a1, f1), . . . ,(ar, fr)) is in RB

(respectively, inRB0_{) if and only if} (a1, . . . , ar)∈RA;

for alli, j∈[r], ife={ai, aj} ∈E thenfi(e) =fj(e).

For the sake of intuition, suppose thatAis a connected graph, so thatAis isomorphic to its Gaifman graph andC=A. The universes ofBandB0 are formed by pairs (a, f) where

a is a vertex of A andf is a Boolean labelling of the edges incident on a. The Boolean labellings have even parity with the only exception of those paired withsinB which have odd parity. Moreover there is an edge{(a, f),(a0, f0)}inB(respectively,B0) if and only if the edge {a, a0} is inA and the labellingsf andf0 agree on {a, a0}. A concrete example follows.

IExample 8. LetA= (A, EA_{) where}_A₌_{_{a, s}_}_and_EA₌_{(_{a, s}₎_,₍_{s, a}_{)}, that is,}_A_{is the} graph formed by the single edge{a, s}. Letfi:{{a, s}} → {0,1} be such thatfi({a, s}) =i

fori= 0,1. ThenB= (B, EB_{) where}_B ₌_{(_{a, f}

0),(s, f1)} andEB=∅, andB0= (B0, EB

0 ) whereB0={(a, f0),(s, f0)}andEB

0

(9)

Note that, by construction ofB, the mapπ1:B→Ais a homomorphism fromBto A.

We inline this observation for later use.

IObservation 9. B→A.

3.2 A

Treats

B

and

B

0

Differently

LetBandB0 be the twoσ-structures defined relative to Aandsin Section 3.1. We show thatAmaps homomorphically toB0 (Lemma 11) but not toB (Lemma 12).

IExample 10. Let A, B, andB0 be as in Example 8. Then the mapping hdefined by

h(a0_{) = (}_a0_{, f}

0) for alla0∈Ais a homomorphism fromAtoB0, butAhas no homomorphisms

toBby direct inspection, asEB=∅.

We show thatAmaps homomorphically toB0.

ILemma 11. A→B0.

The essence of the proof is to conduct a rather direct inspection of the construction in Section 3.1; this shows that a copy ofAsits insideB0, by looking at the pairs inB0 carrying an identically 0 labelling.

Proof. Leth:A→B0 _{be defined as follows. For all}_a_∈_A_{, let}_h₍_a_{) = (}_{a, f}_:_E

a→ {0})∈B0.

We claim that h is a homomorphism from A to B0. Let R ∈ σ, ar(R) = r, and let (a1, . . . , ar)∈RA.

Note that by the definition of Gaifman graph, either{a1, . . . , ar}∩C=∅or{a1, . . . , ar} ⊆

C. In the former case, (h(a1), . . . , h(ar)) is trivially in RB

0

because no i, j ∈ [r] satisfy {ai, aj} ∈ E. In the latter case, let i, j ∈ [r] be such that i 6= j. Then, {ai, aj} ∈ E;

set e = {ai, aj}. We have π2(h(ai))(e) = 0 = π2(h(aj))(e) by definition of h. Hence

(h(a1), . . . , h(ar)) is inRB

0

. _J

On the other hand,Bhas no homomorphisms from A.

ILemma 12. [3, Lemma 1] A6→B.

For the sake of understanding this lemma intuitively, let A be a connected graph. A homomorphism fromAtoBmapsAto B preserving all edges. By construction (here we use thatA is a core), the homomorphic image ofA inB is a copy ofA where all edges carrytwoBoolean labels equal to each other. Summing these Boolean labels in two ways, edgewise and vertexwise, we get the contradiction that the edgewise sum has even parity but the vertexwise sum, by the contribution of the labels ofs, has odd parity.

We give a proof for completeness.

Proof. Assume for a contradiction thathis a homomorphism fromAtoB. By a standard argument, we may assume, without loss of generality, thatπ1(h(a)) =afor alla∈A.2

2 _{This can be justified as follows. We write}_g_◦_f₌_g₍_f_{) to denote the composition of}_g_and_f_{, with}_f

applied first. We have thatπ1is a homomorphism fromBtoA. Henceπ1◦his an endomorphism of

A, and sinceAis a core,π1◦his an automorphism ofA. By associativity of function composition,

idA= (π1◦h)◦(π1◦h)−1=π1◦(h◦(π1◦h)−1) =π1◦h0

that is, ifh0=h◦(π1◦h)−1, thenπ1(h0(a)) =afor alla∈A. Moreover,h0is a homomorphism from

(10)

We claim that for alle={a0, a00} ∈E it holds that

π2(h(a0))(e) =π2(h(a00))(e). (1)

Lete={a0, a00} ∈E. Note thate∈Ea0∩E_a00. By the definition of the Gaifman graph of

A, there exist R∈σand (a1, . . . , ai, . . . , aj, . . . , ar)∈RA withai=a0 andaj=a00. Then,

sinceA→Bviah,

(h(a1), . . . , h(ar)) = ((a1, f1), . . . ,(ar, fr))∈RB.

By construction, for alli, j∈[r], ife={ai, aj} ∈E, thenfi(e) =fj(e). In particular,

π2(h(a0))(e) =π2(h(ai))(e) =fi(e) =fj(e) =π2(h(aj))(e) =π2(h(a00))(e),

and we have that (1) holds.

We conclude observing that, by construction,

1 = X

e∈Es

π2(h(s))(e) (mod 2)

and for alla∈A\ {s},

0 = X

e∈Ea

π2(h(a))(e) (mod 2).

Then, lettingbe denote the quantity in (1), we have:

1 = X

e∈Es

π2(h(s))(e) +

X

a∈A\{s} X

e∈Ea

π2(h(a))(e) (mod 2)

= 2·X

e∈E

be (mod 2) by (1)

= 0,

a contradiction. _J

3.3 B

and

B

0

_are

_FO

k

-Indistinguishable

LetBandB0 be the twoσ-structures defined relative toAandsin Section 3.1. We show thatB andB0 are FOk-indistinguishable (Lemma 16).

We describe informally a winning strategy for the duplicator in thek-pebble game onB

andB0. For the sake of illustration, consider the simple case whereBandB0 are constructed relative to a connected graph A so that A = G (the lemma lifts the idea to arbitrary relational structures A, whose Gaifman graphs are possibly disconnected). The duplicator fixes a brambleMofGand maintains along the rounds of the game the following position (i= 0,1, . . . , k):

i pebble pairs are placed on elements (a1, f1), . . . ,(ai, fi)∈B and correspondingly on

elements (a1, f10), . . . ,(ai, fi0)∈B0 such that for some walkW in Gfrom sto a bramble

set inMavoidinga1, . . . , aiit holds thatfj(e) equals the parity offj0(e) plus the number

of timeseoccurs inW (for all j∈[i] ande∈Eaj).

That such a position exists and is a partial isomorphism is the content of Lemma 14; that the duplicator can maintain such a position along the rounds of the game is the content of Lemma 16.

(11)

INotation 13. LetW = (a1, . . . , am) be a walk inG. Fore∈E, we let

occW(e) =|{i∈[m−1]|e={ai, ai+1}}|

denote the number of times the edgeeis used inW. Moreover, for everyS⊆A, let

avoidM(S) =

[

{M∈M|M∩S=∅}

M

be the union of the bramble sets inMdisjoint fromS.

I Lemma 14. Let 0 ≤ i ≤ k and let {(a1, f1), . . . ,(ai, fi)} ⊆ B. Let W be a walk in

G from s to avoidM({a1, . . . , ai}). For all j ∈ [i], let fj0 : Eaj → {0,1} be defined by

f_j0(e) =fj(e) +occW(e) (mod 2). Then the mapping hsending (aj, fj) to (aj, fj0) for all

j∈[i] is a partial isomorphism fromB toB0.

Towards proving Lemma 14, we claim the following.

IClaim 15. Leta∈A and letW be a walk in Gfrom stot6=a. Then:

If (a, f)∈B andf0(e) =f(e) +occW(e) (mod 2) for alle∈Ea, then (a, f0)∈B0.

If (a, f0₎_∈_B0 _and_f₍_e_{) =}_f0₍_e₎₋_occ

W(e) (mod 2) for alle∈Ea, then (a, f)∈B.

The idea underlying the claim is that, for (a, f)∈B witha=s, both the sum of the labellings ofEa underf, and the number of occurrences of edges inEa in a walkW that

starts ataand does not end ata, are odd. For (a, f)∈B witha6=sboth the sum of the labellings underf of edges incident ona, and the number of occurrences of edges incident onain a walkW that neither starts nor ends ata, are even. It follows that (a, f0)∈B0 by construction.

Proof of Claim 15. For the first item, leta∈A, let (a, f)∈B, and letf0:Ea → {0,1}be

such thatf0(e) =f(e) +occW(e) (mod 2) for alle∈Ea.

If a∈A\C, then (a, f) ∈B implies that f(e) =occW(e) = 0 for all e∈Ea, so that

f0₍_e_{) = 0 for all}_e_∈_E

a. By construction, (a, f0)∈B0. Otherwise, ifa∈C, we distinguish

two cases.

Case: Ifa=s, then

1 = X

e∈Es

occW(e) (mod 2)

becauseW starts atsand ends att6=s=a(andGdoes not contain loops). On the other hand, by definition ofB,

1 = X

e∈Es

f(e) (mod 2),

so 0 =P e∈Esf

0₍_e_{) (mod 2), and (}_{s, f}0_{) = (}_{a, f}0₎_∈_B0 _{by definition of}_B0_.

Case: Ifa6=s, then

0 = X

e∈Ea

occW(e) (mod 2)

becauseW neither starts nor ends ata(andGdoes not contain loops). On the other hand, by definition,

0 = X

e∈Ea

f(e) (mod 2),

so 0 =P e∈Eaf

0₍_e_{) (mod 2) and (}_{a, f}0₎_∈_B0_.

For the second item, ifa∈ A\C we similarly have that f is identically 0 onEa and

(12)

such thatf(e) =f0(e)−occW(e) (mod 2). We distinguish two cases. Ifa=s, then 1 = P

e∈EsoccW(e) (mod 2) and 0 =

P e∈Esf

0₍_e_{) (mod 2), therefore 1 =}P

e∈Esf(e) (mod 2) and (s, f) = (a, f)∈B. Ifa6=s, then 0 =P

e∈EaoccW(e) (mod 2) and 0 =

P e∈Eaf

0₍_e_{) (mod 2)}

imply 0 =P

e∈Eaf(e) (mod 2), so that (a, f)∈B and we are done. J We are now ready to prove the lemma. For the sake of intuition, consider the case whereA

is a connected graph, so thatA=G. If the edge{(a, f),(b, g)}is inBandh((a, f)) = (a, f0_),

h((b, g)) = (b, g0), then on the one handf({a, b}) =g({a, b}), which is the content of (4); and on the other handf0({a, b}) andg0({a, b}) equal the parity of the sum off({a, b}) and

g({a, b}), respectively, and the number of occurrences of{a, b}in a fixed walk W, which is the content of (3). Thereforef0({a, b}) =g0({a, b}) and the edge{(a, f),(b, g)}is inB0. The converse is symmetric.

Proof of Lemma 14. By Claim 15, (aj, fj0)∈B0 for allj ∈[i], hencehis a partial function

from B to B0; moreover, h is injective by definition. LetR ∈ σ, let ar(R) = r, and let (b1, f1), . . . ,(br, fr)∈dom(h). It is sufficient to show that

((b1, f1), . . . ,(br, fr))∈RB⇐⇒(h(b1, f1), . . . , h(br, fr))∈RB

0

(=⇒) Assume ((b1, f1), . . . ,(br, fr))∈RB. Then by construction (b1, . . . , br)∈RA. We

distinguish two cases.

Case: {b1, . . . , br} ∩C =∅. Note that ifbj ∈A\C, thenfj(e) = 0 for all e∈Ebj by construction andoccW(e) = 0 for all e∈Ebj becauseW lies entirely inC. Thenf

0 j(e) = 0

for alle∈Ebj. Therefore, by construction, (h(b1, f1), . . . , h(br, fr))∈R B0_.

Case: If {b1, . . . , br} ⊆ C, then let e = {bj, bj0} ∈ E, j, j0 ∈ [r]. We claim that

f_j0(e) =f_j00(e). By hypothesis, there exists a walkW inGfromstot∈avoidM({a1, . . . , ai})

such that for allj∈[i] and alle∈Eaj it holds that

f_j0(e) =fj(e) +occW(e) (mod 2). (2)

It follows from (2) that, for allj∈[r] ande∈Ebj,

f_j0(e) =fj(e) +occW(e) (mod 2). (3)

Moreover, by construction, ifj, j0∈[r] ande={bj, bj0} ∈E, then

fj(e) =fj0(e). (4)

Therefore,

f_j0(e) =fj(e) +occW(e) (mod 2) by (3)

=fj0(e) +occ_W(e) (mod 2) by (4)

=fj00(e) by (3)

We conclude that ((b1, f10), . . . ,(br, fr0)) = (h(b1, f1), . . . , h(br, fr))∈RB

0 . (⇐=) Assume ((b1, f10), . . . ,(br, fr0))∈RB

0

. Then (b1, . . . , br)∈RA. If{b1, . . . , br} ∩C=

∅, then (h(b1, f10), . . . , h(br, fr0)) ∈ R

B _{reasoning as above. If} _{_b

1, . . . , br} ⊆ C, then let

j, j0∈[r] ande={bj, bj0} ∈E. We claim thatf_j(e) =f_j0(e). By hypothesis, there exists a walkW inGfroms tot∈avoidM({a1, . . . , ai}) such that for allj ∈[i] and all e∈Eaj it holds that

(13)

By (5) we have that for allj∈[r] ande∈Ebj

fj(e) =fj0(e)−occW(e) (mod 2). (6)

Moreover, by construction, ifj, j0 ∈[r] and e={bj, bj0} ∈E, then

f_j0(e) =f_j00(e). (7)

Therefore,

fj(e) =fj0(e)−occW(e) (mod 2) by (6)

=f_j00(e)−occW(e) (mod 2) by (7)

=fj0(e) by (6)

We conclude that ((b1, f1), . . . ,(br, fr))∈RB. J

We now formalize the strategy informally described above and show that, indeed, it is a winning strategy for the duplicator in the k-pebble game onBand B0.

ILemma 16. LetS be the family of partial isomorphisms from BtoB0 that contains an injective map h:B → B0 when |dom(h)| ≤ k and there exists a walk W in G from s to avoidM(π1(dom(h)))such that, for all(a, f)∈dom(h), it holds thath((a, f)) = (a, f0)where

f0(e) =f(e) +occW(e) (mod 2)

for alle∈Ea. ThenS is a duplicator winning strategy in thek-pebble game onB andB0.

The crux of the proof is the following. Suppose thati < k pebble pairs are placed on elements (a1, f1), . . . ,(ai, fi)∈B and correspondingly on elements (a1, f10), . . . ,(ai, fi0)∈B0

such that for some walkW inGfroms tot in a bramble set in Mavoiding a1, . . . , ai it

holds thatf_j0(e) equals the parity offj(e) plus the number of timeseoccurs inW (for all

j∈[i] ande∈Eaj).

The spoiler pebbles, say, (ai+1, fi+1)∈B. The duplicator obtains a walkW0 in Gfrom

sto a vertex t0 _{lying in a bramble set of}_M_{that avoids}_a

1, . . . , ai, ai+1 (such a set exists

because the bramble has order greater thank≥i+ 1) by walking fromstot overW and then fromttot0 using only vertices in the bramble sets oftandt0 (which is feasible by the properties of the bramble). The duplicator pebbles (ai+1, fi0+1)∈B, wherefi0+1(e) equals the

parity offi+1(e) plus the number of timeseoccurs inW0 for alle∈Eai+1; thus maintaining its winning position, because the number of occurrences of edges incident to any ofa1, . . . , ai

does not change in passing fromW toW0.

Proof of Lemma 16. We check thatS satisfies Definition 4 relative to BandB0.

For (S1), we show that the functionh:∅ → ∅is inS. Since any hitting set of the bramble Mhas size at leastk+ 1≥2, there exists a setM inMsuch thats6∈M. Leta∈M. Then

a∈avoidM(π1(dom(h))). Moreover, Gis connected, hence there is a walk W inGfromsto

a. Thush∈ S.

We now verify thatS satisfies (S2.F) and (S2.B). Leth∈ S and let|dom(h)|< k. By definition ofS, there exists a walkW inGfromstot∈avoidM(π1(dom(h))) such that, for

all (a, f)∈dom(h) it holds that h((a, f)) = (a, f0) where f0(e) =f(e) +occW(e) (mod 2)

for alle∈Ea.

For (S2.F), let (a, f)∈B. Since|dom(h)|< k, we have that|π1(dom(h))∪ {a}| ≤k. So,

(14)

that M0∩(π1(dom(h))∪ {a}) = ∅. Let t ∈M ∈ M. Obtain a walk W0 inG from s to

t0 ∈avoidM(π1(dom(h))∪ {a}) by concatenatingW and a walk in Gfromt∈M tot0∈M0

containing only vertices inM andM0. Let (a, f0) be such thatf0(e) =f(e)+occW0(e) (mod 2) for alle∈Ea. Now defineh0=h∪ {((a, f),(a, f0))}.

We want to show thath0 is inS. We claim that, for all (b, fb)∈dom(h0), ifh0((b, fb)) =

(b, f_b0), thenf_b0(e) =fb(e) +occW0(e) (mod 2) for all alle∈E_b. It follows thath0 is a partial isomorphism fromB toB0 by Lemma 14, so that h0 is inS witnessed byW0. Henceh0∈ S.

For the claim, let (b, fb)∈dom(h0) and let h0((b, fb)) = (b, fb0). Ifb =a, then (b, fb) =

(a, f) andh0((a, f)) = (a, f0) such thatf0(e) =f(e) +occW0(e) (mod 2) for alle∈E_a by construction. Ifb6=a, then notice thatb6∈M andb6∈M0, so that for everye∈Eb it holds

thatoccW(e) =occW0(e), and therefore,

f_b0(e) =fb(e) +occW(e) (mod 2) by hypothesis onh

=fb(e) +occW0(e) (mod 2) and the claim is settled.

For (S2.B), let (a, f0)∈B0. Along the lines above, we obtain a walkW0 inGfromsto

avoidM(π1(dom(h))∪ {a}), andf:Ea→ {0,1}such thatf0(e) =f(e) +occW0(e) (mod 2) for alle∈Ea; we puth0=h∪ {((a, f),(a, f0))}, and show thath0 ∈ S appealing to Lemma 14.

We conclude by verifying that S satisfies (S3). Leth∈ S and let (a, f)∈dom(h). We want to show that the restriction ofhtodom(h)\{(a, f)}, namely,h0=h|dom(h)\{(a,f)}is inS.

Partial isomorphisms are closed under restrictions, henceh0is a partial isomorphism fromBto

B0 of domain size at mostk. LetW be a walk inGfromstoavoidM(π1(dom(h))) witnessing

thathis in S. We claim thatW also witnesses thath0 is inS. By definition,W is from

stoavoidM(π1(dom(h)))⊆avoidM(π1(dom(h0))). Moreover, ifh0((a0, f0)) = (a0, f00), then

h((a0, f0)) = (a0, f00) and for alle∈Ea0 it holds thatf00(e) =f0(e) +occ_W(e) (mod 2). J

4 Existential Positive Logic

In this section, we presentcombinatorial width, the combinatorial measure on EP-formulas. While the specialization of this measure to EP-sentences is due (implicitly) to previous work, we here give a definition that applies to all EP-formulas.

In order to define our measure, we first associate a structure to each PP-formula, as follows. In the following definition, one should conceive ofψas a disjunct of an EP-formula which is a disjunction of PP-formulas and which has free variablesv1, . . . , v`.

IDefinition 17. For each vocabularyσand each integer `≥1, we fixUσ,` to be a relation

symbol of arity`outside ofσ.

For each vocabulary σ, each list v1, . . . , v` of variables, and each PP-formula ψ over

vocabularyσwithfree(ψ)⊆ {v1, . . . , v`}, we define a structureC[ψ;σ;v1, . . . , v`] as follows:

Define ψ0 as the formula obtained fromψby prenexingψ and then renaming quantified variables (if necessary) so that none of the variablesv1, . . . , v` are quantified.

Define ψ+ _as_∃_v

1. . . v`(Uσ,`(v1, . . . , v`)∧ψ0) if` >0, and asψ0 if`= 0.

Define C[ψ;σ;v1, . . . , v`] as C[ψ+].

Note that, in the case that`= 0, it holds thatC[ψ;σ;v1, . . . , v`] is isomorphic toC[ψ], since

in this caseψ0 andψare identical up to renaming variables, andψ+₌_ψ0_.

In essence, the structure just defined is the canonical structure obtained by conjoining, to the PP-formula, the atomUσ,`(v1, . . . , v`), where the symbolUσ,` is a fresh one; and then

(15)

We now define a notion ofnormalized EP-formula.

IDefinition 18. (extends [6, Definition 3]) Letφbe an EP-formula over vocabularyσand whose free variables arev1, . . . , v`. We say thatφisnormalized if it is equal to a disjunction W

i∈[m]ψi (withm≥0) where:

eachψi is a prenex PP-formula,

for eachi∈[m], the structureC[ψi;σ;v1, . . . , v`] is a core, and

for eachi, j∈[m] withi6=j, it holds thatC[ψi;σ;v1, . . . , v`]6→C[ψj;σ;v1, . . . , v`].

IProposition 19. There exists an algorithm that, given as input an EP-formulaφ, outputs a normalized EP-formulaφ0=W

i∈[m]ψi0 that is logically equivalent toφ, and such that (for

anyk≥0) ifφis EPk-expressible, thenψ0_i is PPk-expressible for everyi∈[m].

This proposition was observed in the particular case of sentences by [6, Section 3]; the con-struction is, in essence, a solution to a classic exercise in database theory [1, Exercise 6.14(c)].

Proof. Letσdenote the vocabulary of φ, and letv1, . . . , v` denote the free variables ofφ.

By the proof of [6, Lemma 4(3)],φcan be syntactically transformed (while preserving logical equivalence) to a disjunctionW

i∈[m]ψi where eachψi is a PP-formula, via

transfor-mations that do not introduce variables. Hence, eachψi is PPk-expressible assuming that φ

was EPk-expressible.

For each i∈ [m], we may further assume (by rewriting ψi if necessary) that ψi =ψ0i,

whereψ_i0 is as defined in Definition 17. As this operation preserves logical equivalence, it does not affect whether or not the disjunctionW

i∈[m]ψi is EPk-expressible.

Suppose thati, j∈[m] andhare such thathis a homomorphism fromC[ψi;σ;v1, . . . , v`]

toC[ψj;σ;v1, . . . , v`]. Due to the interpretation ofUσ,` by these structures as{(v1, . . . , v`)},

it holds that h fixes each of v1, . . . , v`. We claim that when B is a structure and when

f :{v1, . . . , v`} →B is an assignment, it is straightforward to verify thatB, f |=ψj implies B, f(h)|=ψi, which in turn impliesB, f |=ψi.

In the case that there exists such a homomorphism fori, j∈[m] with i6=j, we have that

ψj entailsψi, and hence removingψj from the disjunction preserves logical equivalence of

the disjunction. In the case that there exists such a homomorphism that is not surjective for

i=j, then setψ to be the modification ofψi where one removes all atoms whose variables

are not in the image ofh, as well as the quantifications of such variables; then, we have

C[ψi;σ;v1, . . . , v`] → C[ψ;σ;v1, . . . , v`]; from this, it follows that ψ and ψi are logically

equivalent, andψi can be replaced withψin the disjunction. By iteratively performing such

removals and replacements until none can be performed, a normalized EP-formula is obtained. Moreover, these removals and replacements preserve PPk-expressibility of all disjuncts. _J

We now define the notion ofcombinatorial width. Although we only define it directly on normalized EP-formulas, the definition can be naturally extended to all EP-formulas in light of Proposition 19.

IDefinition 20. Letφ=W

i∈[m]ψibe a normalized EP-formula with free variablesv1, . . . , v`

and over vocabularyσ. We define comb-width(φ) = maxi∈[m](tw(C[ψi;σ;v1, . . . , v`]) + 1).

Note that, in the case that φ is a sentence (equivalently, when ` = 0), it holds that

comb-width(φ) = maxi∈[m](tw(C[ψi]) + 1).

(16)

ITheorem 21. Let φ=W

i∈[m]ψi be a normalizedEP-sentence. Let k=comb-width(φ).

The sentenceφis EPk-expressible, but (assumingk >0) is notEPk−1-expressible.

Proof. We will use the fact, which follows from [16, Theorem 12], that (when w ≥1) a PP-sentenceψhastw(core(C[ψ]))< w if and only ifψis PPw-expressible.

We have that φis EPk-expressible via this fact, since each disjunctψi hastw(C[ψi])<

comb-width(φ).

For the non-expressibility result, suppose that φ is EPn-expressible; we prove that

n ≥k. It follows from Proposition 19 that φ is logically equivalent to a normalized EP-formula that is a disjunction W

i∈[m0_]ψ0_i of PPn-formulas. We have that each C[ψ0_i] is a core; by the fact, we obtain thattw(C[ψ_i0]) + 1≤n. By Lemma 4(2) of [6], it follows that

k= maxi∈[m0_](tw(C[ψ_i0]) + 1). We thus havek≤n. J

5 Main Theorems and Consequences

We first prove our number-of-variables characterization for EP-sentences.

I Theorem 22. Let φ= W

i∈[m]ψi be a normalized EP-sentence; let w= comb-width(φ).

The sentenceφis EPw-expressible, but (assumingw >1) is notFOw−1-expressible. This theorem is transparently seen to be a strengthening of Theorem 21 (when the stated assumption holds).

Proof. That the sentenceφis EPw-expressible follows directly from Theorem 21, so we prove thatφis not FOw−1-expressible. Choose i∈[m] such thatcomb-width(φ) =tw(C[ψi]) + 1;

setA=C[ψi]. By the definition ofnormalized, we have thatAis a core. Let B,B0 be the

structures provided by Theorem 6 relative toAandk=tw(A); note thatk=w−1. We show in the next paragraph that B0 |=φ andB 6|=φ. Then, sinceB0 andB are FOk-indistinguishable by Theorem 6, and sinceφdistinguishes betweenB0 andB, it follows thatφis not FOk-expressible.

We prove the claim. We have that A→B0, henceB0|=ψi by Theorem 2, andB0 |=φ.

Now, assume for a contradiction that B |= φ. Then B |= ψj for some j ∈ [m]. Then C[ψj]→Bby Theorem 2. We have B→A=C[ψi] by Theorem 6. HenceC[ψj]→C[ψi],

soi=j by the hypothesis that φis normalized. But we also haveA=C[ψi]6→B, hence

i6=j, a contradiction. _J

We now extend the previous theorem to address all normalized EP-formulas. The following is our primary theorem.

I Theorem 23 (Primary theorem). Let φ = W

i∈[m]ψi be a normalized EP-formula; let

w = comb-width(φ). The formula φ is EPw-expressible, but (assuming w > 1) is not FOw−1-expressible.

Proof. Letσdenote the vocabulary ofφ. Letv1, . . . , v` denote the free variables ofφ. We

assume that`≥1 (otherwise, the theorem follows from Theorem 22).

We first establish thatφis EPw-expressible. It suffices to prove that, for eachi∈[m], the formulaψi is expressible using b=tw(C[ψi;σ;v1, . . . , v`]) + 1 variables. By definition

of treewidth, there exists a tree decomposition of C[ψi;σ;v1, . . . , v`] where each bag has

size b or less. It is readily verified that this tree decomposition is a tree decomposition of C[∃v1. . .∃v`ψ0i] (whereψi0 is derived from ψi as described in Definition 17) which has

(17)

Lemma 5.11 of [11, 12]. We give an explanation for the sake of completeness. Add a vertexr

adjacent tosto the tree and defineBr={v1, . . . , v`}. LetT be the resulting object, which

is straightforwardly verified to be a tree decomposition ofC[∃v1. . .∃v`ψi0] where each bag

has sizebor less.

We now describe how to construct the desired formula. Each variable ofψ_i0, other than

v1, . . . , v`, will be existentially quantified exactly once in the formula; and all atoms ofψ0i,

will appear in the formula. In this way, the constructed formula will be clearly logically equivalent toψ0_i, and henceψi.

Root the treeT atr; note thatrhas a single child,s.

For each non-root vertext ofT, we define a PP-formulaθtinductively, as follows. Define

θtas∃w1. . . wmθ0twherew1, . . . , wmis a list of variables that are in the bagBt oftbut

not in the bag oft’s parent, andθ_t0 is the conjunction ofθuover all childrenuoftand all

atomsR(z1, . . . , zk) ofψi0 where{z1, . . . , zk} ⊆Bt.

The desired formula isθs.

Observe (by induction) that, for each non-root vertext ofT, it holds that free(θ0t)⊆Btand

free(θt)⊆Bp where pis the parent oft. It follows that each formulaθt, and in particularθs,

has width≤b, and is thus PPb-expressible.

We now establish thatφis not FOw−1-expressible.

Ifw≤`, then it is straightforward to verify that the formulaφis not FOw−1-expressible, sincew−1 is strictly lower than the number of free variables ofφ. (We can remark that

w≥`, since each structureC[ψi;σ;v1, . . . , v`] interpretsUσ,` as{(v1, . . . , v`)}, and so the

treewidth of each such structure plus 1 is` or more.)

We thus assume in the sequel thatw > `. For eachi∈[m], we may assume without loss of generality that eachψiis prenexed and that in eachψi, none of the variablesv1, . . . , v`are

quantified. Defineψ+_i as in Definition 17; we then haveψ_i+=∃v1. . .∃v`(Uσ,`(v1, . . . , v`)∧ψi).

Defineφ+ _asW i∈[m]ψ

+

i . We have thatC[ψi;σ;v1, . . . , v`] =C[ψi+]. We have

comb-width(φ) = max

i∈[m]

(tw(C[ψi;σ;v1, . . . , v`])+1) = max i∈[m]

(tw(C[ψ+_i ])+1) =comb-width(φ+),

where the last equality holds by the note in Definition 20. Observe that

φ+= _

i∈[m]

ψ+_i = _

i∈[m]

∃v1. . .∃v`(Uσ,`(v1, . . . , v`)∧ψi)

≡ ∃v1. . .∃v` _

i∈[m]

(Uσ,`(v1, . . . , v`)∧ψi)≡ ∃v1. . .∃v`(Uσ,`(v1, . . . , v`)∧( _

i∈[m]

ψi))

=∃v1. . .∃v`(Uσ,`(v1, . . . , v`)∧φ).

Suppose, for a contradiction, thatφ is FOw−1-expressible. Then, φcan be expressed just using the variablesx1, . . . , xw−1 (recall the assumption that w > `, which gives w−1≥

`). Since φ+ _{≡ ∃}_v

1. . .∃v`(Uσ,`(v1, . . . , v`)∧φ), this immediately implies that φ+ can be

expressed just using the variables x1, . . . , xw−1, and that φ+ is FOw−1-expressible. As

w=comb-width(φ) =comb-width(φ+_{), this contradicts Theorem 22.}

J

I Corollary 24. For each k ≥ 1, FOk-expressibility and EPk-expressibility coincide on existential positive formulas; that is, an existential positive formula is FOk-expressible if and only if it isEPk-expressible.

(18)

that is logically equivalent toφ(which exists by Proposition 19). We have thatφ0 is FOk -expressible. We have thatk≥comb-width(φ0); if not, we would have 1≤k <comb-width(φ0), contradicting Theorem 23. It follows from Theorem 21 thatφ0 is EPk-expressible. _J

ICorollary 25. The problem of deciding FOk-expressibility ofEP-sentences isΠp₂-complete. By this, we refer to the problem of deciding, given an EP-sentenceφand an integer k≥1, whether φisFOk-expressible.

Proof. The problem of deciding, given an EP-sentenceφand an integerk≥1, whetherφis EPk-expressible, is Πp₂-complete by [6, Theorem 6]. The present corollary thus follows from

Corollary 24. _J

We conclude the article by explaining how the main result of [3] follows from our development. We first present the necessary definitions.

I Definition 26. Let σ be a relational vocabulary and letA and B be σ-structures. A partial homomorphism from A to B is a partial function h from A to B such that, for all R ∈ σ and all a1, . . . , aar(R) ∈ dom(h), it holds that (a1, . . . , aar(R)) ∈ RA implies

(h(a1), . . . , h(aar(R)))∈RB.

IDefinition 27. [21] Let k≥0. Aduplicator winning strategy in the existential k-pebble game on (A,B) is a family S of partial homomorphismshfromAtoBwith|dom(h)| ≤k

such that:

(E1) ∅ → ∅is in S.

(E2) Ifh∈ Sand|dom(h)|< k, then for everya∈Athere existsb∈Bsuch thath∪ {(a, b)} is in S.

(E3) Ifh∈ S anda∈dom(h), thenh|dom(h)\{a} is inS.

ITheorem 28. (Main theorem of [3]) LetA be a core on the relational vocabularyσ such that tw(A) ≥ k ≥ 1. There exists a σ-structure B such that A 6→ B and there exists a duplicator winning strategy in the existentialk-pebble game on (A,B).

To prove Theorem 28, we will make use of the following transitivity property.

I Lemma 29. Let k ≥ 0. If there exist duplicator winning strategies in the existential

k-pebble games on(A,B)and(B,C), then there exists a duplicator winning strategy in the existentialk-pebble game on(A,C).

Proof. Suppose that Gis a duplicator winning strategy in the existentialk-pebble game on (A,B), and that H is a duplicator winning strategy in the existentialk-pebble game on (B,C). LetF be the set containing each mapping of the formh(g) whereg∈G,h∈H, and

dom(h) =g(A). It is straightforward to verify thatF is a duplicator winning strategy in the

existential k-pebble game on (A,C). _J

We now give a proof of Theorem 28 using the construction of this article.

Proof of Theorem 28. LetAbe aσ-structure. By hypothesis,Ais a core andtw(A)≥k; so, by Theorem 6 there existσ-structuresBandB0, such thatA6→B,A→B0, and (via Theorem 5) the duplicator has a winning strategy in thek-pebble game onB andB0.