Two-variable First-Order Logic with Counting in Forests

(1)

Volume 57, 2018, Pages 214–232

LPAR-22. 22nd International Conference on Logic for Programming, Artificial Intelligence and Reasoning

Two-variable First-Order Logic with Counting in Forests

Witold Charatonik

1∗

_{, Yegor Guskov}

2

_,

Ian Pratt-Hartmann

2,3

_{, and Piotr Witkowski}

1∗

1

Institute of Computer Science, University of Wroc law, Poland

[email protected] [email protected] 2 _{School of Computer Science, University of Manchester, UK}

[email protected] [email protected] 3 _{Instytut Informatyki, Uniwersytet Opolski}

Abstract

We consider an extension of two-variable, first-order logic with counting quantifiers and arbitrarily many unary and binary predicates, in which one distinguished predicate is interpreted as the mother-daughter relation in an unranked forest. We show that both the finite satisfiability and the general satisfiability problems for the extended logic are decidable inNExpTime. We also show that the decision procedure for finite satisfiability can be extended to the logic where two distinguished predicates are interpreted as the mother-daughter relations in two independent forests.

1 Introduction

Two-variable logics. The two-variable fragment of first-order logic, here denoted FO2, is the set of function-free, first-order formulas (with equality) featuring at most two variables. The two-variable fragment with counting, here denoted_C2, is the set of function-free, first-order formulas featuring at most two variables, but with the counting quantifiers_∃[_≤C], ∃[_≥C] and ∃[=C], (for every C ≥ 0) allowed. The following facts are known. The logic FO

2

has the finite model property [7], and its satisfiability (= finite satisfiability) problem isNExpTime -complete [5]. The logic_C2is expressive enough for the finite model property to fail; nevertheless, its satisfiability and finite satisfiability problems remainNExpTime-complete [6,8,9].

Contributions, techniques. Aforest is a well-founded directed graph (possibly infinite) in which each vertex has at most one incoming edge. It is impossible, in first-order logic, to express the fact that the graph of a given binary relation is a forest. This suggests enriching FO2 and_C2 _{by adding such a facility. Denote by}

C2_[

↓] and_C2_[

↓1,↓2] the extensions of C2 in which respectively one or two distinguished binary predicates are required to be interpreted as unranked forests. Within the logic_C2_[

↓] one can state that the graph of a given binary relation is connected, since every connected graph has a spanning tree.

(2)

A restriction of the logic _C2_[

↓1,↓2] was investigated in [4], in which the forests in question are required to have bounded rank: for some fixed numberm, no element of either forest has more thanmdaughters. It was shown that the finite satisfiability problem for this logic is in

NExpTime. In the present paper, the restriction to forests of bounded rank is lifted, without

affecting the complexity of finite satisfiability. It is also shown that the satisfiability problem for the logic_C2_[

↓] is in_NExpTime.

Theorem 1. The finite satisfiability problems for_C2[_↓]and_C2[_↓1,↓2] are inNExpTime.

Theorem 2. The satisfiability problem for_C2_[

↓] is in_NExpTime.

Theorem 1 strengthens the result of [4], while simplifying its proof. The key to this simplification is a closer look at the algorithm given in [10] for deciding finite satisfiability in_C2_, which the proof in [4] modifies. By undertaking a more fine-grained analysis of the complexity of this algorithm, we can reduce the problem of determining finite satisfiability of a formula in_C2[_↓] (or_C2[_↓1,↓2]) to that of determining the finite satisfiability of anexponentially larger C2-formula. This allows us to treat the result of [10] as, essentially, a black box, which was not possible in [4]. Theorem2 uses new techniques to reduce the general satisfiability problem for _C2[_↓] to the corresponding finite satisfiability problem. In the context of extensions of_C2, proving general satisfiability usually is not an easy task. In fact, most such extensions, including [4,3,1], do not touch this problem; an exception is [11]. Thus, the contributions of the present paper are: (a) we lift the earlier restriction to forests of bounded rank; and (b) we consider general satisfiability for_C2_[

↓] (thus allowing interpretation over a single, infinite forest). The techniques employed here also work for other extensions of_C2 _{over trees; in a separate paper, we prove the} decidability of the finite satisfiability problem for_C2_[

↓,_→], where the additional predicate_→is interpreted as the next-sister-relation.

Related work. The languages considered in the present paper contain both counting quantifiers and arbitrarily many binary predicates; however, they employ only a single nav-igational predicate, namely the mother-daughter relation_↓. When counting quantifiers are absent, and only unary predicates appear in the signature, we can extend the palette of navi-gational possibilities while retaining decidability of finite satisfiability. Thus, for example [2] considers the logic FO2and its guarded fragment, GF2, interpreted over a single finite tree, and accessed by various combinations of navigational predicates including↓(mother-daughter),↓+ (mother-descendant),→ (next-sister) and→+ _{(older-sister). Complexity of satisfiability for} these logics varies fromPSpacetoExpSpace. Note that all these logics allow vocabularies with arbitrarily many unary-, but no uninterpreted (i.e. non-navigational) binary predicates, and some of them additionally impose the unary alphabet restriction (exactly one predicate holds on each node of a structure). With these extended sets of navigational possibilities, the addition of both counting quantifiers and uninterpreted binary predicates is known to produce significant increases in complexity. Thus, for example, taking the logic FO2[_∗,0,↓,↓+_,

(3)

2 The finite satisfiability problem

Overview of the decision procedure. Although _C2 _{cannot express that a distinguished} predicatetM _{is a forest in every finite model}_M_{, it can express that the graph of}_tM _{consists of}

a number of trees and a number of cycles. Under certain conditions, by a simple rewiring of the model we may implant such a cycle into a tree thus removing the cycle; by repeating this procedure we remove all cycles one by one and obtain a model wheretis interpreted as a forest. Given an input_C2[_↓] formula ψwhose finite satisfiability is to be determined we produce an equisatisfiable _C2 formula ψS by adding to ψ some conjuncts that encode a forest. The

parameterS of ψS, called a shrubbery, is a description of tree components in a model, and

enables the implantation procedure. The shrubberyS is of size exponential in the signature ofψand can be simply guessed; this however givesψS of exponential size, which might lead to

an exponentially slower algorithm. One of contributions of this paper is the notion ofeffective size of a formula. We show thatψS has polynomial effective size and that the satisfiability of

such formulas can be tested without moving to a higher complexity class.

The implantation procedure is very similar to the one used in [4]. However, the techniques developed here not only lift the restriction in [4] to ranked trees, but also enable us to use the result of [10] directly, without having to reconstruct the proof given there (as was done in [4]).

2.1 Preliminaries

In the sequel,formulameans a formula of _C2_{. A formula}_ϕ_{is in}_{normal form} _{if it conforms to} the pattern:

∀x_∀y(x=y_∨α)_∧

m ^

h=1

∀x_∃[≺hCh]y(βh∧x6=y), (1)

whereαand theβhare quantifier-free, equality-free formulas,mis a positive integer, the≺h are

either = or_≤, and theCh are (bit-strings representing) non-negative integers. The integermis

themultiplicity ofϕ, and the integerC= max (C1, . . . , Cm) theceilingofϕ. Themodulus ofϕ

is the formulaθ=β1∨. . .∨βm. We assume without loss of generality thatC≥1. Observe

that if a_≺his = then ψis not satisfiable over any domain of cardinality less than or equalCh.

The following lemma uses a familiar technique originally employed in [12] in the context of FO2.

Lemma 1 ([10, Lemma 1]). Given a _C2-formula ϕ, we can compute, in polynomial time, a normal-form_C2-formulaψ, with ceilingC, such that, for any setA of cardinality greater thanC,

ψis satisfiable (in either_C2,_C2[_↓] or_C2[_↓1,↓2]) over the domainA if and only ifϕis.

In the sequel, we shall require fine control over the various parameters in a normal form formula, and in order to obtain this, we generalize the notion slightly. Letθbe a quantifier-free formula. A formulaψwith free variablexisθ-eclipsed if it is a Boolean combination of formulas which are either (i) quantifier-free or (ii) of the form_∃[./B]y χ, whereBis a non-negative integer, the symbol./ is chosen from_{≤,=,_≥}, andχis a quantifier-free formula such that_|=χ_→θ. A formulaψis in weak normal formif it conforms to the pattern:

ϕ_∧

` ^

g=1

∃[./gBg]x. ξg

∧ ∀x.η, (2)

whereϕis in normal form with modulusθ,B1, . . . , B` are non-negative integers,ξ1, . . . , ξ`are

quantifier-free formulas with free variablex, the symbols./1, . . . , ./` are chosen from{≤,=,≥},

(4)

ofϕrespectively. We take formulas that are trivially equivalent to (weak) normal form formulas to be in (weak) normal form by courtesy. Thus, any normal-form formula is automatically also in weak normal form.

The satisfiability and finite satisfiability problems for_C2_{are decidable and in fact}

NExpTime

-complete [6,8,9]. In this paper, we require fine control over both the parameters of the formula and the features of the model in question. Letψ be a (weak) normal-form_C2_{-formula over a} signature Σ having ceilingC and multiplicitym. Thesizeofψ, denoted_|ψ_|, is the number of symbols it contains (with quantifier subscripts coded in binary). We say that theeffective size ofψis the quantity_|Σ_|+ log(_|ψ_|) + log(mC).

Theorem 3. There exists a non-deterministic procedure which, given a_C2_-formula_ψ _{in weak} normal form, runs in time bounded by a fixed exponential function of the effective size ofψ, and which has a successful run if and only ifψ is finitely satisfiable.

Proof. An immediate corollary of the proof of [10, Theorem 1], in which the finite satisfiability of a normal-form formulaψwith multiplicitymand ceilingCis reduced to the solvability overNof a guessed system_E of Boolean combinations of linear inequalities, together with a check that_E verifies the satisfiability ofψ. The number of linear inequalities in_E is, by inspection [10, p. 51, formulas (C1)–(C6)], exponential in_|Σ_|+ log(mC). Furthermore, checking that_E verifies the satisfiability ofψrequires time polynomial in_|ψ|. Thus, the procedure runs in (nondeterministic) exponential time. The relaxation toweak normal form requires just two changes. First, to deal with the conjuncts_∃[./gBg]x. ξg, we must add to the systemE an additional collection of`

(in)equalities. Second, when checking that_E verifies the satisfiability ofψ, we take into account the eclipsed conjunct_∀x.η. Both changes are completely routine.

A forest is a directed graphG= (V, E) in whichE is inverse-functional and well-founded. That is: for allv_∈V there exists at most oneu_∈V such thatuEv, andV contains no infinite reverse chainsv0, v1, . . . withvi+1Evi for alli≥0. It follows from well-foundedness thatE is

anti-symmetric (for allu, v∈V,uEvimplies_¬vEu), and hence irreflexive. A connected forest is called atree.

In this section, we employ the distinguished binary predicate t whose interpretation, in the logic_C2_[

↓], is constrained to be a forest. In order to be able to discuss both_C2_and C2_[

↓] together, we call any structureAin which the graph (A,tA_{) is a forest}_{dendral. Thus,}

C2_[ ↓] is the logic_C2 _{restricted to dendral structures. The universe of a structure}_A_{is denoted}_A_.

For technical reasons we employ a further distinguished binary predicate,s, and two distin-guished unary predicates,s+_,_s−_{. We define the}_C2_{-formula ∆, featuring these predicates, to be} the conjunction ∆0∧∆1∧∆2∧∆3, where

∆0:=∀x∀y(t(x, y)→ ¬t(y, x))∧ ∀x∃[≤1]y.t(y, x) ∆1:=_∀x∀y(s(x, y)_→t(x, y))_{∧ ∀}x∃[_≤1]y.s(x, y) ∆2:=_∀x(s+₍_x₎

→(_∃[=1]y.s(x, y)∧ ∀y¬s(y, x))) ∆3:=_∀x((_∃[=1]y.s(y, x)∧ ∀y¬s(x, y))→s−(x)).

For any finite modelAof ∆0, the relationtAis anti-symmetric and inverse functional. Hence, it is easy to see that the graphG= (A,tA_{) consists of components which are either trees or which}

containt-cycles, namely sequences of distinct elementsa0, . . . , an−1 (n≥3) such that, writing

an=a0, we haveA|=t[ai, ai+1] for all i(0≤i < n). Elements belonging to tree-components

(5)

dendral. Any elementa_∈Asuch that there is nob withA_|=t[b, a] will be called aroot. By definition, elements not incident on anyt-edge (which form isolated vertices of G) are roots. It is obvious that all roots are dendral. Of course, ifAis dendral, thenA_|= ∆0, and, moreover, every element ofAis dendral.

Assuming that ∆0 holds, ∆1 then states that the graph ofsconsists of zero or more disjoint, linear sequences oft-edges. A maximal sequencea0, . . . , an (n≥1) such thatA|=s[ai, ai+1] for

alli(0_≤i < n), will be called ans-chain. The formula ∆2 ensures thats-chains begin with all elements satisfyings+_{, while ∆3} _{ensures that}_s_{-chains end only with elements satisfying}_s−_.

Any formula with two free variables is assumed to have those variables taken in the order

x, y. Thus, we writeA_|=θ[a, b], wherea, b are elements ofA, to indicate thatθ is satisfied inAunder the assignmenta7→xandb7→y. We denote byθ(y, x) the result of transposing the variables inθ. Finally, ifξhasxas its only free variable, we denote byξ(y) the result of replacingxbyy inξ.

Let Σ be a signature of unary and binary predicates. A1-typeis a maximal consistent set of literals over Σ involving only the variablex. Likewise, a2-typeis a maximal consistent set of literals over Σ involving only the variablesxandy and containing the literalx₆=y. Ifτ is a 2-type, we denote by τ−1 _{the 2-type obtained by exchanging the variables} _x_and_y _in_τ_{, and} callτ−1 _the_inverse _of_τ_{. We denote by tp}

1(τ) the 1-type obtained by removing fromτ any literals containingy; and we denote by tp₂(τ) the 1-type obtained by first removing from τ any literals containingx, and then replacing all occurrences ofy byx. Evidently, tp₂(τ) = tp₁(τ−1_). We equivocate freely between finite sets of formulas and their conjunctions; thus, we treat 1-types and 2-types as formulas, where convenient. LetAbe any structure interpreting Σ. If

a_∈A, then there exists a unique 1-type πsuch thatA_|=π[a]; we denoteπby tpA[a] and say thata realizesπ. If, in addition, b∈ A\ {a}, then there exists a unique 2-typeτ such that

A_|=τ[a, b]; we denoteτ by tpA[a, b] and say that the paira, brealizesτ. Evidently, in that case,

τ−1_{= tp}A_[_{b, a}_{]; tp}

1(τ) = tpA[a]; and tp2(τ) = tpA[b].

The following terminology helps us to characterize configurations of pairs of elements in structures. Let θ be a quantifier-free, equality-free formula over Σ, and A any structure interpreting Σ. A θ-ray in A is an ordered pair of distinct elements _ha, b_{i ∈} A2 _{such that}

A_|=θ[a, b]; we say that the ray in question isemitted byaandabsorbed byb, or simply thata

sends aθ-ray tob. Aθ-ray-type is a 2-typeρover Σ such that_|=ρ_→θ. (Thus, aθ-ray-type is the 2-type of some possibleθ-ray.) We refer to tp₁(ρ) and tp₂(ρ) as theemission-1-type and absorption-1-type ofρ, respectively. ForY a positive integer, we say thatAhasθ-degree Y if no element ofAemits more than Y θ-rays. As an illustration of these concepts, letψbe a formula over Σ of the form (1), and supposeA_|=ψ. Settingθ:=β1∨ · · · ∨βmandY =C1+· · ·+Cm,

we see thatAhasθ-degreeY. Aθ-ray_ha, b_iis said to beinvertibleif alsoA_|=θ[b, a]. Similarly, with ray-types: a θ-ray-typeρis said to beinvertible if_|=ρ−1

→θ. A 2-typeτ is said to be

θ-dark if neitherτ norτ−1 _{is a} _θ_-ray-type.

The following two lemmas can be established by simple counting arguments.

Lemma 2. Let θ be a quantifier-free, equality-free formula over a signature Σ. Let A be a structure interpretingΣ, ofθ-degreeY, and supposeB, B0_{are subsets of}_A_{of cardinality}₂₍_Y_+1).

Then there exista_∈B,b_∈B0 _{such that}_a₆₌_b _{and tp}A_[_{a, b}_] _is_θ_-dark.

Lemma 3. Let θ be a quantifier-free, equality-free formula over a signature Σ. Let A be a structure interpreting Σ, of θ-degree Y, and suppose B, B0 are subsets of A of cardinalities (3Y + 2)and2(Y + 1)respectively. Then, for anyb0∈A, there exista∈B,b∈B0, such that

a6=b, tpA_[_{a, b}_] _is_θ_{-dark, and} _b0 _{sends no}_θ_{-ray to}_a_.

(6)

elements that arise in them. LetAbe a structure interpreting signature Σ,X a positive integer andθa quantifier-free, equality-free formula over Σ. We callAX-differentiated if, for every 1-typeπover Σ, the setAπ={a∈A|tpA[a] =π} satisfies either|Aπ| ≤1 or|Aπ|> X. We

callA: (i)θ-semichromatic if no invertibleθ-ray has the same emission- and absorption 1-type; (ii) θ-chromatic if it is θ-semichromatic and no element emits two or more invertible θ-rays having the same absorption-type as each other; and (iii)θ-superchromatic if it isθ-semichromatic and no element emits two or moreθ-rays at least one of which is invertible, having the same absorption-type as each other. Note that aθ-semichromatic structureAisθ-chromatic if there is no triple of distinct elementsb1, a, b2, with tpA[b1] = tpA[b2], such that tpA[b1, a] and tpA[a, b2] are invertibleθ-ray-types; likewise,Aisθ-superchromatic if there is no triple of distinct elements

b1, a, b2, with tpA[b1] = tpA[b2], such that tpA[b1, a] is an invertibleθ-ray-type and tpA[a, b2] a

θ-ray-type. We can write the definitions of these concepts as various types of_C2_-formulas.

Lemma 4. LetAbe a structure interpreting a signature Σ, andθ a quantifier-free, equality-free formula over Σ. There exist θ-eclipsed formulas χ−θ, χθ and χ+θ such that: (i) A is

θ-semichromatic if and only ifA_|=_∀x.χ−θ,(ii)Aisθ-chromatic if and only ifA|=∀x.χθ; (iii) Aisθ-superchromatic if and only ifA_|=_∀x.χ+

θ. All formulas have size at mostO((|θ|+|Σ|)·2|

Σ|_).

For structures with boundedθ-degree Y,θ-(super)chromaticity and X-differentiation can be ensured by expanding the signature with a suitable collection of unary predicates:

Lemma 5. Let θ be a quantifier-free, equality-free formula interpreting a signature Σ, and supposeAis a structure interpretingΣwithθ-degreeY. ThenAcan be expanded to aθ-chromatic structure over a signature extendingΣwith at most_dlog(Y2₊₁₎

enew unary predicates; moreover,

Acan be expanded to a θ-superchromatic structure over a signature extending Σwith at most dlog(2Y2_{+ 1)}

enew unary predicates.

Lemma 6 ([10, Lemma 5]). Let Abe a structure and X a positive integer. Then Acan be expanded to anX-differentiated structureA0 by interpreting at most_dlogX_eadditional unary

predicates. IfAisθ-(super)chromatic, for someθ, then so isA0.

We now construct apparatus for describing the ‘local environment’ of elements in super-chromatic structures interpreting Σ. Letθ be a quantifier-free, equality-free formula over Σ, and let the θ-ray-types be listed in some fixed order (depending on Σ and θ) as ρ1, . . . , ρM.

A θ-star-type is an (M + 1)-tupleσ=_hπ, v1, . . . , vMi, whereπis a 1-type over Σ and the vj

are cardinal numbers (not-necessarily finite) such thatvj 6= 0 implies tp1(ρj) = π for all j

(1_≤ j _≤ M). We denote the 1-type π by tp(σ). To motivate this terminology, suppose A

is a structure interpreting Σ. For any a∈ A, we define stA_θ[a] =_htpA[a], v1, . . . , vMi, where

vj = |{b ∈ A : b 6= aand tpA[a, b] = ρj}|. Evidently, stAθ[a] is a θ-star-type; we call it the

θ-star-type of ainA, and say thatarealizesstA_θ[a]. Intuitively, the θ-star-type of an element records the number ofθ-rays of each type emitted by some element. It helps to think, informally, of aθ-star-typeσasemitting a collection ofθ-rays of various types, as shown in Fig.1b). To understand the significance ofθ-star-types, consider again the formulaψgiven in (1), and again let θ := β1∨ · · ·,∨βm. If A is a structure interpreting the signature of ψ, whether A |= ψ

is determined entirely by the 2-types and the θ-star-types realized in A. More formally, we say that a 2-typeτ iscompatible with ψ ifτ_∧α_∧α(y, x) is consistent; similarly a star-type

σ=_hπ, v1, . . . , vMiis compatible withψif (i) each of the ray-types emitted byσis compatible

(7)

a ρ b tpA_[_a_{] = tp}

1(ρ) tpA[b] = tp2(ρ)

(a)

π v1

v2

vM

ρ1

ρ2

ρM ρM

(b)

Figure 1: Depiction of: (a) an elementasending a ray of typeρto an elementb in a structure

A; and (b) a star-type _hπ, v1, v2, . . . , vMi, emittingvj rays of typeρj for allj (1≤j≤M).

respectively, equal toCh(if≺h is =) or bounded byCh (if≺h is≤):

m X

j=1

{vj |1≤j ≤M and |=ρj→βh} ≺hCh

Thus,A_|=ψ just in case all theθ-star-types andθ-dark 2-types realized in Aare compatible withψ. These notions are extended to a formulaψin weak normal form in the obvious way.

Aθ-star-typeσis: (i)semichromaticif it does not emit any invertibleθ-rays with absorption type tp(σ); (ii)chromaticif it is semichromatic and does not emit any two invertibleθ-rays that have the same absorption-type as each other;superchromatic if it is semichromatic and does not emit twoθ-rays, at least one of which is invertible, that have the same absorption-type as each other. Thus, a structure interpreting Σ isθ-(semi/super-) chromatic if and only if every

θ-star-type it realizes is (semi/super-) chromatic.

We finish these preliminaries with some notation for labelling elements in structures. Let ¯

d=d1, . . . , dn be a sequence of unary predicates. For all k (0≤k <2n), we abbreviate by

¯

d_hk_ithe quantifier-free, equality-free formulaδ1∧ · · · ∧δn, where, for allj (1≤j ≤n), δj is

dj(x) if thej-th bit in then-digit binary representation ofkis 1, and¬dj(x) otherwise. We call

¯

d_hk_i(x) thek-th labelling formula (over d1, . . . , dn). Evidently, ifA={a0, . . . , aM−1}is a set of cardinalityM _≤2n_{, then we can interpret the predicates in}_d

j (1≤j≤n) overA so as to

ensure that, for allk(0_≤k < M),ak satisfies ¯dhki.

2.2 Shrubberies and their logical encodings

Turning to the logic_C2_[

↓], for the rest of this section, we assume that all signatures feature the distinguished binary predicatet. Let Σ be a signature. A 2-type over Σ containing either of the atomst(x, y) ort(y, x) will be said to bearboreal. Ashrubberyover Σ is a tripleS= (V, E, L), where (V, E) is a non-empty, finite forest andLa labelling function defined onV _∪Esuch that:

(i) for allv∈V,L(v) is a 1-type (over Σ);

(ii) for all (u, v)_∈E,L(u, v) is either a 2-type (over Σ) containing the atomt(x, y) or is the special symbol $.

We refer to any edge labelled $ as aspecial edge; all other edges areordinary. We define _|S_|, thesizeofS, to be the cardinality of the setV, and we assume thatV is enumerated in some fixed way as_{v0, . . . , v_|S|−1}. For any positive integerX, we say thatS isX-differentiatedif, for every 1-typeπover Σ, either _|L−1₍_π₎

| ≤1 or_|L−1₍_π₎

(8)

By way of motivation, we note that, ifAis dendral, then we can always construct a shrubbery

S by taking any subgraphGof the forest (A,tA_{), with vertices and edges labelled by their}

1-and 2-types, 1-and then optionally collapsing some number of linear paths inGto single edges (i.e. taking a topological minor), labelling the newly-formed edges with the special symbol, $ (Fig.2). Indeed, ifAisX-differentiated, then, by selectingGappropriately, and being careful which chains we collapse, we can ensure thatS also is X-differentiated. This observation is formalized in Lemma7.

Suppose thatS= (V, E, L) is a shrubbery over Σ. We define a formula ∆S_{= ∆}S

1 ∧ · · · ∧∆S8 encoding S. Our formula features collectionsp1, . . . , pn, s1, . . . , sn of new unary predicates,

wheren=_dlog(_|S_|+ 1)_e, in addition to the predicatest, s, s+ _and_s− _{featured in ∆. Recall}

that ¯p_hk_i denotes the k-th labelling formula (0 _≤k <2n_{) over} _p

1, . . . , pn, and similarly for

¯

s_hk_i. For ease of reading, we present the conjuncts of ∆S _{using their English glosses under}

the (helpful) assumption that the formula ∆ is true; of course, however, these are really C2_{-formulas in weak normal form (modulo trivial logical manipulation). Thus, for instance, we} have ∆S1 ≡

V|S|−1

k=0 ∃[=1]x.p¯hki; ∆S2–∆S8 are purely universal. The complete list is as follows.

∆S

1: For 0≤k <|S|, there exists a unique element satisfying ¯phki, intuitively, thek-th element in the enumeration ofV.

∆S₂: No element corresponding to a root of the forest (V, E) has any incomingt-edges.

∆S3: The element corresponding to any vertex ofV has the 1-type given to that vertex byL, and the elements corresponding to any ordinary edge ofE have the 2-type given to that edge byL.

∆S

4: Any vertexusuch thathu, viis a special edge corresponds to an element satisfyings+ and hence starts ans-chain.

∆S

5: For 0 ≤ k < |S|, if the first element of an s-chain satisfies ¯phki, then all subsequent elements satisfy ¯s_hk_i.

∆S₆: If (vi, vj) is a special edge ofS, then any s-chain with non-initial elements satisfying ¯shii,

has final element satisfying ¯phji.

∆S

7: The only 1-types realized in the structure are those labelling the vertices ofS, while the only arboreal 2-types realized in the structure are those labelling the ordinary edges ofS.

∆S

8: The only 1-types realized more than once in the structure are those labelling more than one vertex ofS.

Observe that ∆S4, ∆S5 and ∆S6 together ensure that vertices ofSlinked by special edges correspond to elements ofAjoined bys-chains.

2.3 The reduction

We present a non-deterministic procedure,FinSatF(ψ), for determining whether a given weak normal-form_C2_[

↓]-formula,ψ, has a finite, dendral model. Sinceψ is in weak normal form, we may write it as

ϕ∧

` ^

g=1

∃[./gBg]x. ξg

(9)

where ϕ is a normal-form formula with multiplicity m, ceiling C, and modulus θ0_{. Define}

Y = mC+ 1 and X = 3Y + 2, and let Σ− be the signature of ψ (which we may assume contains the distinguished predicatet), and let Σ be Σ−together with_dlog(2Y2_{+ 1)}

e+_dlogX_e

fresh unary predicates. Letθ:=θ0_∨_t₍_{y, x}_{), and let}_χ+

θ be the formula of Lemma 4encoding

the property ofθ-superchromaticity over Σ. Recall the sentence ∆ governing the predicates

s, s+ _and_s− _{(which, we may assume, do not belong to Σ). A simple check shows that the}

formulaψS :=ψ∧ ∀x χ+θ ∧∆∧∆

S _{is in weak normal form with modulus}_θ

∨s(x, y)_∨s(y, x). (The additional disjuncts are required by ∆.) LetFinSat(_·) be the procedure for testing finite

satisfiability of a_C2-formula in weak normal form, as guaranteed by Theorem3. The procedure FinSatF(ψ) consists of two steps:

1. Non-deterministically guess an X-differentiated shrubbery S over Σ of size at most 5(X_·2|Σ|+ 24|Σ|), and compute the formula ψS:=ψ∧ ∀x χ+θ ∧∆∧∆

S_{, in weak normal}

form.

2. RunFinSat(ψS) and report the result.

Recall thatFinSatruns in time bounded by an exponential function of the effective size of its argument.

2.4 Correctness: direction 1

We show that, ifψ has a finite, dendral model, then the procedureFinSatF(ψ) has a successful run. We begin with a property secured by the formula ∆_∧∆S_.

Lemma 7. If Ais a finite,X-differentiated dendral structure interpretingΣ, then there exists anX-differentiated shrubberyS overΣ, of size at most 5(X_·2|Σ|_{+ 2}4|Σ|_{), such that}_A_{can be}

expanded to a model A+

|= ∆_∧∆S_.

Proof. Let L = 2|Σ| _{be the number of 1-types realized in} _A_{, and} _M _≤₂4|Σ| _{the number of}

arboreal 2-types realized inA. For every 1-type πrealized inAexactly once, mark the unique element satisfying π. For every 1-type π realized in A more than once, select X elements satisfyingπ, and mark them. For every arboreal 2-typeτ realized inA, pick distinct elements

a, brealizing it (in either direction), and mark those elements. Let V0 be the set of marked elements; bearing in mind that arboreal 2-types are never equal to their inverses, it is enough to pick_|V0| ≤XL+M elements. Let W be the set of elements of V0 together with all their ancestors in the forest (A,tA_{), and let}_F _{be the restriction of}_tA _to_W_{. Thus,}_H _{= (}_{W, F}_{) is}

a forest in which every leaf vertex is inV0, so thatH has at mostXL+M branches. Let V1 be the set of vertices inW having at least two daughters in H; thus, _|V1| ≤XL+M. Let

V2 be the set of daughters of any vertex ofV0∪V1 inH; thus|V2| ≤ |V0∪V1|+ (XL+M). LetV =V0∪V1∪V2; thus|V| ≤5(XL+M)≤5(X·2|Σ|+ 24|Σ|). LetV be enumerated as

v0, . . . , v|V|−1. Observe that, for allw∈W \V,whas exactly oneF-predecessor and exactly

oneF-successor. That is: the elements ofW\V correspond to linear strands in the forestH. Remembering thatV ⊆A, for every v∈V, defineL(v) = tpA_[_v_{]. Let}_E

0 be the restriction of F to V, and, for any edge_hu, vi ∈ E0, defineL(u, v) = tpA[u, v]. By construction of the graphH, L(u, v) contains the atomt(x, y). LetE1 be the set of ordered pairshu, vifromV such that there exists a path in the forestH of the formu=a0, . . . , am=v (m≥2) such that

for alli(1_≤i < m),ai ∈W \V. In that case, call the sequence of elements{a0, . . . , am} a

special chain, denotedS(u), and we defineL(u, v) = $. In other words, special chains are linear strands in the forestF leading from one element ofV to another, having no elements of V

(10)

•

ph0i s+

◦

•

ph4i s−

•

ph5i

•

ph8_i

• ph6i

• ph9_i

• ph7i

• ph10_i

• ph1i

• ph3i

• ph2i

A+

s

•

v0 _s+

•

v4 s−

•

v5

•

v8

• v6

• v9

• v7

• v10

• v1

• v3

• v2

S= (V, E, L)

Figure 2: Extraction of a shrubberyS from a dendral structureAand its encoding inA+_(proof of Lemma7). White nodes represent elements ofW \V, in this case forming a singles-chain.

shrubbery. Fig2shows a small example. Observe that the edges inE0 are ordinary edges ofS (labelled with 2-types), and those inE1,special edges(labelled with $).

Recalling the enumeration v0, . . . , v|V|−1 of V, we proceed to expand A to a model A+ satisfying ∆S_{. First we interpret the predicates} _p

1, . . . , pn so that vk satisfies the labelling

formula ¯p_hk_ifor allk(0_≤k <_|V_|), and so that all elements ofA_\V satisfy ¯p_h|V_|i. Recalling that each special edge (u, v) corresponds to a special chainu=a0, . . . , am=v, we take u=a0 to satisfy the unary predicates+_, _v ₌ _a

m to satisfy the unary predicate s−, and each pair

hai, ai+1i(0≤i < m) to satisfy the binary predicate s. Finally, we interpret the predicates

s1, . . . , sn so that each elementai (1≤i≤m) satisfies the labelling formula ¯shki, whereu=vk.

Thus, the ¯sindex of each element in a special chain (except the first) equals the ¯p-index of the first element. All elements ofAwhich are not members of special chains can be taken to satisfy ¯

s_h|V_|i. It is straightforward to check thatA+_|= ∆_∧∆S.

We are now in a position to show that, if ψ, as given in (3), has a finite, dendral model, then the procedure FinSatF(ψ) has a successful run. SupposeA− _|= ψ, withA− finite and dendral, interpreting the signature Σ−. Let Y = mC+ 1 and X = 3Y + 2. Since A− _|=ϕ,

A− hasθ-degree at mostY. By Lemmas5and6, we can expandA− to a θ-superchromatic,

X-differentiated structureAover Σ. Observe that Σ has the requisite number of spare unary predicates. By Lemma4,A_|=_∀x.χ+

θ. By Lemma7, letS be anX-differentiated shrubberySof

size at most 5(X_·2|Σ|_{+ 2}4|Σ|_{), such that}_A_{can be further expanded to a model}_A+

|= ∆_∧∆S_.

SinceψS has a finite model,FinSat(ψS) has a successful run, and so therefore doesFinSatF(ψ).

2.5 Correctness: direction 2

We show that, if the procedureFinSatF(ψ) has a successful run, thenψ has a finite, dendral model. Recall that, in a finite modelA_|= ∆0, an element is said to be dendral if it belongs to a tree-component of the graph (A,tA_{). Let}_X _{be a positive integer. We say that a finite model} A_|= ∆0 isX-viableif:

(i) every 1-type realized inAis realized by a dendral element;

(ii) every 1-type realized by more than one element in Ais realized by at least X dendral elements; and

(iii) every arboreal 2-type realized inAis realized by a pair of dendral elements.

Lemma 8. LetA+ _{be a finite model of} _∆

∧∆S_{, where} _S _{is an}_X_{-differentiated shrubbery over}

(11)

b′

·

a′

a

·

b

rewiring

b′

·

a′

a

·

b

(a) Noninvertible case

b′

·

a′ ·

a

b

rewiring

b′

·

a′ ·

a

b

(b) Invertible case

Figure 3: Model rewiring

Sketch proof. Write S= (V, E, L), enumerateV as_{v0, . . . , v|S|−1}, and letA be the domain ofA+. SinceA+_|= ∆S1, there exists, for eachk(0≤k <|S|), a uniquebk ∈Asatisfying ¯phki.

Thus, the mapι:vk 7→bk is an embedding ofV inA. The main idea of the proof is to show

that every element in the image ofι is dendral, making essential use of the formulas ∆, ∆2, ∆3, ∆4, ∆5and ∆6. TheX-viability ofA+ then follows almost immediately from ∆S1, ∆S3, ∆S7, ∆S8

and the assumedX-differentiation ofA.

Lemma 9. Let Σ be a signature, θ a quantifier-free, equality-free formula over Σ, and Aa finite,θ-superchromatic structure of θ-degreeY, interpreting Σ. LetX = 3Y + 2, and suppose

A is X-viable and contains at least one t-cycle. Then there exists an X-viable structure A0

interpretingΣover the same domain, realizing the same 2-types and the sameθ-star-types, and containing fewert-cycles.

Proof. Let _ha0_{, b}0_i _{be any edge of some} _t_{-cycle in} _A_{, and let} _π _{= tp}A_[_a0_], _π0 _{= tp}A_[_b0_{], and}

τ= tpA_[_a0_{, b}0_{]. We proceed to break the}_t_{-cycle containing}_a0 _and_b0_{, thus rendering all of its}

elements dendral. SinceA_|=t[a0_{, b}0_{] and}_t₍_{y, x}_{) is a conjunct in}_θ_,_τ−1 _{is necessarily a}_θ_-ray; thisθ-ray may be invertible or non-invertible.

We consider first the case where τ−1 _{is non-invertible. Since} _A_is _X_-viable,_π_and_π0 _are

realized more than once inA, and so must be realized at leastX times by dendral elements. SettingB to be the set of dendral elements of 1-typeπandB0 _{the set of dendral elements of}

1-typeπ0_{, by Lemma}₃_{, there exist dendral elements}_a_, _b_{, such that tp}A_[_a_{] =}_π_{, tp}A_[_b_{] =}_π₀_,

tpA[a, b] isθ-dark, andb0sends noθ-ray toa. Thus, tpA[a, b0] is either a non-invertibleθ-ray-type or isθ-dark. Indeed, sinceais dendral, butb0 is not, tpA[a, b0] is not arboreal. Thus, we may define the structureA0 to be exactly likeAexcept that

tpA0[a, b] = tpA[a, b0] tpA0[a, b0] = tpA[a0, b0] tpA0[a0, b0] = tpA[a, b],

as illustrated in Fig.3a. It is immediate thatA0 andArealize the same 2-types, and clear by inspection of Fig.3athat the star-type of every element is the same inA0 as inA. On the other hand, thet-cycle containinga0 _and_b0 _{has been broken in}_A0_{, and all its elements have become}

dendral.

We consider next the case where τ−1_{—and hence} _τ _{= tp}A_[_a0_{, b}0_{]—is an invertible} _θ

-ray-type. Since A is X-viable, let a, b be dendral elements such that tpA_[_{a, b}_{] =} _τ_{. Since} _A _is

θ-superchromatic, we know that tpA_[_{a, b}0_{] and tp}A_[_a0_{, b}_{] are}_θ_{-dark. In that case, we simply set}

tpA0[a, b] =tpA[a, b0] tpA0[a, b0] =τ

tpA0[a0, b] =τ tpA0[a0, b0] =tpA[a0, b],

(12)

other hand, thet-cycle containinga0 _and_b0 _{has again been broken in}_A0_{, and all its elements}

have become dendral.

We are now in a position to show that, if the procedure FinSatF(ψ) has a successful run, thenψhas a finite, dendral model. For supposeS is the shrubbery guessed in the first step: letA+ _{be a finite model of}_ψ

S, and letAbe the reduct ofA+ to Σ. SinceA+|= ∆∧∆S and

S is, by assumption, (3Y + 2)-differentiated, it follows by Lemma8thatAis (3Y + 2)-viable. SinceA+ _|=_∀x χ+θ, Ais alsoθ-superchromatic. Now apply Lemma 9to obtain the structure A0. Since this process leaves theθ-star-types of elements unchanged, and never causes dendral elements to become non-dendral,A0 is a θ-superchromatic, (3Y + 2)-viable structure. Thus, we may continue this process until we obtain a structureA∗containing no cycles at all. ThenA∗is the desired dendral model ofψ.

This shows that FinSatF(ψ) yields the correct result. To analyse the running time, let the multiplicity and ceiling of ψ bem andC, respectively, and let the signature of ψ be Σ. Examination of the size of the formulaψS then establishes the following lemma.

Lemma 10. Let ψ be a weak normal-form _C2[_↓]-formula with multiplicity m and ceiling C

over signatureΣ. We can non-deterministically compute a_C2-formulaψS in weak normal-form

with multiplicity mS and ceiling CS over signature ΣS, such that: (i) ψ has a finite dendral

model if and only if, for some run of the computation, ψS has a finite model; (ii) |ψS| is at

most_|ψ| ·2O(|Σ|)₍_mC₎O(1)_; _(iii) _m

S=m+ 2 andCS =C; (iv)|ΣS| isO(|Σ|+ log(mC)). The

computation ofψS requires time polynomial in |ψS|.

Lemmas1 and10and Theorem3imply the first part of Theorem 1.

2.6 Generalization to two forests

The above argument can be unproblematically extended to the logic _C2_[

↓1,↓2], where two distinguished predicates,t1 andt2 are, required to be interpreted as forests.

Lemma 11. The finite satisfiability problem for the logic_C2_[

↓1,↓2]is in NExpTime.

In the proof of this lemma we use the following well-known fact about graph-colouring.

Lemma 12. IfG= (A, E)is a directed graph with out-degreem, then the underlying undirected graph ofGhas a proper(2m+ 1)-colouring.

Sketch proof of Lemma11. Here, we sketch the principal differences to_C2[_↓]. Call any structure in which the predicates t1 and t2 are interpreted as forests dendral. Given a formula ψ of C2_[

↓1,↓2] in weak normal-form, we again construct a shrubbery together with a C2-formula

ψS =ψ∧ ∀x χ+θ ∧∆∧∆

S_{, such that}_ψ _{has a finite dendral model if and only if}_ψ

S has a finite

model. The notion of a shrubbery is modified so that it is the union of two (not necessarily disjoint) forests. In the case whereAhas a finite dendral model, a shrubbery is obtained as a subgraph of the coloured graph (A,tA

1,tA2), including sufficiently many ordinary edges, and with any very long connecting strands contracted to special edges. (We now need two types of special edges: one for each forest.) The formulas ∆ and ∆S _{are modified in the obvious way. In}

particular, the conjunct ∆0 of ∆ states that botht1 andt2 are irreflexive and inverse functional. The key idea behind the construction, given in Lemma9, then proceeds almost identically to the case_C2_[

(13)

for each i. That is, either a0, . . . , an−1 is a t2-cycle or an−1, . . . , a0 is. This means that we can break both thet1-cycle and thet2-cycle simultaneously at the elementsai andai+1, using

the argument of Lemma 9, which works unproblematically. If, on the other hand, for any

i (0_≤ i < n), A_6|= t2[ai, ai+1] and A6|= t2[ai+1, ai], then we can break the t1-cycle at that

point, takinga0₌_a_i_,_b0₌_a_i₊₁ _and_τ _{= tp}A_[_a0_{, b}0_{], again selecting dendral elements to absorb}

the relevantt1-ray. We must show that, in doing so, we create not2-cycles. If tp[b0_{, a}0_{] is an}

invertible θ-ray-type, then we select dendral elements a, b such that tp[a, b] = tp[a0_{, b}0_{]. By}

θ-superchromaticity, tpA0[a, b0] and tpA0[a0, b] areθ-dark, and so we may set

tpA0[a, b] =tpA[a, b0] tpA0[a, b0] =τ

tpA0[a0, b] =τ tpA0[a0, b0] =tpA[a0, b],

exactly as in the argument of Lemma9(Fig. 3b).

If, on the other hand, tp[ai+1, ai] = tp[b0, a0] is a non-invertibleθ-ray-type, a complication

arises. Following Lemma 9 (Fig. 3a), we wish to select dendral elements a, b such that tpA[a] = tpA[a0_{], tp}A_[_b_{] = tp}A_[_b₀_{], and tp}A_[_{a, b}_{] is}_θ_{-dark, and set}

tpA0[a, b] = tpA[a, b0] tpA0[a, b0] = tpA[a0, b0] tpA0[a0, b0] = tpA[a, b].

This does indeed have the desired effect of eliminating at1-cycle without introducing anyt2-cycle, providedb0 _{is not the}_t_{2-mother of}_a_{, that is, provided}_A_6|₌_t_2[_b0_{, a}_{]. This is important, because}

bmight be at2-ancestor ofa, in which case the assignment tpA0[a, b] = tpA[a, b0_{] would create a}

t2-loop.

Fortunately, we can easily prevent this situation from arising. Suppose that the_C2[_↓1,↓ 2]-formula ψ has a dendral model AA, and consider the graph (A, E) where E is the set of

ordered pairs_ha, cifor which there existsb∈A such that eitherA_|=t1[c, b] and A_|=t2[b, a] orA_|=t2[c, b] and A|=t1[b, a]. Since each node has at most one mother in each forest, this graph has degree at most 2, and so can be properly 5-coloured by Lemma12. Now letp1, . . . , p5 be fresh predicates encoding these colours, and letζ be a two-variable formula saying that the graph (A, E) is 5-coloured. Then, if ψ is replaced by ψ_∧ζ, it can never happen that there is a triple of elementsa, b0_, _a0 _with_A_|₌_t

1[a0, b0], A|=t2[b0, a] and tpA[a0] = tpA[a], and the construction of Lemma9goes through. On the other hand,ψ_∧ζ has a dendral model if and only ifψhas, which proves the theorem.

This is the second part of Theorem1. This argument does not work for three distinguished predicates t1, t2, and t3. The reason is that a t1-cycle may be composed entirely of t2- and

t3-edges that do not form parts oft2- andt3-cycles. In this case, the swapping construction of Lemma9, when used to eliminate at1 cycle, may create newt2- andt3-cycles. The argument then fails. It is not known whether the finite satisfiability problem for the logic_C2_[

↓1,· · ·,↓k] is

decidable for anyk_≥3; similarly for the satisfiability problem.

3 The general satisfiability problem

The purpose of this section is to prove Theorem 2: the satisfiability problem for_C2_[ ↓] is in

NExpTime. We proceed by reduction to the finite satisfiability problem forC2[↓]. For the

remainder of this section, we fix a _C2_[

(14)

dlog(2(mC+ 1)2_{+ 1)}

e+_dlog(3(mC+ 1))_eadditional unary predicates. Thus, by Lemmas5and6, any model ofϕinterpreting Σ0can be expanded to aθ-super-chromatic, 3(mC+ 1)-differentiated model ofϕinterpreting Σ. Henceforth, all 1-types and 2-types are to be understood as 1-types and 2-types over Σ. Moreover, since the formulaθwill not vary, we speak ofθ-ray-types,θ-dark 2-types,θ-star-types,θ-chromatic structures etc. simply asray-types, dark2-types, star-types, chromaticstructures etc.

Overview of the decision procedure. We reduce the general satisfiability of _C2_[

↓] to its finite satisfiability. The main idea is to explore regularity in infinite models of_C2_[

↓] formulas. Elements realizing star-types that occur only finitely often in the model, all their ancestors in the forest and elements absorbing rays from them are members of a finite “irregular” part called theinitial segment. The remainder of the model is represented as a finite, cyclic graph of star-types, thestar chart. The initial segment and rays connecting its elements with the remainder constitute a finite structure which is described by a translated_C2[_↓] formula. The translation involves reasoning about 1- and 2-types and its size is exponential in the size of an input formula. However, its effective size is again polynomial, which allows us to use the decision procedure for finite satisfiability of_C2_[

↓] in the previous section as a black box.

3.1 Star charts

We begin with some technical machinery for reasoning about the star-types realized infinitely often in models of_C2[_↓]-formulas. Recall that an arboreal ray-type is one containing either of the atomst(x, y) ort(y, x). Because we will be particularly concerned in the sequel with the directions oft-edges, the following terminology will be useful. Ifρis a ray-type containing the atomt(x, y), we callρBoreal, and ifρis a non-invertible ray-type containing the atom t(y, x), then we callρAustral. We use the termsBoreal rayandAustral rayin the obvious sense. Note that, since_|=t(y, x)_→θ, Boreal ray-types are necessarily invertible.

Ifσ is a star-type andρan invertible ray-type, we writeσ ρifσemits a ray of typeρ, andρ σifσemits a ray of typeρ−1 _{(or, as we might say: if}_σ_absorbs_{a ray of type}_ρ_{). If}_σ andσ0 are chromatic star-types, then there can be at most one invertible ray-typeρsuch that

σ ρandρ σ0_{. In that case we write}_σ₋_→ρ _σ0_{. If}_σ_,_σ0_,_σ00 _{are chromatic star-types and}_ρ_,_ρ0

are distinct invertible ray-types such thatσ00₋_→ρ _σ_and_σ00_−→ρ0 _σ0_{, we write}_σ00_→₍_σ_;_σ0_{). Notice}

that, in this case,σ,σ0 andσ00 must be distinct, by the chromaticity ofσ00. Astar chartis a set Ω of chromatic star-types with the property that, ifσ∈Ω andσ ρfor some invertible ray-typeρ, then there existsσ0 _∈_{Ω such that}_σ₋_→ρ _σ0_{. As we might say: star-charts absorb all}

the invertible ray-types they emit. A star-chart may be regarded as a directed graph in the following way: the vertices are star-types, and the edges are theBoreal ray-types which those star-types emit or absorb. Notice that the edges of this graph are directed by the predicate

t, not by the directedness of the ray-types: all ray-types labelling the edges of this graph are Boreal and hence by definition invertible.

Where a star chart Ω is clear from context, we writeσ_⇒σ0 _{if, for some}_m_≥_{0, there exists}

a sequence σ0, . . . , σm of star-types from Ω and a sequence ρ0, . . . , ρm₋1 of Boreal ray-types such thatσ=σ0,σ0 =σm andσi

ρi

−→σi+1 for alli(0≤i < m). Likewise, we writeσVσ0 if either: (i)σ_⇒σ0 _and_σ0_⇒_σ_{; or (ii) there exist}_σ00_{, σ}

a, σb∈Σ such thatσ⇒σ00,σ00→(σa;σb),

σa ⇒σandσb⇒σ0. Thus, whileσ⇒σ0 states that there is a path in Ω fromσtoσ0,σVσ0

states that there is a path in Ω from σ to itself which proceeds either via σ0_{, or via some}

(15)

σ _σ0 σ σ σ00 σa

σb

σ

σ0

Figure 4: The relationσVσ0 in a star chart.

Let Ω be a star chart. Atrek(through Ω) is a (possibly infinite) Ω-labelled treeT = (V, E, L), whereL:V _→Ω is the labelling function, such that, for every vertexvofT: (i) if (v, w)_∈E, then there exists a Boreal ray-typeρsuch that L(v)₋_→ρ L(w), and (ii) for every Boreal ray-type

ρemitted byL(v), there exists (v, w)_∈E such thatL(v)₋_→ρ L(w). IfT is a trek andσis the label of the root ofT, we say that theorigin of T isσ. Intuitively, a trek through Ω with origin

σis a tree of the possible routes that can be taken through the star-chart Ω starting atσ. A labelled tree satisfying only condition (i) is called apartial trek. A union of disjoint treks is called amulti-trek. Note that, for a star-chart Ω, there may be many treks with originσ∈Ω. This is because, ifρis a Boreal ray-type such thatσ ρ, there may be more than oneσ0∈Ω such thatρ σ0. Thus, we have more than one choice as to how to unfold the trek at this point, and similarly for the Boreal ray-types emitted by whicheverσ0 we choose.

Lemma 13. IfΩis a star chart andσ_∈Ω, then there exists a trek throughΩwith origin σ. In fact, any partial trek throughΩ can be extended to a trek.

Lemma 14. IfΩ is a star chart,Ξ0 _⊆Ω and σ∈Ω such that, for allσ0∈Ξ0, σVσ0, then there exists a trekT through Ωin which, for everyσ0∈Ξ0, there are infinitely many vertices labelled withσ0.

Lemma 15. Let Ωbe a star chart,σ, σ0 _∈_Ω_and _T _{a trek through}_{Ω. Suppose}_B _{is an infinite}

branch ofT such that (i)every vertex ofB has a descendant inT labelledσ0_{, and} _(ii)_σ _occurs

infinitely often in B. ThenσVσ0_.

Lemma 16. LetΩ be a star chart and ρa Boreal ray-type. Let Ωρ =_{σ_∈Ω_|ρ σ_}, and Ω?ρ ={σ∈Ω|for someσ0∈Ωρ,σ0Vσ}. If Ωρ 6=∅, then there is a trek T through Ω with originσ∗∈Ωρ such that, for allσ∈Ω?ρ,T has infinitely many vertices labelled with σ.

3.2 The reduction

With these preliminaries behind us, we present our promised non-deterministic exponential time Turing reduction of the satisfiability problem for_C2_[

↓] to the finite the satisfiability problem for C2_[

↓].

Recall thatϕis a_C2[_↓]-formula of the form (1) over a signature Σ0 with multiplicitymand ceiling C, thatθ is the formulaWm_h₌₁βh∨t(y, x), and that Σ is the signature ofϕtogether

with_dlog(2(mC+ 1)2+ 1)_e+_dlog(3(mC+ 1))_eadditional unary predicates. Recall also the formulaχθ stating that a Σ-structure is (θ-) chromatic. Let Ω be a set of star-types and ρan

invertible ray-type. As in Lemma16, we write Ωρ _{for the set of star-types in Ω that absorb a}

ray of typeρ. We write Ω◦ for the set of star-types in Ω that absorb noBoreal ray. We write Inv(Ω) for the set of invertible ray-types absorbed by some star-type in Ω, and Bor(Ω) for the set of Boreal ray-types absorbed by some star-type in Ω. Thus, Bor(Ω)_⊆Inv(Ω). Finally, we write tp(Ω) to denote the set_{tp(σ)_|σ_∈Ω_} of 1-types of the star-types in Ω. Aparameter set forϕis a tuple X=_hΩ,Ξ,ΠS,ΠNi, where Ω is a star chart, Ξ⊆Ω\Ω◦, and ΠS, ΠN, disjoint

(16)

(I1) for allσ_∈Ω, either there existsσ0 ∈Ω◦ such that σ0⇒σ, or there existsσ0∈Ξ such thatσ0Vσ;

(I2) ifσ_∈Ω andσ ρ, then tp₂(ρ)_∈ΠN ∪ΠS;

(I3) tp(Ω)_⊆ΠN;

(I4) ifσ_∈Ω andπ_∈ΠS, thenσemits at most one ray with absorption typeπ;

(I5) for allσ∈Ω and allπ∈ΠS, eitherσemits a non-invertible ray with absorption typeπor

there exists a dark 2-typeτ, compatible withϕ, such thatτ includes both tp(σ) andπ(y);

(I6) Every star-type in Ω is compatible withϕ_∧∆0 (and in particular emits at most one ray containing the atomt(y, x)).

By way of motivation, it helps to think of the various components of a parameter set in terms of a putative modelA_|=ϕ. Here, Ω is the star-chart consisting of those star-types that are realized infinitely often inA, while ΠS and ΠN are the sets of 1-types realized, respectively,

uniquely and more than once, inA. The construction of Ξ is more complicated; however, the idea can be explained roughly as follows. Consider the graphG= (A,tA_{). Now consider the}

subgraphH ofGobtained by removing a certain finite initial segment containing all elements whose star-types are realized only finitely many times, and then dropping all Austral rays (i.e. retaining only the Boreal rays). Thus,H is also a forest—which we might call theBoreal forest of A—every component of which is therefore a tree. Note that H may have infinitely many components. The initial segment is chosen in such a way that all star-types realized by the elements ofH are realized infinitely often inA. Of particular interest, however, are those star-types which are realized infinitely often inA, but in only finitely many components of H: these will need special treatment in our construction. The star-types in Ξ are the roots of these finitely many components ofH.

We will be working with the signature Σ ofϕtogether with a fresh unary predicate init. If

Ais a structure interpreting Σ_{∪ {}init_}, we call the set initA_⊆Atheinitial segment, and we call any ray from an element in the initial segment to an element outside the initial segment afrontierray. LetX =_hΩ,Ξ,ΠS,ΠNibe a parameter set for ϕ, then. We define a formula

ϕX:=χθ∧ψ0∧ · · · ∧ψ7 over the signature Σ∪ {init}, whereχθis as in Lemma 4, andψ0–ψ7,

may be glossed in English as follows.

ψ0: Every realized 2-type is compatible withϕ, and every star-type realized by an element of the initial segment is compatible withϕ.

ψ1: For everyρ∈Bor(Ξ), there is a frontier ray of this type.

ψ2: The star-types in Ω are able to absorb all the invertible frontier rays.

ψ3: If_ha, b_isatisfytandb is in the initial segment, then so isa.

ψ4: Every 1-type in ΠS is uniquely realized and is realized in the initial segment.

ψ5: Every 1-type in ΠN is realized at least 3(mC+ 1) times in the initial segment.

ψ6: Elements realizing 1-types in ΠS do not emit frontier rays.

ψ7: The 1-type of any element outside the initial segment is consistent with some star-type in Ω.

(17)

A0

T T∗

σ0∈Ω◦

Tσ

Boreal frontier rays privileged rays

Tρ Ta,a0

Figure 5: Constructing a modelBofϕfrom a finite model of ϕX (Lemma17).

Lemma 17. LetX be a parameter set for ϕ. If ϕX has a finite dendral model, thenϕhas a

dendral model.

Proof idea. SupposeA+_{is a finite dendral model of}_ϕ

X, whereX =hΩ,Ξ,ΠS,ΠNi. We build a

structureBover a domain consisting of the initial segmentA0ofA+together with the vertex-set

V of an infinite multi-trek T through Ω. To construct T, we take all star typesσ_∈ Ω and build treks originating with star-typesσ0(whose existence is guaranteed by condition(I1)) that either do not absorb Boreal rays (and thus come from Ω0_{) or do absorb a Boreal ray (and thus} come from Ξ). The former lead to the multi-trekT∗ _{in Fig.}₅_{. The latter, by formula}_ψ₁_{, are}

connected by Boreal frontier rays to the initial segment. We call these rays privileged; they lead to treksTρin Fig.5. The conditions(I1)–(I6) and the conjunctsψ0–ψ7 ensure that the rest

of the modelB can be built. In particular, for all remaining Boreal frontier rays we are able to take the emitting nodea, add a fresh nodea0 toV and construct a trekTa,a0. OverA₀,B

is the Σ-reduct ofA+_{, and thus 1-types of elements and 2-types of element pairs in}_A

0are all defined. OverV, we define Bin such a way that elements realize the star-types with which they are labelled in the multi-trek. In particular, condition(I1)ensures that every star type

σ_∈Ω is indeed realized infinitely often in B, as it is realized inT∗ _{or in}_T

ρ, for some Boreal

rayρ. This allows to establish the required invertible (non-arboreal) ray-types between elements ofT and invertible (both Boreal and non-arboreal) ray-types between elements ofA0andT. Austral ray-types betweenT andA0are all originating inT∗ and can be assigned an absorbtion site inA0, as every 1-type in ΠS∪ΠN is realized there. Non-invertible non-arboreal ray-types

betweenA0 andT can be set using the well-known cyclic construction. The remaining pairs of elements ofBcan be assigned dark 2-types. All 2-types assigned as well as star-types ofB are compatible withϕ.

Lemma 18. If ϕhas a dendral model, then there exists a parameter setX such that ϕX has a

finite, dendral model.

Proof idea. SupposeAis a dendral model ofϕ. We select a finite subsetA0 consisting (roughly) of those elements ofAwhose star-types are realized only finitely often, together with all their ancestors in the graph (A,tA_{). This subset is the initial segment of the constructed model} _B_.

(18)

defined in this way does not satisfy(I1), we recover this property by an appropriate expansion ofA0. Let ΠS and ΠN be the sets of 1-types realized inA, respectively, once and more than

once. We putX =_hΩ,Ξ,ΠS,ΠNi. We then build a finite dendral modelBofϕXover a domain

consisting ofA0 together with finitely many representatives of the star-types in Ω.

Lemma 19. LetX be a parameter set forϕ. IfϕX has a finite dendral model, then there exists

a parameter setX0 such that ϕX0 =ϕ_X andX0 is exponential in |ϕ|.

Proof idea. LetX =_hΩ,Ξ,ΠS,ΠNi. We begin with some definitions. Ajunction is any

star-typeλsuch that 1) there are at most fourθ-ray-typesρsuch that λ ρ, 2) among them there are at most two Boreal ray-types, at most one ray-type whose inverse is a Boreal ray-type and at most one other (i.e. non-arboreal) invertible ray-type. A junction is intended to represent some selected subset of information that an ordinary star-type captures. The information in question concerns: (i) an invertible ray-type connecting a node to its mother in a tree (a type whose inverse is a Boreal ray-type), provided that the mother exists, (ii) Boreal ray-types connecting the node to one or two of its daughters, and (iii) a ray-type connecting the node to another node in the structure. Ifσis any star-type compatible withϕandλis a junction, we say thatσ

is anexpansion of λif 1) for everyρsuch thatλ ρwe haveσ ρand 2) ifσ τ whereτ−1 is a Borealθ-ray-type thenλ τ. Note that there are at most exponentially many junctions over Σ, in contrast to the number of ordinary star-types compatible withϕ, which is doubly exponential in_|ϕ_|.

Ifλis a junction and Ψ is a set of star-types, define Ψλ={σ∈Ψ|σis an expansion ofλ}.

Recalling now the sets of star-types Ξ and Ω featured inX, from each non-empty Ξλ, where

λis a junction, select one representative and define Ξ0 as the set of all these representatives (asλ varies over all junctions); similarly, from each non-empty Ωλ select one representative

and and define Ω0 as the set consisting of all these representatives and of all elements of Ξ0. DefineX0 ₌_h_Ω0_,_Ξ0_,_Π_S_,_Π_N_i_{. As there are exponentially many junctions, both Ω}0 _{and Ξ}0 _are

exponential in _|ϕ_|. Thus X0 _{is also exponential in} _|_ϕ_|_{. It is a routine to show that} _X0 _{is a}

parameter set andϕX0 =ϕ_X.

This yields the sought-after procedure, SatF(ϕ), for determining the satisfiability of a normal-form_C2[_↓]-formulaϕ. The procedureSatF(ϕ) consists of two steps:

1. Non-deterministically guess a parameter setX of size exponential in_|ϕ|, and compute the formulaϕX in weak normal form.

2. RunFinSatF(ϕX) and report the result.

Recall that multiplicity ofϕismand ceiling ofϕisC. Examination of the construction ofϕX

establishes:

Lemma 20. We can non-deterministically compute a_C2_[

↓]-formulaϕX in weak normal-form

with multiplicitym and ceilingC over a signature ΣX, such that: (i) ϕhas a dendral model

if and only if, for some run of the computation, ϕX has a finite dendral model; (ii) |ϕX| is

O _|ϕ_{| · |}Σ_{| ·}2O(|Σ|)

; (iii) _|ΣX| is O(|Σ|+ log(mC)). The computation of ϕX requires time

polynomial in_|ϕX|.

(19)

References

[1] Bartosz Bednarczyk, Witold Charatonik, and Emanuel Kieronski. Extending two-variable logic on trees. In Valentin Goranko and Mads Dam, editors,26th EACSL Annual Conference on Computer

Science Logic, CSL 2017, August 20-24, 2017, Stockholm, Sweden, volume 82 ofLIPIcs, pages

11:1–11:20. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017. URL:https://doi.org/10. 4230/LIPIcs.CSL.2017.11,doi:10.4230/LIPIcs.CSL.2017.11.

[2] Saguy Benaim, Michael Benedikt, Witold Charatonik, Emanuel Kieronski, Rastislav Lenhardt, Filip Mazowiecki, and James Worrell. Complexity of two-variable logic on finite trees.ACM Trans.

Comput. Log., 17(4):32:1–32:38, 2016. URL:http://dl.acm.org/citation.cfm?id=2996796.

[3] Witold Charatonik and Piotr Witkowski. Two-variable logic with counting and a linear order.

Logical Methods in Computer Science, 12(2), 2016. Extended abstract in CSL 2015. URL:

https://doi.org/10.2168/LMCS-12(2:8)2016,doi:10.2168/LMCS-12(2:8)2016.

[4] Witold Charatonik and Piotr Witkowski. Two-variable logic with counting and trees. ACM

Trans. Comput. Log., 17(4):31:1–31:27, 2016. Extended abstract in LICS 2013. URL: http:

//dl.acm.org/citation.cfm?id=2983622.

[5] E. Gr¨adel, P. Kolaitis, and M. Vardi. On the decision problem for two-variable first-order logic.

Bulletin of Symbolic Logic, 3(1):53–69, 1997.

[6] E. Gr¨adel, M. Otto, and E. Rosen. Two-variable logic with counting is decidable. In Logic in

Computer Science, pages 306–317. IEEE, 1997.

[7] M. Mortimer. On languages with two variables.Zeitschrift f¨ur Mathematische Logik und Grundlagen

der Mathematik, 21:135–140, 1975.

[8] L. Pacholski, W. Szwast, and L. Tendera. Complexity of two-variable logic with counting. InLogic

in Computer Science, pages 318–327. IEEE, 1997.

[9] I. Pratt-Hartmann. Complexity of the two-variable fragment with counting quantifiers. Journal of

Logic, Language and Information, 14(3):369–395, 2005.

[10] Ian Pratt-Hartmann. The two-variable fragment with counting revisited. In A. Dawar and R. de Queiroz, editors,WoLLIC, number 6188 in LNAI. Springer, 2010.

[11] Ian Pratt-Hartmann. Logics with counting and equivalence. In Thomas A. Henzinger and Dale Miller, editors,Joint Meeting of the Twenty-Third EACSL Annual Conference on Computer Science Logic (CSL) and the Twenty-Ninth Annual ACM/IEEE Symposium on Logic in Computer Science

(LICS), CSL-LICS ’14, Vienna, Austria, July 14 - 18, 2014, pages 76:1–76:10. ACM, 2014. URL:

http://doi.acm.org/10.1145/2603088.2603117,doi:10.1145/2603088.2603117.