• No results found

Solving Threshold Games in Polynomial Space

4.2 Playing Parity Games with Costs Optimally in Polynomial Space

4.2.4 Solving Threshold Games in Polynomial Space

In this section we conclude the proof of PSpace-membership of the threshold problem

for parity games with costs. To this end, we first show that solving Gb is indeed

equivalent to solving Gbf. Subsequently, we show how to solve Gbf on-the-fly given onlyG and b, i.e., without explicitly constructing it. This on-the-fly technique requires only polynomial space in the size of G and thus yields the desired result.

To show equivalence between Gb and Gbf, we employ a technique similar to that

used in the proof of : Theorem 4.8: We provide strategies for both players in Gb by : Sec. 4.1, Page 91

simulating play prefixes inGbf. As strategies in the latter game, however, only prescribe “useful” moves as long as the play prefix is not settled, we ensure that the play prefixes in Gbf remain unsettled.

We again split the proof of equivalence betweenGbandGbf into two lemmas. First, in Lemma 4.24, we show that Player 0 winning Gbf from some designated vertex implies her winning Gb from the same vertex. We then show the analogous result for Player 1 in Lemma 4.25. This suffices due to determinacy ofGbf.

Lemma 4.24. Let v∗be a vertex ofG. If Player 0 winsGbf from(v∗, init(v∗)), then she winsGb

from(v∗, init(v∗)).

Proof. Let σf be a winning strategy for Player 0 inGbf from(v∗, init(v∗)). We construct a winning strategy σ for her from (v∗, init(v∗)) in Gb by mimicking the moves made

inGb inGbf using a simulation function h mapping play prefixes inGbto play prefixes inGbf. We construct h to satisfy the following invariant:

Let π be consistent with σ and end in (v, o, r). Then, h(π)is consistent with σf, is unsettled, and ends in (v, of, rf)with(o, r) v (of, rf).

To this end, recall that Gb and Gbf share the set of vertices V0. We define h and σ inductively and simultaneously, starting with h(v∗, init(v∗)) = (v∗, init(v∗)), which clearly satisfies the invariant. Now let π be a play prefix ofGbconsistent with σ, ending in(v, o, r)such that h(π)is defined. If(v, o, r)is a vertex of Player 1, then let(v∗, o∗, r∗) be an arbitrary successor of (v, o, r) in A0. Otherwise, i.e., if (v, o, r) is a vertex of Player 0, then, due to the invariant, h(π)ends in some(v, of, rf)with (o, r) v (of, rf). Let (v∗, o∗f, r∗f) be the unique vertex such that h(π) · (v∗, o∗f, r∗f) is consistent with σf and define σ(π) = (v∗, o∗, r∗), where(o∗, r∗) =upd((o, r),(v, v∗)). This concludes the definition of σ. In either case, let π∗ =π· (v∗, o∗, r∗). It remains to define h(π∗).

To this end, let(o∗f, r∗f)be the unique memory state such that πf =h(π) · (v∗, o∗f, r∗f) is a play prefix ofGbf. If πf is unsettled, we define h(π∗) =πf. This choice satisfies the invariant: If the vertex(v∗, o∗f, r∗f)is the destination of a shortcut, then let(v∗, o∗→, r∗→)

be the destination of its corresponding detour. We obtain(o∗f, r∗f) w (o∗→, r→∗ ) w (o∗, r∗)

due to: Lemma 4.16 and due to : Remark 4.20.1. Otherwise, i.e., if (v∗, o∗f, r∗f) is not

: Sec. 4.2, Page 101

: Sec. 4.2, Page 105 the destination of a shortcut, then Lemma 4.16 yields the invariant directly.

Now consider the case that πf is settled. Then it is settled due to containing an even dominating cycle as a suffix, due to the invariant and due to πf being consistent with the winning strategy σf for Player 0. We define h(π∗) by removing the settling dominating cycle as follows: Since h(π)is not settled, the dominating cycle is a suffix of πf. Thus, the cycle starts in a vertex(vj0, oj0, rj0)with vj0 =v∗and rj0 wr∗

f. We define h(π· (v∗, o∗, r∗)) = (v0, o0, r0) · · · (vj0, oj0, rj0) ,

which satisfies the invariant due to transitivity ofv, as stated in: Remark 4.14.2.

: Sec. 4.2, Page 100

It remains to show that σ is winning for Player 0 from (v∗, init(v∗)) in Gb. To this end, consider a play ρ starting in(v∗, init(v∗)) and consistent with σ and let πj+1 be the prefix of length j of ρ.

As all πjstart in(v0, o0, r0)and are consistent with σ, all h(πj)are consistent with σf due to the invariant of h. Since σf is winning for Player 0 from (v0, o0, r0)in Gbf, this implies that the overflow counter of the h(πj)never reaches n. Thus, again due to the invariant of h, neither does the overflow counter of the πj. Hence, the colors of the last vertices of πj and h(πj)coincide for all j∈N.

Towards a contradiction, assume that the maximal color occurring infinitely often along ρ is odd, call it c. After some finite prefix, c cannot occur on even dominating cycles in the h(πj) anymore, since each occurrence on such a cycle implies at least one occurrence of an even higher even color in ρ. Hence, after this prefix, each time a vertex of color c is visited, say at the end of the prefix πj, a vertex of the same

color is appended to the simulated play h(πj). Moreover, this vertex is never removed from the simulated play, since only vertices occurring on even dominating cycles are removed from the simulated play. Hence, the simulated play becomes longer with each visit to a vertex of color c after a finite prefix. This contradicts the h(πj) being

unsettled, as every play of length ` +1 is settled due to : Lemma 4.21. Thus, the : Sec. 4.2, Page 106

maximal color occurring infinitely often in ρ is even, i.e., σ is winning for Player 0 in Gbfrom(v∗, init(v∗)).

Having shown that Player 0 can leverage a winning strategy from(v, init(v))inGbf

in order to obtain a winning strategy from the same vertex in Gb, we now show the analogous statement for Player 1. This then implies equivalence of Gbf and Gb due to determinacy ofGbf (cf. Corollary 4.23).

Lemma 4.25. Let v∗be a vertex ofG. If Player 1 winsGbf from(v∗, init(v∗)), then he winsGb

from(v∗, init(v∗)).

Proof. Let τf be a winning strategy from(v∗, init(v∗))for Player 1 inGbf. We construct a winning strategy τ for him from(v∗, init(v∗))inGbby simulating play prefixes inGb

by unsettled prefixes in Gbf from which we remove shortcut- and dominating cycles. We again define a simulation function h that maintains the following invariant:

Let π be consistent with τ and end in(v, o, r)with o <n. Then, h(π)is consistent with τf, is unsettled, and ends in (v, of, rf)with(of, rf) v (o, r). We define h and τ inductively and simultaneously, starting with h((v∗, init(v∗))) = (v∗, init(v∗)), which clearly satisfies the invariant. Now let π be a play prefix ofGbcon- sistent with τ and ending in(v, o, r). If(v, o, r)is a vertex of Player 0 then let(v∗, o∗, r∗)

be an arbitrary successor of(v, o, r)inA0. Otherwise, if(v, o, r)is a vertex of Player 1, then, due to the invariant, h(π) = π0 ends in some (v, of, rf) with (of, rf) v (o, r). Let (v∗, o∗f, r∗f) be the unique vertex such that h(π) · (v∗, o∗f, r∗f) is consistent with τf and define τ(π) = (v∗, o∗, r∗), where(o∗, r∗) =upd((o, r),(v, v∗)). This concludes the definition of τ.

It remains to define the simulation function h. To this end, let π∗ =π· (v∗, o∗, r∗)and let(o∗f, r∗f)be the unique memory state such that πf = h(π) · (v∗, o∗f, r∗f)is a play prefix ofGbf. First consider the case that πf is unsettled. If(v∗, o∗f, r∗f)is not the destination of

a shortcut, we define h(π∗) =πf, which satisfies the invariant due to: Lemma 4.16. If, : Sec. 4.2, Page 101

however, (v∗, o∗f, r∗f)is the destination of a shortcut, let(v∗, o∗, r∗)be the destination of the corresponding detour. We differentiate whether taking the shortcut to(v∗, o∗f, r∗f)

merely allows Player 1 to “catch up” to the play prefix constructed in Gb, or whether it is more advantageous for him than the position (v∗, o∗, r∗) actually reached in Gb. In the former case, i.e., if (o∗, r∗) w (o∗f, r∗f), we define h(π∗) = πf, which satisfies the invariant by assumption. In the latter case, however, i.e., if(o∗, r∗) w (o∗f, r∗f)does not hold true, we remove the shortcut cycle similarly to the removal of a settling domi- nating cycle in the proof of Lemma 4.24, obtaining πf, and define h(πf) = πf. This

satisfies the invariant due to(o∗, r∗) w (o∗→, r∗→), which we obtain via : Lemma 4.16

: Sec. 4.2, Page 101

and: Remark 4.14.2.

: Sec. 4.2, Page 100

Now consider the case that πf is settled. In this case, we distinguish two cases: If πf is settled due to o∗f = n, then, due to the invariant and Lemma 4.16, we obtain o∗ =n. Thus, the invariant of h is vacuously true and we define h(π∗) arbitrarily. If, how- ever, πf is settled due to reaching a dominating cycle, we remove this cycle from πf similarly to the removal of dominating cycles in the proof of Lemma 4.24.

It remains to show that τ is indeed winning for Player 1 from(v∗, init(v∗))inGb. To this end, consider a play ρ consistent with τ and let πj be the prefix of length j+1 of ρ. If the overflow counter along ρ eventually saturates, ρ is clearly winning for Player 1. Hence, assume that the overflow counter along ρ does not saturate. Then, due to the invariant of h, the colors of the last vertices of πj and h(πj)coincide for all j∈N.

Let c be the largest color occurring infinitely often along ρ and assume towards a contradiction that c is even. Similarly to the argument in the proof of Lemma 4.24, after some finite prefix, the color c may only occur on odd dominating cycles and on removed shortcuts, as these are the only play infixes that are removed from the simulation: If this is not the case, then a vertex with color c would be appended to the h(πj)without ever being removed from the simulation. As the h(πj)are unsettled due to the invariant of h, this unbounded growth contradicts the bounded length of unsettled play prefixes due to Lemma 4.21. Moreover, again analogously to the proof of Lemma 4.24, the color c can only occur finitely often on odd dominating cycles, as each such occurrence implies one occurrence of some larger, odd color. Hence, it remains to show that the color c does not occur infinitely often on removed shortcut cycles.

Towards a contradiction, assume that the color c occurs infinitely often on removed shortcut cycles. Since, by assumption, the overflow counter along ρ never saturates, none of the h(πj) contains a saturated overflow counter either due to the invariant of h. Moreover, as both the removal of an odd dominating cycle and that of a shortcut retain the value of the overflow counter, the values of the overflow counter of the h(πj) eventually stabilize. For all j∈N let

πj = (v0, o0, r0) · · · (vj, oj, rj)as well as h(πj) = (v0j, o j 0, r j 0) · · · (v j kj, o j kj, r j kj) . Furthermore, pick the position p such that the overflow counter in both the play ρ as well as in the simulations h(πj)has stabilized and such that no color larger than c occurs after position p. Formally, we pick p such that for all j > p we have op = oj and opk

p =o j

kj and such that c is the largest color occurring on the suffix of ρ starting at position p.

We show okp

p = op by contradiction, i.e., we assume o p

kp 6=op. Due to the invariant, we obtain okpp ≤ op, i.e., okpp < op. We claim that okpp < op implies that h(πj) results from h(πj−1)by removing a shortcut cycle only finitely often. In fact, after the posi- tion p, no shortcut cycle is removed anymore in this case: If, for some j> p a shortcut is used in the move from h(πj−1) to h(πj), then (oj, rj) w (ojkj, r

j

cycle is not removed. Hence, only finitely many shortcut cycles are removed, which contradicts the assumption of c occurring on infinitely many such cycles. Since we have opkp ≤ op due to the invariant of h, we obtain okpp = op, which implies rkpp v rp, again due to the invariant of h. In particular, for each j > p, for each relevant request for some color c0 that is open in rjk

j, a request for some color c

00 c0 is open in r

j. Whenever c occurs on a removed shortcut cycle, then c must be smaller than the smallest relevant request that is open during that cycle: Otherwise it would answer that relevant request, due to c being even and thus cause the detour corresponding to the infix to violate the shortcut condition. While there may be some requests for colors c0 < c in the infix corresponding to the shortcut cycle in ρ, visiting c does not answer all relevant requests in that corresponding infix in ρ, as argued above. This implies that traversing the shortcut cycle increases the cost of some request in ρ. Furthermore, since c is the maximal color visited in the considered suffix, one such

request eventually causes an overflow after traversing at most b+1 many edges of

nonzero weight. This contradicts the choice of p such that no overflows occur after πp.

If less than b+1 edges of nonzero weight occur during the remainder of the play,

then also at most b+1 shortcuts occur, since each shortcut requires the traversal of at least one such edge. This in turn contradicts c occurring on infinitely many removed shortcut cycles.

Hence, we conclude that vertices of color c occur only finitely often on odd domi- nating cycles and on removed shortcut cycles. As these cycles are the only cycles that are removed from the simulation, almost all visited vertices of color c are added to the simulated play and are never removed. Thus, the h(πj)grow increasingly longer. Such unbounded growth contradicts them being unsettled due to Lemma 4.21. This, in turn, contradicts the invariant of h.

Thus, the maximal color visited infinitely often during ρ is odd. Hence, ρ is winning for Player 1, i.e., τ0 is winning for him inGbfrom(v∗, init(v∗)).

The combination of the above two lemmas together with determinacy ofGbf due to

: Corollary 4.23 yields the desired equivalence of Gb and Gbf. Moreover, as the win- : Sec. 4.2, Page 111

ner of a play in Gbf is determined after at most polynomially many moves, Gbf can easily be solved by simulating it on an alternating Turing machine whose runtime is polynomially bounded. As such machines can be simulated using deterministic Tur- ing machines with polynomially bounded space due to Chandra, Kozen, and Stock- meyer [CKS81], this yields PSpace membership of the threshold problem for parity games with costs.

Theorem 4.26. The following problem is inPSpace:

“Given a parity game with costs G, a vertex v of G, and a bound b ∈ N, does

Player 0 have a strategy σ with Costv(σ) ≤b?”

Proof. Let n be the number of vertices of G and let W be the largest weight occurring

reduces to solving G. As the problem of solving parity games with costs is in UP∩

coUP due to: Proposition 2.33, and since PSpace subsumes both UP and coUP, this

: Sec. 2.4, Page 26

concludes the proof for the case b≥nW.

If, however, b < nW, we show how to simulate the finite-duration game Gbf on an alternating Turing machine using the game semantics of such machines, i.e., two players construct a single path of a run of the machine. The existential and universal player take the roles of Player 0 and Player 1, respectively. The Turing machine keeps track of the complete prefix of the simulated play ofGbf.

Every vertex of the underlying arena of Gbf can be represented in polynomial size. Moreover, the length of the play is bounded from above by (log(nW) +1)(n+1)6

due to : Lemma 4.21. Thus, the Turing machine can keep track of the play prefix

: Sec. 4.2, Page 106

constructed thus far explicitly and check whether a vertex picked by either player is a valid continuation of the play prefix of Gbf constructed thus far. Moreover, the Turing machine can check whether a dominating cycle has occurred after each step in polynomial time. If the play is settled due to an even dominating cycle, the machine accepts, if it is settled otherwise, the machine rejects.

This algorithm involves neither the explicit construction ofGbnor that ofGbf. Due to this construction, the Turing machine acceptsG and b if and only if Player 0 wins Gbf

and, due to Lemma 4.21, this machine terminates after polynomially many steps. Since polynomially time-bounded alternating Turing machines are equivalent to polyno- mially space-bounded classical Turing machines due to Chandra, Kozen and Stock- meyer [CKS81], we obtain the desired result.

This concludes our work on upper bounds on the complexity of the threshold prob- lem for parity games with weights. We have argued that the general case is in ExpTime

in: Theorem 4.12 and we have shown that its complexity drops to PSpace when only

: Sec. 4.1, Page 96

considering parity games with costs in Theorem 4.26. In the following section, we provide matching lower bounds.