Solving Threshold Games in Polynomial Space

4.2 Playing Parity Games with Costs Optimally in Polynomial Space

4.2.4 Solving Threshold Games in Polynomial Space

In this section we conclude the proof of PSpace-membership of the threshold problem

for parity games with costs. To this end, we first show that solving G_b is indeed

equivalent to solving G_bf. Subsequently, we show how to solve G_bf on-the-fly given onlyG and b, i.e., without explicitly constructing it. This on-the-fly technique requires only polynomial space in the size of G and thus yields the desired result.

To show equivalence between G_b and G_bf, we employ a technique similar to that

used in the proof of : Theorem 4.8: We provide strategies for both players in G_b by : Sec. 4.1, Page 91

simulating play prefixes inG_bf. As strategies in the latter game, however, only prescribe “useful” moves as long as the play prefix is not settled, we ensure that the play prefixes in G_bf remain unsettled.

We again split the proof of equivalence betweenG_bandG_bf into two lemmas. First, in Lemma 4.24, we show that Player 0 winning G_bf from some designated vertex implies her winning G_b from the same vertex. We then show the analogous result for Player 1 in Lemma 4.25. This suffices due to determinacy ofG_bf.

Lemma 4.24. Let v∗be a vertex ofG. If Player 0 winsG_bf from(v∗, init(v∗)), then she winsG_b

from(v∗, init(v∗)).

Proof. Let σf be a winning strategy for Player 0 inG_bf from(v∗, init(v∗)). We construct a winning strategy σ for her from (v∗, init(v∗)) in G_b by mimicking the moves made

inG_b inG_bf using a simulation function h mapping play prefixes inG_bto play prefixes inG_bf. We construct h to satisfy the following invariant:

Let π be consistent with σ and end in (v, o, r). Then, h(π)is consistent with σf, is unsettled, and ends in (v, of, rf)with(o, r) v (of, rf).

To this end, recall that G_b and G_bf share the set of vertices V0. We define h and σ inductively and simultaneously, starting with h(v∗, init(v∗)) = (v∗, init(v∗)), which clearly satisfies the invariant. Now let π be a play prefix ofG_bconsistent with σ, ending in(v, o, r)such that h(π)is defined. If(v, o, r)is a vertex of Player 1, then let(v∗, o∗, r∗) be an arbitrary successor of (v, o, r) in A0. Otherwise, i.e., if (v, o, r) is a vertex of Player 0, then, due to the invariant, h(π)ends in some(v, of, rf)with (o, r) v (of, rf). Let (v∗, o∗_f, r∗_f) be the unique vertex such that h(π) · (v∗, o∗_f, r∗_f) is consistent with σ_f and define σ(π) = (v∗, o∗, r∗), where(o∗, r∗) =upd((o, r),(v, v∗)). This concludes the definition of σ. In either case, let π∗ =π· (v∗, o∗, r∗). It remains to define h(π∗).

To this end, let(o∗_f, r∗_f)be the unique memory state such that π∗_f =h(π) · (v∗, o∗_f, r∗_f) is a play prefix ofG_bf. If π∗_f is unsettled, we define h(π∗) =π∗_f. This choice satisfies the invariant: If the vertex(v∗, o∗_f, r∗_f)is the destination of a shortcut, then let(v∗, o∗→, r∗→)

be the destination of its corresponding detour. We obtain(o∗_f, r∗_f) w (o∗→, r→∗ ) w (o∗, r∗)

due to: Lemma 4.16 and due to : Remark 4.20.1. Otherwise, i.e., if (v∗, o∗_f, r∗_f) is not

: Sec. 4.2, Page 101

: Sec. 4.2, Page 105 the destination of a shortcut, then Lemma 4.16 yields the invariant directly.

Now consider the case that π∗_f is settled. Then it is settled due to containing an even dominating cycle as a suffix, due to the invariant and due to π∗_f being consistent with the winning strategy σ_f for Player 0. We define h(π∗) by removing the settling dominating cycle as follows: Since h(π)is not settled, the dominating cycle is a suffix of π∗_f. Thus, the cycle starts in a vertex(vj0, o_j0, r_j0)with v_j0 =v∗and r_j0 wr∗

f. We define h(π· (v∗, o∗, r∗)) = (v0, o0, r0) · · · (vj0, o_j0, r_j0) ,

which satisfies the invariant due to transitivity ofv, as stated in: Remark 4.14.2.

: Sec. 4.2, Page 100

It remains to show that σ is winning for Player 0 from (v∗, init(v∗)) in G_b. To this end, consider a play ρ starting in(v∗, init(v∗)) and consistent with σ and let πj+1 be the prefix of length j of ρ.

As all πjstart in(v0, o0, r0)and are consistent with σ, all h(πj)are consistent with σf due to the invariant of h. Since σf is winning for Player 0 from (v0, o0, r0)in G_bf, this implies that the overflow counter of the h(π_j)never reaches n. Thus, again due to the invariant of h, neither does the overflow counter of the πj. Hence, the colors of the last vertices of πj and h(πj)coincide for all j∈N.

Towards a contradiction, assume that the maximal color occurring infinitely often along ρ is odd, call it c. After some finite prefix, c cannot occur on even dominating cycles in the h(πj) anymore, since each occurrence on such a cycle implies at least one occurrence of an even higher even color in ρ. Hence, after this prefix, each time a vertex of color c is visited, say at the end of the prefix πj, a vertex of the same

color is appended to the simulated play h(πj). Moreover, this vertex is never removed from the simulated play, since only vertices occurring on even dominating cycles are removed from the simulated play. Hence, the simulated play becomes longer with each visit to a vertex of color c after a finite prefix. This contradicts the h(πj) being

unsettled, as every play of length ` +1 is settled due to : Lemma 4.21. Thus, the : Sec. 4.2, Page 106

maximal color occurring infinitely often in ρ is even, i.e., σ is winning for Player 0 in G_bfrom(v∗, init(v∗)).

Having shown that Player 0 can leverage a winning strategy from(v, init(v))inG_bf

in order to obtain a winning strategy from the same vertex in G_b, we now show the analogous statement for Player 1. This then implies equivalence of G_bf and G_b due to determinacy ofG_bf (cf. Corollary 4.23).

Lemma 4.25. Let v∗be a vertex ofG. If Player 1 winsG_bf from(v∗, init(v∗)), then he winsG_b

from(v∗, init(v∗)).

Proof. Let τf be a winning strategy from(v∗, init(v∗))for Player 1 inG_bf. We construct a winning strategy τ for him from(v∗, init(v∗))inG_bby simulating play prefixes inG_b

by unsettled prefixes in G_bf from which we remove shortcut- and dominating cycles. We again define a simulation function h that maintains the following invariant:

Let π be consistent with τ and end in(v, o, r)with o <n. Then, h(π)is consistent with τf, is unsettled, and ends in (v, of, rf)with(of, rf) v (o, r). We define h and τ inductively and simultaneously, starting with h((v∗, init(v∗))) = (v∗, init(v∗)), which clearly satisfies the invariant. Now let π be a play prefix ofG_bcon- sistent with τ and ending in(v, o, r). If(v, o, r)is a vertex of Player 0 then let(v∗, o∗, r∗)

be an arbitrary successor of(v, o, r)inA0. Otherwise, if(v, o, r)is a vertex of Player 1, then, due to the invariant, h(π) = π0 ends in some (v, of, rf) with (of, rf) v (o, r). Let (v∗, o∗_f, r∗_f) be the unique vertex such that h(π) · (v∗, o∗_f, r∗_f) is consistent with τ_f and define τ(π) = (v∗, o∗, r∗), where(o∗, r∗) =upd((o, r),(v, v∗)). This concludes the definition of τ.

It remains to define the simulation function h. To this end, let π∗ =π· (v∗, o∗, r∗)and let(o∗_f, r∗_f)be the unique memory state such that π∗_f = h(π) · (v∗, o∗_f, r∗_f)is a play prefix ofG_bf. First consider the case that π∗_f is unsettled. If(v∗, o∗_f, r∗_f)is not the destination of

a shortcut, we define h(π∗) =π∗_f, which satisfies the invariant due to: Lemma 4.16. If, : Sec. 4.2, Page 101

however, (v∗, o∗_f, r∗_f)is the destination of a shortcut, let(v∗, o∗_→, r∗_→)be the destination of the corresponding detour. We differentiate whether taking the shortcut to(v∗, o∗_f, r∗_f)

merely allows Player 1 to “catch up” to the play prefix constructed in G_b, or whether it is more advantageous for him than the position (v∗, o∗, r∗) actually reached in G_b. In the former case, i.e., if (o∗, r∗) w (o∗_f, r∗_f), we define h(π∗) = π∗_f, which satisfies the invariant by assumption. In the latter case, however, i.e., if(o∗, r∗) w (o∗_f, r∗_f)does not hold true, we remove the shortcut cycle similarly to the removal of a settling domi- nating cycle in the proof of Lemma 4.24, obtaining π_f, and define h(π∗_f) = π_f. This

satisfies the invariant due to(o∗, r∗) w (o∗→, r∗→), which we obtain via : Lemma 4.16

: Sec. 4.2, Page 101

and: Remark 4.14.2.

: Sec. 4.2, Page 100

Now consider the case that π∗_f is settled. In this case, we distinguish two cases: If π∗_f is settled due to o∗_f = n, then, due to the invariant and Lemma 4.16, we obtain o∗ =n. Thus, the invariant of h is vacuously true and we define h(π∗) arbitrarily. If, how- ever, π∗_f is settled due to reaching a dominating cycle, we remove this cycle from π∗_f similarly to the removal of dominating cycles in the proof of Lemma 4.24.

It remains to show that τ is indeed winning for Player 1 from(v∗, init(v∗))inG_b. To this end, consider a play ρ consistent with τ and let πj be the prefix of length j+1 of ρ. If the overflow counter along ρ eventually saturates, ρ is clearly winning for Player 1. Hence, assume that the overflow counter along ρ does not saturate. Then, due to the invariant of h, the colors of the last vertices of πj and h(πj)coincide for all j∈N.

Let c be the largest color occurring infinitely often along ρ and assume towards a contradiction that c is even. Similarly to the argument in the proof of Lemma 4.24, after some finite prefix, the color c may only occur on odd dominating cycles and on removed shortcuts, as these are the only play infixes that are removed from the simulation: If this is not the case, then a vertex with color c would be appended to the h(πj)without ever being removed from the simulation. As the h(πj)are unsettled due to the invariant of h, this unbounded growth contradicts the bounded length of unsettled play prefixes due to Lemma 4.21. Moreover, again analogously to the proof of Lemma 4.24, the color c can only occur finitely often on odd dominating cycles, as each such occurrence implies one occurrence of some larger, odd color. Hence, it remains to show that the color c does not occur infinitely often on removed shortcut cycles.

Towards a contradiction, assume that the color c occurs infinitely often on removed shortcut cycles. Since, by assumption, the overflow counter along ρ never saturates, none of the h(π_j) contains a saturated overflow counter either due to the invariant of h. Moreover, as both the removal of an odd dominating cycle and that of a shortcut retain the value of the overflow counter, the values of the overflow counter of the h(πj) eventually stabilize. For all j∈N let

πj = (v0, o0, r0) · · · (vj, oj, rj)as well as h(πj) = (v0j, o j 0, r j 0) · · · (v j kj, o j kj, r j kj) . Furthermore, pick the position p such that the overflow counter in both the play ρ as well as in the simulations h(πj)has stabilized and such that no color larger than c occurs after position p. Formally, we pick p such that for all j > p we have op = oj and op_k

p =o j

kj and such that c is the largest color occurring on the suffix of ρ starting at position p.

We show o_kp

p = op by contradiction, i.e., we assume o p

kp 6=op. Due to the invariant, we obtain o_kp_p ≤ op, i.e., o_kp_p < op. We claim that o_kp_p < op implies that h(π_j) results from h(πj−1)by removing a shortcut cycle only finitely often. In fact, after the position p, no shortcut cycle is removed anymore in this case: If, for some j> p a shortcut is used in the move from h(πj−1) to h(πj), then (oj, rj) w (ojkj, r

cycle is not removed. Hence, only finitely many shortcut cycles are removed, which contradicts the assumption of c occurring on infinitely many such cycles. Since we have op_k_p ≤ op due to the invariant of h, we obtain o_kp_p = op, which implies r_kp_p v rp, again due to the invariant of h. In particular, for each j > p, for each relevant request for some color c0 that is open in rj_k

j, a request for some color c

00_≥ _c0 _{is open in r}

j. Whenever c occurs on a removed shortcut cycle, then c must be smaller than the smallest relevant request that is open during that cycle: Otherwise it would answer that relevant request, due to c being even and thus cause the detour corresponding to the infix to violate the shortcut condition. While there may be some requests for colors c0 < c in the infix corresponding to the shortcut cycle in ρ, visiting c does not answer all relevant requests in that corresponding infix in ρ, as argued above. This implies that traversing the shortcut cycle increases the cost of some request in ρ. Furthermore, since c is the maximal color visited in the considered suffix, one such

request eventually causes an overflow after traversing at most b+1 many edges of

nonzero weight. This contradicts the choice of p such that no overflows occur after πp.

If less than b+1 edges of nonzero weight occur during the remainder of the play,

then also at most b+1 shortcuts occur, since each shortcut requires the traversal of at least one such edge. This in turn contradicts c occurring on infinitely many removed shortcut cycles.

Hence, we conclude that vertices of color c occur only finitely often on odd dominating cycles and on removed shortcut cycles. As these cycles are the only cycles that are removed from the simulation, almost all visited vertices of color c are added to the simulated play and are never removed. Thus, the h(πj)grow increasingly longer. Such unbounded growth contradicts them being unsettled due to Lemma 4.21. This, in turn, contradicts the invariant of h.

Thus, the maximal color visited infinitely often during ρ is odd. Hence, ρ is winning for Player 1, i.e., τ0 is winning for him inG_bfrom(v∗, init(v∗)).

The combination of the above two lemmas together with determinacy ofG_bf due to

: Corollary 4.23 yields the desired equivalence of G_b and G_bf. Moreover, as the win- : Sec. 4.2, Page 111

ner of a play in G_bf is determined after at most polynomially many moves, G_bf can easily be solved by simulating it on an alternating Turing machine whose runtime is polynomially bounded. As such machines can be simulated using deterministic Tur- ing machines with polynomially bounded space due to Chandra, Kozen, and Stock- meyer [CKS81], this yields PSpace membership of the threshold problem for parity games with costs.

Theorem 4.26. The following problem is inPSpace:

“Given a parity game with costs G, a vertex v of G, and a bound b ∈ _{N, does}

Player 0 have a strategy σ with Costv(σ) ≤b?”

Proof. Let n be the number of vertices of G and let W be the largest weight occurring

reduces to solving G. As the problem of solving parity games with costs is in UP∩

coUP due to: Proposition 2.33, and since PSpace subsumes both UP and coUP, this

: Sec. 2.4, Page 26

concludes the proof for the case b≥nW.

If, however, b < nW, we show how to simulate the finite-duration game G_bf on an alternating Turing machine using the game semantics of such machines, i.e., two players construct a single path of a run of the machine. The existential and universal player take the roles of Player 0 and Player 1, respectively. The Turing machine keeps track of the complete prefix of the simulated play ofG_bf.

Every vertex of the underlying arena of G_bf can be represented in polynomial size. Moreover, the length of the play is bounded from above by (log(nW) +1)(n+1)6

due to : Lemma 4.21. Thus, the Turing machine can keep track of the play prefix

: Sec. 4.2, Page 106

constructed thus far explicitly and check whether a vertex picked by either player is a valid continuation of the play prefix of G_bf constructed thus far. Moreover, the Turing machine can check whether a dominating cycle has occurred after each step in polynomial time. If the play is settled due to an even dominating cycle, the machine accepts, if it is settled otherwise, the machine rejects.

This algorithm involves neither the explicit construction ofG_bnor that ofG_bf. Due to this construction, the Turing machine acceptsG and b if and only if Player 0 wins G_bf

and, due to Lemma 4.21, this machine terminates after polynomially many steps. Since polynomially time-bounded alternating Turing machines are equivalent to polynomially space-bounded classical Turing machines due to Chandra, Kozen and Stock- meyer [CKS81], we obtain the desired result.

This concludes our work on upper bounds on the complexity of the threshold problem for parity games with weights. We have argued that the general case is in ExpTime

in: Theorem 4.12 and we have shown that its complexity drops to PSpace when only

: Sec. 4.1, Page 96

considering parity games with costs in Theorem 4.26. In the following section, we provide matching lower bounds.

In document Optimality and Resilience in Parity Games (Page 121-126)