Games on graphs - Algorithms for Solving Parity Games

Algorithms for Solving Parity Games

3.1 Games on graphs

Algorithms for Solving Parity Games

Marcin Jurdzi´nski

University of Warwick

Abstract

This is a selective survey of algorithms for solving parity games, which are infinite games played on finite graphs. Parity games are an important class of omega-regular games, i.e., games whose payoff functions are computable by a finite automaton on infinite words. The games considered here are zero-sum, perfect-information, and non-stochastic. Several state-of-the-art algorithms for solving parity games are presented, exhibiting disparate algorithmic techniques, such as divide-and-conquer and value iteration, as well as hybrid approaches that dovetail the two. While the problem of solving parity games is in NP and co-NP, and also in PLS and PPAD, and hence unlikely to be complete for any of the four complexity classes, no polynomial time algorithms are known for solving it.

3.1 Games on graphs

A game graph (V, E, V0, V1) consists of a directed graph (V, E), with a set of vertices V and a set of edges E, and a partition V0 V1 = V of the set of vertices. For technical convenience, and without loss of generality, we assume that every vertex has at least one outgoing edge. An inﬁnite game on the game graph is played by two players, called player 0 and player 1, player Even and player Odd, or player Min and player Max, etc., depending on the context. A play of the game is started by placing a token on a starting vertex v0∈ V , after which inﬁnitely many rounds follow. In every round, if the token is on a vertex v∈ V⁰then player 0 chooses an edge (v, w)∈ E going out of vertex v and places the token on w, and if the token is on a vertex

Algorithms for Solving Parity Games 75 v∈ V¹ then player 1 moves the token in the same fashion. The outcome of such a play is then an inﬁnite pathv0, v1, v2, . . . in the game graph.

An infinite game on a graph consists of a game graph and a payoff function π : V^ω → R. A payoff function assigns a payoff π(v) to every infinite sequence of vertices v =v0, v1, v2, . . . in the game graph. In this chapter we consider only zero-sum games, i.e., if the outcome of a play is the infinite path v∈ V^ω then player 0 (player Min) has to pay π(v) to player 1 (player Max).

We call a game on a graph a qualitative game if the payoﬀ function is Boolean, i.e., if π(V^ω) ⊆ { 0, 1 }. In qualitative games, we say that an outcome v is winning for player 0 if π(v) = 0, and it is losing for player 0 otherwise; and vice versa for player 1. An alternative, and popular, way of formalising qualitative games is to specify a set W ⊆ V^ω of outcomes that are winning for player 1, which in our formalisation is π⁻¹(1), i.e., the indicator set of the Boolean payoﬀ function π.

A strategy for player 0 is a function μ : V⁺ → V , such that if v ∈ V^∗ and w ∈ V0 then (w, μ(vw)) ∈ E. Strategies for player 1 are deﬁned analogously. Both players follow their strategies μ and χ, respectively, to produce an outcome Outcome(v0, μ, χ) =v0, v1, v2, . . . if for all i ≥ 0, we have that vi ∈ V1 implies vi+1 = μ(v0v1· · · vi), and that vi ∈ V2 implies vi+1 = χ(v0v1· · · vi). A strategy μ : V^ω → V for player 0 is a positional strategy if for all w, u∈ V^∗ and v∈ V0, we have μ(wv) = μ(uv), i.e., the values of μ are uniquely determined by the last element of its argument. It follows that a function μ : V0 → V uniquely determines a positional strategy for player 0, and we often do not distinguish between the two. Positional strategies for player 1 are deﬁned analogously.

We say that a game, with a game graph (V, E, V0, V1) and a payoﬀ function and player 1, respectively. Note that the inequality

sup always, and trivially, holds. One interpretation of determinacy , i.e., when the converse of inequality (3.2) holds, is that player 0 (player Min) does not undermine her objective of minimising the payoﬀ if she announces her strategy to player 1 (player Max) before the play begins, rather than keeping it secret and acting ‘by surprise’ in every round. An analogous interpretation holds for player 1.

76 Marcin Jurdzi´nski

The following fundamental theorem establishes that determinacy of games on graphs holds for a rich class of payoﬀ functions.

Theorem 3.1 (Martin [1998]) If the payoﬀ function is bounded and Borel measurable then the game is determined.

A game is positionally determined if the equality (3.1) holds for all v∈ V , where χ on the left hand side of the equality, and μ on the right-hand side of the equality, respectively, are restricted to range over the sets of positional strategies for player 0 and player 1, respectively. In other words, if a game is positionally determined then players can announce their positional strategies with impunity. We say that a class of games enjoys positional determinacy if all games in this class are positionally determined.

If a game is determined then we define the game value Val(v) at vertex v∈ V to be the value of either side of equation (3.1). We say that a strategy μ of player 0 is an optimal strategy if sup_χOutcome(v, μ, χ) = Val(v) for all v ∈ V . Optimal strategies of player 1 are defined analogously. If the value of a qualitative game at a vertex v is 1 and player 1 has an optimal strategy then we say that the strategy is winning for player 1 from v. Similarly, if the value of a qualitative game at a vertex v is 0 and player 0 has an optimal strategy then we say that the strategy is a winning strategy for player 0 from v. We define the winning sets win0(G) and win1(G) to be the sets of vertices from which players 0 and 1, respectively, have winning strategies.

All games G considered in this chapter are determined, the payoﬀ functions are Boolean, and both players have optimal strategies from every starting vertex. It follows that game values at all vertices are well deﬁned, and from every vertex exactly one of the players has a winning strategy, i.e., win0(G) win1(G) = V .

The central algorithmic problem studied in this chapter is the computation of the values and optimal strategies for both players in games on graphs. The corresponding decision problem is, given a game graph, a starting vertex v, and a number t, to determine whether Val(v) ≥ t. For the special case of qualitative games, the problem of deciding the winner is, given a starting vertex v, to determine whether v∈ win1(G).

In order to formalise such algorithmic problems we have to agree on finitary representations of relevant classes of payoff functions. In this chapter we only consider Boolean payoff functions that can be uniquely specified by their indicator sets, i.e., the sets of outcomes winning for player 1.

Given a set of target vertices T ⊆ V , we deﬁne the reachability payoﬀ

Algorithms for Solving Parity Games 77 function by setting its indicator set to:

Reach(T ) ={v⁰, v1, v2, . . . : vⁱ ∈ T for some i ≥ 0}.

Similarly, for a set of safe vertices S ⊆ V , we deﬁne the safety payoﬀ function by setting its indicator set to:

Safe(S) ={v0, v1, v2, . . . : vi∈ S for all i ≥ 0}.

Observe that Reach(T ) = V^ω\Safe(V \T ). It implies that from a reachabil-ity gamewith the target set T ⊆ V , by swapping the roles of players 0 and 1 and their payoﬀ functions, we get a safety game with the safe set V \ T , and vice versa.

Given a set of target vertices T ⊆ V , we define the repeated reachability payoff function, also referred to as Büchi payoff, by setting its indicator set to:

B¨uchi(T ) ={v0, v1, v2, . . . : vi ∈ T for infinitely many i ≥ 0}, and for a set S ⊆ V of safe vertices, we define the eventual safety payoff function, also known as co-Büchi payoff, function by setting its indicator set to:

co-B¨uchi(S) ={v0, v1, v2, . . . : vi ∈ S for all but ﬁnitely many i ≥ 0}.

Again, the repeated reachability and eventual safety payoﬀs are dual in the sense that from a repeated reachability game with a set of target vertices T ⊆ V , by swapping the roles of players 0 and 1 and their payoﬀ functions, we get an eventual safety game with the set of safe vertices V \ T , and vice versa.

For an infinite sequence a =a0, a1, a2, . . . ∈ A^ω, we define its infinity set Inf(a) by:

Inf(a) ={a ∈ A : ai = a for inﬁnitely many i≥ 0}.

Note that the infinity set of every infinite sequence must be non-empty if the set A is finite. For a priority function p : V → { 1, 2, . . . , d }, we define the parity payoff function by setting its indicator set to:

Parity(p) ={v0, v1, v2, . . . ∈ V^ω : max Inf(p(v0), p(v1), p(v2), . . .) is odd}.

3.2 Solving repeated reachability and eventual safety games

In document Clever Algorithms (Page 88-91)