Computing equilibria - Stochastic Games - Stochastic multiplayer games : theory and algorithms

Stochastic Games

3.4 Computing equilibria

ε-equilibria in these games for all ε> 0. Moreover, they showed that subgame-

perfect equilibria do exist in deterministic SSMGs with arbitrary nonnegative payoffs on terminal vertices.

3.4 Computing equilibria

The first computational problem coming to mind when one considers equilibria is computing an equilibrium for a given game. For this problem to be meaningful, we need to make sure that both the possible inputs and the possible outputs are representable by finite means. In order to ensure this, we will restrict the inputs to finite SMGs with ω-regular objectives, and the outputs to equilibria in pure finite-state strategies. Moreover, for the sake of simplicity, we concentrate on parity SMGs.²

Computing Nash equilibria

For Nash equilibria, it is easy to see that the problem of computing an equilibrium lies in the class FNP of function problems where a potential solution can be verified in polynomial time.

Theorem 3.19. The problem of computing a pure finite-state Nash equilibrium (of polynomial size) in a finite parity SMG is in FNP.

Proof. To prove membership in FNP, we need to show that, given a fi-

nite parity SMG(G, v0) and a pure strategy profile σ with finite memory

M = (M, δ, m0), we can decide in polynomial time whether σ is a Nash

equilibrium of the game. This can be achieved as follows: First, for each

player i, we calculate the payoff ziof σ by computing the probability of the

event χ−1(Wini) in the Markov chain (G

,(v0, m0)). To check whether σ is a Nash equilibrium, we additionally need to compute for each player i the

value riof the MDPG

−ifrom(v

0, m0). Clearly, σ is a Nash equilibrium if and

only if ri≤ zifor each player i. Since we can compute the values of an MDP

(or a Markov chain) with a parity objective in polynomial time, all this can

be done in polynomial time. □

Arguably more interesting is the following theorem which essentially states that we can reduce the problem of computing a Nash equilibrium to

² One problem with computing equilibria for games with more complex objectives is that optimal strategies might be of exponential size (Dziembowski et al. 1997; Horn 2005).

the problem of computing optimal strategies. For any classCof parity S2Gs, letC∗

be the class of all parity SMGsGsuch that for each player i the coalition

gameGiis inC.

Theorem 3.20. LetC be any class of finite parity S2Gs. There exists a

polynomial-time Turing reduction from the problem of computing a Nash

equilibrium for games inC∗

to the problem of computing globally optimal

positional strategies for games inC.

Proof. We describe a deterministic polynomial-time algorithm for comput-

ing Nash equilibria for games inC∗

with access to an oracle for computing

globally optimal positional strategies for games inC. On input(G, v0), where

G∈C∗

, the algorithm starts by requesting from the oracle, for each player i,

globally optimal positional strategies σiand τifor both players in the coalition

gameGi∈ C. Then, the algorithm constructs a finite-state Nash equilibrium

of(G, v0) by combining the strategies σiand τiin the way it is done in the

proof of Lemma 3.7, which can be done in polynomial time. □

Since optimal strategies can be computed in polynomial time for deterministic two-player zero-sum parity games with a bounded number of priorities, Theorem 3.20 implies that a Nash equilibrium of a deterministic multiplayer parity game with a bounded number of priorities can be computed in polynomial time. We will prove a stronger result below, namely that we can even compute a subgame-perfect equilibrium of such a game in polynomial time. Finally, it follows from Theorem 3.20 that computing a finite-state Nash equilibrium in a parity SMG can be done in polynomial time if and only if the quantitative decision problem for parity S2Gs and related problems are solvable in polynomial time.

Corollary 3.21. Either none or all of the following problems are solvable in polynomial time:

1. the quantitative decision problem for parity S2Gs, 2. computing the values of a parity S2G,

3. computing globally optimal positional strategies in a parity S2G, 4. computing a pure finite-state Nash equilibrium of a parity SMG, 5. computing a finite-state Nash equilibrium of a parity SMG.

Proof. The polynomial-time equivalence of 1., 2. and 3. is the subject of Propo- sition 2.14. That 4. can be done in polynomial time if 3. can follows from

3.4 Computing equilibria

Algorithm 3.1. Computing the set of consistent memory-vertex pairs. Input: SMGG= (Π, V, (Vi)i∈Π, ∆, χ,(Wini)i∈Π), v0∈ V, memoryM= (M, δ, m0)

Output:{(m, v) ∈ M × V ∶ exists history xv of (G, v0) with δ∗(m0, x) = m}

X∶= {(m0, v0)} repeat

X′ ∶= X

X∶= X ∪ {(n, w) ∈ M × V ∶ exists (m, v) ∈ X with δ(m, v) = n and w ∈ v∆}

until X= X′ output X

Theorem 3.20, and that 5. can be done in polynomial time if 4. can is trivial.

Finally, to compute valG(v) for a parity S2GG, we can compute a finite-state

Nash equilibrium(σ, τ) of (G, v). It follows from Proposition 3.2 that the

payoff of(σ, τ) for player 0 equals valG(v). This payoff can be computed in

polynomial time from(σ, τ) by analysing the generated Markov chain. Hence,

2. can be done in polynomial time if 5. can. □

Computing subgame-perfect equilibria

For subgame-perfect equilibria, the problem of computing a pure finite- state equilibrium of polynomial size in a parity SMG can again easily be put into FNP. The restriction to polynomial size is important: we do not know whether the existence of a pure finite-state subgame-perfect equilibrium in a parity SMG implies the existence of one with polynomial size.

Theorem 3.22. The problem of computing a pure finite-state subgame- perfect equilibrium of polynomial size in a finite parity SMG is in FNP.³

Proof. We need to show that, given a finite parity SMG(G, v0) and a pure strat-

egy profile σ with finite memoryM= (M, δ, m0), we can decide in polynomial

time whether σ is a subgame-perfect equilibrium of the game. Our algo- rithm starts by computing the set C of consistent pairs of a memory state

and a vertex, i.e. the set of all pairs(m, v) ∈ M × V such that there exists a

history xv of(G, v0) with δ∗(m0, x) = m. This can be achieved in polynomial

time (for any kind of SMG) by Algorithm 3.1.

³ More precisely, the problem of computing a pure finite-state subgame-perfect equilibrium of size at most p(n) in a finite parity SMG of size n is in FNP for any polynomial p.

After having computed the set C, the algorithm proceeds by computing

(in polynomial time) for each i∈ Π and (m, v) ∈ C the probability zi(m, v)

of the event χ−1(Wini) in the Markov chain (G

,(m, v)) and the value ri(m, v)

of the MDPGσ

−i from(m, v). Clearly, σ is a subgame-perfect equilibrium

of(G, v0) if and only if ri(m, v) ≤ zi(m, v) for each i ∈ Π and each (m, v) ∈ C. □

For deterministic games, we know how to construct a finite-state subgame-perfect equilibrium (Theorem 3.16). It is easy to see that the equilibrium can be computed in polynomial time if globally optimal positional

strategies can be computed in polynomial time. For a classCof parity S2Gs,

the classC∗

is defined as above.

Theorem 3.23. LetCbe any class of finite deterministic two-player zero-

sum parity games. There exists a polynomial-time Turing reduction from

the problem of computing a subgame-perfect equilibrium for games inC∗

to the problem of computing globally optimal positional strategies for games inC.

Theorem 3.23 makes the entire machinery that has been developed for solving (subclasses of) deterministic two-player zero-sum parity games available for the computation of subgame-perfect equilibria in deterministic multiplayer parity games. For example, the deterministic subexponential algorithm due to Jurdziński et al. (2008) can be adapted to compute subgame- perfect equilibria. Moreover, we can compute subgame-perfect equilibria in polynomial time for games on arenas that admit a polynomial-time algorithm for solving deterministic two-player zero-sum parity games, such as the ones mentioned in Section 2.5. In particular, we can compute a subgame-perfect equilibrium of a deterministic multiplayer parity game with a bounded number of priorities in polynomial time.

Corollary 3.24. For each d∈ M, there exists a polynomial-time algorithm for

computing a subgame-perfect equilibrium of a finite deterministic multi- player parity game with at most d priorities.

Finally, it follows from Theorem 3.23 that computing a Nash or subgame- perfect equilibrium of a deterministic multiplayer parity game is polynomial- time equivalent to deciding the winner of a deterministic two-player zero- sum parity game.

Corollary 3.25. Either none or all of the following problems are solvable in polynomial time:

In document Stochastic multiplayer games : theory and algorithms (Page 69-73)