Lec2-9.pdf

(1)

Lecture 9: Repeated Games

Advanced Microeconomics II

Yosuke YASUDA

Osaka University, Department of Economics

[email protected]

January 6, 2015

(2)

Finitely Repeated Games (1)

A repeated game, a specific class of dynamic game, is a suitable framework for studying the interaction between immediate gains and long-term incentives, and for understanding how a reputation mechanism can support cooperation.

LetG={A1, ..., An;u1, ..., un} denote a static game in which players1 throughn simultaneously choose actionsa1 through an from the action spacesA1 throughAn, and payoffs are

u1(a1, ..., an) through un(a1, ..., an).

Definition 1

The gameGis called the stage gameof the repeated game. Given a stage gameG, let G(T)denote thefinitely repeated gamein which Gis played T times, with the outcomes of all preceding plays observed before the next play begins.

Assume that the payoff forG(T)is simply the sum of the

(3)

Finitely Repeated Games (2)

Theorem 2

If the stage gameG has a unique Nash equilibrium, then, for any finiteT, the repeated game G(T) has a unique subgame perfect Nash equilibrium: the Nash equilibrium ofGis played in every stage irrespective of the past history of the play.

Proof.

We can solve the game by backward, that is, starting from the smallest subgame and going backward through the game.

In stage T, players choose a unique Nash equilibrium of G.

Given that, in stage T−1, players again choose the same Nash equilibrium outcome, since no matter what they play the last stage game outcome will be unchanged.

This argument carries over backwards through stage 1, which concludes that the unique Nash equilibrium outcome is played in every stage.

(4)

Finitely Repeated Games (3)

When there are more than one Nash equilibrium in a stage game, multiple subgame perfect Nash equilibria may exist.

Furthermore, an action profile which does not constitute a stage game Nash equilibrium may be sustained (for any period t < T) in a subgame perfect Nash equilibrium.

Q The following stage game will be played twice. Can players sustain non-equilibrium outcome(M1, M2) in the first period?

12 L2 M2 R2

L1 1, 1 5,0 0,0

M1 0,5 4,4 0,0

R1 0,0 0,0 3,3

Rm Note that there are two Nash equilibria in the stage game, i.e.,(L1, L2),(R1, R2): future behavior can influence current

(5)

Infinitely Repeated Games (1)

Even if the stage game has a unique Nash equilibrium, there may be subgame perfect outcomes of the infinitely repeated game in which no stage game’s outcome is a Nash equilibrium ofG.

LetG(∞, δ) denote theinfinitely repeated gamein which Gis repeated forever and the players share the discount factorδ.

For eacht, the outcomes of the t−1preceding plays of the stage game are observed before the t-th stage begins. Each player’s payoff in G(∞, δ) is the average payoff defined as follows.

Definition 3

Given the discount factorδ, theaverage payoff of the infinite sequence of payoffsπ1, π2, ...is

(1−δ)(π1+δπ2+δ2π3+· · ·) = (1−δ)

∞

X

t=1

δt−1πt.

(6)

Infinitely Repeated Games (2)

There are a few important remarks:

The historyof play through stagetis the record of the players’ choices in stages1 through t.

The players might have chosen(as

1, ..., asn)in stages, where for each playerithe actionas

i belongs toAi.

In the finitely repeated game G(T) or the infinitely repeated game G(∞, δ), a player’s strategy specifies the action the player will take in each stage, for every possible history of play. In the finitely repeated game G(T), a subgame beginning at stage t+ 1is the repeated game in whichGis played T−t times, denoted G(T −t).

In the infinitely repeated gameG(∞, δ), each subgame beginning at any stage is identical to the original game.

In a repeated game, a Nash equilibrium is subgame perfect if the players’ strategies constitute a Nash equilibrium inevery subgame, i.e., after every possible history of the play.

(7)

Unimprovability (1)

Definition 4

A strategyσi is called aperfect best responseto the other players’ strategies, when playerihas no incentive to deviate followingany history.

Consider the following requirement that, at first glance, looks much weaker than the perfect best response condition.

Definition 5

A strategy foriisunimprovable against a vector of strategies of her opponents if there is not−1period history (for any t) such thaticould profit by deviating from her strategy in period tonly (conforming thereafter).

To verify the unimprovability of a strategy, one checks only “one-shot” deviations from the strategy, rather than arbitrarily complex deviations (such as defecting in every period).

(8)

Unimprovability (2)

Theorem 6

Let the payoffs ofGbe bounded. In the repeated game G(S)or G(∞, δ), strategyσi is a perfect best response to a profile of strategiesσ if and only ifσi is unimprovable against that profile.

Proof of⇒ (⇐is trivial).

Consider the contrapositive.

1 _If_σi _{is not a perfect best response, there must be a history} after which it is profitable to deviate to some other strategy.

2 _{Then, because of discounting and boundedness of payoffs,} there must exist a profitable deviation involves defection for finitely many periods (and conforms toσi thereafter).

3 Consider a profitable deviation involving defection at the smallest possible number of period, denoted by T.

4 In such a profitable deviation, the player must be improvable after deviating for T −1 period.

(9)

Repeated Prisoner’s Dilemma (1)

Q The following prisoner’s dilemma will be played infinitely many times. Under what conditions ofδ, we can sustain cooperation

(C1, C2) as a SPNE?

1 2 D2 C2

D1 1, 1 5,0

C1 0,5 4,4

Suppose that playeriplays Ci in the first stage. In the t-th stage, if the outcome of allt−1preceding stages has been (C1, C2) then playCi; otherwise, play Di.

This strategy is calledtrigger strategy, because playeri cooperates until someone fails to cooperate, which triggers a switch to noncooperation forever after.

If both players adopt this trigger strategy then the outcome of the infinitely repeated game will be (C1, C2) in every stage.

(10)

Repeated Prisoner’s Dilemma (2)

To show that the trigger strategy is SPNE, we must verify that the trigger strategies constitute a Nash equilibrium on every subgame of that infinitely repeated game.

Rm Since every subgame of an infinitely repeated game is identical to the game as a whole, we have to consider only two types of subgames: (i) subgame in which all the outcomes of earlier stages have been(C1, C2), and (ii) subgames in which the outcome of at least one earlier stage differs from(C1, C2).

Thanks to the previous theorem, it is sufficient to show that there is no one-shot profitable deviationin every possible history that can realize under the trigger strategy.

Players have no incentive to deviate in (ii) since trigger strategy involves repeated play of one shot NE,(D1, D2).

(11)

Repeated Prisoner’s Dilemma (3)

The following condition guarantees that there will be no (one-shot) profitable deviation in (i).

4 + 4δ+ 4δ2+· · · ≥5 +δ+δ2+· · · ⇐⇒ 3(δ+δ2+· · ·)≥1

⇐⇒ 3δ

1−δ ≥1 ⇐⇒ δ ≥

1 4.

Rm Mutual cooperation (C1, C2) can be sustained as an SPNE outcome by using the trigger strategy when a discount factor is sufficiently large.

The next theorem, calledfalk theorem, states that large subsets of feasible payoffs are sustained in an SPNE.

(12)

Falk Theorem

Definition 7

Payoffs(x1, ..., xn) are called feasible in the stage gameG if it is a convex combination of the pure strategy payoffs ofG.

Theorem 8 (Falk Theorem)

LetGbe a finite, static game. Let (e1, ..., en) denote the payoffs from a Nash equilibrium ofG, and let(x1, ..., xn)denote any other feasible payoffs fromG. If xi> ei for every playeriand if δ is sufficiently close to one, then there exists a subgame perfect Nash equilibrium of the infinitely repeated gameG(∞, δ) that achieves

(x1, ..., xn) as the average payoff.

Proof See Appendix 2.3.B (Gibbons, pp.100)

Rm The name comes from the fact that the statement (relying on NE rather than SPNE) was widely known among game theorists in the 1950s, even though no one had published it.