Sequence form computations - Discrete poker models

6.2 Discrete poker models

6.2.3 Sequence form computations

with a king, to bet with an ace and to randomize between passing and betting with a queen such that the probability of betting is equal to ¹₃. Player 2 should play ²₃FQFKCA+¹₃FQCKCA. So he should fold with a queen, call with an ace and with a king he should call with probability ¹₃. In fact, it is not difficult to check that these are the unique optimal strategies in this game.

6.2.3 Sequence form computations

The sequence form is an alternative strategic description of a game. This description can only be given for games with perfect recall: if two nodes are in the same information set for a player, this player’s moves needed to arrive at any of these two nodes must be the same. Rather than planning a move for every information set, a player considers for each node in the game tree

the choices he needs to make so that that node can be reached. These choices together form a sequence for the player. In the sequence form, Si is the set of sequences of player i defined by all nodes of the game tree. This set replaces his set of pure strategies in the normal form. A single sequence can be represented by a set: the set of actions a player has to take to reach the node. Then the sequence needed for a player to reach the root node is the empty set. Each player has at most as many sequences as the game tree has nodes, so the number of sequences is linear in the size of the tree. Actually, the upper bound of the number of sequences for a player is determined by the number of information sets he faces. After all, the sequences leading to two different nodes in the same information sets are the same in a game with perfect recall.

In the normal form, a player can pick a pure strategy. In the sequence form, a player cannot just pick a single sequence. In Figure 6.2, for example, player 1 has to decide between PQ and BQ, but also between PK and BK and between PA and BA. Choosing PQ, PK and PA, just like in the pure strategy PQPKPA

of the normal form game, assigns the realization probabilities 1, 1, 0, 1, 0, 1, 0 to the sequences∅, PQ, BQ, PK, BK, PA, BA respectively. These realization proba-bilities can be ordered in a vector x, which we call a realization plan. A player can use randomization in one or more of his choices between sequences, but a realization plan for player 1 should satisfy the following equations:

x(∅) = 1,

x(∅) = x(PQ) + x(BQ), (6.1)

x(∅) = x(P^K) + x(BK), x(∅) = x(PA) + x(BA).

For player 2 we can also construct a (column) vector y, specifying his realization plan. This realization plan should satisfy the equations belows.

y(∅) = 1,

y(∅) = y(FQ) + y(CQ), (6.2)

y(∅) = y(F^K) + y(CK), y(∅) = y(FA) + y(CA).

We use the notation of Von Stengel (1996) when constructing the optimization problem of which the solution is an equilibrium of the game.

The length of a player’s realization plan is equal to the sum over all in-formation sets of this player of the number of actions in the inin-formation set plus one additional element, corresponding to the empty sequence. This last element is always equal to one. In minipoker, player 1 has three information sets. In each of them he can choose between two actions, so the length of x is 3× 2 + 1 = 7. The entries of x are real numbers between zero and one.

Similarly, y∈ R⁷ and 0≤ yi ≤ 1 for all i ∈ {1, . . . , 7}.

The payoff matrix A contains the expected payoff for player 1 for each pair of sequences that leads to a terminal node. Player 1’s payoff at the terminal node is multiplied by the probabilities of chance moves on the path from the root to this terminal node to obtain the expected payoff. For all combinations of sequences that do not lead to a terminal node, the corresponding entry in A is zero. For minipoker with three cards, the payoff matrix A is as follows.

The subscripts of the sequences indicate the information sets to which they correspond.

Player 1 chooses a realization plan, x, for the rows of the matrix. The real-ization plan y, chosen by player 2, indicates the realreal-ization probabilities for the columns of A. The vector x should satisfy the equalities given in equa-tion (6.1). These equalities can be represented by the expression Ex = e, with

Similarly, following the equalities in equation (6.2), y should satisfy F y = f , with F = E and f = e. Furthermore, for both players the realization plans should be nonnegative.

We follow the derivation of Von Stengel (1996, p. 233–234) to construct the linear program with which we can compute the equilibrium. Let us first consider the problem of finding a best response of player 2 against a given realization plan x of player 1. This is equivalent to solving the following linear program, in which B =−A:

maxy (x^⊤B)y

subject to F y = f, y ≥ 0.

(6.3)

The number of variables in the dual of this LP is equal to the number of information sets of player 2 plus one. These variables are unconstrained and are represented by the vector q. The dual LP to (6.3) is

minq q^⊤f

subject to q^⊤F ≥ x^⊤B. (6.4)

Analogously, finding a best response x of player 1, given that player 2 plays according to y, is equivalent to solving the following problem:

maxx x^⊤(Ay)

subject to x^⊤E^⊤ = e^⊤,

x ≥ 0.

(6.5)

The dual problem to (6.5) uses the unconstrained vector p of which the length is equal to the number of information sets of player 1 plus one and reads

minp e^⊤p

subject to E^⊤p ≥ Ay. (6.6)

In order to find an equilbrium, both x and y have to be treated as variables.

In this way, the objective functions in (6.3) and (6.5) are not linear anymore.

However, a zero-sum game can still be solved by a linear program. Note that the LP (6.5) and its dual (6.6) have the same optimal value e^⊤p. If player 2 can vary y, he will try to minimize this value: an optimal choice of y will be

a min-max strategy. In order to be a well-defined realization plan, y has to

Again, consider the dual of this LP:

maxx,q −q^⊤f subject to the constraints of (6.5). This LP can be interpreted as the problem of finding a min-max strategy for player 1. Von Stengel (1996, Theorem 5.1) proves that the optimal solutions to (6.7) and (6.8) indeed define an equilibrium for the zero-sum game.

When we solve the linear program given in (6.7), we find that the game value is equal to ₁₈¹ and that optimal realization plans ˜x and ˜y are given by

The realization plans are easily interpretable in terms of behavioural strategies.

For player 1 it is optimal to pass with a king and to bet with an ace. With a queen, he has to bet with probability ¹₃. Player 2 should fold with a queen, call with an ace and with a king he should call with probability ¹₃. Of course, the conclusion is the same as in section 6.2.2, since the optimal strategies in this game are unique.

In document Skill and Strategy in Games. (Page 140-145)