Game Theory
Dmitry Potapov CERN
1. Introduction
What is Game Theory?
Game theory is about interactions among agents that are self-interested
I’ll use “agent” and “player” synonymously
Self-interested:
Each agent has its own description of what states are desirable
Generally model this using utility theory
Utility function: maps each state of the world to a real number
• how much an agent likes that state
Example: TCP Users
Internet traffic is governed by the TCP protocol
TCP’s backoff mechanism
If the rate at which you’re sending packets causes congestion, reduce the rate until congestion subsides
Suppose that
You’re trying to finish an important project
• It’s extremely important for you to have a fast connection
Only one other person is using the Internet
• That person wants a fast connection just as much as you do
You each have 2 possible actions:
C (use a correct implementation)
D (use a defective implementation that won’t back off)
Action Profiles and their Payoffs
An action profile is a choice of action for each agent
You both use C => average packet delay is 1 ms
You both use D => average delay is 3 ms (router overhead)
One of you uses D, the other uses C:
• D user’s delay is 0
• C user’s delay is 4 ms
Payoff matrix:
Your options are the rows
The other agent’s options are the columns
Each cell = an action profile
• 1st number in the cell is your payoff
or utility (I’ll use those terms synonymously)
› In this case, the negative of your delay
0,–4 –3,–3 –1, –1 –4, 0
Some questions
Examples of the kinds of questions game theory attempts to answer:
Which action should you use: C or D?
Does it depend on what you think the other person will do?
What kind of behavior can the network operator expect?
Would any two users behave the same?
Will this change if users can communicate with each other beforehand?
Under what changes to the delays would the users’ decisions still be the same?
How would you behave if you knew you would face this situation repeatedly with the same person?
0,–4 –3,–3 –1, –1 –4, 0
Some Fields where Game Theory is Used
Economics
Auctions
Markets
Bargaining
Fair division
Social networks
…
Some Fields where Game Theory is Used
Government and Politics
Voting systems
Negotiations
International relations
War
Human rights
A trench in World War 1:
Some Fields where Game Theory is Used
Evolutionary Biology
Communication
Population ratios
Territoriality
Altruism
Parasitism, symbiosis
Social behavior
Some Fields where Game Theory is Used
Computer Science
Artificial Intelligence
Multi-agent systems
Computer networks
Robotics
Some Fields where Game Theory is Used
Engineering
Communication networks
Control systems
Road networks
Expected utility maximization
–5 10
0 2
Games against nature
Nature is considered an impartial agent (may be represented by U = const)
𝑈 = 𝑝𝑖𝑈𝑖
𝑁
𝑖=1
Assume that p = 0.7; p = 0.3 Then:
U = (0)×(0.7) + (2)(0.3) = 0.6
U = (-5)×(0.7) + (10)(0.3) = – 0.5
It is rational to take an umbrella!
Zero-sum Games
These games are purely competitive
Constant-sum game:
For every action profile, the sum of the payoffs is the same, i.e.,
there is a constant c such for every action profile (a
1, …, a
n), u
1(a
1, …, a
n) + … + u
n(a
1, …, a
n) = c
Any constant-sum game can be transformed into an equivalent game in which the sum of the payoffs is always 0
Just subtract c/n from every payoff
Thus constant-sum games are usually called zero-sum games
Examples
Matching Pennies
Two agents, each has a penny
Each independently chooses to display Heads or Tails
• If same, agent 1 gets both pennies
• Otherwise agent 2 gets both pennies
Rock, Paper, Scissors (Roshambo)
3-action generalization of matching pennies
• If both choose same, no winner
• Otherwise,
paper beats rock, rock beats scissors, scissors beats paper
–1, 1 1, –1 1, –1 –1, 1
Heads Tails
Heads
Tails
Dominant strategies
–5, 5 10, -10 0, 0 2, – 2
Now let’s assume that weather is a conscious, self-interested agent
Column 2 always gives me less utility, than column 1.
There is no use for me choosing it!
Surely, Weather will choose to rain, so I better take my umbrella!
Nash Equilibrium
-1, 1 4, –4 2, –2 0, 0 𝑅1
𝑅2
𝐶1 𝐶2
No dominant strategies:
If I choose R1, C is better with C2, if R2 – with C1. C will consider “worst case scenario” and will choose C1. So I must choose R1!
R will choose R1, so I must choose C2!
R will choose C2, so I must choose R2!
R will choose R2, so I must choose C1!
Solution: use Nash equilibrium –
choose R
1with probability p = 5/7
and R
2with probability p = 2/7
Nash Equilibrium
2, –2 0, 0 𝑅1
𝐶1 𝐶2
Non-zero-sum games
10, –10 –100,–100 1, 1 –10, 10
The game of chicken
“Yank your steering wheel off and throw it out of the window” (H. Kahn)
What is the “rational” decision?
- Need to make sure that your opponent sees this
- If your opponent uses the same strategy, you have a problem (are killed)
Choose maximin strategy – (C, C) - Not an equilibrium
Choose a pure equilibrium – (C, D) or (D, C) - Not symmetric
Choose a mixed equilibrium – C with p = 10/11, D with p = 1/11 - Average payoff is 0, but a non-equilibrium (C, C) gives 1
No satisfactory answer which is the
“most rational” strategy!
Non-zero-sum games
0, 0 –x,–y 1, 0 0, 1 𝑆1
𝑇1
𝑆2 𝑇2
The deterrence game
Model: Cuban Missile Crisis 𝑆2 − remove Russian missiles
𝑇2 − keep Russian missiles
𝑇1 − attack Cuba
𝑆1 − not attack Cuba
How to induce Cuba to choose S2?
Send a threat: “If you choose T2, I am going to choose T1”
Assume that Cuba will defy the threat with probability p. Then a “solution” (T.C. Schelling, 1960) is:
Is it a credible threat or a bluff?
It is worthwhile to threaten with some probability π so that:
The deterrence game
Real world implementation
The question reduces to the question of how rational it is to play Russian roulette
Assume π = 1/6. The problem is:
One doesn’t release a 1/6th of a nuclear war!
He either releases a full-blown nuclear attack or none at all The chances of “winning” are the same as in Russian roulette
The Prisoner’s Dilemma
The TCP user’s game is more commonly called the Prisoner’s Dilemma
Scenario: two prisoners are in separate rooms
For each prisoner, the police have enough evidence for a 1 year prison sentence
They want to get enough evidence for a 4 year prison sentence
They tell each prisoner,
• “If you testify against the other prisoner, we’ll reduce your prison sentence by 1 year”
C = Cooperate (with the other prisoner): refuse to testify
D = Defect: testify against the other prisoner
Both prisoners cooperate => both stay in prison for 1 year
Both prisoners defect => both stay in prison for 4 – 1 = 3 years
0,–4 –3,–3 –1, –1 –4, 0
The paradox: strategy (D, D) is a dominant equilibrium (for example, for every strategy of the column player, the row
player prefers C to D.) But (C, C) has a bigger payoff.
0,–4 –3,–3 –1, –1 –4, 0
Prisoner’s Dilemma
To find subgame-perfect equilibria, we can use backward induction
Identify the equilibria in the bottom-most nodes
Assume they’ll be played if the game ever reaches these nodes
For each node x, recursively compute a vector v
x= (v
x1, …, v
xn) that gives every agent’s equilibrium utility
At each node x,
• If i is the agent to move, then i’s
equilibrium action is to move to a child y of x for which i’s equilibrium utility v
yiis highest
• Thus v
x= v
yA B
C
G 2 (3,8)
D
1 (2,10) H E
2 (2,10) F
(3,8) (8,3)
(2,10) (1,0) (5,5)
1 (3,8)
Backward Induction
1, 1 0, 3
1
2, 2
1
99, 99
2
98,101
100, 100 1 2
The Centipede Game
Two possible moves:
C (continue) and S (stop)
Agent 1 makes the first move
At each terminal node, the payoffs are as shown
A Problem with Backward Induction
The Centipede Game
Can extend this game to any length
The payoffs are constructed in such a way that for each agent, the only SPE is always to choose S
This equilibrium isn’t intuitively appealing
Seems unlikely that an agent would choose S near the start of the game
If the agents continue the game for several moves, they’ll both get higher payoffs
In lab experiments, subjects continue to choose C until close to the end
of the game
A Problem with Backward Induction
Suppose agent 1 chooses C
If you’re agent 2, what do you do?
SPE analysis says you should choose S
But SPE analysis also says you should never have gotten here at all
How to amend your beliefs and course of action based on this event?
Fundamental problem in game theory
Differing accounts of it, depending on
• the probabilistic assumptions made
• what is common knowledge (whether there is common knowledge of rationality)
• how to revise our beliefs in the face of an event with probability 0
Backward Induction in Zero-Sum Games
Backward induction works much better in zero-sum games
No zero-sum version of the Centipede Game, because we can’t have increasing payoffs for both players
Only need one number: agent 1’s payoff (= negative of agent 2’s payoff)
Propagate agent 1’s payoff up to the root
At each node where it’s agent 1’s move,
the value is the maximum of the labels of its children
At each node where it’s agent 2’s move,
the value is the minimum of the labels of its children
The root’s label is the value of the game (from the Minimax Theorem)
In practice, it may not be possible to generate the entire game tree
E.g., extensive-form representation of chess has about 10150 nodes
Summary
Basic concepts:
payoffs, pure strategies, mixed strategies
Some classifications of games based on their payoffs
Zero-sum
• Roshambo, Matching Pennies
Non-zero-sum
• Prisoner’s Dilemma, Game of chicken