• No results found

Game Theory 1. Introduction

N/A
N/A
Protected

Academic year: 2022

Share "Game Theory 1. Introduction"

Copied!
28
0
0

Loading.... (view fulltext now)

Full text

(1)

Game Theory

Dmitry Potapov CERN

1. Introduction

(2)

What is Game Theory?

Game theory is about interactions among agents that are self-interested

I’ll use “agent” and “player” synonymously

Self-interested:

Each agent has its own description of what states are desirable

Generally model this using utility theory

Utility function: maps each state of the world to a real number

• how much an agent likes that state

(3)

Example: TCP Users

Internet traffic is governed by the TCP protocol

TCP’s backoff mechanism

If the rate at which you’re sending packets causes congestion, reduce the rate until congestion subsides

Suppose that

You’re trying to finish an important project

• It’s extremely important for you to have a fast connection

Only one other person is using the Internet

• That person wants a fast connection just as much as you do

You each have 2 possible actions:

C (use a correct implementation)

D (use a defective implementation that won’t back off)

(4)

Action Profiles and their Payoffs

An action profile is a choice of action for each agent

You both use C => average packet delay is 1 ms

You both use D => average delay is 3 ms (router overhead)

One of you uses D, the other uses C:

• D user’s delay is 0

• C user’s delay is 4 ms

Payoff matrix:

Your options are the rows

The other agent’s options are the columns

Each cell = an action profile

1st number in the cell is your payoff

or utility (I’ll use those terms synonymously)

› In this case, the negative of your delay

0,–4 –3,–3 –1, –1 –4, 0

(5)

Some questions

Examples of the kinds of questions game theory attempts to answer:

Which action should you use: C or D?

Does it depend on what you think the other person will do?

What kind of behavior can the network operator expect?

Would any two users behave the same?

Will this change if users can communicate with each other beforehand?

Under what changes to the delays would the users’ decisions still be the same?

How would you behave if you knew you would face this situation repeatedly with the same person?

0,–4 –3,–3 –1, –1 –4, 0

(6)

Some Fields where Game Theory is Used

Economics

Auctions

Markets

Bargaining

Fair division

Social networks

(7)

Some Fields where Game Theory is Used

Government and Politics

Voting systems

Negotiations

International relations

War

Human rights

A trench in World War 1:

(8)

Some Fields where Game Theory is Used

Evolutionary Biology

Communication

Population ratios

Territoriality

Altruism

Parasitism, symbiosis

Social behavior

(9)

Some Fields where Game Theory is Used

Computer Science

Artificial Intelligence

Multi-agent systems

Computer networks

Robotics

(10)

Some Fields where Game Theory is Used

Engineering

Communication networks

Control systems

Road networks

(11)

Expected utility maximization

–5 10

0 2

Games against nature

Nature is considered an impartial agent (may be represented by U = const)

𝑈 = 𝑝𝑖𝑈𝑖

𝑁

𝑖=1

Assume that p = 0.7; p = 0.3 Then:

U = (0)×(0.7) + (2)(0.3) = 0.6

U = (-5)×(0.7) + (10)(0.3) = – 0.5

It is rational to take an umbrella!

(12)

Zero-sum Games

These games are purely competitive

Constant-sum game:

For every action profile, the sum of the payoffs is the same, i.e.,

there is a constant c such for every action profile (a

1

, …, a

n

), u

1

(a

1

, …, a

n

) + … + u

n

(a

1

, …, a

n

) = c

Any constant-sum game can be transformed into an equivalent game in which the sum of the payoffs is always 0

Just subtract c/n from every payoff

Thus constant-sum games are usually called zero-sum games

(13)

Examples

Matching Pennies

Two agents, each has a penny

Each independently chooses to display Heads or Tails

• If same, agent 1 gets both pennies

• Otherwise agent 2 gets both pennies

Rock, Paper, Scissors (Roshambo)

3-action generalization of matching pennies

• If both choose same, no winner

• Otherwise,

paper beats rock, rock beats scissors, scissors beats paper

–1, 1 1, –1 1, –1 –1, 1

Heads Tails

Heads

Tails

(14)

Dominant strategies

–5, 5 10, -10 0, 0 2, – 2

Now let’s assume that weather is a conscious, self-interested agent

Column 2 always gives me less utility, than column 1.

There is no use for me choosing it!

Surely, Weather will choose to rain, so I better take my umbrella!

(15)

Nash Equilibrium

-1, 1 4, –4 2, –2 0, 0 𝑅1

𝑅2

𝐶1 𝐶2

No dominant strategies:

If I choose R1, C is better with C2, if R2 – with C1. C will consider “worst case scenario” and will choose C1. So I must choose R1!

R will choose R1, so I must choose C2!

R will choose C2, so I must choose R2!

R will choose R2, so I must choose C1!

Solution: use Nash equilibrium –

choose R

1

with probability p = 5/7

and R

2

with probability p = 2/7

(16)

Nash Equilibrium

2, –2 0, 0 𝑅1

𝐶1 𝐶2

(17)

Non-zero-sum games

10, –10 –100,–100 1, 1 –10, 10

The game of chicken

“Yank your steering wheel off and throw it out of the window” (H. Kahn)

What is the “rational” decision?

- Need to make sure that your opponent sees this

- If your opponent uses the same strategy, you have a problem (are killed)

Choose maximin strategy – (C, C) - Not an equilibrium

Choose a pure equilibrium – (C, D) or (D, C) - Not symmetric

Choose a mixed equilibrium – C with p = 10/11, D with p = 1/11 - Average payoff is 0, but a non-equilibrium (C, C) gives 1

No satisfactory answer which is the

“most rational” strategy!

(18)

Non-zero-sum games

0, 0 –x,–y 1, 0 0, 1 𝑆1

𝑇1

𝑆2 𝑇2

The deterrence game

Model: Cuban Missile Crisis 𝑆2 remove Russian missiles

𝑇2 keep Russian missiles

𝑇1 attack Cuba

𝑆1 not attack Cuba

How to induce Cuba to choose S2?

Send a threat: “If you choose T2, I am going to choose T1

Assume that Cuba will defy the threat with probability p. Then a “solution” (T.C. Schelling, 1960) is:

Is it a credible threat or a bluff?

It is worthwhile to threaten with some probability π so that:

(19)

The deterrence game

Real world implementation

The question reduces to the question of how rational it is to play Russian roulette

Assume π = 1/6. The problem is:

One doesn’t release a 1/6th of a nuclear war!

He either releases a full-blown nuclear attack or none at all The chances of “winning” are the same as in Russian roulette

(20)

The Prisoner’s Dilemma

The TCP user’s game is more commonly called the Prisoner’s Dilemma

Scenario: two prisoners are in separate rooms

For each prisoner, the police have enough evidence for a 1 year prison sentence

They want to get enough evidence for a 4 year prison sentence

They tell each prisoner,

• “If you testify against the other prisoner, we’ll reduce your prison sentence by 1 year”

C = Cooperate (with the other prisoner): refuse to testify

D = Defect: testify against the other prisoner

Both prisoners cooperate => both stay in prison for 1 year

Both prisoners defect => both stay in prison for 4 – 1 = 3 years

0,–4 –3,–3 –1, –1 –4, 0

(21)

The paradox: strategy (D, D) is a dominant equilibrium (for example, for every strategy of the column player, the row

player prefers C to D.) But (C, C) has a bigger payoff.

0,–4 –3,–3 –1, –1 –4, 0

Prisoner’s Dilemma

(22)

To find subgame-perfect equilibria, we can use backward induction

Identify the equilibria in the bottom-most nodes

Assume they’ll be played if the game ever reaches these nodes

For each node x, recursively compute a vector v

x

= (v

x1

, …, v

xn

) that gives every agent’s equilibrium utility

At each node x,

If i is the agent to move, then i’s

equilibrium action is to move to a child y of x for which i’s equilibrium utility v

yi

is highest

Thus v

x

= v

y

A B

C

G 2 (3,8)

D

1 (2,10) H E

2 (2,10) F

(3,8) (8,3)

(2,10) (1,0) (5,5)

1 (3,8)

Backward Induction

(23)

1, 1 0, 3

1

2, 2

1

99, 99

2

98,101

100, 100 1 2

The Centipede Game

 Two possible moves:

C (continue) and S (stop)

 Agent 1 makes the first move

 At each terminal node, the payoffs are as shown

(24)

A Problem with Backward Induction

The Centipede Game

Can extend this game to any length

The payoffs are constructed in such a way that for each agent, the only SPE is always to choose S

This equilibrium isn’t intuitively appealing

Seems unlikely that an agent would choose S near the start of the game

If the agents continue the game for several moves, they’ll both get higher payoffs

In lab experiments, subjects continue to choose C until close to the end

of the game

(25)

A Problem with Backward Induction

Suppose agent 1 chooses C

If you’re agent 2, what do you do?

SPE analysis says you should choose S

But SPE analysis also says you should never have gotten here at all

How to amend your beliefs and course of action based on this event?

Fundamental problem in game theory

Differing accounts of it, depending on

• the probabilistic assumptions made

• what is common knowledge (whether there is common knowledge of rationality)

• how to revise our beliefs in the face of an event with probability 0

(26)

Backward Induction in Zero-Sum Games

Backward induction works much better in zero-sum games

No zero-sum version of the Centipede Game, because we can’t have increasing payoffs for both players

Only need one number: agent 1’s payoff (= negative of agent 2’s payoff)

Propagate agent 1’s payoff up to the root

At each node where it’s agent 1’s move,

the value is the maximum of the labels of its children

At each node where it’s agent 2’s move,

the value is the minimum of the labels of its children

The root’s label is the value of the game (from the Minimax Theorem)

In practice, it may not be possible to generate the entire game tree

E.g., extensive-form representation of chess has about 10150 nodes

(27)

Summary

Basic concepts:

payoffs, pure strategies, mixed strategies

Some classifications of games based on their payoffs

Zero-sum

• Roshambo, Matching Pennies

Non-zero-sum

• Prisoner’s Dilemma, Game of chicken

Backward induction

(28)

References

Related documents

strategy on food security as well, but consequences are likely only to be felt in the long-term because of the investments the Omani government from 2008

MCE analysis allows for determining the worth of a new treatment relative to an older one, given not only the potential risks of adverse events and benefits that may be gained, but

The Karnataka State Integrated Health Policy states that "Equitable proportions of spending will be in the primary, secondary and tertiary levels [of care] (55%, 35% and

Several measures were undertaken to strengthen the innovation ecosystem during the Tenth Malaysia Plan, 2011-2015, including investing substantially in research, development,

Widespread poverty affects all aspects of life in Sierra Leone, and persons with disabilities are even more vulnerable because they are often socially marginalized and face

Since the DMCat is based mainly on the data sources specific to the Pilot Study Areas in compliance with  to  scope  of  DEVOTES  project,  its  contents 

Over the last 15 years the state of Illinois has utilized the ACT to measure the student achievement level of districts, and high school administrators have constructed curricula and

When we talk about the foreign direct investment global evolution, because the growth of these flows is an important determinant of economic globalization, we should illustrate