International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 6, Issue 4, April 2016)
295
The Study of Strategic form of Repeated Game Theory
V.Vinoba
1, P. Hema
21
K.N. Government Arts College, Tamil Nadu, India
2Assistant Professor, RMKCET
Abstract— In this paper, we consider a strategic form behavioural game theory. It is standard multi agent settings to assume that agents will adopt Nash equilibrium strategies. This paper gives a brief overview of game theory. Therefore in the first section we want to outline what game theory generally is and where it is applied. In the next section, we introduce some of the most important terms of Repeated game theory such as strategic form (or) normal form games, semi-extensive form and Nash equilibrium.
Keywords— Game theory, Semi-extensive form strategic form, Nash equilibrium
I. INTRODUCTION
A game is a description of strategic interaction that includes the constraints on the actions that the players can take and the player‟s interests, but does not specify the actions that the players do take. A solution is a systematic description of the outcomes that may emerge in a family of games. Game theory suggests reasonable solutions for classed of games and examines their properties. Nash equilibrium is one of the most basic concepts in game theory.
II. GAME THEORY
A game is made of three basic components: a set of players, a set of actions, and a set of preferences. These are collectively known as the rules of the game, and the modeller‟s objective is to describe a situation in terms of the rules of a game so as to explain what will happen in that situation. Trying to maximize their payoffs, the
player will devise plane known as strategies that pick
actions depending on the information that has arrived at each moment. The combination of strategies chosen by each player is known as the equilibrium. Given an equilibrium, the modeller can see what actions come out of the conjunction of all the players‟ plans, and this tells him the outcome of the game.
The number of players will be denoted by n . Let us
label the players with the integers 1 to n , and denote the
set of players by
N
1,2,...
n
. We assumethroughout that there are atleast two players, that is
n
2
. There are three main mathematical models or formsused in the study of games,(i) the extensive form (ii) the strategic or normal form and (iii) the coalitional form.
In the strategic form, many of the details of the game such as position and move are lost; the main concepts are those of a strategy and a payoff. In the strategic form, each player chooses a strategy from a set of possible strategies
We denote the strategy set or action space of player i
by Each player considers all the other players and their possible strategies, and then chooses a specific strategy from his strategy set. All players make such a choice simultaneously, the choices are revealed and the game ends with each player receiving some payoff. Each player‟s choice may influence the final outcome for all players. We model the payoffs as taking as numerical values. The mathematical and philosophical justification behind the assumption that each player can replace such
payoffs with numerical values is said to be utility theory.
We therefore assume that each player receives a numerical payoff that depends on the actions chosen by all the players.
The strategic form of a game is defined by the three objects:
(i) the set,
N
1,2,...
n
of players.(ii)the sequence A1,A2,…….An of strategy set of players ,and
(iii)the sequence f1(a1,…..an)……..fn(a1……an) of real valued
payoff functions of the players.
A game in strategic form is said to be zero-sum if the sum of the payoffs to the players is zero no matter what actions are by the players.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 6, Issue 4, April 2016)
296
We will develop a useful formalism, the semi-extensive form, for analysing repeated games, i.e. those which are repetitions of the same one-shot game (called the stage game). We will describe strategies for such repeated games as sequences of history-dependent stage-game strategies. The payoffs to the players in this repeated game will be functions of the stage-game payoffs. We will define the concept of Nash equilibrium and—after identifying the sub games in this formalism—the concept of a sub game-perfect equilibrium for a repeated game.
We will prove one result which applies to both finitely and infinitely repeated games, which makes it easy to identify at least some of the Nash equilibrium of a repeated game: any sequence of stage-game Nash equilibrium is a sub game-perfect repeated-game equilibrium.
We will then focus our attention on finitely-repeated games. We will exploit the existence of a final period— which wouldn‟t exist in an infinitely-repeated game—to establish a necessary condition for a repeated-game strategy profile to be a Nash equilibrium: the last period‟s play must be Nash on the equilibrium path.
We then turn to sub game perfection within the finitely-repeated game context and prove two theorems about sub game-perfect equilibrium. The first is a necessity theorem which is much stronger than the one we obtained for Nash equilibrium: the last period‟s play must be Nash even off the equilibrium path. The second pertains to the special case in which the stage game has a unique Nash equilibrium payoff.1 In this case the sub game-perfect equilibrium of the repeated game are any period-by-period repetitions of the stage-game Nash equilibrium.
Consider a game G (which we‟ll call the stage game or
the constituent game). As usual we let the player set be I={1,…,n}. In our present repeated-game context it will be clarifying to refer to a player‟s stage game choices as
actions rather than strategies. (We‟ll reserve “strategy” for
choices in the repeated game.) So each player has a
pure-action space Ai. The space of action profiles is
i i
i
A
X
A
. Each player has a von Neumann-Morgensternutility function defined over the outcomes of G,
R
A
g
i:
. Here we use “U” for the payoff to theentire repeated game.)
Let G be played several times (perhaps an infinite number of times) and award each player a payoff which is the sum of the payoffs she got in each period from playing G. Then this sequence of stage games is itself a game: a
repeated game or a super game.
Two statements are implicit when we say that in each
period we‟re playing the same stage game: a For each
player the set of actions available to her in any period in the game G is the same regardless of which period it is and regardless of what actions have taken place in the past.
The payoffs to the players from the stage game in any period depend only on the action profile for G which was played in that period, and this stage-game payoff to a player for a given action profile for G is independent of which period it is played.
Statements a and b are for our repeated game is
stationary (or, alternatively, independent of time and
history). This does not mean the actions themselves must
be chosen independently of time or history.
Then we interpret a and b above as saying that the payoff matrix is the same in every period.
We make the typical “observable action” or “standard signaling” assumption that the play which occurred in each repetition of the stage game is revealed to all the players before the next repetition. Therefore even if the stage game is one of imperfect information (as it is in simultaneous-move games)—so that during the stage game one of the players doesn‟t know what the others are doing/have done
that period—each player does learn what the others did
before another round is played. This allows subsequent choices to be conditioned on the past actions of other players.
IV. HISTORY AND PERIOD-TSTAGE-GAME STRATEGIES
We let the first period be labeled t=0. The last period, if
one exists, is period T, so we have a total of T+1 periods in
our game. We allow the case where T=
, i.e. we can havean infinitely repeated game.
We‟ll refer to the action of the stage game G which player i executes in period t as
a
itThe action profile played in period t is just the n-tuple of
individuals‟ stage-game actions
t
n t
t
t
a
a
a
a
1,
2,
...
…………. (1)We want to be able to condition the players‟ stage-game action choices in later periods upon actions taken earlier by
other players. To do this we need the concept of a history: a
description of all the actions taken up through the previous
period. We define the history at time t to be
0 1 1
...
,
,
tt
a
a
a
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 6, Issue 4, April 2016)
297
In other words, the history at time t specifies which stage-game action profile (i.e., combination of individual stage-game actions) was played in each previous period.
Note that the specification of
h
tincludes within it aspecification of all previous histories
h
0,
h
1,
...
h
t1
.For example, the history
h
t is just the concatenation of1 t
h
with the action profilea
t1; i.e.h
t=(h
t1 ,a
t1 ) The history of the entiregame is
h
T1
a
0,
a
1,
...
a
T
Note also that the set of all possible historiesh
t at time t is justA
X
A
t
t
t 1
0
……… (3)To condition our strategies on past events, then, is to
make them functions of history. So we write player i‟s
period-t stage-game strategy as the function
s
it, wheret i
a
=s
it(h
t) is the stage-game action would play in periodt if the previous play had followed the history
h
t. Aplayer‟s stage-game action in any period and after any history must be drawn from her action space for that period, but because the game is stationary her stage-game
action space Ai does not change with time. Therefore we
write:
it t i t t
A
h
s
A
h
t
I
i
(
)
. Alternatively, wecan write
i
I
t
s
it:A
it
A
iThe period-t stage game strategy profile
s
tis
t
n t t t
s
s
s
s
1,
2...
This profile can be described by
s
it:s
t;
A
t
A
i.e)
(
),
(
)...
.
(
)
)
(
,
t t 1t t 2t t tn tt t
h
s
h
s
h
s
h
s
A
h
V. STRATEGIES IN THE REPEATED GAME
So far we have been referring to stage-game strategies for a particular period. Now we can write, using these stage-game entities as building blocks, a specification for a
player‟s strategy for the repeated game. We write player i’s
strategy for the repeated game as
……….(5)
i.e. a (T+1)-tuple of history-contingent player-i
stage-game strategies. Each
s
it takes a history ash
t
A
t itsargument. The space Si of player-i repeated-game strategies
is the set of all such (T+1)-tuples of player-i stage game strategies
s
it:A
t
A
.We can write a strategy profile s for the repeated game in two ways. We can write it as the n- tuple
Profile of players repeated game strategies
s
s
s
n
s
1, 2,...
…………(6) as defined in (5).Alternatively, we can write the repeated-game strategy profile s as
T
s
s
s
s
0,
1,...
………….(7)i.e., as a collection of stage-game strategy profiles, one for each period, as defined in (4).
VI. PLAYING THE REPEATED GAME
In (6) and (7) we have written a repeated-game strategy profile as a collection of functions of histories. It is interesting to note that a repeated-game strategy profile is not itself a function of histories. The reason is simple: Once every one of the individual players‟ repeated-game strategies
s
i is specified ( or alternatively once each timeperiod‟s history-contingent stage-game strategy profile t
s
is specified), the sequence of actions, and therefore the history of the entire game, is determined. Perhaps this will become clearer when we see below explicitly how the game is played out.Let‟s see how this repeated game is played out once
every player has specified her repeated-game strategy si. It
is more convenient at this point to view this repeated-game strategy profile as expressed in (7), i.e. as a sequence of
T+1 history-dependent stage-game strategy profiles. When
the game starts, there is no past play, so the history
h
0 isdegenerate: Every player executes her
a
i0
s
i0stage-gamestrategy from (5). This zero-th period play generates the history
h
1
(
a
0)
, wherea
0
a
10,a
20,...
a
n0
. This history is then revealed to the players so that they cancondition their period-1 play upon the period-0 play. Each
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 6, Issue 4, April 2016)
298
Consequently, in the t=1 stage game the stage-game
strategy profile
(
)
11(
1),...
1(
1)
1 1
h
s
h
s
h
s
a
nt
isplayed. In order to form the updated history this stage-game strategy profile is then concatenated onto the
previous history:
h
2
a
0,
a
1
. This new history isrevealed to all the players and they each then choose their
period-2 stage-game strategy
s
i2(
h
2)
. And so it goes.…We say that
h
T1 is the path generated by therepeated-game strategy profile s.
VII. PAYOFF IN REPEATED GAME
To complete our description of the semi-extensive form we need to define the payoffs for this repeated game.
Consider some repeated-game strategy profile s and some
time period t. As the game is played out according to this
repeated-game strategy profile a history ht of actions up to
this period is compiled. The history-contingent stage-game strategy profile to be played this period is
s
t(
h
t)
Thepayoff to player i from the period-t stage game when each
player executes her component of the stage-game strategy profile
s
t(
h
t)
isg
i(
s
t(
h
t))
We define the payoff toplayer i for the repeated game, when the repeated game
strategy profile s [as expressed in (7)] is played, to be
Tt
t t i t s
i
g
s
h
u
0
)
(
…….(8)Where
is the common discount factor. Note theappearance of the histories ht in the summand of (8).
These are the histories which are generated sequentially as the game is played. I.e
h
t
h
t1,
h
t(
s
t1)
.8. Equilibrium in Repeated Games Nash Equilibrium in the Repeated Game
Now that we know what strategies and payoffs are in the repeated game, we are in position to make clear what a Nash equilibrium is in this semi-extensive form of a repeated game. Now we will use the expression of a repeated-game strategy profile as a collection of players‟ repeated-game strategies, as in (6). As usual, we say that a
strategy profile s for the repeated game is a Nash
equilibrium if each player‟s part of
s
—i.e., the strategyi
s
i, as given by (5)—is a best response given that the otherplayers are playing their parts of
s
.More formally, we say that the repeated-game strategy
profile
s
is a Nash equilibrium if for all players i
i i
i S s
s
s
u
s
i i
arg
max
,
…………..(9) Let T1h
be thehistory generated by the repeated-game strategy profile
s
;i.e. let
h
T1 be the path associated withs
. Ifs
is aNash-equilibrium strategy profile, then
h
T1 is itsassociated equilibrium path.
VIII. CONCLUSION
Note from (8) that
u
i(
s
)
depends only uponstage-game strategy profiles played along the equilibrium path. Therefore in a Nash equilibrium each player‟s repeated-game strategy need only be optimal along the equilibrium path.
REFERENCES
[1] F. Akyldiz, W. Su, Y. Sankarasubramaniam and E. Cayirci, ―Wireless sensor networks: a survey,‖ Computer Networks, vol.38, pp:393-422
[2] I. F. Akyildiz and I. H. Kasimoglu, ―Wireless sensor and actor networks:research challenges,‖ to be published Ad Hoc Networks, 2004.
[3] T. Basar and G. T. Olsder, Dynamic Non cooperative Game Theory, 2nd Ed., Society of Industrial and Applied Mathematics, 1999.
[4] S. Buchegger and J. L. Boudec, ―Performance Analysis of the CONFIDANT Protocol Cooperation Of Nodes-Fairness In Dynamic Ad-hoc NeTworks,‖ MobiHoc, 2002.
[5] N. Bulusu, D. Estrin, L. Girod and J. Heidemann, ―Scalable coordination for wireless sensor networks: self-configuring localization systems,‖International Symposium on Communication Theory and Applications [6] Dan Li, Kerry D. Wong, Yu Hen Hu, Akbar M. Sayeed."Detection,
Classification, and Tracking of Targets,"IEEE Signal Processing Magazine, Volume 19, pp: 17-19, (2002).
[7] M. Felegyhazi, L. Buttyan and J. P. Hubaux, ―Equilibrium Analysis of Packet Forwarding Strategies in Wireless Ad Hoc Networks the Static Case,‖ Proceedings of Personal Wireless Communications (PWC „03),
[8] Wei-Peng Chen, Hou, J.C, Lui Sha. "Dynamicclustering for acoustic target tracking in wireless sensornetworks", IEEE Transactions on Mobile Computing,Volume 3, pp: 258-271, (2004).
[9] L. Buttyan and J.P. Hubaux, ―Nuglets: AVirtual Currency to Stimulate Cooperation in Self organized Mobile Ad-Hoc Networks,‖Technical Report DSC/2001/001,Swiss Fed. Inst. Of Technology, Jan. 2001 [10] S. Marti, T. Giuli, K. Lai and M. Baker, ―Mitigating routing
misbehavior in mobile ad hoc networks,‖ Proceedings of Mobicom 2000, Boston, MA,USA, August 2000.
International Journal of Emerging Technology and Advanced Engineering
Website: www.ijetae.com (ISSN 2250-2459, ISO 9001:2008 Certified Journal, Volume 6, Issue 4, April 2016)
299
[12] A. Perrig, R. Szewczyk, V. Wen, D. Culler and J. D. Tygar, ―SPINS: Security Protocols for Sensor Networks,‖ MobiCom, pp: 189-199, July 2001.
[13] V. Srinivasan, P. Nuggehalli, C. Chiasserini, and R. ao,―Cooperation in Wireless Ad Hoc Networks,‖ Proc. IEEEINFOCOM, vol. 2, pp. 808-817, Apr. 2003
[14] E. Shih, S. Cho, N. Ickes, R. Min, A. Sinha, A. Wang and A.Chandrakasan, ―Physical layer driven protocol and algorithm design for energy-efficient wireless sensor networks,‖ Proceedings of ACMMobiCom, Italy, pp:272-286, July 2001
[15] Nuggehalli, C. F. Chiasserini and R. R. Rao, ―Cooperation in Wireless Ad Hoc Networks,‖ INFOCOM,