Contents lists available atScienceDirect
Games
and
Economic
Behavior
www.elsevier.com/locate/geb
Starting
small
to
communicate
✩
Alp Atakan
a,
b,
Levent Koçkesen
a,
Elif Kubilay
c,
∗
aKoçUniversity,TurkeybQueenMaryUniv.ofLondon,UnitedKingdomofGreatBritainandNorthernIreland cUniversityofEssex,UnitedKingdomofGreatBritainandNorthernIreland
a
r
t
i
c
l
e
i
n
f
o
a
b
s
t
r
a
c
t
Articlehistory:
Received26December2016 Availableonline13March2020
JELclassification: D82 D83 D23 Keywords: Communication Cheaptalk Reputation Repeatedgames Careerpath Gradualism Startingsmall
Weanalyzearepeatedcheap-talkgameinwhichthereceiverisprivatelyinformedabout theconflictofinterestbetweenherselfandthesenderandeitherthesenderorthereceiver controlsthestakesinvolvedintheirrelationship.Wefocusonpayoff-dominantequilibria that satisfy aMarkovian property and show that if the potential conflict of interest is large,then thestakesincreaseovertime, i.e.,“startingsmall” isthe uniqueequilibrium arrangement. Ineach period, the receiver plays the sender’s ideal actionwith positive probabilityandthesenderprovidesfullinformationaslongashehasalwaysobservedhis idealactionsinthepast.Wealsoshowthatasthepotentialconflictofinterestincreases, theextenttowhichthestakesareback-loadedincreases,i.e.,stakesare initiallysmaller butgrowfaster.
©2020TheAuthor(s).PublishedbyElsevierInc.Thisisanopenaccessarticleunderthe CCBYlicense(http://creativecommons.org/licenses/by/4.0/).
1. Introduction
Weanalyzearepeatedcheap-talkgameinwhichthereceiverisprivatelyinformedabouttheconflictofinterestbetween herselfandthesenderandeitherthesenderorthereceivercontrolsthestakesinvolvedintheirrelationship.Forexample, the sender could be a manufacturer who is better informed about demand andmakes retail-price recommendations to a retailer, who inturn decides on the actual retailprice andmayhave differentobjectives than the manufacturer.1 The
retailer mightdecide on how much tobuy fromthe manufacturer, inwhich casewe would havea situation wherethe receivercontrolsthestakes,orthemanufacturermightdecideonhowmuchtodelivertothatparticularretailer,inwhich caseitisthesenderwhocontrolsthestakes.
✩ Anearlierversionofthispaperhasbeen circulatedunderthetitle“OptimalDelegationofSequentialDecisions:TheRoleofCommunicationand Reputation”.WethankNavinKartik,VladyslavNora,MuratUsman,OkanYilankaya,anassociateeditor,andtworefereesforvaluablecomments.Wealso wouldliketothanktheparticipantsatvariousseminarsandconferencesfortheircommentsandquestions.Atakan’sworkonthisprojectwassupported byagrantfromtheEuropeanResearchCouncil(ERC681460,InformativePrices).KoçkesenandKubilay’sworkonthisprojectwassupportedbyTÜBITAK grantno.112K496.
*
Correspondingauthorat:DepartmentofEconomics,UniversityofEssex,UnitedKingdomofGreatBritainandNorthernIreland. E-mailaddress:[email protected](E. Kubilay).1 SeeBuehlerandGärtner(2013) andKimandNora(2017),amongothers,formodelsinwhichretail-pricerecommendationsserveasacommunication deviceinverticalsupplyrelations.
https://doi.org/10.1016/j.geb.2020.03.001
0899-8256/©2020TheAuthor(s).PublishedbyElsevierInc.ThisisanopenaccessarticleundertheCCBYlicense (http://creativecommons.org/licenses/by/4.0/).
Or the receivercould be an investor who first decides howmuch to saveandthen howto allocatethesefunds after receivingadvice fromaninformedfinancialadvisor.Theinvestorisprivatelyinformedaboutherpreferences,whichmight includeriskpreferencesorsomebehavioralbias.Thiswouldbeasituationinwhichthereceiverchoosesthestakesinvolved intherelationship.Alternatively,itcouldbetheadvisor (sender)who choosesthestakesbycontrollingwhichinvestment opportunitiestopresenttotheinvestorineachperiod.Forexample,hemaypresentmoreimportantones(orabigbunch ofthem)atthebeginningandlessimportantoneslater,orviceversa.
Insituationswherethesenderisasubordinate,e.g.,anadvisor,inahierarchicalrelationship,wecouldalsothinkofthe problemfacedbythereceiverastheoptimaldesignoftheimportanceofthedecisionsaboutwhichthesenderisconsulted overtime,i.e.,thesender’scareerpath.Atwhatlevelofthehierarchyshouldthereceiverstartthesenderandhowshould shegoaboutpromotinghim? Isitbesttostarthimata verylowrankandkeephimthereforalongtime,orshouldthe careerofthesenderprogressatasteadypace?Whatistheroleofpotentialconflictofinterestbetweenthesenderandthe receiverintheoptimaldesignofthecareerpathofthesender?
In ourmodel, each perioda sender observesa payoff relevant state ofthe worldand communicates thisinformation to the receiver.The receiver observes themessage sent bythe sender andmakes a decision. State ofthe worldand the decisionjointlydeterminethepayoffs, whicharerevealedattheendoftheperiod.Overallpayoff ofeachplayerisequal to the weighted sumofperiodpayoffs, wherethe weightof each periodisdetermined by thesize ofthe stakes(or the importance ofthedecision)inthatperiod.2 Thesenderwouldlikethedecisiontomatchthestateoftheworldwhilethe
receivermightbebiased.Morecrucially,thesender’spreferencesarecommonknowledgewhilethatofthereceiverisher privateinformation.
We assume that the informationon thestate ofthe worldis “soft”,i.e., it cannot be verified, andthat themessages arecostless.Thismakesthecommunicationphaseineachperioda“cheap-talk”game,i.e.,thesendermaylieandthishas no directcostsforhim.Wealsoassumethat thedecisionsofthereceiverarenot contractible.Thiscould beduetolegal reasons or becausethe decisions are impossibleto reproduce before courts.3 Our third crucialassumption is that states ofthe worldareindependently distributedacrossperiods.Thisimpliesthatthesender decideshowmuch informationto reveal each periodwithout havingto worryaboutits informationalimplications forthefuture states.Finally, weassume thatthereceiver’spreferencesaresimilarforeachdecision,i.e.,sheeithersharesthepreferencesofthesenderorisbiased in thesamemannerforall thedecisions.Therefore,our modelismoresuitable forsituationsinwhich thedecisionsare related, suchasaseriesofinvestmentdecisions,orbudgetarydecisionsfordifferentdepartments,etc.Analternative,and perhaps moreplausible, waytointerpret themodelisthat thereceiver makesthe samedecisionineach periodandthe stateoftheworldobservedbythesendercorrespondstotheoptimaldecisionforthatperiod.4
We assume thatthe receiveriseitheran unbiasedtype, who myopicallychooses thedecisionbestsuitedto thestate givenher beliefsineach period,orabiasedtypewhoactsstrategically.Theunbiasedtyperesemblesacommitmenttype that is commoninthereputation literature.But,unlike a standard commitmenttype whoalways plays thesame action, the unbiasedreceiverplays abestresponse toher beliefsinanygivenperiod.Furthermore,unlike instandardmodels of repeatedgamesandreputation,wherethediscountratesandthestagegamepayoffsarefixed,inourmodelthesizeofthe stakes(ortheimportance ofthedecision)ineachperiodischosen strategicallybyeitherthesenderorthereceiver.5 Our aimistocharacterizetheperfectBayesianequilibriaoftheresultingextensiveformgamewithincompleteinformation.
Inordertogainsomeintuitionaboutthemajorforcesatworkinthemodel,notethatthesenderwouldliketoreceive his favoritedecision,i.e., theunbiaseddecisionwhichmatchesthestate,ineach period.Therefore,ifhebelievesthat the receiverisgoingtomaketheunbiaseddecisionwithhighenoughprobability,thenhehasanincentivetorevealthestateof theworldtruthfully.Thebiasedreceiver,ontheotherhand,wouldliketomakeadecisionthatisbestforher,i.e.,thebiased decision,inanyperiodandforthatreasonshewouldliketoreceiveaccurateinformation.However,ifshemakesadecision thatisdifferentfromthedecisionthatwouldbemadebytheunbiasedcommitmenttype,shewouldberevealedasbiased andreceivenoinformationinthefuture.Thisintroducesreputationconcernsinthesensethatshemaymasqueradeasthe unbiasedreceiverandactagainstherowninteresttoday,inordertoreceivebetterinformationinthefuture.
Itisclearthatthereceiverbenefitsfromtruthfulcommunication.Itturnsoutthat,ex-ante,thesenderalsobenefitsfrom truthfulcommunication,irrespectiveofwhetherthereceiverisbiasedornot.Infact,ifhecouldcommittoacommunication strategybeforelearningthestateinthestagegame,hewouldcommittofullrevelation.6Therefore,thestagegameexhibits
bothconflictofinterest,becauseofthepossiblebias,andcommoninterest,becauseofthecommonpreferencefortruthful communication.
These considerations imply that the sender (or the receiver) may choose the stakes in a strategic manner in order to utilizethe reputationalincentivesandfacilitatecommunication. In particular,ifrelatively largerstakes are leftfor the
2 Fromnowonwewillusetheterms“importanceofthedecision”and“sizeofthestakes”interchangeably.
3 Theassumptionthatdecisionsareobservablebutnotcontractiblefollowsthe“incompletecontracts”perspective(e.g.,GrossmanandHart(1986) and HartandMoore(1990).
4 Wearegratefultoarefereeforthisinterpretationofourmodel.
5 Thiscouldalsobethoughtofaschoosingatimevaryingdiscountrateinastrategicmanner.
6 Asinthestandardcheap-talkgames,weassumethatthesendercannotcommittoacommunicationstrategy.Evenifhecould,however,commitment totruth-tellingmaynotbeinhisinterestintherepeatedgame,becausethatwouldeliminatetheabilitytopunishthereceiverforplayingthebiased action.
future,thenthebiasedreceivermaychoosetoplaytheunbiasedactionearlyoninthegameandthismayenabletruthful communication.
Asisusualincheap-talkgames,ourmodelexhibitsmultipleequilibria.Inordertocircumventthisproblem,wefocuson equilibriathatsatisfyaMarkovianpropertyandyieldthehighestequilibriumpayoffsforalltheplayers,whichwecallthe
payoff-dominantequilibria.Weshowthatifthepotentialbiasislargeandtheinitialreputationofthereceiverbadenough, thenthereisa uniquepayoff-dominantequilibriumoutcome inwhichthesizeofthestakesincreasesovertime. Inother words,thetimepathofthestakesexhibits“gradualism”or“startingsmall”,whichhasbeenarecurringthemeineconomics indifferentcontexts.7 Ourpapercontributesto thisliteratureby providingan alternative rationaleforthesephenomena, whichisbasedonutilizingreputationalincentivestomaintainahealthycommunicationinarelationship.
Weshow that thetimepath ofthestakesischosen insuch awaythatthe biasedreceiveris indifferentbetweenthe biased and the unbiased actions in each period. She mixes between theseactions in a waythat builds her reputation (aftereach successiveunbiasedaction) atjust therightspeed inordertofacilitatecommunicationineveryperiod.Ifher reputationwere toevolve faster,thentruthful communicationwouldfail inthecurrentperiod,whileifitwere toevolve slower,itwouldfailinthefuture.Senderrevealsthestatetruthfullyineveryperiodaslongashehasalwaysobservedthe unbiasedactioninthepast.8
Interestingly, the sender doesnot prefer a screening equilibrium in which the biasedreceiver reveals herself at the beginningofthegame.Thisisbecauseoncethebiasedreceiverrevealsherself,thesendercannolongercommunicatewith herand,aswehavementionedbefore,ex-antethesenderpreferstocommunicateevenwiththebiasedreceiver.Thebiased receiverdoesnotprefertobescreenedeitherbecause,ifherinitialreputationisbad,shewillreceiveinformationneitherin theperiodsherevealsherselfnorafterwards.Theunbiasedreceivercouldpotentiallybenefitfrombeingrevealedbutonly ifthathelpsherreceivemoreinformation.Inthepayoff-dominantequilibriumoutcome,shereceivescompleteinformation ineveryperiodexceptperhapsthefirstperiodinwhichthebiasedreceiverplaysamixedstrategy.However,ifthebiased receiverrevealsherselfbyplayingthebiasedactionwithprobabilityone inthefirstperiod,ratherthanmixing, thenthe sender cannotcommunicate inthat period.Thus,screeningdoesnot increase theamountofinformationprovided tothe unbiasedreceivereither.
Ourresultsfurtherimplythatasthepotentialconflictofinterestbetweenthesenderandthereceiverincreases,initial stakesbecomesmallerbutthey growfaster.Thisisduethefact thatasthebiasbecomeslarger,thefuturemustbecome relativelymoreimportantinordertoprovidesufficientincentivestothebiasedreceivertoplaytheunbiasedactioninthe currentperiod.Wealso showthat, boththe senderandthereceiverprefer tospreadthe totalstakeintheir relationship overasmanyperiodsaspossible.Ifthepotentialbiasislarge,thiswouldmeanthatthestakesremainverysmallforalong periodoftimeandthenincreasequicklytowardstheendoftherelationship.
Finally,we should note that since thesender fullyrevealsthe state inevery period aslong asheobserves the unbi-ased action,apayoff-dominant equilibriumisalso themostinformative oneon theequilibriumpath.9 Therefore, payoff-dominanceispotentiallyareasonableequilibriumselectioninourcontext.10
2. Themodel
Asenderanda potentiallybiasedreceiverplaya repeatedcheap-talkgamefor N periods. Sinceitismoreconvenient todoso,we willcounttheperiods inreverse,so thatthefirstperiodislabeled N,thesecond N
−
1,andsoon.Ineach periodi,thefollowingstage-gameisplayed:1. Thesender(orthereceiver)choosestheparameter
δ
i∈ [
0,
1]
forperiodi.Theparameterδ
i representstheproportion ofthetotalstakesdeferredtosubsequentperiodsand1−
δ
irepresentstheproportioninperiodi.Sincethereareno subsequentperiodsinthelastperiod,wesetδ
1=
0.2. Naturechooses thestateoftheworld
θ
i∈ {
0,
1}
.Weassumethateach stateisequallylikelyandthatstatesare inde-pendentacrossperiods.3. Thesenderobservesthestate
θ
i andchoosesamessagemi∈ {
0,
1}
.114. Thereceiverobservesthesender’smessageandchoosesanactionai
∈
R
withoutobservingthestateoftheworld. Wedefinetheparameterγ
iastheimportanceofthedecisionortheproportionofthetotalstakesmadeinperiodi.More precisely,γ
N=
1−
δ
N,γ
i=
δ
N· · ·
δ
i+1(
1−
δ
i)
,andγ
1=
δ
N· · ·
δ
2(
1−
δ
1)
. Thesender’s payoffforperiod i is v(
ai,
θ
i,
γ
i)
=
7 SeeSection6foradiscussionoftheliteratureongradualismandstartingsmall.
8 Iftheinitialreputationofthereceiverisnotbadenough,thenthesender-optimalequilibriumisnotoptimalforthereceiver.Still,eveninthiscase, weshowthatthesizeofthestakesincreasesovertimeinthereceiver-optimalequilibrium.
9 Aswewillshowsubsequently,thisistrueexceptperhapsinthefirstperiod.
10 WereferthereadertoChenetal.(2008) andthereferencescitedthereinforjustificationofthemostinformativenesscriterionincheap-talkgames. Weshould,however,notethattheconditiondevelopedinthatpaper,i.e.,noincentivetoseparate,canbedirectlyappliedonlytoaone-periodversionof ourmodel,inwhichitwouldindeedselectthefullyinformativeequilibriumwhenitexists.
11 Wecanshowthatanyequilibriumwithalargermessagespacecanbereplicatedwithtwomessagesandhencerestrictingthenumberofmessagesto twoiswithoutlossofgenerality.
−
γ
i(
ai−
θ
i)
2 andthereceiver’spayoff isu(
ai,
θ
i,
β,
γ
i)
= −
γ
i(
ai−
(θ
i+
β))
2,whereβ
∈ {
0,
b}
andb>
0.Theparameterb measures thedivergenceofthepreferencesofthesenderandthereceiver,orsimplythe“bias”ofthereceiver.Weassume thatnaturechoosesβ
=
bwithprobability p∈
(
0,
1)
beforethegamebeginsandprivatelyinformsthereceiver.Thepayoff ofeachplayerovertheN periodsissimplythesumofthepayoffsfromeachperiod.Thestate oftheworld,themessages,andthedecisionsofthereceiverareunverifiableandhencecannotbecontracted upon. Furthermore,asthe payofffunctionsimply, themessages havenodirect payoffconsequence.This impliesthatthe communication between the sender and the receiver is “cheap-talk” and that outcome contingent contracts cannot be written.After aperiodisover,thesenderandreceiverobservetheirpayoffsandthereforethereceiverlearnsthestatein thatperiodandthesenderlearnsthereceiver’saction.
Weassume thattheunbiased(type
β
=
0) receiverisacommitmenttypewho,ineachperiod,playstheactionthatis perfectlyalignedwiththesender’spreferences.Moreprecisely,fixaperiodi andletλ
∈ [
0,
1]
betheprobabilityassigned bythereceivertotheeventthatθ
i=
1.Definethebestperiodactionfortypeβ
∈ {
0,
b}
asfollows:aβ
(λ)
=
argmax ai∈RE
−
(
ai−
(θ
i+
β))
2=
λ
+
β.
(2.1)We refertoa0
(λ)
astheunbiasedactionandtoab(λ)
asthebiasedaction.Theunbiasedreceiverisacommitmenttype (or an automaton)who playsactiona0(λ)
ineachperiod,i.e., shepicksthemyopicbestresponse ofa receiverwhohaszero biasandthereforechoosesanactionthatisperfectlyalignedwiththesender’spreferences.Thebiasedreceiver,incontrast, isrationalandchoosesherperiodactionstrategically.Themainquestionanalyzedinthepaperistheequilibriumallocationofthestakesortheimportanceofdecisionsover time. We assume thatany
δ
i∈ [
0,
1]
can be chosen atthebeginning ofeach periodi andthat atleastone periodhas a positive weight.In otherwords, theimportance ofperiodi decisioncanbe fine-tunedin anydesiredway.This couldbe motivated in three different ways:(1) The parameterγ
i is the proportionof the total stakes in therelationship that is assignedtoperiodi;(2)Thereisalargesetofdecisionsandeachperiodasubsetofthesedecisionsischosen;(3)Thereis alargesetofdecisionswithvaryingimportanceandeachperiodonedecisionfromthissetischosen.We willanalyze twodifferent versionsofthemodel: (1)Sender choosesthe parameter
δ
i (sender-game);(2) Receiver chooses theparameterδ
i (receiver-game).Forthesakeofclarity,wewilldescribeandpresentourresultsindetailforonly thesender-game.InSection5wewilldescribethereceiver-gameandarguethat ourresultsgothroughforthatmodelas well.Letoi
=
(δ
i,
θ
i,
mi,
ai)
denoteaperiodioutcomeand Oithesetofallpossibleperiodi outcomes.Foranyi<
N,let Hi bethesetofallhistoriesbeforedecisioniismade,i.e.,sequencesofthetype(
oN, . . . ,
oi+1)
.Define HN= {∅}
.Thesender’s belief in periodi is a mapping pi:
Hi→ [
0,
1]
where pi(
h)
=
prob(β
=
b|
h)
foreach h∈
Hi and pN(
∅
)
=
p.A period i strategy forthe senderiscomprised oftwo components:τ
i:
Hi→ [
0,
1]
,whereτ
i(
h)
isthe sender’s choiceofδ
i∈ [
0,
1]
inperiodi afterhistoryh∈
Hi,andμ
i:
Hi× [
0,
1]
× {
0,
1}
→ [
0,
1]
,whereμ
i(
h,
δ
i,
θ
i)
,denotestheprobabilityofsending message 1afterhistoryh,δ
i,andθ
i.12 The receivermovesafterhistories ofthetype(
h,
δ
i,
θ
i,
mi)
whereh∈
Hi.Forany historyh∈
Hi,aperiodi informationsetforthereceiverisgivenby Ii= {
(
h,
δ
i,
θ
i,
mi)
:
θ
i∈ {
0,
1}}
.Inotherwords,before makingadecisioninperiodi,theonlythingthatisnotknownbythereceiverisθ
i.Letthesetofallperiodi information sets beI
i.Receiver’sbeliefthatθ
i=
1 isgivenbyλ
i:
I
i→ [
0,
1]
.Sincetheunbiasedreceiverisacommitmenttype,we will onlydescribestrategiesforthebiasedreceiver.Biasedreceiver’s(mixed) strategyisgivenbyα
i:
I
i→
R
,whereR
denotestheset ofall probabilitydistributions withsupportinR
.Forease ofexpositionwe willsometimeswriteλ
i(
h, δ
i,
mi)
andα
i(
h, δ
i,
mi)
foranyh∈
Hi,δ
i∈
[0,
1], andmi∈ {
0,
1}
.Acollectionσ
=
(τ
i,
μ
i,
α
i,
pi,
λ
i)
Ni=1 constitutesanassessment.
WeareinterestedintheperfectBayesianequilibria(PBE)ofthegame.Forexpositionalreasons,wewillrestrictattention toequilibriainwhichforanyi
∈ {
N,
N−
1, . . . ,
1}
,h∈
Hi,andδ
i∈
[0,
1] (1)Thesendersendsmessagem=
0 afterobservingθ
i=
0,i.e.,μ
i(
h,
δ
i,
0)
=
0 and(2)Thereceiverputspositiveprobabilityonlyonthebiasedandunbiasedactions,i.e.,α
i(
h, δ
i,
m)
∈
a0
(λ
i(
h, δ
i,
m)),
ab(λ
i(
h, δ
i,
m))
.
We should note that restriction (1)is without loss ofgenerality in terms ofequilibrium outcomesandrestriction (2) is satisfiedinanyequilibrium.13
Furthermore,wefocusourattentiononequilibriathatsatisfythefollowingtwoproperties:
Property1.Fix an assessment
σ
,a period i,a historyh∈
Hi,andan outcome oi=
(δ
i,
θ
i,
mi,
ai)
.Ifai=
a0(λ
i(
h,
δ
i,
mi))
, thenσ
isaPBEonlyifpj(
hˆ
)
=
1 inanyperiod j=
i−
1,
. . . ,
1 andhistoryhˆ
∈
Hjthatfollowsoi.12 Althoughwedefineonlythepurestrategiesinthechoiceofδ
i,inouranalysisweallowformixedstrategiesinboththesenderandthereceiver-game.
13 Aswewillshowinthesequel(seeLemma2),anyequilibriumwheretype0mixesbetweenthetwomessagesiscompletelyuninformativeandhence canbereplicatedwithanequilibriuminwhichbothtypessendmessage0.Furthermore,playinganyactionotherthantheunbiasedactionrevealsthe receiverasbiased,i.e.,leadstotheworstbeliefsabouther.Thisimpliesthatifsheplaysanyactionotherthantheunbiasedactionwithpositiveprobability, ithastobeherbestperiodaction,i.e.,thebiasedaction.
Thisproperty impliesthat anyactionother thanthe unbiasedaction isbelievedto beplayedby thebiasedtype and oncebeliefsputprobabilityoneonthebiasedtype,theyareneveraltered.14
Property2.For any i
=
N,
. . . ,
1 and h,
h∈
Hi: (1) pi(
h)
=
pi h impliesτ
i(
h)
=
τ
i(
h)
,μ
i(
h,
δ
i,
θ
i)
=
μ
i(
h,
δ
i,
θ
i)
,α
i(
h, δ
i,
mi)
=
α
i h, δ
i,
mi ;(2)α
i(
h, δ
i,
0)
=
α
i(
h, δ
i,
1)
.Thefirst partofthispropertystatesthat pasthistorymattersonlytothe extentthat itchangesthe reputationofthe receiver,whilethesecond partstatesthatthereceiverplays asymmetricstrategy.Together, theyimplythat strategiesdo notdepend onthepastcommunicationbehaviorofthesender.15Since thesender’spastcommunicationbehaviorhasno effect on currentand futurepayoffs orthe states of theworld, this is a Markovianproperty in the sense that strategies are independent ofpayoff irrelevant histories.16 In particular,this restriction eliminates punishments inthe form of“no
informationrevelation” or“playingthebiasedaction”afterhistoriesin whichthesenderdidnot communicatetruthfully. Inadditiontoits Markoviannature,thispropertyisalsoimplied by“renegotiation-proofness”,becauseevenafterhistories in whichthe sender haslied, both partieshave an incentive to choosea continuation equilibriumin which thereis full communication.Insection3.2and4wewillcommentonhowourresultschangewhenthisrestrictionisremoved.
Fromthispointon,werestrictattentiontoPBEthatsatisfyProperties1and2.Hence,whenwesaythatanassessment
σ
constitutesanequilibriumwemeanthattheassessmentisaPBEandtheassessmentsatisfiesProperties1and2. Given the above restrictions, we simplify notation and describe period i strategies by functionsτ
i:
Hi→ [
0,
1]
,μ
i:
Hi
× [
0,
1]
→ [
0,
1]
,andqi:
Hi× [
0,
1]
→ [
0,
1]
,whereτ
i(
h)
determinesthechoiceofδ
i,μ
i(
h,
δ
i)
determinestheprobability thatthesendersendsmessage1 afterδ
iandθ
i=
1,andthefunctionqi(
h,
δ
i)
isadistributionalstrategy(seeMilgromand Weber(1985))forthereceiverthatdeterminesthetotalprobabilitywithwhichthereceiverplaysthebiasedaction.Asisusualincheap-talkgames,therearemanyequilibriaofthegamedefinedabove,evenundertheMarkovian restric-tionintroducedinProperty2.Inthispaper,wefocusonthepayoff-dominantequilibria,i.e.,equilibriathatyieldthehighest equilibriumpayoffsfortheplayers.
3. Preliminaries
Fixanassessment
σ
,aperiodi,ahistoryh∈
Hi,andδ
i∈
[0,
1].LetPr(
m)
bethetotalprobabilitythatthesendersends messagem∈ {
0,
1}
inthisassessmentinperiodiafter(
h, δ
i)
.17Letq=
qi(
h, δ
i)
andλ
(
m)
=
λ
i(
h, δ
i,
m)
.Wecanthenwrite thesender’sandthebiasedreceiver’sex-antecostsinthatperiodasfollows:CiS
(
σ
|
h, δ
i)
=
qb2 Cost of bias
+
m∈{0,1} Pr(
m)λ(
m)(
1−
λ(
m))
Cost of miscommunication CiB R(
σ
|
h, δ
i)
=
1−
q p b2 Cost of bias+
m∈{0,1} Pr(
m)λ(
m)(
1−
λ(
m))
Cost of miscommunicationThesecostsarecomposedoftwocomponents:Thefirstcomponent(costofbias)comesfromthefactthatthereceiverplays the biasedaction with totalprobability q afterboth messages. The second component(costofmiscommunication) comes fromthe fact that thesender may not providefull informationand isequal to theexpected conditional variance ofthe state of the world. If, for example,the sender’s message provides no information on
θ
i,thenλ
(
m)
=
1/
2 for m∈ {
0,
1}
andthecostofmiscommunicationisequal1/
4.Ifitisperfectlyinformative,thenλ
(
1)
=
1 andλ
(
0)
=
0 andthecost of miscommunicationisequaltozero.Thecostoftheunbiasedreceiverissimplythecostofmiscommunication.3.1. One-periodmodel
Tofixideas,webeginouranalysiswiththesimplecaseof N
=
1.Letq bethetotalprobabilitywithwhichthebiased actionisplayedandnotethatsequentialrationalityimpliesthatthebiasedreceiverplaysthebiasedactionwithprobability one after both messages, i.e., q=
p.As it is the case incheap-talk games,there is always an equilibrium inwhich the14 Thefirstpartisquitereasonablebecausetheunbiasedactionistheuniqueactionthatisavailabletotheunbiasedtype.Thesecondpartisthesame asthe“neverdissuadedonceconvinced”assumptioninRubinstein(1985).
15 Notethatthesecondpart,i.e.,symmetry,isnecessaryforthistobetrue.Ifthereceiverplaysdifferentlyafterdifferentmessages,thenthesender’s choiceofthemessagemayinducedifferentreputationlevels,andhencedifferentbehavior,inthefuture.However,symmetrydoesnotimplythatthe receiverplaysthesamemixedstrategyoverthesamesetofactionsaftereachmessage.Itonlyimpliesthatthesameprobabilitiesareassignedaftereach messagetothebiasedandunbiasedactions,whichthemselvesdependonbeliefsandthusonthemessage.
16 SeeMaskinandTirole(2001) forthenotionofpayoffirrelevanthistoriesandsomeargumentsinfavoroffocusingonMarkovperfectequilibria. 17 Moreprecisely,Pr(1)=0.5μ
sender’s messageprovides noinformationaboutthestate, thesocalled“babblingequilibrium”.Instead,suppose thatthe sendercommunicatestruthfullyinequilibriumand,withoutlossofgenerality,assumethattype
θ
=
0 sendsmessagem=
0 andtypeθ
=
1 sendsmessagem=
1.Therefore,aftermessagem∈ {
0,
1}
,theunbiasedactionismandthebiasedactionism
+
b.Optimalityofthestrategyoftypeθ
=
1 senderimpliesthatthecostofsendingmessagem=
1 islessthanthecost ofsendingmessagem=
0,i.e.,q
(
1+
b−
1)
2+
(
1−
q)(
1−
1)
2≤
q(
0+
b−
1)
2+
(
1−
q)(
0−
1)
2.
Thisisequivalenttoqb
≤
1/
2 orq≤ ¯
q,where¯
q
≡
1 2b.
Optimalityofthestrategyoftype
θ
=
0 impliesthatq
(
b−
0)
2+
(
1−
q)(
0−
0)
2≤
q(
1+
b−
0)
2+
(
1−
p)(
1−
0)
2,
whichisalways satisfied.Converselywe canshowthatifq
≤ ¯
q,thenthereisan equilibriumwithfullrevelation.Ifq¯
/
2<
q
<
q¯
, thenthere isalsoan equilibriuminwhich typeθ
=
1 completely mixes,butsince thisplays nomajor role inour analysis we relegate its proof to the Section 8 (see the proof of Lemma 2). Therefore, if q>
q¯
there is no information revelationinequilibrium.Notethattruthfulcommunicationbecomesanissueonlywhenq¯
<
1,orequivalentlyb>
1/
2.We summarizethisdiscussioninthefollowinglemmaforeasyreference.Lemma1.LetN
=
1andp∈
[0,
1]betheprobabilitythatthereceiverisbiased.Inanyequilibrium,thereceiverplaysthebiasedaction withprobabilityone,i.e.,q=
p.Ifp>
q,¯
thenthesender’smessageprovidesnoinformation.Ifp≤ ¯
q,thenthereisanequilibriumwhere thesendertruthfullyreportsthestateandifq¯
/
2<
p<
q,¯
thenthereisanequilibriumwherethesender’sreportispartiallytruthful (i.e.,thesendersendsthesamemessagewithpositiveprobabilityinbothstates).Thesearetheonlyequilibria.Note that the equilibria described in Lemma 1 are Pareto ranked: the more informative equilibrium yields a strictly higherexpectedpayofftoboththesenderandthereceiver.Infact,inthetruthfulequilibrium,thereceiver’scostisequalto zerowhereasthe(ex-ante)costofthesenderis pb2.Inthepartiallyinformativeequilibrium,receiver’scostis
(
1/
2−
pb)
∈
(
0,
1/
4)
,whereasthesender’sisequalto pb2+
(
1/
2−
pb)
.Inthebabblingequilibrium,expectedcosts ofthereceiverand thesenderare1/
4 andpb2+
1/
4,respectively.Remark1.WeshouldnotethattheobservationinLemma1istrueforanyperiodinwhichthecommunicationincentives ofthesenderdependsonlyonthetotalprobabilitywithwhichthebiasedactionisplayedinthatperiod.Property2ensures thatthisisindeedthecaseandhencewehaveLemma2,whoseproofisinSection8.
Lemma2.FixaperfectBayesianequilibrium,aperiodi
∈ {
1,
. . . ,
N}
,historyh∈
Hi,andδ
i∈
[0,
1].Sender’sequilibriumcommunica-tionstrategycanbeofthreetypes:(1)fullyinformative;(2)partiallyinformative;or(3)non-informative.Itiscompletelyinformative onlyifq
(
h, δ
i)
≤ ¯
q andpartiallyinformativeonlyifq¯
/
2<
q(
h, δ
i) <
q.¯
3.2. Two-periodmodel
In thissubsection,we providesomeintuitionforourgeneralresultbyanalyzingthetwo-period versionofthemodel. Wewillassumethatthesenderchoosesthestakesandanalyzethemoreinterestingandchallengingcaseofp
>
q¯
.Wewill commentlateron whathappenswhen p≤ ¯
q andinSection5we willarguethatourresultsgothroughforthe receiver-game.We willfirstestablishtwo importantfactsthat aretrueinanyequilibrium, includingthosethatdo notsatisfy the Markovian property (Property 2). After that we will constructan equilibriumandshow that it yields the unique payoff dominantoutcomeamongallequilibriathatsatisfyProperty2.As we statedin Lemma 2, the Markovianproperty implies that truth-telling incentives of the sender in each period dependsonlyonthetotalprobabilitywithwhichthebiasedactionisplayed inthat period:ifthebiasedactionisplayed with total probability qi in period i, then full revelation is incentive compatible if qi
≤ ¯
q while no information can be revealedifqi>
q¯
.Notethat weare usingtwo aspectsofProperty2here:(1)communicationbehavior inthenext period isindependentofthecommunicationbehaviorinthecurrentperiod;and(2)symmetry,i.e.,totalprobabilityofplayingthe biasedactionisthesameandequaltosomeqiaftereachmessage.1818 Weshouldnotethatthedefinitionsofthebiasedandunbiasedactiondependonthereceiver’sbeliefs(seeEquation(2.1))andhencepotentiallyon themessage.Therefore,symmetrydoesnotmeanthatthereceiverplaysthesamemixedstrategyoverthesamesetofactionsaftereachmessage.For example,ifthesenderfullyrevealsthestate,thenaftermessagem=0,thebiasedandunbiasedactionsareband0,respectively,whileafterm=1,they are1+band1.Symmetryonlymeansthatprobabilityassignedtob(resp.0)afterm=0 isthesameastheprobabilityassignedto1+b(resp.1)after m=1.
Thisallowsustowritethetotalcostofthesenderinthefollowingsimpleform: CS
(δ
2,
q2,
c2,
c1|
p)
=
(
1−
δ
2)
q2b2+
c2 Period 2 cost+
δ
2[
q2(
b2+
1 4)
Cost after biased action
+
(
1−
q2)
(
p1b2+
c1)
Cost after unbiased action]
=
(
1−
δ
2)
q2b2+
c2+
δ
2 pb2+
q2 1 4+
(
1−
q2)
c1,
(3.1)where pistheprior probabilityassignedtothebiasedreceiver,theposteriorbeliefinperiod1is p1
=
(
p−
q2) /
(
1−
q2)
(byBayes’rule),c2 isthecostofmiscommunicationperiod2,andthecostofmiscommunicationinperiod1isequalto 14 andc1 afterthebiasedandunbiasedactions,respectively.Similarly,wecanwritethetotalcostofthebiasedandunbiased receiversasfollows: CB R(δ
2,
q2,
c2,
c1|
p)
=
(
1−
δ
2)
1−
q2 p b2+
c2+
δ
2 q2 p 1 4+
1−
q2 p c1 (3.2) CU R(δ
2,
q2,
c2,
c1|
p)
=
(
1−
δ
2)
c2+
δ
2c1 (3.3) Notethat,aswe haveremarked before,thecostsofmiscommunicationarethe sameforthesender andthe receiverand symmetryallowsustouseoneq2afterbothmessages.Wenowestablishtwoequilibriumfactsthatwillbeusefullateron.Firstdefine,
δ
2∗=
4b 2 1+
4b2 and q∗2(
p)
=
p− ¯
q 1− ¯
qIntheequilibriumthatwewillconstruct,thesenderchooses
δ
∗2 andthereceiverplaysthebiasedactionwithtotal proba-bilityq∗2(
p)
inperiod2.Thefirstfactstatesthatifthefutureislessimportantthanδ
∗2,thenthebiasedreceiverplaysthe biasedactionwithprobabilityoneinperiod2.Fact1.If
δ
2< δ
2∗,theninanyequilibriumthebiasedreceiverplaysthebiasedactionwithprobabilityoneinperiod2,i.e.,q2=
p. The proof of thisfact is simple. Ifthe biased receiverplays thebiased action inperiod 2, i.e., q2=
p, then her cost is(
1−
δ
2)
c2+
δ
2/
4. If, on the other hand, sheplays the unbiased action her cost is atleast(
1−
δ
2)
c2+
b2 .δ
2< δ
∗2 implies(
1−
δ
2)
c2+
δ
2/
4< (
1−
δ
2)
c2+
b2,i.e., thecostofplayingthebiasedactionissmallerthan thecostofplaying the unbiasedaction. Intuitively, whenthe future isnot important enough, the futurereputational benefitof playingthe unbiasedactionissmallerthanitscurrentcost.Therefore,inthecurrentperiodthebiasedreceiverplaysthebiasedaction. Inordertostateoursecondfact,firstnotethatifthetotalprobabilityofplayingthebiasedactioninperiod2isq∗2
(
p)
, thenBayes’ ruleimpliesthat beliefsmustassign thefollowingprobability tothebiasedreceiverinthenext period(after theunbiasedaction):p1
=
p
−
q∗2(
p)
1
−
q∗2(
p)
.
Definition of q∗2 implies that if the total probability of the biasedaction is q2
=
q∗2(
p)
in period 2, then p1= ¯
q and ifq2
<
q∗2(
p)
,thenp1>
q¯
.Inotherwords,q∗2(
p)
isthesmallestprobabilityofthebiasedactioninperiod2thatmakes p1≤ ¯
q andhencefullrevelationinperiod1possible.Thisleadstothefollowingfact:Fact2.Inanyequilibrium,thetotalprobabilityofthebiasedactionisatleastq∗2
(
p)
inperiod2andtheprobabilityassignedtothe biasedreceiverinperiod1aftertheunbiasedactionisatmostq.¯
Inotherwords,q2≥
q∗2(
p) >
0andp1≤ ¯
q inanyequilibrium.The proof of thisfact is alsosimple. As we have argued above, if q2
<
q2∗(
p)
,then p1>
q¯
. But p1>
q¯
implies that the senderwill notprovide anyinformationinperiod 1afterthe unbiasedaction,i.e., there isno benefitofplayingthe unbiased action forthe biasedreceiver. Therefore, sheshould play the biasedaction withprobability one, i.e., q2=
p≥
q∗2
(
p)
, contradicting thehypothesis q2<
q∗2(
p)
.Finally,Bayes’ ruleandq2≥
q∗2(
p)
implythat p1≤ ¯
q inanyequilibrium. TheintuitionbehindFact2issimple:thebiasedreceivermustplaythebiasedactionwithhighenoughprobability,because, otherwise,herreputationnextperiodaftertheunbiasedactionwillbesobadthatshewillnotreceiveanyinformation.We shouldalsonotethat,q2≥
q∗2(
p)
impliesthatthe(minimumpossible)equilibriumcostofmiscommunicationinperiod2is atleastashighastheoneinourequilibrium.Wearenowreadytoconstructourequilibrium,whichwedenoteby
σ
∗.Wesaythataperiod2outcomeiscooperative ifδ
2∗ischosenandtheunbiasedactionisplayed.Inperiod2,thesenderchoosesδ
∗2 andtheunbiasedreceiveralwaysplays the unbiased actiongiven her beliefs.The biasedreceiver mixesafterδ
2∗ sothat total probability ofthe biasedaction isq∗2
(
p)
,andplays thebiasedactionwithprobability one(totalprobability p)afteranyotherδ
2.Typeθ
2=
0 senderalways sends messagem2=
0. Typeθ
2=
1 sends messagem2=
1, i.e., the sender fully revealsthe state, if andonly ifδ
2∗ has been chosen anditisincentive compatibleto reveal thestate, i.e.,q∗2(
p)
≤ ¯
q.Ifδ
2=
δ
2∗ orq∗2(
p)
>
q¯
,typeθ
2=
1 sender sendsmessagem2=
0,i.e.,thesenderrevealsnoinformation.Inperiod1,thebiasedreceiverplaysthebiasedactionwith probabilityoneandthesenderrevealsthestatetruthfullyifandonlyifperiod2outcomeiscooperativeanditisincentive compatibletorevealtruthfully,i.e.,posteriorprobabilitythatthereceiverisbiasedaftertheunbiasedactionissmallerthan¯
q,i.e., p1
≤ ¯
q.Ifperiod 2outcome isnot cooperativeor p1>
q¯
,thenboth senders sendmessagem1=
0, i.e.,the sender revealsnoinformation.BeliefsarederivedfromBayes’rulewheneverpossible.Under
σ
∗,thesender’scostisequaltoCS
σ
∗|
p=
CSδ
2∗,
q∗2(
p) ,
c∗2,
0|
p=
(
1−
δ
∗2)
q∗2(
p)
b2+
c2∗+
δ
2∗ pb2+
q∗2(
p)
1 4,
(3.4)thebiasedreceiver’stotalcostis CB R
σ
∗|
p=
1−
δ
∗2c∗2+
δ
2∗14
,
(3.5)andtheunbiasedreceiver’scostis
CU R
σ
∗|
p=
1−
δ
∗2c2∗,
(3.6)wherec∗2
=
0 ifq∗2(
p)
≤ ¯
q,andc∗2=
1/
4 otherwise.We first show that
σ
∗ isan equilibrium. Letusstart withperiod 1.The biasedreceiver plays thebiasedaction with probability oneinperiod1,whichistheuniquesequentiallyrationalbehavior.Thesenderreportstruthfullyifandonlyif thepreviousperiod’soutcomeiscooperativeandp1≤ ¯
q,whichisalsosequentiallyrational.Now letusconsiderperiod2behavior.Afterhistoriesthat startwith
δ
∗2,thereceiverplaysthebiasedactionwithtotal probability q∗2(
p)
, whichimpliesthat communicating truthfully issequentially rationalifq∗2(
p)
≤ ¯
q. Ifthe receiverplays the unbiasedaction,then sheinducesa cooperativehistoryandposteriorbeliefis p1= ¯
q (Aswe haveshownbefore,this followsfromthedefinitionofq∗2(
p)
).Therefore,shelearnsthestate perfectlyinthenextperiodandplaysherbestperiod action. Thisimpliesthatthetotal costofplayingthe unbiasedactionis(
1−
δ
∗2)
b2+
c2,wherec2
∈
[0,
1/
4] isthe cost ofmiscommunication.Ifthereceiverplays thebiasedaction,thenhercostisequaltothesamecostofmiscommunicationc2 inthecurrentperiod,butsheinducesahistorythatisnot cooperativeandreceivesnoinformationinthenextperiod, which costs her 1
/
4. Definition ofδ
2∗ implies that(
1−
δ
∗2)
b2+
c2=
(
1−
δ
2∗)
c2+
δ
∗2/
4, i.e., the receiver is indifferent betweenthebiasedandtheunbiasedactionsandhenceplayingthebiasedactionwithtotalprobabilityq∗2(
p)
issequentially rational. Now considerhistories that startwithδ
2=
δ
2∗.In thiscase, thesender provides noinformationin both periods irrespective ofwhatthereceiverdoes.Therefore,itissequentiallyrationalforthebiasedreceivertoplaythebiasedaction with probability one after anyδ
2=
δ
2∗. Ifthe sender choosesδ
2=
δ
2∗, there isno information revelation andthe biased receiver plays thebiasedactionwithprobability onein bothperiods.Therefore, thecostofthe senderis equalto pb2+
1
/
4,which is the highest possible cost forhim, andhence higher than the cost of choosingδ
∗2, which is at most(
1−
δ
2∗)
q∗2(
p)
b2+
1/
4+
δ
∗2pb2+
q∗2(
p)/
4.Wewillnowshowthat
σ
∗ yieldstheuniquepayoffdominantequilibriumoutcome.Letusstartwiththebiasedreceiver. As we have established in Fact 2,q2>
0 in any equilibrium, i.e., she plays the biased action in period 2 with positive probability.Thisimpliesthat,inanyequilibrium,hercostisequaltothecostofplayingthebiasedaction,i.e.,(
1−
δ
2)
c2+
δ
2/
4.Sincec2≤
1/
4,shepreferssmallerδ
2.Butifδ
2 issmallerthanδ
2∗,thenFact1impliesthatq2=
p>
q¯
andhenceshe receivesnoinformationinperiod2.NotethathereweareusingtheMarkovianproperty,which,aswehavearguedbefore, impliesthatifthetotalprobabilityofthebiasedactionisgreaterthanq¯
thenthesenderrevealsnoinformation.Therefore, ifδ
2< δ
2∗,thenthecostofmiscommunicationinperiodtwoisc2=
1/
4 andhertotalcostisthehighestpossible,i.e.,1/
4. Therefore,thebestforheristheminimumδ
2thatmaypossiblyallowinformationrevelationinperiod2,i.e.,δ
2∗.Now let usconsiderthe unbiasedreceiver. Ifq∗2
(
p)
≤ ¯
q, thenσ
∗ isoptimalforthe unbiasedreceiverbecause her cost is zero and hence no other equilibrium can yield a lower cost. If, on the other hand, q∗2(
p) >
q¯
, then her cost in our equilibriumisequalto(
1−
δ
2∗)/
4.Fact2,i.e.,q2≥
q∗2(
p) >
q¯
,impliesthathercostisatleast(
1−
δ
2) /
4 inanyequilibrium (again animplicationoftheMarkovianproperty). Ifδ
2≤
δ
2∗,then(
1−
δ
2) /
4≥
1
−
δ
2∗/
4,that is,theunbiasedreceiver’s costisatleastaslargeashercostinourequilibrium.Ifδ
2> δ
2∗andtheunbiasedreceiver’scostissmallerthanhercostinσ
∗,thenthebiasedreceiverwouldbebetteroffwithplayingtheunbiasedactionherself,whichcontradictsq2≥
q∗2(
p) >
0 (Fact2).1919 Moreprecisely,ifδ
2> δ∗2 and(1−δ2) /4+δ2c1<1−δ2∗
/4,orequivalentlyδ2/4> δ2∗/4+δ2c1,thenδ2∗/4=(1−δ2∗)b2impliesthatδ2/4> (1−
δ∗2)b2+δ2c1> (1−δ2)b2+δ2c1.Inotherwords,thecostofplayingthebiasedactionislargerthanthecostofplayingtheunbiasedactionforthebiased receiver,whichcontradictsq2≥q∗2(p) >0,i.e.,sheplaysthebiasedactionwithpositiveprobabilityinperiod2inanyequilibrium.
Now let usconsider the sender. First note that since the cost of miscommunication inperiod 1 is at most 1
/
4, i.e.,c1
≤
1/
4,thesender’scostisstrictlyincreasinginthetotalprobabilityofthebiasedactioninperiod2,i.e.,q2(seeequation (3.1)).Intuitively,thisisbecauseahigherq2 leads toa highercostofthebiasedactiontodayanda higherprobabilityof miscommunicationtomorrow.Assumefirstthatδ
2< δ
∗2 inequilibrium.Inthiscase,Fact1impliesthatthebiasedreceiver playsthebiasedactionwithprobabilityone inperiod2,i.e.,q2=
p>
q∗2(
p)
.Thisimpliesthat,forafixedδ
2,thesender’s costisstrictly higherthanhiscost ifq2 were toequaltoq∗2(
p)
:(1)costofmiscommunicationinperiod2isthehighest possible,i.e., 1/
4; (2)cost ofbiasedaction is strictlyhigher, i.e., pb2>
(
1−
δ
2
)
q∗2(
p)
+
δ
2pb2.Furthermore,
(
1−
δ
2) >
1
−
δ
∗2 impliesthat moreweightisassignedtotoday. Sincethecostofmiscommunicationtoday(=
1/
4)ishigherthan the(expected)costofmiscommunicationtomorrow (=
p/
4), smallerδ
2 iscostlier.Therefore,sender’s costishigherthan hiscostinourequilibriumifδ
2< δ
∗2.Assumenowthat
δ
2> δ
2∗andrememberthatq2≥
q∗2(
p) >
0 in anyequilibrium.Sincethesender’scostisincreasinginq2,thecostinanyequilibriumwhereq2
>
q∗2(
p)
ishigherthanthecost ifwe replaceq2 withq∗2(
p)
.Furthermore,whenq2
=
q∗2(
p)
sender’scostisstrictlyincreasinginδ
2:Asδ
2 increasesthecostofthebiasedactionincreasesbecauseinperiod 1thebiasedreceiverplays thebiasedactionwithahigherprobabilitythan inperiod2.However,thecost of miscommu-nicationdecreasesbecauseitisequalto1/
4 todaywhileitsexpectationisq∗2/
4 tomorrow.Astraightforwardcomputation showsthatwhenb>
1/
2,theincrease inthecost ofbiasedactionoutweighs thedecreaseincost ofmiscommunication. Therefore, starting withan arbitrary equilibriumwithq2>
q∗2(
p)
andδ
2> δ
∗2 and decreasingq2 to q∗2(
p)
andδ
2 toδ
∗2, decreases the sender’s cost. Thisshows that the sender’s cost ishigher than his cost inour equilibrium ifδ
2> δ
∗2. We concludethatδ
2=
δ
2∗ andq2=
q∗2(
p)
uniquelyminimizesthesender’scost.Theimportantingredientsinourresultarethefollowingfactsthataretrueinanyequilibrium:(1)
δ
2< δ
∗2 impliesthat thebiasedreceiverplaysthebiasedactionwithprobabilityoneinperiod2;(2)q2≥
q∗2(
p) >
0 inanyequilibrium;(3)q∗2(
p)
isthe smallestprobability ofbiasedaction that makes informationrevelation possibleinthe next period.The twoother propertiesthat wehaveused comefromtheMarkovianproperty: (4)thereceiverplays thebiasedactionwiththesame probabilitiesaftereachmessage;(5)ifthetotalprobabilityofthebiasedactioninperiod2isgreaterthanq¯
,thenthereis noinformation revelation.The restoftheintuition canbe summarizedasfollows:δ
2< δ
∗2 isbad forthesenderbecause it leads to a higherprobability ofthe biased action andno information todayand higherprobability of no information tomorrow.Similarly, it isbad forboth typesof thereceiver becauseit leads tono informationtoday. Senderprefersthe smallestpossibleprobabilityofthebiasedactiontodaythatallowsinformationrevelationtomorrow,whichisq∗2(
p)
.Ifthis probabilityisindeedequaltoq∗2(
p)
,thenhepreferssmallerδ
2 becausethebiasedactionisplayedwithhigherprobability tomorrow,i.e., p>
q∗2(
p)
.Thebiasedreceiveralsolikessmallerδ
2 becauseshemayreceiveinformationtodaybutwillnot receiveittomorrow.So,itisbestforherthatδ
2 isthesmallestthat allowsinformationrevelationtoday,i.e.,δ
2∗.Largeδ
2 maybenefit theunbiased receiver,butifit doesthen thebiasedreceiver wouldalso preferplaying theunbiased action, contradictingthatshemustplaythebiasedactionwithpositiveprobability,i.e.,ingredient(2)above.Tosummarize,iftheinitialreputationofthereceiverisbad,i.e.,p
>
q¯
,thenintheuniquepayoff-dominantequilibrium outcome:(1)Therelativesizeofthestakesinperiod2ischosensothatthereceiverisindifferentbetweenthebiasedand theunbiasedactionsforthatperiod,giventhat thesenderwillcommunicatetruthfullyaftertheunbiasedactionandwill providenoinformationotherwise;(2)Thereceivermixesinsuchawaythatherreputationnextperiodisjustgoodenough tomaketruthfulcommunicationpossible;(3)Alargershareofthestakesislefttothefuture(δ
2∗>
1/
2),i.e.,startingsmall istheuniquepayoff-dominantequilibriumoutcome.Remark2.Inordertounderstandtherole ofreputationalconcerns,itmightbeinstructivetocomparetheequilibriumwe havepresented abovewiththe scenarioin whichthe senderlearnsthe receiver’stype. Assuming that b
>
1/
2, oncethe sender learnsthat the receiver is biased, hewill provide noinformation andthe receiver will play thebiased action in each period.Therefore,thecost ofthesenderwill beb2+
1/
4.If, ontheother hand,thesender learnsthatthereceiver isunbiased,then thesender’s costwillbezeroineveryperiod.Therefore,ifthesenderknowsthathewilllearnthetype of the receiver, his ex ante cost is pb2+
1/
4. This cost is higher than his cost in our equilibrium. Intuitively, in our equilibrium,theprobabilityofthebiasedactionissmallerineachperiod,whichdirectlybenefitsthesender,andalsoleads tohigherprobability oftruthfulcommunicationinthefuture.Thisisthe mainreasonwhythesenderdoesnotprefer to screenthereceiverearlyoninthegame.Remark3.Weshouldalsonotethat,forourresults,itisnotnecessarythatthesenderandthereceiversharethesame
δ
2. Themainrole ofδ
2 istoprovideincentivestothebiasedreceivertoplaytheunbiasedactionwithpositive probability.It iseasytoshowthatifthesenderhasafixedδ
2 andthechoice variableisthesizeofthestakesforthereceiver,thenour mainresultstillgoesthrough,i.e.,σ
∗isstillthepayoff-dominantequilibrium.Remark4.Intheanalysisabovewehaverestrictedthesearchforpayoff-dominantequilibriumtothoseinwhichtheMarkov property,i.e.,Property2,holds.
If,however,weallowthesendertocoordinatehercommunicationstrategyacrossperiods,thentheincentive compati-bilityconstraintfortellingthetruthisrelaxed.WhentheMarkovpropertyholds,thebindingconstraintfortellingthetruth
inthefirstperiodisgivenbyq2b2
≤
q2b2+
1−
2q2b,orq2≤ ¯
q,whereq2isthetotalprobabilitywithwhichtheagentplays thebiasedaction.WithouttheMarkovrestriction,thisconstraintisgivenby(
1−
δ
2)
q2b2+
δ
2 p2b2+
1 4q2≤
(
1−
δ
2)
q2b2+
1−
2q2b+
δ
2 p2b2+
1 4,
whichisequivalentto q2≤ ˆ
q(δ)
≡
δ
2/
4+
(
1−
δ
2)
δ
2/
4+
(
1−
δ
2)
2b.
Otherwise, thenon-Markov payoff-dominantequilibriumis exactlythe sameasthe Markovequilibrium. In particular,
q2
=
1−
(
1−
p) /
(
1− ¯
q)
inanypayoff-dominantequilibrium,bothMarkovandnon-Markov.Since,qˆ
(δ)
<
q¯
foranyb>
1/
2, truth-tellingisoptimalforalargersetofpriorprobabilitiesinthenon-Markovpayoff-dominantequilibrium.Remark5.Itis easy toadaptthe argumentsin thetext andshow that,inanypayoff-dominant equilibrium, thereceiver plays the biasedaction with the same probability after any message sent withpositive probability.In other words, the symmetrypartofthepropertyisnotbindingforourresultsinthetwo-periodmodel.However,theanalysisrapidlybecomes intractable forN
>
2 withoutthesymmetryassumption.Beyondtractability,thereisalsoaneconomicjustificationforthe symmetry assumption:thechoiceofstakesisindependentofwhetherthereceiverplays symmetricallybecausestakesare chosen toleave thereceiverindifferentbetweenplayingthebiasedactioninthecurrentperiodandplayingtheunbiased actioninthecurrentperiodandthenswitchingtothebiasedactioninthesubsequentperiod.Importantly,theprobability ofplayingthebiasedactioninthecurrentperiodandthereforethecommunicationstrategy ofthe senderdoesnot enter thiscalculationofthestakes.Remark6.If p
≤ ¯
q,thenthesender’scostinourequilibriumisδ
2∗pb2.Thesender’scostinanyequilibriumwithδ
2> δ
∗2 is atleastδ
2pb2,whichisstrictly greaterthanδ
2∗pb2.Ifδ
2< δ
∗2,then q2=
pandthesender’s costisatleast pb2,whichis alsostrictlygreaterthanδ
∗2pb2.Therefore,ourequilibriumuniquelyminimizesthesender’scostwhenp≤ ¯
qaswell.However, when p
≤ ¯
q,equilibriumσ
∗ is not optimalfor thereceiver. Still, there isan equilibrium whichis receiver-optimal andhas thesame property ofstarting small.In thisequilibriumδ
2=
1, thesender provides full information in bothperiodsandthebiasedreceiverplaystheunbiasedactioninthefirstandthebiasedactioninthelastperiod.Inother words,theplayerseffectivelyplayaone-periodgamewiththesameprior.Thecosts ofboth thebiasedandtheunbiased receiversareequaltozerointhisequilibrium,whilethecostofthebiasedreceiverisatleastδ
∗2/
4 inequilibriumσ
∗. 4. ThemainresultInthissection,wewillshowthatthemainresultswehaveobtainedinthetwo-periodversionofthemodelgothrough in the generalmodel with N periods: There is a unique sender-optimal equilibrium outcome andthis outcome is also optimalforthereceiveraslongasthereceiverhasasufficientlybadinitialreputation.Inotherwords,forsufficientlybad initialreputationlevels,thereisauniqueequilibriumoutcomethatParetodominatesallotherequilibriumoutcomes.Ifthe potentialconflictofinterestislargeenough,i.e.,b
>
1/
2,thisequilibriumoutcomeischaracterizedby“startingsmall”,i.e., increasing thestakesovertime aslongasthereceiverdoesthe“rightthing”.Wealsoshowthatasthereceiver’spotential biasbincreases,theinitialstakesbecomesmallerbuttheygrowfaster.In ordertofacilitatethe definitionofthepayoff-dominantequilibriumwe firstneed some preliminarydefinitions.Let
δ
1∗=
0 anddefineδ
∗i recursivelyasδ
2∗=
4b 2 1+
4b2 (4.1)δ
i∗=
4b 2 1+
4b2i−1 j=2δ
∗j,
i=
3, . . . ,
N (4.2)Foranyp
∈
[0,
1],let20q∗i
(
p)
=
⎧
⎨
⎩
1−
1−p (1−¯q)i−1 +,
q¯
<
1 ori=
1 0,
otherwise (4.3) Define thesetof periodi histories H∗i asfollows. H∗N= {∅}
and, foranyi=
N−
1,
. . . ,
1, ahistoryh=
(
oN,
oN−1. . . ,
oi+1
)
∈
H∗i if(1)periodN outcome,oN=
(δ
N, θ
N,
mN,
aN)
,issuchthatδ
N=
δ
∗N andaN
=
mN
,
ifq∗N(
p)
≤ ¯
q 1/
2,
ifq∗N(
p) >
q¯
(2)forall j
=
N−
1,
. . . ,
i+
1, period j outcome,oj=
δ
j, θ
j,
mj,
aj,issuch that
δ
j=
δ
∗j andaj=
mj.In otherwords,a historybelongsto H∗i ifineach previous period j,δ
∗j hasbeenchosen andthereceiverplayed theunbiasedactionafter eachmessagebelieving thatthesenderwastellingthetruth(except inperiod N,whereshebelievesthesenderistelling thetruthifandonlyifdoingsoissequentiallyrationalforthesender).We willdefine thepayoff-dominantequilibriumassessment
σ
∗=
(τ
i,
μ
i,
qi,
pi, λ
i)
forthe sender-gameandpoint out howitdiffersforthereceiver-gameinSection5.AftereachhistoryinH∗i,thesenderchoosesδ
∗i andafteranyotherhistory hechoosesδ
i=
0,i.e.,τ
i(
h)
=
δ
∗i,
h∈
H∗i0
,
otherwise (4.4)Ifh
∈
H∗i andδ
∗i hasbeenchosen inperiodi,then thereceiver’stotalprobability ofplayingthebiasedactionisequaltoq∗i
(
pi(
h))
,whereq∗i isdefinedin(4.3);otherwise,thebiasedreceiverplaysthebiasedactionwithprobabilityone,i.e.,21qi
(
h, δ
i)
=
q∗i
(
pi(
h)) ,
ifh∈
Hi∗, δ
i=
δ
i∗,
pi(
h) <
1 pi(
h) ,
otherwise(4.5) Thesender communicates truthfullyinperiodi aslong asthe totalprobability ofthebiasedactioninthatperiodis less thanorequaltoq
¯
,thehistorybelongsto H∗i,andδ
∗i hasbeenchosen.Sinceweassumedtypeθ
i=
0 sendersendsmessagemi
=
0,thisimpliesthattheprobabilitywithwhichtypeθ
i=
1 sendsmessagemi=
1 isgivenbyμ
i(
h, δ
i)
=
1
,
ifh∈
H∗i, δ
i=
δ
i∗,
qi(
h, δ
i)
≤ ¯
q,
pi(
h) <
10
,
otherwise (4.6)whichimpliesthat
μ
N(δ
N)
=
1
,
ifδ
N=
δ
∗N,
q∗N(
p)
≤ ¯
q 0,
otherwiseInanyperiodi,theunbiasedactionisgivenbya0
(λ
i(
h, δ
i,
m))
,whereλ
i(
h, δ
i,
m)
=
1−μ i(h,δi)2−μi(h,δi)
,
m=
01
,
m=
1 (4.7)Beliefsonthereceiver’stypearedefinedasfollows:pN
(
∅
)
=
pand pi−1(
h,
oi)
=
1−
1
−
pi(
h)
1−
qi(
h, δ
i)
,
for allh∈
Hi,
oi∈
Oi (4.8)Notethatiftheplayersplayaccordingto
σ
∗ uptoandincludingperiodi+
1,whichimpliesthathi∈
Hi∗,thenhi−1∈
/
Hi∗−1 if
δ
i=
δ
∗i ischosenorifinperiodi thereceiverplays anactionthatisdifferentfromtheunbiasedaction,i.e.,playsai
=
mi.Inthat case,the senderassigns probabilityone totheeventthat thereceiverisbiased, providesnoinformation, andterminatesthegamebychoosingδ
i−1=
0.Theorem1.Theassessment
σ
∗isaperfectBayesianequilibriumandinducestheuniquesender-optimalequilibriumoutcome.If p>
1−
(
1− ¯
q)
N−1,thenσ
∗isalsoareceiver-optimalequilibriumandthereforepayoff-dominant.In the equilibrium
σ
∗ described above, the sender leaves a proportionδ
i∗ of the total stakes in the relationship to subsequentperiodsontheequilibriumpath.Thisproportionleavesthereceiverexactlyindifferentbetweenthebiasedand the unbiasedactions in period i,given that ineach subsequent period,the sender communicates truthfully afterhaving observedtheunbiasedactioninallpriorperiodsandprovidesnoinformationotherwise.In order to further describe the equilibrium, let usfirst focus on the case inwhich
σ
∗ is payoff-dominant, i.e., p>
1
−
(
1− ¯
q)
N−1. Inthiscasethe receiverplaysthe biasedactionwithtotalprobability equalto1−
(
1−
p)/(
1− ¯
q)
N−1 in period N andplays the biasedaction withtotal probability q¯
thereafter. Thesender reportsthe state truthfully inevery21 Notethatinthelastperiodδ
Fig. 1.Equilibrium behavior forb=1,p=0.9. (For interpretation of the colors in the figure(s), the reader is referred to the web version of this article.) periodexceptpossiblythefirstperiod,i.e.,periodN inournotation.InperiodN,totalprobabilityofthebiasedactionmay exceedq
¯
andifthisisthecasethesendercommunicatesnoinformation.Inotherwords,ifthereceiverhasasufficiently badinitialreputation,theninformativecommunicationmayfailinthefirstperiod,i.e.,periodN,butcommunicationisfully informativethereafter.If p
≤
1−
(
1− ¯
q)
N−1,thenthesender-optimalequilibriuminnotreceiver-optimalanymore.Inthesender-optimal equi-libriumσ
∗,the receiverplays theunbiased actionwithprobability one until the gamereachesperiod k, wherek is the firstperiod(largest integer)suchthat p>
1−
(
1− ¯
q)
k−1.Thereceiverplaysthebiasedactionwithtotalprobability equal to1−
(
1−
p)/(
1− ¯
q)
k−1≤ ¯
qinperiodkandplaysthebiasedactionwithtotalprobabilityq¯
inallsubsequentperiods.The receiver’sreputationremainsconstantandequalto1−
puntilthegamereachesperiodkandthenmonotonicallyincreases ineachperiodtoreachexactly1− ¯
qinthelastperiodofthegame.Thesenderreportsthestatetruthfullyineveryperiod afte