Regret Aversion as a Coordination Device
Chris Gee
⇤Philip R. Neary
†Juan Pablo Rud
‡March 3, 2016
Abstract
Models of regret aversion require that agents can make an ex-post comparison between their choice and a foregone alternative. We drop this, assuming instead that agents learn about the outcome of an alternative choice with a probability that depends on the choices of others. This turns a series of simple one-person decision problems into a multi-person game where players coordinate on informa-tion. We use technology adoption as an example to show that our setting has a larger set of equilibria. In particular, universal non-adoption can be an equilib-rium for a range of new-technology productivities wherein regret-neutral agents would unequivocally adopt. Non-adoption insures regret-averse agents against re-gret, as they never learn whether the new technology turned out to be successful or not.
Keywords: regret aversion; coordination; technology adoption; information
JEL codes: D03; D80; D81.
⇤Deloitte LLP
†Email: [email protected]; Web: https://sites.google.com/site/prneary/; Address:
Depart-ment of Economics, Royal Holloway, University of London, Egham, Surrey, TW20 0EX.
1
Introduction
Economists model regret as the utility loss experienced from comparing a choice made -that turned out to be suboptimal - to an alternative foregone.1 However, implicit in this
is that it is always possible for the decision maker to make an ex-post comparison. Yet in many situations, ranging from technology adoption (Ryan and Gross,1943) to ordering food in a restaurant (Ariely and Levav,2000), this may not be so. The reason is clear: in each of these settings the decision maker will only be capable of making an ex-post comparison if someone else went for an alternative option. In this paper we introduce a model with precisely this property, whereby an individual finds out about alternative choices with a probability that is a function of the choices of others. This seemingly innocuous assumption turns what was previously considered a series of single-person decision problems, into a multi-player game where regret can facilitate coordination on an action that would not be observed if each agent was acting in isolation.
To make things concrete, consider the example of technology adoption. Everybody in the population is currently using the existing risk-free ‘old’ technology, and must decide between sticking with it or adopting a ‘new’ technology. The new technology is risky, but its expected productivity is higher. If individuals were traditional risk-neutral expected utility maximisers, all would adopt the new technology since in expectation it is better. But with regret averse individuals, non-adoption can be an equilibrium because sticking with the old technology provides insurance against regret, as nobody will ever learn whether the new technology was successful or not. Consequently, the new technology has to be significantly better than previously thought in order toensure uniform adoption.
Like many of the behavioural biases that have recently been incorporated into eco-nomic models, to our knowledge regret has only been formally considered for one-person decision problems. With the inclusion of regret as the object of coordination, our model in some sense fuses two standard frameworks for technology adoption: the network ef-fect ‘coordination game’ settings like those inMorris(2000) and Young(2009), and the private information + public information ‘herding models’ ofBikhchandani, Hirshleifer, and Welch (1992) and Banerjee(1992)
2
The Decision Problem
2.1
Standard preferences
A risk neutral investor chooses a 2 {0,1}, where 0 means sticking with an existing ‘old’ technology, and 1 means adopting a ‘new’ technology.2 There are two states of the
world,!1and!2, that occur with probabilitypand 1 prespectively (wherep2(0,1)).
The new technology costsc > 0 to adopt and is successful only in state !1 wherein it
brings benefit✓(> c). If unsuccessful, the investor returns to the old technology, which yields a payo↵ normalised to 0.
Formally, the investor has real-valued utility functionU :{0,1}⇥{!1,!2}, with
U(0,!) =
(
0, if !=!1
0, if !=!2
(1)
and
U(1,!) =
(
✓ c, if ! =!1 c, if ! =!2
(2)
Letting E! denote the expectation operator, the investor will adopt the new tech-nology if the expected utility from doing so, E![U(1,!)], exceeds the expected utility of sticking with the old technology, E![U(0,!)] = 0. Solving this means the investor follows a threshold rule specifying adoption of the new technology if and only if
✓ ✓? = c
p (3)
2.2
Regret averse preferences
We now suppose that the investor experiences regret in the event that an undesirable outcome occurs. That is, if the investor learns that his choice wasex post suboptimal, then he experiences a psychological penalty proportional to the payo↵ shortfall in the realised state.
the investor can certainly make a comparison between the outcome and the alternative in the case that he adopts the new technology, it is not so clear that he will be able to do so if he sticks with the old one. In order to experience regret in the case of non adoption, there must be a chance that the investor will learn whether or not the new technology was ultimately successful. To capture this, let q 2 [0,1] be the probability that an investor learns the outcome of the new technologyconditional on having stuck with the existing technology. The investor with regret-averse preferences then has utility functionUR(a,!, q), with
UR(0,!, q) =
(
qk(✓ c), if !=!1
0, if !=!2
(4)
and
UR(1,!, q) =
(
✓ c, if ! =!1 c kc, if ! =!2
(5)
wherek 0 is the coefficient of regret aversion that, followingSarver(2008), we assume is state independent.
A regret averse investor will adopt the new technology only if it yields higher ex-pected utility. So, solving for E![UR
i (1,!)] E![UiR(0,!)], yields the new threshold rule specifying adoption if an only if,
✓ ✓??(q) = c
p
1 +k 1 p(1 q) 1 +qk
!
(6)
We highlight how the expression in (6) varies with q for a strictly regret averse investor, i.e.k >0.3 First supposeq = 1. This is the case of an investor who will learn
of the performance of the new technology no matter what. Here, there is no distortion to the threshold rule relative to standard preferences, in that,
✓??(q)|q=1 =✓? = c
p (7)
Second, suppose q = 0, so that the investor knows that he will definitely not find
3Whenk= 0, so that the investor does not experience regret, the threshold rule is independent of
out about the performance of the new technology unless he adopts. Then (6) becomes,
✓??(q)|q=0 = c
p 1 +k(1 p) (8)
Finally, suppose q 2 (0,1). Here, ✓?? takes values in ✓??(q)|
q=1,✓??(q)|q=0 .
Fur-thermore, it can be checked that ✓??(q) decreases strictly in q over the range [0,1]. Intuitively, the reason that the threshold productivity for adopting the new technology becomes less demanding asq increases is due to the possibility of asymmetry in antic-ipated regret. As the likelihood of making an ex-post comparison in the case of non adoption reduces, the investor is increasingly - from an ex-ante perspective - insured against regret. And because of this insurance against potential regret, the threshold on the new technology must increase to tempt the investor away from the old one.
In the next section, we extend the setting to one with lots of investors with the above preferences, but suppose that q is determined by the choices of other investors. More precisely, we will assume that each investor’s q is increasing in the number of other investors who adopt the new technology. From the perspective of each investor, this turns a straightforwardsingle-person decision problem into amulti-player game.
3
The Game
3.1
The set up
The strategic setting is a symmetric N-player simultaneous move game given by the 2N-tuple, G = (A1, . . . , AN, U1R, . . . , UNR), where each investor, i, chooses an action ai from the set Ai = {0,1}, and has utility function UiR : A⇥ {!1,!2} ! R, where
A := Qj2N Aj, with typical element a = (a1, . . . , aN). From player i’s perspective a pure action profile a 2 A can be viewed as (ai,a i), so that (ˆai,a i) will refer to the profile (a1, . . . , ai 1,ˆai, ai+1, . . . , aN), i.e., the action profile abut with ˆai replacing ai.
For each state, the utility function of investori is as defined in (4) and (5) save one di↵erence:qi depends on the actions of the other investors. More precisely we have that
UiR (0,a i),! =
(
qi(a)k(✓ c), if !=!1
and
UiR (1,a i),! =
(
✓ c, if ! =!1 c kc, if ! =!2
(10)
whereqi is defined as
qi(a) :=
P
j6=iaj
N 1 (11)
While it seems implausible to us that the function qi would be decreasing in the fraction of other investors who adopt the new technology, the assumption that it is linearly increasing from 0 to 1 is made solely for reasons of tractability. For example, letting x denote the fraction of other investors who have adopted the new technology, and abusing notation by writingq as a function ofx, one could easily imagine a convex or concave likelihood function, q(x) = x2 or q(x) = px, or a step-function where
an investor is guaranteed to learn of the new technology’s performance once ‘enough’ others adopt it (maybe the news is announced via public radio), q(x) = 1{x d}, where 1 denotes the indicator function andd is the threshold. One might even imagine there is always some chance that a non-adopting investor will learn of the new technology, so that q(x) =↵+ x, where ↵>0, 0 and ↵+ 1.
3.2
Equilibria
For values of ✓ below c/p and for values of ✓ above c/p 1 +k(1 p) , the setting is a dominant strategy game and behaviour is easily classified. For values of ✓ in the interval between these two thresholds however, the game is one of coordination. This brings us to our main result on the pure Nash Equilibria whose straightforward proof is omitted.4
Theorem. With N investors, the set of pure Nash Equilibria is:
1. {(0, . . . ,0)}, when ✓ < c/p,
2. {(1, . . . ,1)}, when ✓ > cp(1 +K(1 p)),
3. {(0, . . . ,0),(1, . . . ,1)}, when ✓2[c/p,pc(1 +K(1 p))].
That the only pure strategy equilibria are symmetric is easily seen by imagining a 2-player variant of the game. Such a game is strategically equivalent to a stag hunt, and
so, precisely as occurs with a large population stag hunt, the pure strategy equilibria are large population analogs of those from the 2-player game.
3.3
Welfare and equilibrium selection
It is interesting to compare the (common) expected utility levels at each equilibrium. When there is uniform non-adoption, from (9) it is clear that each agent has expected utility of 0. With uniform adoption, (10) implies that each agent has expected utility
p✓ c (1 p)kc. Comparing these two, we get that uniform adoption of the new technology is the pareto dominant equilibrium if and only if ✓ ✓??(q)|
q=0 =c/p 1 + k(1 p) . But this is the precisely the productivity threshold at which individuals would always adopt the new technology in any case. Thus, for values of✓ where there are two equilibria, uniform non-adoption is always preferred.
In symmetric large population binary action games, existing equilibrium selection techniques - be they evolutionary likestochastic stability (Kandori, Mailath, and Rob,
1993;Young,1993) or higher-order belief based likeglobal games (Carlsson and Damme,
1993;Morris and Shin,2003) - favour the equilibrium that is most difficult to destabilise. For this reason, it is important to compute the threshold on q, that we denote by q?, at which action 1 becomes optimal. Simple algebra yields that
q?(✓, p, c, k) := 1
k(✓ c) ✓ ??(q)|
q=0 ✓) (12)
when q? 1/2 uniform adoption is the selected outcome, and similarly but opposite whenq? >1/2, nobody adopting is the predicted outcome.
4
Discussion
non-adoption of the new technology becomes an equilibrium outcome because doing so provides everybody with insurance against regret.
References
Ariely, D., and J. Levav (2000): “Sequential Choice in Group Settings: Taking the Road Less Traveled and Less Enjoyed,” Journal of Consumer Research, 27(3), 279–290.
Banerjee, A., A. G. Chandrasekhar, E. Duflo, and M. O. Jackson (2013): “The Di↵usion of Microfinance,”Science, 341(6144).
Banerjee, A. V.(1992): “A Simple Model of Herd Behavior,” The Quarterly Journal of Economics, 107(3), 797–817.
Bikhchandani, S., D. Hirshleifer, and I. Welch (1992): “A Theory of Fads, Fashion, Custom, and Cultural Change as Informational Cascades,” Journal of Po-litical Economy, 100(5), pp. 992–1026.
Carlsson, H.,andE. v. Damme(1993): “Global Games and Equilibrium Selection,” Econometrica, 61(5), 989–1018.
Diamond, D. W., and P. H. Dybvig (1983): “Bank Runs, Deposit Insurance, and Liquidity,” Journal of Political Economy, 91(3), pp. 401–419.
Kandori, M., G. J. Mailath,and R. Rob(1993): “Learning, Mutation, and Long Run Equilibria in Games,” Econometrica, 61(1), 29–56.
Morris, S.(2000): “Contagion,” Review of Economic Studies, 67(1), 57–78.
Morris, S., and H. S. Shin (2003): “Global Games: Theory and Applications,” in in “Advances in Economics and Econometrics, the Eighth World Congress”, Dewa-tripont, Hansen and Turnovsky, Eds.
Neary, P. R. (2012): “Competing conventions,” Games and Economic Behavior, 76(1), 301 – 328.
Ryan, B., and N. Gross (1943): “The di↵usion of hybrid seed corn in two Iowa communities,” Rural Sociology, 8(1), 15–24.
Sarver, T. (2008): “Anticipating Regret: Why Fewer Options May Be Better,” Econometrica, 76(2), 263–305.
Young, H. P. (1993): “The Evolution of Conventions,”Econometrica, 61(1), 57–84.