Parameter Learning and the Carry Trade

(1)

Parameter Learning and the Carry Trade

Yichuan Wang

∗

April 10, 2016

Abstract

Volatile beliefs generate large and predictable carry trade returns. In a symmetric two country model with power utility, risk aversion of 5, and constant gain learning about mean consumption growth, the carry trade earns a Sharpe ratio of 0.21. I extend this basic model to accommodate correlated belief updates and Epstein-Zin preferences. While certain extensions manage to replicate both large carry trade and equity Sharpe ratios, they are unable to do so simultaneously with smooth exchange rates. My model has the empirical implication that if a group of countries have similar true mean growth rates, then a growth carry of sorting countries by recent consumption growth should earn excess returns on par with the carry trade. I find support for this in a panel of 14 developed world countries.

JEL Classification: D83, G12, G15

Keywords:Carry Trade, Subjective Beliefs, Exchange Rates, Parameter Learning

∗_{Wang: University of Michigan,}_{[email protected]}_{. For helpful comments and discussions, I would like to thank Stefan Nagel}

(2)

1 Introduction

Parameter learning has been successful in explaining many closed economy asset pricing puzzles, such as the equity risk premium, return predictability, and excess volatility (Pastor and Veronesi,2009;Lewellen and Shanken, 2002; Collin-Dufresne et al., 2015). Whereas agents in traditional rational expectations models know the true parameters governing the economy, agents in a learning model must use past data to infer the values of said parameters. Asset prices reflect agents’ time varying and uncertain beliefs. Learning models are therefore able to match the reality of large, predictable, and highly volatile equity returns.

This paper explores how parameter learning can help resolve two open economy asset pricing puzzles: the carry trade and exchange rate smoothness.

1. Thecarry trade puzzlerefers to the failure of uncovered interest rate parity (UIP) to hold in the cross section.1 Investing in high interest rate currencies by borrowing in low interest rate currencies earns annual Sharpe ratios of around 0.50 after transaction costs (Lustig et al., 2011). Such a strategy is known as the carry trade. Moreover, carry trade returns are predictable using interest rate dif-ferentials. Backus et al. (2001) translate these facts into restrictions on stochastic discount factors (SDFs) in each country. They show that under conditionally log normal SDFs, rational expectations, and complete markets, carry trade conditional expected returns are equal to one-half the difference in the conditional variances of SDFs.2 Therefore large and time varying carry trade returns imply highly volatile and heteroskedastic SDFs. But given that observed consumption growth is smooth and nearly homoskedastic, why are carry trade returns so large? Solutions to the equity premium puzzle that increase the volatility of SDFs are not enough to resolve the carry trade puzzle. Instead, one needs an explanation for large time varying differences in SDF volatilities across countries. 2. Theexchange rate smoothness puzzlerefers to the fact that exchange rates are smooth relative to the

level of equity market Sharpe ratios. This fact implies that SDFs across countries must be highly correlated (Brandt et al., 2006). But given that consumption growth is nearly independent across countries, what is the common risk factor that causes SDFs to be correlated? Smooth exchange rates are also at odds with large carry trade returns. Intuitively, the SDFs that are volatile enough to match carry trade returns also cause exchange rates to be too volatile (Lustig and Verdelhan,2016). Therefore it is important for models of the carry trade to also match smooth exchange rates.

1_{UIP fails in the time series as well. However, the asset pricing literature focuses on this cross sectional violation because it} is easy to implement without estimating parameters about the time series relationship between forward discounts and currency depreciation.

2_{Under more general distributions for the SDFs, the carry trade premium becomes the difference in conditional entropies. The} fundamental insight that the carry trade is about differences in higher order moments.

(3)

Rational expectations models resolve these puzzles by considering more sophisticated preferences and

economies. Verdelhan (2010) uses a symmetric two country model with habit formation to generate

the necessary heteroskedasticity in SDFs. However, his model fails to match exchange rate smoothness.

Bansal and Shaliastovich(2012) use a two country model with heteroskedastic, correlated, long run risks to generate both heteroskedastic SDFs as well as smooth exchange rates. Farhi and Gabaix(2014) derive another resolution of the carry trade through disaster risks.

Beliefs open up a third dimension for explaining foreign exchange puzzles. Instead of relying on complicated preferences and consumption economies, I instead take a disciplined departure from rational expectations in the form of constant gain learning about mean consumption growth. There certainly exist rational expectations models that explain the carry trade and smooth exchange rates, but my goal is to show how time varying beliefs can offer a more parsimonious explanation of these puzzles.

I start with a simple two country model with power utility and independent consumption growth to outline the basic mechanisms for how beliefs can generate a carry trade. When agents have to learn about mean consumption growth they will end up perceiving exchange rate drifts that do not show up in the data, thereby leading to carry trade opportunities. Interest rate differentials then track the extent of these belief errors, and thus interest rate differentials end up predicting carry trade returns. Volatile beliefs also introduce a volatile common belief distortion term into the SDFs of different countries, thereby explaining the exchange rate smoothness puzzle.

Ideally, a model of beliefs and the carry trade should also be able to price equity risk. Such a model would demonstrate the essential unity of fluctuating beliefs as a way of explaining asset pricing anomalies. Moreover, the exchange rate smoothness puzzle started based on arguments about what equity Sharpe ratios implied about plausible levels of exchange rate smoothness. A model that can jointly match the equity premium, the carry trade, as well as the smoothness of exchange rates would be a model that more credibly addresses the exchange rate smoothness puzzle as well.

In order to generate large equity Sharpe ratios, I combine Epstein-Zin preferences with priced param-eter uncertainty. A power utility model fails to generate large equity Sharpe ratios because the increase in parameter uncertainty caused by learning does not have a substantial effect on valuations. In contrast, Epstein-Zin preferences induce a preference for the early resolution of uncertainty. Uncertainty about mean consumption growth lowers equity valuations and raises wealth returns (Collin-Dufresne et al.,

2015). The long run risks literature for international finance has made heavy use of correlatedlong run risks.3 The analogous mechanism in a model of beliefs is to have correlated belief updates. Intuitively, one 3_{For example,}_{Colacito and Croce} ₍₂₀₁₁_); _{Bansal and Shaliastovich}₍₂₀₁₂_{) use correlated long run risks in their models that} generate both smooth exchange rates and large equity premia.

(4)

might imagine that the developed world shares a common fate when it comes to long run consumption growth, but that the exact fate itself is undetermined. I derive novel tools for modeling multivariate con-stant gain learning with arbitrary belief correlations and study the asset pricing implications. Although my simple model is unable to simultaneously replicate large carry trade Sharpe ratios, equity Sharpe ra-tios, and smooth exchange rates, understanding why sheds insight on the limitations of models of belief formation in explaining anomalies in international finance.

I then evaluate the empirical predictions of my beliefs based story of the carry trade. My model predicts that carry trade returns should be determined by differences in expected growth rates, where expected growth rates come from an exponentially weighted averages of past consumption growth. I find support for this hypothesis in a panel of 14 developed world countries. This stylized fact also distinguishes my story of the carry trade from the existing long run risks and disasters explanations.

2 A Simple Model of Parameter Learning and the Carry Trade

A simple two country Lucas model most clearly illustrates the effect of learning. Let lower case letters denote logs. Global consumption growth is iid and distributed

∆ct+1=    ∆ct+1 ∆c∗_t₊₁   ∼MVN       g g   ,σ 2_I    (2.1)

Where stars denote foreign variables. Agents know the varianceσ2but not the true mean growth ratesg. Agents have power utility. This implies log SDFs of

mt+1 = logδ−γ∆ct+1 (2.2)

Because agents do not know the true growth rate g, they instead form beliefs ˜gt about consumption growth in the domestic country through constant gain learning:

˜ gt = α∆ct+ (1−α)gt˜−1= ∞

∑

j=0 α(1−α)j∆ct−j (2.3)

with gain parameterα. Symmetric equations hold for foreign preferences and beliefs.

To see where the name constant gain learning comes from, it is useful to compare equation2.3to the full Bayesian benchmark. In that case, the belief at stage tis just the sample mean, and so the updating

(5)

equation can be written as ˜ gt= 1 t |{z} Gain ∆ct+ 1−1 t ˜ gt−1 (2.4)

In this case, the gain parameter at stagetis 1_t and decreasing over time. Ast→∞, ˜gt→galmost surely and learning stops.

In contrast, constant gain learning replaces the decreasing sequence 1_t by a small constantα. Therefore agents never stop learning about the true growth rate. There are three reasons to focus on this case:

1. It is a sensible rule for learning if the agent is concerned about structural shifts in the true growth rate g. In this case the agent will want to overweight recent data when forming expectations. 2. There is empirical support for this model of belief formation. In the macro literature,Milani(2007)

finds that constant gain learning is able to dramatically improve the fit of DSGE models to match smooth inflation and consumption growth. Malmendier and Nagel(2015) use survey microdata and find that respondents seem to form expectations in a fully Bayesian fashion but only over the events they see in their own life time. Embedding this form of limited Bayesian learning in an overlapping generations framework produces a learning process, when averaged over generations, that closely approximates constant gain learning.

3. It is analytically useful. Beliefs evolve according to anAR(1)and therefore have a stationary, non-degenerate distribution for all time periodst. Thus the trading opportunities caused by learning do not fade away with time.

Given that consumption growth is independent, then the beliefs ˜gt, ˜g∗_t are independent. Agents know that they are learning from this process. So even though agents know that the true volatility of consumption growth is σ2, they have a slightly wider predictive density because of the uncertainty induced by the mean. In particular, Var(g˜) = ∞

∑

j=0 α2(1−α)2jVar ∆ct−j = ∞

∑

j=0 α2(1−α)2jσ2 = α 2 1−(1−α)2 σ2 = α 2−ασ 2 _(2.5)

(6)

Therefore under the subjective distribution domestic and foreign consumption growth∆ct+1is distributed ∆ct+1 ∼ MVN       ˜ gt ˜ g∗_t   , ˜Σ    ˜ Σ = ασ 2 2−α    1 1   +σ 2    1 1   

Define ˜σ2= 1+₂₋α_ασ2to be the subjective variance of domestic consumption growth. Agents price assets under the subjective Euler equation

1=E˜ [Mt+1Rt+1] = ˆ φ ∆_c t+1−gt˜ ˜ σ ×δ _C t+1 Ct −γ Rt+1d(∆ct+1) (2.6)

whereφ(·)is the density of a standard normal distribution. This pricing equation differs from the rational expectations benchmark because agents price assets based on their subjective expectation ˜gt for future consumption growth instead of the true growth rate g.4 I use ˜E to denote the expectation under the agents’ subjective beliefs. Based on this Euler equation, the risk free rate reflects the subjectively expected growth rate

r_tf₊₁=−logδ+γgt˜+1−1 2γ

2_˜

σ2 (2.7)

The solutions for the price-consumption ratio also reflect subjective expectations:

PCt+1 = k 1−k (2.8) k = δexp (1−γ)gt˜+1+1 2(1−γ) 2_˜ σ2 (2.9) rw_t₊₁ = logPCt+1+1 PCt +∆ct+1 (2.10)

I close by specifying foreign exchange markets. I assume complete markets.5 Therefore exchange rates are determined by ratios of SDFs.6 Let St denote the timet spot exchange rate, expressed in units of 4_{In this simple model, agents also ignore the effects of future parameter revisions on price. Ignoring future parameter revisions is} a useful approximation in the case of power utility (Collin-Dufresne et al.,2015). Intuitively, because the power utility SDF depends only on one period’s realized consumption growth, then parameter uncertainty does not make the SDF more volatile.

5_{One could also ask whether relaxing the complete markets assumption is a more productive way to resolve exchange rate} puzzles.Lustig and Verdelhan(2016), however, show that incorporating market incompleteness to existing complete markets models is not sufficient to jointly produce smooth exchange rates, large carry trade premia, and exchange rates that are uncorrelated with consumption growth. Therefore it is still necessary to consider various complete market benchmarks and to then figure out which particular cases of market incompleteness can help bridge the remaining facts.

6_{For a short proof, note there are two candidate SDFs for pricing domestic returns:} _M

t+1 and StS+t1M

∗

t+1. The second SDF arises becauseSt+1

(7)

foreign currency per domestic currency. Then

∆st+1=mt+1−m∗t+1 (2.11)

Note that the foreign currency appreciates in bad times for the foreign investor and depreciates in good times for the foreign investor. Therefore when foreign consumption growth is high relative to domestic consumption growth, the foreign currency depreciates. Covered interest rate parity (CIP) governs the forward rate. LetFt→t+1denote the forward rate at timetfor a contract that expires at timet+1. That is, an agent, at timet, can lock in a contract to buyFt→t+1units of foreign currency per domestic currency in periodt+1. The law of one price implies that

ft→t+1=st+rtf,∗−r f

t (2.12)

This condition comes about from analyzing two options that the agent has for investing one unit of domestic currency. In the first case, she can buy a forward contract, save the currency at home to haveRf in the home currency next period, and then convert intoRfFt→t+1units of foreign currency. Alternatively, she can convert today to getSt units of foreign currency, and save that abroad at the interest rate ofRf,∗. Since these two quantities can be locked in at period t, they must be equal, and so Rf_F

t→t+1 = StRf,∗. Taking logs yields equation2.12, and these relationships are illustrated in the diagram below.

Home $1 / /_Rf Foreign St //R∗,fSt=RfFt→t+1

My endowment model is robust to the introduction of production and trade costs. Assuming agents are optimizing their consumption, then the above equations are still first order conditions for the final consumption choices after agents have produced the optimal amounts of consumption and engaged in international trade. In fact, given that the exchange rate fluctuates, my model implicitly assumes that there are trade costs; otherwise purchasing power parity would hold and the exchange rate would be fixed at 1. But so long as agents optimize their consumption choices and financial markets are complete, consumption growth, interest rates, and exchange rates will be consistent with my model.

My model does assume that there is only one type of consumption produced in the world that enters St+1

St M

∗

t+1becomes a candidate SDF for pricing domestic returns. Since SDFs are unique in complete markets, thenMt+1=StS+t1M

∗ t+1. Equation2.11then follows from taking logs and rearranging terms. SeeBackus et al.(2001) for a more detailed explanation.

(8)

into both home and foreign utility functions additively. The exchange rate is the relative price of this consumption good at home versus consumption abroad .7 There is no home bias in consumption, nor any special division between tradable and non-tradable goods. Rather, my model is a minimal one in which the real exchange rate is just the relative price of a consumption good that both domestic and foreign agents consume.8

2.1 Model Results

I simulate this model for 40,000 quarters. I calibrate consumption growth to the post 1980 developed world per capita consumption growth data, with an annualized mean of 1.51% and standard deviation of 1.67%. I setδ=0.995 andγ=5. I calibrate the gain parameterα=0.02 to match the empirical estimates fromMalmendier and Nagel(2015) andMilani(2007). Table1reports key moments from the simulation. Three themes emerge from the results:

2.1.1 Persistent Excess Returns Do Not Imply Compensation for Risk

One argument for risk based explanations of the carry trade is that the well known historical performance of the carry trade rules out stories of persistently biased agent expectations.9 _{But constant gain learning}

means that even though beliefs about the growth rate of the economy are unconditionally unbiased, belief errors and carry trade profits never go away.10 The returns section of table1reports large Sharpe ratios of 0.16 for a simple strategy of just buying the high interest rate currency, and a higher Sharpe ratio of 0.21 when the strategy scales with the size of the interest rate differential.11 One way to conceptualize the magnitude of a 0.21 Sharpe ratio is to compare to Sharpe ratios typically seen in a power utility model withγ=5. For example, my carry Sharpe ratio is more than twice the Sharpe ratio of the wealth return

7_{Equivalently, one unit of home consumption can be exchanged for}_S

tunits of foreign consumption.

8_{Interpreting my model as story about two countries – the United States and the United Kingdom – can clarify my implicit} assumptions about production and trade. Agents in both the US and UK have their own apple trees. Agents are indifferent between consuming American or British apples, and because it is costly to trade between the two countries, then the relative price of American and British apples fluctuates.

9_{Lustig and Verdelhan}₍₂₀₀₈_{) write “The average log excess return after transaction costs has increased from 227 basis points in} the first part of the sample to 698 basis points per annum in the second part of the sample. Clearly,this is not a temporary anomaly that is about to be arbitraged away, and understanding carry trade returns is a critical step toward a better understanding of exchange rates.”

10_{My results that learning implies carry trade profits would also work in a fully Bayesian setting. So long as the beliefs are} somewhat volatile, carry trade profits will exist. However, in that Bayesian setting profits would eventually go to zero as learning stops. Therefore constant gain learning allows me to more realistically model robust belief formation on the part of the agent as well as allowing profits opportunities to exist over time.

11_{This carry trade Sharpe ratios is lower than the 0.40 Sharpe ratio for going long/short}_portfolios_{of developed country currencies} sorted by interest ratesLustig et al.(2011). They get larger Sharpe ratios because currency portfolios have lower volatility. But it is hard to compare their results with my result for two countries directly. One way to make a comparison is to suppose I had access to more currency pairs, thereby scaling expected returns by the number of additional countries and the exchange rate volatility at rate√n. Then if I go from 2 countries to 14 countries, as I do in my empirical analysis, then, assuming independent consumption, my Sharpe ratio would also be multiple by

√

13

√

2 ≈2.54. My Sharpe ratio would then be more comparable. However, this is a very ad-hoc comparison, and I have a more thorough empirical test of my theory in section4.

(9)

(0.08). As such it is important to understand why such large returns exist in equilibrium.

Analytically solving the carry trade premium reveals why the agents leave carry trade profits on the table. Buy the forward contract and sell it at the spot in the next period. The profits from this transaction are: ft→t+1−st+1 = −∆st+1+rf, ∗ t −r f t (2.13) = m∗_t₊₁−mt+1−log ˜Etexp m∗t+1 +log ˜Et[exp(mt+1)] (2.14) = m∗_t₊₁−mt+1− ˜ Et m∗t+1 +1 2Vart˜ m ∗ t+1 + ˜ Et(mt+1) + 1 2Vart˜ (mt+1) (2.15) = m∗_t+1−E˜t m∗t+1 − mt+1−E˜t(mt+1) (2.16)

Equation2.13uses CIP. Equation2.14expands the exchange rate using equation2.11and the risk free rate using2.7. Variance terms appear in equation2.15due to a Jensen inequality correction, and variances are eliminated from the last line because consumption variances, and thus SDF variances, are equal in the two countries.

To find the carry premium, I need to take expectations of equation2.16. First, I take expectations under subjective beliefs. In that case

˜

E[ft→t+1−st+1] =E˜ m∗t+1−E˜t m∗t+1

−

mt+1−E˜t(mt+1) =0 (2.17)

This equation demonstrates that agents do not perceive any carry trade premium. On the other hand, when the econometrician measures the excess returns from the carry trade, she takes expectations under the objective distribution for consumption growth. In that case

Et[ft→t+1−st+1] = Etm∗t+1−E˜tm∗t+1− Etmt∗+1−E˜tm∗t+1 (2.18) γ[(g˜∗_t −g)−(gt˜ −g)] (2.19) = γ(g˜t∗−gt˜ ) (2.20) = r_tf,∗−r_tf (2.21)

Thus the econometrician identifies a carry trade premium equal to the entire interest rate differential. Agents fail to enter the carry trade because uncovered interest rate parity holds under the agents’ subjective beliefs. High interest rate countries are those with high subjectively expected growth rates. But because of the positive relationship between a currency’s strength and the value of the country’s SDF, agents forecast currency depreciations for countries with high subjective growth rates. This forecasted

(10)

depreciation is precisely equal to the interest rate differential. On the other hand, the econometrician exploits the knowledge that consumption growth in the two countries share the same mean. Under the objective measure, SDFs are symmetric, exchange rates are a random walk, and the econometrician measures a carry trade premium that equals the interest rate differential. The carry trade exists not because agents are too risk averse, but rather because agents do not believe the carry trade will earn excess returns at all.

Another way to understand why agents are skeptical about carry trade profits comes from thinking about the carry trade returns that they must have observed as the two countries’ interest rates rf_,_rf,∗

pulled apart. Suppose that rf,∗ _> _rf_{, so that there are large expected returns from investing in the} foreign currency. By the Euler equation foreign growth expectations must also be higher than domestic growth rate expectations. For this to have occurred, it must have been that foreign growth consistently came in above the subjective expectation in the recent past. Given the tight link between growth and exchange rate depreciations imposed by equation2.11, then it is also the case that foreign exchange rate depreciations came in larger than the interest rate differential in the recent past. During this time, the carry trade has consistently lost money. Constant gain learning then means this should affect the agents’ future expectations, thereby making the agent unable to perceive the carry trade premium. Formally, we can decompose the current interest rate differential into past carry trade profits and past interest rate differentials: r_tf,∗−r_tf = γ(g˜∗t −gt˜) = γα(∆c∗_t −∆ct) + (1−α) g˜∗_t₋₁−gt˜−1 = −α r_tf,∗−r_tf−∆st+r_tf₋,∗₁−r_tf₋₁ (2.22) = −α j−1

∑

i=0 r_tf₋,∗_i−r_tf₋_i−∆st−i +r_tf₋,∗_j−r_tf₋_j (2.23)

Note that the term inside the parentheses is precisely the return from the carry trade. Therefore if the interest rate differential increased from period t−j to t, the term ∑ji=−10

r_tf₋,∗_i−r_tf₋_i−∆st−i

must be negative, and so it must be that the carry trade lost money on average in periodst−jtot.

My explanation for the carry trade resembles Lewellen and Shanken’s learning based explanation of the value premium (2002). In their model, agents learn about mean dividends in the cross section of stocks with identical mean dividend levels. Value stocks are stocks that have paid low dividends in the recent past, are perceived by agents to have low future dividends, and therefore have a low price. But because actual mean dividends have not decreased, expectations eventually mean revert, thereby making value

(11)

stocks appreciate in value. The value premium persists because agents cannot foresee this increase in dividend expectations in their subjective predictive distributions, and the book to market ratio tracks the difference between the agent’s subjective belief and objective dynamics. Analogously, agents in my model would earn an excess return by buying currencies of countries with expected growth rates above the true growth rate. But the carry trade premium persists because the agents cannot know what the true growth rates are! Moreover, interest rate differentials track the magnitude of the difference between subjective beliefs and objective dynamics, and therefore forecast expected returns.

True, if agents were fully Bayesian and consumption truly had a constant mean, then learning would eventually stop and therefore no country would have a difference between the subjectively expected growth rate and the true growth rate. But my model has the general implication that so long as the agents’ learning process is not fast enough to pin down the growth rate with absolute certainty, then carry trade premia can exist in the data without being compensation for risk.

2.1.2 Small Belief Distortions Can Look Like Large Risk Aversion

The carry trade Sharpe ratio is independent of risk aversion and linear in the difference in expected growth rates. Dividing equation2.18by the volatility of the exchange rate yields a carry trade Sharpe ratio of

SRt[ft→t+1−st+1] = ˜ gt∗−gt˜ σ √ 2 (2.24)

Applying the law of iterated expectations also gives that the unconditional Sharpe ratio is equal to12

SR[ft→t+1−st+1] = √ 2α p π(2−α) =Std(gt˜)×1 σ r 2 π (2.25)

Therefore large Sharpe ratios come from volatile beliefs caused by large gain and consumption volatility, not high risk aversion.

The closest rational expectations model that can generate a carry trade premium is an international Lucas model with heteroskedastic consumption growth. At timet, let next period consumption volatility at home and abroad equalσ_t2₊₁,σ_t∗₊2₁, respectively. In that model the carry trade Sharpe ratio equals

SRt[ft→t+1−st+1] =

γ σ_t2₊₁−σ_t∗₊2₁

2qσ_t2₊₁+σ_t∗₊2₁

(2.26)

12_{The conditional expected premium from going long the high interest rate currency is equal to}_|_γ₍_g_˜∗

t−g˜t)|. Since ˜g, ˜g∗ are independent and both normal with meangand variance ασ2

2−α, thenγ(g˜∗t−g˜t)∼N

0,2γ2ασ2

2−α

. By general results on half normals, E|γ(g˜∗t−g˜t)|= 2γσ

√ α

√

π(2−α). Then because the volatility of the strategy is always equal toγσ

√

2, then the unconditional Sharpe ratio becomes the reported result.

(12)

A rational expectations econometrician facing a large unconditional carry trade premium, consumption variances that do not differ substantially across countries, and equation2.26must conclude thatγis very high.

On the other hand, under subjective beliefs I only need that the belief difference ˜g∗_t −gt˜ to exhibit mild volatility. In fact, in my calibration, ˜gtis normally distributed with mean 1.51% and with standard deviation 0.16%. The full Bayesian solution for beliefs gives a good benchmark for conceptualizing how small belief fluctuations are in my model. The ratio between the volatility of beliefs and consumption volatility is Std(g˜t)

σ =

q α

2−α ≈0.10. Hence for a fully Bayesian agent to have comparable belief volatility as the constant gain agent, the fully Bayesian agent would have to observe 25 years = 100 quarters of consumption data. This is a long time for consumption growth parameters to stay stable. Second, one can ask how quickly an agent with fully Bayesian belief formation would reject the constant gain belief. In particular consider the likelihood ratioLdefined by

L = T

∏

t=1 p(∆ct|gt˜ ) p(∆ct|gt¯ ) = exp ( − 1 2σ2 T

∑

t=1 h (∆ct−gt˜ )2−(∆ct−gt¯)2i )

Assume that the constant gain belief is initialized at ˜g0=g, and that beliefs evolve according to equation

2.3. Figure1shows the value of exp(EL), or the expected odds of the growth rategequaling ˜gtinstead of equaling the sample mean ¯gt, as the number of observationstincreases. This figure shows that the odds decline slowly. Even after 50 years of data, the agent can only say that the likelihood of the constant gain growth rate is 60% of the likelihood of the full Bayesian belief. This is not strong evidence for an agent concerned about robustness to reject the constant gain belief. Nor is there definite evidence preferring the sample mean over the constant gain learning belief even after 100 years, with odds only at 20%.

Subjective beliefs generate premia based on differences in expected first moments, whereas risk based explanations must depend on second moments and therefore the risk aversion parameter. In that sense my work is an extension ofAbel(2002), who argued that one potential explanation of the equity risk premium was persistent pessimism on the part of agents. In that case, every percentage point of pessimism about consumption growth would raise the equity premium by an equal amount. Similarly, every percentage point difference in relative expectations about consumption growth in my model raises the conditional Sharpe ratio by an amount independent ofγ. Ultimately, I can exploit mild volatility in beliefs in the place of large volatility in SDFs to generate large carry trade Sharpe ratios.

(13)

2.1.3 Subjective Beliefs Can Make Exchange Rates Smooth and Objective SDFs Highly Correlated

Allowing agents to have time varying subjective beliefs can dramatically change the interpretation of stan-dard non-parametric tools used to evaluate rational expectations asset pricing models. One implication for international finance is that subjective beliefs can make measured SDFs based onHansen and Jagan-nathan(1991) bounds look far more correlated than what seems to be justified on the basis of consumption growth. Equivalently, beliefs can explain how smooth exchange rates are consistent with large carry trade Sharpe ratios.

Hansen and Jagannathan bounds are non-parametric tools used to connect the levels of risk premia with the variance of marginal utility growth. Intuitively, for assets to earn large excess returns it must mean that agents are highly risk averse. Concretely, what this means is that the maximal attainable Sharpe ratio should be bounded above by the volatility of the agents’ SDFs. Yet my basic model seems to violate this bound. The scaled carry trade earns a Sharpe ratio of 0.21 yet the volatility of the SDF is justγσ≈0.08.

However,Hansen and Jagannathan bounds apply only to an SDF that prices returns under objective

probabilities, whereas agents in my model price returns under subjective beliefs. Define an objective SDF ˆ Mt+1 ˆ Mt+1= ˜ ft+1 f ×Mt+1 (2.27)

Define the experienced marginal utility growth Mt+1 as the subjective SDF. Rewrite the subjective Euler equation2.6under objective probabilities as:

1=E˜ [Mt+1Rt+1] =E _f˜ t+1 f Mt+1Rt+1 =E _ˆ Mt+1Rt+1 (2.28)

Where ˜ft+1/f is the ratio of the subjective belief and objective probability density functions. Equation2.28 shows how the objective SDF earns its name: it prices returns under the objective probability distribution f. The objective SDF is the product of the belief distortion and the subjective SDF. Cogley and Sargent

(2008), in their paper on the equity risk premium, emphasize thatHansen and Jagannathanbounds apply only to objective SDFs. Table1shows that while theHansen and Jagannathanbound does not apply to the subjective SDF, which has a volatility of 0.08, it does apply to the objective SDF, which has a volatility of 0.30. Therefore high Sharpe ratios in a model with subjective beliefs indicate only thateitherbeliefs or marginal utility growth must be volatile.

Cogley and Sargent’s insight, when applied to international finance, can explain tightly correlated international SDFs and smooth exchange rates. Note that the tight connection between exchange rates

(14)

and SDFs in equation2.11implies Var(∆s) | {z } Small = Var(mt+1) | {z } Large +Var m∗_t₊₁ | {z } Large −2 Cov mt+1,m∗t+1 | {z } Large? (2.29)

Given annual Sharpe ratios above 0.40 on international stock markets,Hansen and Jagannathanbounds imply that the stochastic discount factors within countries must have standard deviations of around 40%. Yet changes in log exchange rates have an annual standard deviation of only 10%. Therefore log stochastic discount factors must be highly correlated across countries.13 But observable risk factors, such as con-sumption growth, have low correlations. These facts lead to the exchange rate smoothness puzzle: given plausible levels of correlation between SDFs, exchange rates in the data are too smooth (Brandt et al.,2006) But because equation 2.29 applies to objective SDFs, it only describes moment restrictions on the combination of both the log subjective SDF and the belief distortion. Therefore the high covariation in SDFs implied by equation2.29can actually just be an artefact of a common time varying belief distortion. This belief distortion reflects thecommondeviation of beliefs shared by domestic and foreign agents from objective dynamics.

Studying the prices of Arrow securities on domestic and foreign outcomes shows how this common belief distortion has nothing to do with the relationship between shocks in the two countries. Let states be discrete, and consider the price of an Arrow security that pays out 1 unit of domestic consump-tion if a certain pair of domestic and foreign states (s,s∗) occurs. Domestic agents are willing to pay

˜

ft+1(s,s∗)Mt+1(s). Such a security pays out St+1 units of foreign consumption, and therefore foreign agents would be willing to paySt+1ft˜+1(s,s∗)M∗_t+1(s∗). When the rational expectations econometrician computes what values the SDF takes on in these states, she will divide by the objective probabilities and payoffs to get that the foreign SDF takes on a value of f˜t+1

f M

∗

t+1= Mb_t∗₊₁and that the domestic SDF takes

on a value of f˜t+1

f Mt+1=Mtb +1. Both contain the common belief distortion ˜ ft+1

f . This proof relied on com-plete markets and that domestic and foreign agents agree on the international distribution of consumption growth. Even if agents believe that foreign and domestic states are independent, the rational expectations econometrician will still recover SDFs that share a common belief distortion f˜t+1

f .

Fortunately, the foreign belief distortion does not affect the price of all assets. In the case of indepen-dent shocks, the foreign belief distortion does not play a role in pricing assets based purely off of domestic shocks; Terran assets do not depend on belief distortions about Martian consumption growth. When pric-ing domestic payoffs under the objective measure, the foreign densities in the objective SDF decomposition

13_Let _m_,_m∗ _{denote the domestic and foreign log SDFs.} _{Then Cov}₍_m_,_m∗_{) =} 1

2[Var(m) +Var(m∗)−Var(m−m∗)] = 1

2 0.402×2−0.102

=0.155. Therefore Corr(m,m∗_{) =} 0.155

(15)

2.27will factor out and have expectation 1 by construction. In the general case of correlated shocks, I can instead replace the ratio of foreign densities with the ratio of conditional distributions of foreign shocks, in which case they will again factor out and have no effect on assets based on purely domestic shocks.14 For purely domestic assets, foreign belief distortions only matter to the extent that they convey information for domestic consumption shocks.

The common belief distortion term arises when one tries to find a pair of SDFs(M,M∗)that success-fully prices international assets under the objectivejointdistribution represented by ft+1. In this case the econometrician will instead pick up a common belief distortion about the joint distribution of shocks, thereby making objective SDFs highly correlated even when subjective SDFs are independent. This anal-ysis is borne out in my simulations. The SDF moments section of table1shows that while the subjective SDFs are uncorrelated, the objective SDFs with the belief distortion term have a correlation of 0.92. Thus subjective beliefs help explain why exchange rates can be so smooth by changing what it means for ob-jective SDFs to be correlated. Correlated obob-jective SDFs need not reflect shared risk, but rather shared fluctuating beliefs.

3 Correlated Learning and Subjective Long Run Risks

I extend my model to include Epstein-Zin (EZ) preferences and correlated learning to find a way to match high equity Sharpe ratios as well. Agents in my model price a "subjective" long run risk: namely the risk that their beliefs about average consumption growth fluctuate over time. This extension to Epstein-Zin preferences helps to illustrate the limits of models of subjective beliefs in solving open economy asset pricing puzzles simultaneously with domestic asset pricing puzzles.

14_Let_s_,_s∗_{denote the shock variables for home and domestic. Let}_g₍_s₎_{denote the marginal distribution of the home shock and}

g∗₍_s∗_|_s₎_{be the density of the foreign distribution conditional on the domestic shock. Therefore the joint distribution is}_g₍_s₎_·_g∗₍_s∗_|_s₎_. Let tildes denote subjective belief distributions. Then if the payoffXt+1depends only ons, then

(16)

3.1 A Model of Correlated Learning

I first derive formulas for correlated multivariate constant gain learning. My strategy is to show how the one dimensional version of constant gain learning can be derived from Bayesian preliminaries, and to then adapt this strategy for the multivariate case.

Let the agent have a normal prior for the mean growth rate gwith a mean ofµand varianceτ−2. She

samples consumption growth∆ct from a normal distribution with unknown mean but known variance

σ2. The agent then forms a sequence of posterior means{gt˜ }for the mean growth rate that can be defined recursively: ˜ gt = 1 t+σ2τ2∆ct+ 1− 1 t+σ2τ2 ˜ gt−1 (3.1) ˜ g0 = µ (3.2)

To arrive at the constant gain learning formula 2.3, I first change the sample size t to the effective sample size α−1 implied by constant gain learning, and then let the precisionτ → 0. This establishes a model of constant gain learning that is independent of the prior.

I use the same strategy in the multivariate case. Impose a prior on the vector of mean growth rates g with a mean of µ but with prior variance Υ = τ−2

   1 φ φ 1  

. In the multivariate case, there is a new

parameter to specify – the correlationφof beliefs in the prior. A value close to 1 communicates the idea that the agent strongly believes the true means for consumption growth in the two countries are close together and therefore induces correlated belief updates. The agent samples consumption growth ∆ct drawn from the multivariate normal distribution with unknown mean but known varianceΣ=σ2I. The full Bayesian solution implies that the posterior means{g˜t}for the mean growth rate evolve as follows:

˜

gt = M∆ct+ (I−M)g˜t−1 (3.3)

M = ΣΥ−1+tI−1 (3.4)

˜

g0 = µ (3.5)

Replacetwithα−1and let the precision go to zero. AppendixCshows that for fixed values ofφ<1, the updating equation reduces to independent learning.15 As precisionτ→0, the agent puts no weight on 15_For_φ ₌_{1, the prior corresponds to a dogmatic belief that mean growth rates in the two countries are equal, and is thereby} successful at replicating perfectly correlated beliefs. However, since I want to be able to model beliefs with arbitrary correlation in (0, 1), I develop further methods.

(17)

the prior. Thereforeφdoes not matter as the known independence in realized consumption overwhelms any motivation to update beliefs in a correlated fashion.

I resolve this problem by instead modeling the belief correlation φ with local to unity asymptotics φ∼1−cτ2+o(τ2)for somec>0, and then lettingτ→0. Intuitively, this allows me to impose a strong prior that growth rates are correlated without also determining what each of the growth rates should be. The particular value of the hyperparameterccan be chosen to match any arbitrary belief correlation

between 0 and 1. Appendix C outlines the derivations of the updating formulas and the theoretical

variances and correlations of beliefs. The resulting update equations take the form

˜ gt = M∆cn+ (I−M)g˜t−1 (3.6) M =    σ2 2c    1 −1 −1 1   +α −1_I    −1 (3.7) ˜ g0 = µ (3.8)

Although equation3.6indicates that the domestic agent will use realizations of foreign consumption growth to update beliefs, she nonetheless perceives that her subjective beliefs will evolve as a martingale. An important corollary is that future beliefs about domestic consumption, ˜gt+1, are independent of current beliefs about foreign consumption, ˜g∗_t. Therefore the pricing of the domestic consumption claim does not depend on beliefs about foreign consumption growth.

To show that ˜gt+1 ⊥ g˜∗t, observe that the future belief is a normal random variable. The variance is already pinned down by known parameters, so to show that the two variables are independent, it suffices to show thatEt[g˜t+1]⊥g˜∗t. There are two ways to arrive at this result. First, the evolution of future beliefs under current beliefs must be a martingale. Otherwise the agent would have revised her current period belief. ThereforeEt[gt˜+1] = gt, which is already determined and therefore does not depend on ˜˜ gt∗. I can also verify directly with algebra. Write ˜gt =

˜ gt g˜_t∗

T

, where the star denotes the posterior mean for the foreign consumption distribution. LetMij denote theijelement of matrixM. Expanding equation3.6

shows that the expected value of ˜gt+1under subjective beliefs is:

˜ EtM11∆ct+1+M12∆c∗t+1 + (1−M11)gt˜ −M12g˜∗t (3.9) = gt˜ +M11 E˜t[∆ct+1]−gt˜+M12 E˜t∆c∗t+1 −g˜t∗ (3.10) = gt˜ (3.11)

(18)

Which does not depend on ˜g∗_t.

3.2 Model Description

I now apply the new learning process to a model of priced parameter uncertainty. Consumption is again independent across countries and time. Agents now have Epstein-Zin preferences, and so the log SDF is:

logmt+1=θlogδ−γ∆ct+1+ (θ−1)logPCt+1+1

PCt (3.12)

whereψis the elasticity of intertemporal substitution,γis relative risk aversion over consumption growth,

θ= ₁₋1−γ

ψ−1, and PCtis the price to consumption ratio at timet.

Again, agents price assets using the subjective Euler equation. Beliefs evolve according to the update equations3.6. Agents incorporate parameter uncertainty into prices. They are aware that their belief about consumption growth may change in the future, and price this risk. Collin-Dufresne et al.(2015) show that incorporating priced consumption risk can substantially raise Sharpe ratios when agents have Epstein-Zin preferences. Exchange rates and forward contracts are determined by complete markets and covered interest parity, just as in the power utility case. AppendixDdetails the numerical solution methodology.

3.3 Discussion of Results

I simulate the model for 40,000 quarters. I set time preferenceδ=0.995 andα=0.02 just as in the power utility case. I setψ=1.5 to match parameterizations seen in the long run risks literature. I consider two cases for risk aversion: γ=5 andγ=10, as well as three possibilities for the correlation between beliefs. I choose to model zero correlation to match the power utility model, a perfect correlation in order to see how much traction the model can get on exchange rate smoothness, as well as an intermediate correlation of 0.77 to match the cross-country correlation in log dividend price ratios reported inColacito and Croce

(2011).

Table2shows that Epstein-Zin preferences raise Sharpe ratios on the wealth claim. But the table also illustrates the tensions that arise from trying to match other moments.

First, observe that carry trade Sharpe ratios are always highest in the case of uncorrelated beliefs and lowest in the case of perfectly correlated beliefs. This is to be expected as the carry trade premium arises

in my model because of differences in beliefs about consumption growth. When belief updates become

correlated these differences become smaller, and so Sharpe ratios fall.

But focusing on the case of uncorrelated belief updates then makes exchange rates highly volatile. In the Epstein-Zin model, both the subjective and objective SDFs are driven by volatile price consumption

(19)

ratios, which are ultimately tied to fluctuating beliefs. Since exchange rates are determined by differences in subjective SDFs, then exchange rates are inevitably volatile. In contrast, the power utility model had a smooth subjective SDF and generated a volatile objective SDF through a belief distortion term. Since that belief distortion term does not enter into the exchange rate, the power utility model was able to have large carry trade Sharpe ratios alongside smooth exchange rates.

There is also a tension between exchange rate smoothness and large equity Sharpe ratios. If one would like to get close to the large equity Sharpe ratios in the data, one would need to consider a risk aversionγ of at least 10. But in those cases, the exchange rate is volatile even when the two countries have perfectly correlated beliefs. The exchange rate volatility rises entirely from the consumption growth portion of the Epstein-Zin SDF, and is sufficient to result in exchange rate volatility twice of that which we see in the data.

Thus the exchange rate smoothness puzzle for models of subjective beliefs can be restated as: Large equity Sharpe ratios tell us that beliefs must be highly volatile. But given the low cor-relation in beliefs implied by large carry trade Sharpe ratios, exchange rates are puzzlingly smooth.

The logic behind this statement can best be captured through re-interpreting the exchange rate smoothness bound2.29as a set of moment restrictions on the subjective SDFs. This is possible because the relationship between the exchange rate and SDFs applies to both subjective and objective SDFs. If I allow beliefs to be uncorrelated, then the subjective SDFs are independent. In that case the variance in the exchange rate is equal to the sums of the subjective SDF variances. If I insist that beliefs are unconditionally unbiased, then to get large wealth Sharpe ratios the subjective SDF must be highly volatile. But that means exchange rates are no longer smooth.

4 Empirical Test: Growth Carry

My models predict that sorting currencies by exponential moving averages of past consumption growth rates should deliver comparable returns to the standard carry trade that sorts based on interest rates. I investigate this prediction for a panel of 14 developed world countries. I focus on developed world countries because my theory only applies to countries with relatively similar growth rates that do not vary substantially over time. Appendix F contains further details of the data used for this analysis. I initialize each country’s growth rate expectation at 2.32%, the pre-1980 average per capita consumption

(20)

growth across all the developed countries in my sample.16 Then for each country, I recursively generate beliefs according to the constant gain learning formula 2.3. This generates a sequence of growth rate beliefs ˜gi,t for country i = 1, . . . ,Nat time t. I also compute forward discounts f di,t = fi,t→t+1−st that correspond to interest rate differentials.

I define two kinds of carry trades. First is the standard interest rate carry, which enters into positions f di,t− N1 ∑Ni=1f di,tin each currencyiat timet. This strategy takes larger positions in currencies that have high forward discounts relative to other currencies in the cross section, and corresponds to a continuous version of the portfolio strategy specified inLustig and Verdelhan(2007). Second, I define a growth carry, which enters into positions ˜gi,t−N1 ∑iN=1g˜i,tin each currencyiat timet. This strategy takes larger positions in currencies of countries that have high expected growth rates relative to other currencies in the cross section. If true growth rates were constant and growth rate expectations evolved exactly according to my constant gain specification, then these sorts would be identical.

Figure2 shows that these two strategies share similar periods of gains and losses. This suggests that high interest rate currencies correspond to those with high expected growth rates. While the growth carry earns a Sharpe ratio of 0.26, the interest rate carry earns a Sharpe ratio of 0.40. Given that estimates based on macro data are inevitably noisier, and given that my constant gain learning process inevitably suffers from misspecification, these the comparable Sharpe ratios indicate that differences in growth rate expectations do a good job in explaining the dispersion of forward discounts and carry trade returns.

This growth carry is a novel empirical test that helps to distinguish competing stories of the carry trade. For example, the long run risks model ofBansal and Shaliastovich(2012) and the rare disasters model of

Farhi and Gabaix (2014) do not feature expected returns that depend on long moving averages of past consumption growth. In fact, in the domestic asset pricing literature, it is an advantage of the long run risks model that past consumption growth does not forecast the state variable (the price to consumption ratio) that forecasts expected returns. But this strength of the long run risks model in the domestic realm is a weakness in resolving the carry trade. In the case of the carry trade, the state variable that forecasts expected returns, the interest rate differential, is predictable on the basis of past consumption growth. Therefore the growth carry is support for my beliefs based story as well as the habits based story of

Verdelhan(2010).

In table3, I provide a summary of the empirical performance of rational expectations solutions to the carry trade as well as a subset of my belief based stories. Overall, the beliefs model with power utility is the only model to match all of the foreign exchange facts: large carry trade returns, smooth exchange 16_{The exact level of this belief is not very important because the carry trades that I study are all based on the cross section of} growth rates.

(21)

rates, and a growth carry. However, it fails to match a large equity Sharpe ratio. When one imposes the requirement that models must also match large equity Sharpe ratio, then the model of beliefs that matches the most facts is the Epstein-Zin model with high risk aversion γ= 10. However, in that case exchange rates are no longer smooth, and my model with Epstein-Zin preferences and priced parameter uncertainty matches the same set of facts as the habits model.

5 Conclusion

I extend the modern asset pricing literature on parameter learning to the open economy and show how errors in beliefs about mean consumption growth can provide a plausible account of persistent and large carry trade premia. In my model, the carry trade is profitable not because of differences in conditional SDF variances, but rather because of differences in beliefs about mean consumption growth in different countries. I also show how incorporating subjective beliefs in open economy asset pricing models can explain how smooth exchange rates are consistent with large carry trade Sharpe ratios. I then extend the model to the case of Epstein-Zin preferences, and find that my simple model of international learning cannot jointly match high equity Sharpe ratios, high carry trade Sharpe ratios, and smooth exchange rates. My beliefs based stories predict that a growth carry that sorts countries based on expected growth rates instead of interest rates should earn returns comparable to the interest rate carry, and I find support for this in the data. I show how this test distinguishes my model relative to existing rational expectations models.

A key extension of my model would be to incorporate how beliefs may interact with incomplete mar-kets. Given that learning seems to have traction even in simple consumption environments, one could extend the rational expectations analysis ofLustig and Verdelhan (2016) to see how incomplete market wedges combined with beliefs can help match key moments of international asset returns. Another direc-tion would be to find ways to model the non-linearities in carry trade returns. While existing explanadirec-tions of the skewness of currency returns focus on intermediary constraints (Brunnermeier et al.,2008), it is an open question if and how learning might generate similar results.

(22)

References

Abel, Andrew, “An exploration of the effects of pessimism and doubt on asset returns,”Journal of Economic Dynamics and Control, 2002,26.

Backus, David K, Silverio Foresi, and Chris I Telmer, “Affine Term Structure Models and the Forward Premium Anomaly,”Journal of Finance, February 2001,56(1).

Bansal, Ravi and Ivan Shaliastovich, “A Long-Run Risks Explanation of Predictability Puzzles in Bond and Currency Markets,”Review of Financial Studies, 2012,26(1), 1–33.

Brandt, Michael W., John H. Cochrane, and Pedro Santa-Clara, “International risk sharing is better than you think, or exchange rates are too smooth,”Journal of Monetary Economics, 2006,53, 671–698.

Brunnermeier, Markus K., Lasse Pedersen, and Stefan Nagel,Carry Trades and Currency Crashes, Vol. 23 ofNBER Macroeconomics Annual, University of Chicago Press,

Cogley, Timothy and Thomas J Sargent, “Market price of risk and the equity premium: A legacy of the Great Depression,”Journal of Monetary Economics, 2008,55, 454–476.

Colacito, Riccardo and Mariano M. Croce, “Risks for the Long Run and the Real Exchange Rate,”Journal of Political Economy, 2011,119(1), 153–182.

Collin-Dufresne, Pierre, Michael Johannes, and Lars A. Lochstoer, “Parameter Learning in General Equilibrium: The Asset Pricing Implications,”Working Paper, 2015.

Farhi, Emmanuel and Xavier Gabaix, “Rare Disasters and Exchange Rates,”Working Paper, October 2014. Hansen, Lars Peter and Ravi Jagannathan, “Implications of Security Market Data for Models of Dynamic

Economies,”Journal of Political Economy, April 1991,99(2), 225–262.

Lewellen, Jonathan and Jay Shanken, “Asset-Pricing Tests, and Market Efficiency,” Journal of Finance, June 2002,57(3), 1113–1145.

Lustig, Hanno and Adrien Verdelhan, “The Cross Section of Foreign Currency Risk Premia and Con-sumption Growth Risk,”American Economic Review, March 2007.

and , “Comment on "Carry Trades and Currency Crashes",” in Daron Acemoglu, Kenneth Rogoff, and Michael Woodford, eds.,NBER Macroeconomics Annual 2008, Vol. 23 2008, pp. 361–384.

and , “Does Incomplete Spanning in International Financial Markets Help to Explain Exchange

Rates?,”NBER Working Paper, February 2016.

, Nikolai Roussanov, and Adrien Verdelhan, “Common Risk Factors in Currency Markets,”The Review of Financial Studies, November 2011,24(11), 3731–3777.

Malmendier, Ulrike and Stefan Nagel, “Learning from Inflation Experiences,”Quarterly Journal of Eco-nomics, October 2015.

Milani, Fabio, “Expectations, learning and macroeconomic persistence,” Journal of Monetary Economics, January 2007,54, 2065–2082.

Pastor, Lubos and Pietro Veronesi, “Learning in Financial Markets,”Annual Review of Financial Economics, 2009.

Verdelhan, Adrien, “A Habit-Based Explanation of the Exchange Rate Premium,”Journal of Finance, Febru-ary 2010,65(1), 123–146.

(23)

A

Tables

Table 1: Moments from model simulations of the power utility model. I run quarterly sim-ulations but annualize all moments. The subjective SDF refers to the experienced marginal utility growth of the agent. The objective SDF refers to the subjective SDF plus the common belief distortion. The carry trade refers to the strategy of buying the high interest rate cur-rency and shorting the low interest rate curcur-rency. The scaled carry trade scales the carry trade exposure according to the size of the forward discountr_tf,∗−r_tf, and therefore takes advantage of the higher conditional Sharpe ratio when the forward discount is large.

Moments Consumption Consumption Growth (%) g 1.51 Consumption Volatility (%) σ(∆c) 1.67 Returns + Beliefs Carry Trade SR SR([ft→t+1−st+1]× signhr_tf,∗−r_tf,∗i 0.16 Scaled Carry Trade SR SR([ft→t+1−st+1]× h r_tf,∗−r_tf,∗i 0.21 Wealth SR SR(rW) 0.05 Volatility of Belief (%) σ(gt˜) 0.16 SDF Moments Subjective SDF Volatility σ(m) 0.08 Objective SDF Volatility σ(mˆ) 0.30 Subjective SDF Correlation Corr(m,m ∗₎ _0.00 Objective SDF Correlation Corr(m, ˆˆ m) 0.92 Exchange Rate Volatility σ(∆s) 0.12

(24)

Table 2: Moments from model simulations of the Epstein-Zin model with priced parameter uncertainty. I run quarterly simulations but annualize all moments. The carry trade refers to the strategy of buying the high interest rate currency and shorting the low interest rate currency. The scaled carry trade scales the carry trade exposure according to the size of the forward discountr_tf,∗−r_tf, and therefore takes advantage of the higher conditional Sharpe ratio when the forward discount is large.

Risk Aversion γ=5 γ=10 Belief Correlation ρ=0 ρ=0.77 ρ=1 ρ=0 ρ=0.77 ρ=1 Carry Trade SR 0.14 0.04 0.00 0.13 0.06 0.00 Scaled Carry SR 0.18 0.06 0.00 0.17 0.07 0.00 Wealth SR 0.19 0.13 0.13 0.36 0.28 0.27 Exchange Rate Volatility 0.44 0.17 0.12 0.76 0.32 0.24

(25)

Table 3: Summary of how models perform on replicating different features of exchange rates and the carry trade in the data. Carry SR refers to the ability of a model to generate large carry trade SR. Smooth FX refers to the ability of the model to limit the standard deviation of the real exchange rate to around 0.10. Growth carry refers to the ability of the model to link the cross sectional dispersion in interest rates to recent levels of consumption growth. Wealth SR refers to the ability of a model to generate equity Sharpe ratios near 0.40. TheX’s mark the dimensions on which each model succeeds.

Data Feature

Habits (Verdelhan,

2010)

Long Run Risks (Bansal and Shaliastovich, 2012) Rare Disasters (Farhi and Gabaix,2014) Beliefs + Power Utility,γ=5 Beliefs + EZ + Independent Beliefs,γ=10 Carry SR X X X X X Smooth FX X X X Growth Carry X X X Wealth SR X X X X 25

(26)

B

Figures

0 50 100 150 200 250 300 350 400 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Relative Odds that Data is from Constant Gain Belief

Quarters

P(g = Constant Gain) / P(g = Sample Mean)

Figure 1: Average relative odds of data being generated from the constant gain belief ˜gt instead of the sample mean ¯gt, for t = 1, . . . ,T = 400 quarters. In each simulation trial i = 1, . . . ,N = 10, 000, I simulate a sequence of consumption growth ∆ci,t. Constant gain beliefs ˜gt,i are initialized with ˜g0,i = g and evolve according to ˜gt,i = α∆ct,i+ (1−α)g˜t−1,i. The full Bayesian belief is the sample mean ˜gt,i = 1t∑tj=1∆cj,i. In each simulation trial, I compute the log likelihood ratio`t,i = −_2σ12∑tj=1

h

∆cj,i−gt˜,i2− ∆cj,i−gt¯,i2

i

. The curve above, at quartert, is then exp_N1 ∑N

i=1`t,i

. The quantity represents the relative likelihood of the data being generated from a process of consumption growth with the mean ˜gt versus being generated from ¯gt. Given that the sample mean is the maximum likelihood estimate, this relative likelihood is always less than one, although it does decrease only slowly.

(27)

1.0 1.1 1.2 1.3 1990 2000 2010 Date Cum ulativ e Retur n Strategies Growth Carry Interest Rate Carry

Cumulative Returns

Figure 2: Comparison of carry trade strategies on developed world countries. The growth carry refers to the long short strategy sorting on model implied consumption growth beliefs, entering in positionsgi˜,t− _N1 ∑iN=1gi˜,t

in currencyiat timet, when the growth rate belief for countryiat timetis ˜gi,t. The interest rate carry refers to the long short strategy sorting on forward discounts, entering in positionsf di,t−N1 ∑Ni=1f di,t

in currency i at timet, when the forward discount is defined as the log difference between the forward contract and the spot f di,t= fi,t→t+1−si,t. Log returns to each strategy are scaled to have a standard deviation of 1%.

(28)

C

Proofs of Multivariate Constant Gain Learning

Let consumption growth follow

∆ct+1=     ∆ct+1 ∆c∗_t₊₁     ∼MVN(g,Σ),g=     g g∗     ,Σ=σ2I

Impose the prior

p(g,g∗)∼MV N(g,Υ),g=     g g∗     ,Υ=τ−2     1 φ φ 1    

I start with the full Bayesian solution. Aftertobservations, the posterior for the means is a multivariate normal with mean ˜gt

˜ gt = Υ−1₊_t_Σ−1−1 _Υ−1_g₊_Σ−1

_∑

t s=1 ∆cs !

This can be written recursively as:

˜ gt = ΣΥ−1₊_tI−1_∆_c n+ I−ΣΥ−1₊_tI−1 ˜ gt−1 ˜ g0 = g

The first step to get to constant gain learning is to replace the true sample size t with the effective sample sizeα−1. After doing so, expand out the multiplier matrix on consumption growth:

ΣΥ−1₊ α−1I −1 =     α−1 1−φ2+σ2τ2 φσ2τ2 φσ2τ2 α−1 1−φ2+σ2τ2     × 1 α−2(1−φ2) +2α−1σ2τ2+σ2τ4

The general strategy involves sendingτ→0 and seeing what the resulting matrix implies for learning. First consider the special case ofφ=1. In that case I get the symmetric matrix

σ2τ2 2α−1σ2τ2+σ2τ4     1 1 1 1    

(29)

Let the precisionτ→0 to reflect a diffuse prior. The matrix then becomes α 2     1 1 1 1    

Concretely, what this means is that ˜gtevolves according to

˜

gt=α· ∆ct+∆c

∗

t

2 + (1−α)gt˜−1

Therefore the two countries’ beliefs are perfectly correlated because the new information incorporated at each stagetis just the average of the two countries’ consumption growth at timet.

The case ofφ=0 also accords with the intuition that independent priors implies independent learning. In that case, ΣΥ−1₊ α−1I −1 =     α−1+σ2τ2 α−1+σ2τ2     × 1 α−2+2α−1σ2τ2+σ2τ4 = I× α −1₊_σ2_τ2 α−2+2α−1σ2τ2+σ2τ4 → I×α −1 α−2 =αI

Which corresponds exactly to the intuition of just using each country’s past history of consumption growth.

However, lettingτ→0 for intermediate levels ofφdoes not lead to intermediate solutions. Intuitively, asτ →0, the effect ofΥin the updating equation goes to zero, and so the influence of the independent consumption growth overwhelms any effect of the prior in inducing correlations between beliefs. Formally, one can take limits of the updating matrix ΣΥ−1+α−1I−1asτ→0 to get:

ΣΥ−1₊ α−1I −1 →     α−1 1−φ2 α−1 1−φ2     × 1 α−2(1−φ2) = αI

(30)

asymp-totics. Letφ∼1−cτ2+o τ2, and thereforeφ2∼1−2cτ2+o τ2. In that case asτ→0 ΣΥ−1₊ α−1I −1 =     α−1 1−φ2+σ2τ2 φσ2τ2 φσ2τ2 α−1 1−φ2+σ2τ2     × 1 α−2(1−φ2) +2α−1σ2τ2+σ2τ4 ∼     α−12cτ2+σ2τ2 1−cτ2σ2τ2 1−cτ2σ2τ2 α−12cτ2+σ2τ2     × 1 α−22cτ2+2α−1σ2τ2+σ2τ4 →     2α−1c+σ2 σ2 σ2 2α−1c+σ2     × 1 2cα−2+2α−1σ2 = α     2α−1c+σ2 σ2 σ2 2α−1c+σ2     × 1 2cα−1+2σ2 =     σ2 2c+α−1 −σ 2 2c −σ2 2c σ 2 2c+α−1     −1 =     σ2 2c     1 −1 −1 1     +α−1I     −1

In deriving this learning model I imposed a hyperparameterc. One way to calibratecis to think about

the implied correlation between the two between the two beliefs ˜gt, ˜g∗_t. LetM=

    σ2 2c     1 −1 −1 1     +α−1I     −1 . Because Mis the inverse of a symmetric matrix, it too is symmetric. Iterating the recursion for ˜gt back-wards yields: ˜ gt = M∆cn+ (I−M)g˜t−1 = M∆cn+ (I−M)M∆cn−1+ (I−M)θ¯n−2 = ∞

∑

j=0 (I−M)jM∆cn−j

(31)

Note thatM,I−Mcommute, and thatM=MT. Therefore Var(g˜t) = ∞

∑

j=0 (I−M)jMΣMTI−MTj = σ2 ∞

∑

j=0 (I−M)jM2(I−M)j = σ2M2 ∞

∑

j=0 (I−M)2j

This provides both the parameter uncertainty matrix and as well as the correlation between beliefs. Figure

3shows the relationship between the hyperparameter and the belief correlation, with the belief correlation both computed analytically and checked by simulation.

−20 −18 −16 −14 −12 −10 −8 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 log(c) Belief Correlation

Relation of Hyperparameter c to Belief Correlation

Monte Carlo T = 40000 Matrix Sum

Figure 3:Relationship of hyperparametercto the correlation of beliefs. Monte Carlo refers to the correlation as computed from a Monte Carlo simulation of how the beliefs evolve. Matrix sum refers to explicitly computing the infinite sum in the matrix expression.