Rational macroeconomic learning in linear expectational models

(1)

Rational macroeconomic learning in

linear expectational models

Holden, Tom

Department of Economics, University of Oxford

1 May 2008

Online at

https://mpra.ub.uni-muenchen.de/10872/

(2)

Rational macroeconomic learning in

linear expectational models

An analysis of the convergence properties of macroeconomic models

un-der partial information rational expectations and Bayesian learning

Abstract: The partial information rational expectations solution to a general linear multivari-ate expectational macro-model is found when agents are uncertain about the true values of

the model’s parameters. Necessary and sufficient conditions for convergence to the full in-formation rational expectations solution are given, and the core of an algorithm for the Bayesian updating of beliefs is provided. In the course of this a new class of full information rational expectations equilibria is described and some of its desirable properties proven.

Keywords: Rational Expectations, Partial information, Bayesian learning, Generalized Schur decomposition, Sunspots, Indeterminacy, Feasible Rational Expectations Equilibria

JEL Classification: C11, C60, E00

Word count: Actual: 19898 words. Official: 365 words per page × 81 pages = 29565 words.

Post: Tom Holden, Balliol College, Oxford, OX1 3BJ

Phone: +44 7815 067305

E-mail:[email protected]

Acknowledgements: The author would particularly like to thank his primary supervisor David Vines for steering him towards this topic in its current form and his secondary supervisor Martin Ellison for his advice. Additional thanks for helpful comments are due to Simon

(3)

1. Introduction

In this thesis, we solve the problem of forming macroeconomic rational expectations under partial

infor-mation about a model’s parameters. We find necessary and sufficient conditions for convergence to the

full information solution and we develop the core of an algorithm for the updating of beliefs. This

pro-vides a fully rational alternative to the statistical learning literature, popularized by Evans and Honkapohja

(2001), which has been influential in recent years. We begin with the motivation for this project.

1.1. Expectations in macroeconomics

Expectations are inextricably tied up with the optimising agent framework that underlies almost all

mod-ern economics. In choosing whether to invest in stock, we consider whether the dividends we expect to

get from it are more than adequate compensation for the price asked. More generally, whenever an

agent is making a decision that will potentially deliver costs or rewards in the future, then they must form

expectations of what that reward might be. Consumers choose current consumption to maximise their

expectations of lifetime utility. Firms make pricing and investment decisions to maximise the expected

value of the stream of profits that will result. Central banks choose the interest rate to minimise the

ex-pected future deviation of inflation and output from their targets. Indeed, almost all economic decisions

have a forward-looking aspect to them, and so require the formation of expectations.

What makes expectations particularly interesting to macroeconomists are the many macroeconomic

vari-ables that are affected by their own expectations. If when a firm chooses a price for their product they

know they may be constrained to stick to that price for several periods, then they will optimally choose

their price taking not only their current marginal costs into account, but also their expectations about the

marginal costs they may face in the future. With price a mark-up over marginal costs such a set-up leads

to current inflation depending on current expectations of future inflation (Calvo 1983; Walsh 2003:

234-40). Similarly the optimization decisions of households lead current output to depend on households’

ex-pectations of future output (Walsh 2003: 232-34). Many contemporary macroeconomic models take a

dynamic stochastic general equilibrium (DSGE) approach in which the optimisation decisions of

(6)

macroeco-nomic variable having consequences for the path of virtually every other variable considered. Clearly

then, precisely how these expectations are formed will have significant consequences for the path the

economy actually takes.

Traditionally the literature has been divided between full information “rational expectations” on the one

hand and various partial information, boundedly rational schemes on the other. Neither is entirely

satis-factory. On the one hand, the knowledge and mental capacities ascribed to agents under rational

expec-tations are surely infeasible in general; on the other hand, though, there are at least some agents in the

economy, often those with most influence, who really could not be sensibly modelled as anything other

than fully rational. Most boundedly rational schemes also suffer from exceptionally poor performance in

certain specific settings, meaning that in some circumstances even the least rational agents in the

econ-omy may realise the flaws in the way they form expectations. It is also hard to interpret the predictions of

partial information boundedly rational models as until now has there has been no partial information full

rationality benchmark to compare them against. Finally, since there are so many ways in which an agent

can fail to be fully rational, any boundedly rational scheme will always seem somewhat arbitrary unless

sound reasons can be given for one form rather than another.

1.2. “Rational expectations”

1.2.1. Calculating rational expectations

If we have a model of some part of the economy and values for all the model’s parameters, and we take

that model to be true, how should we rationally form expectations of the model’s variables? This is the

question to which “rational expectations” were the answer, an answer first formulated by Muth (1961)

and later popularized by Lucas (1972) and Sargent et al. (1973). Broadly, rational expectations are just

mathematical expectations; complications arise, though, when these expectations directly affect the

model’s variables.

Consider as a first example an industry in which supply decisions must be taken a period prior to the

reali-sation of demand, due to the time taken by production. If markets clear and we take a locally linear

(7)

𝑐𝐷− 𝑚𝐷𝑝𝑡+𝜈𝐷,𝑡 =𝑐𝑆+𝑚𝑆𝔼𝑡−1𝑝𝑡+𝜈𝑆,𝑡

where 𝐷 and 𝑆 subscripts denote demand and supply side parameters respectively, 𝑝_𝑡 is the price level

and 𝜈_∙_,_𝑡 are unpredictable shocks (i.e. 𝔼_𝑡−₁𝜈_∙_,_𝑡 = 0)1. To find the rational expectations solution, we take

expectations conditional on the 𝑡 −1 information set of both sides, giving:

𝑐𝐷− 𝑚𝐷𝔼𝑡−1𝑝𝑡 =𝑐𝑆+𝑚𝑆𝔼𝑡−1𝑝𝑡 ⇒ 𝔼𝑡−1𝑝𝑡 =

𝑐𝐷− 𝑐𝑆

𝑚𝐷+𝑚𝑆

Substituting this back into the original equation gives us that: 𝑝_𝑡= 𝑐𝐷−𝑐𝑆

𝑚𝐷+𝑚𝑆+

𝜈𝐷,𝑡−𝜈𝑆,𝑡

𝑚𝐷 . This then is the

value 𝑝_𝑡 would take if all agents in the economy had formed rational expectations with knowledge of the

values of the parameters 𝑐_𝐷,𝑐_𝑆,𝑚_𝐷 and 𝑚_𝑆. Because “rational expectations” are only rational when ev

e-ryone in the economy knows that evee-ryone else is rational, it is important to note that strictly construed

“rational expectations” are an equilibrium concept. Were it the case that almost everyone in the eco

n-omy (irrationally) expected next period’s price to be zero, then the rational expectation of the next period

price would instead approximately equal 𝑐𝑆−𝑐𝐷

𝑚𝐷 . In light of this, we shall term a solution to a model under

rational expectations a rational expectations equilibrium or REE2.

The models we will chiefly be concerned with in this thesis will not admit such simple REE as the one just

given for the Cobweb model. In particular, we will focus on models in which current expectations of

fu-ture values influence the current value of those variables, rather than those in which only past

expecta-tions matter. Most DSGE and New Keynesian models take this “𝑡-dated” form. The canonical example is

asset pricing under risk neutrality, with a constant, non-stochastic real interest rate. It is straightforward

to see that in this situation, 𝑝_𝑡 = 1 +𝑟 −1𝔼_𝑡𝑝_𝑡₊₁+𝑑_𝑡, where 𝑝_𝑡 is the period 𝑡 asset price and 𝑑_𝑡 is the

dividend paid at the start of that period (so in particular 𝑑_𝑡 is in the period 𝑡 information set). In general,

this has many REE. For example, let 𝜂_𝑡 be any white noise process, then we can impose 𝑝_𝑡 =𝔼_𝑡−1𝑝𝑡+𝜂𝑡

and still get a solution, since stacking these equations we have:

1 − 1 +𝑟 −1

1 0

𝑝𝑡

𝔼𝑡𝑝𝑡+1 =

0 0

0 1

𝑝𝑡−1

𝔼𝑡−1𝑝𝑡 + 𝑑

𝑡

𝜂𝑡

1

This is the Cobweb model considered by Muth (1961).

2

(8)

i.e. 𝑝𝑡

𝔼𝑡𝑝𝑡+1 =

0 1 +𝑟 0 1 +𝑟

𝑝𝑡−1 𝔼𝑡−1𝑝𝑡 +

0 1 +𝑟

−1 1 +𝑟

𝑑𝑡

𝜂𝑡

It is common when considering rational expectations solutions to such problems to restrict attention to

those satisfying some stationarity condition. These are often justified by the transversality conditions of

the optimization problem from which the equations arrived, or by an appeal to agents’ assumption that

the future is not radically different from the present. In this model, it turns out that if 𝑑_𝑡~NIID 𝜇,𝜎2 , for

sensible values of 𝑟 there is always a stationary solution taking the form 𝑝_𝑡=𝑐+𝑑_𝑡 for some unknown

parameter 𝑐. When this holds we must have 𝔼_𝑡𝑝_𝑡₊₁ =𝑐+𝜇, so identifying coefficients 𝑐= 1 +

𝑟 −1_𝑐₊_𝜇_{, i.e.}_𝑐₌𝜇

𝑟. This method of guessing solutions based on the state variables of the problem is

due to McCallum (1983; 1999) and is known as the minimal state variables (MSV) solution. Unfortunately,

for more complex models finding MSV solutions is numerically cumbersome (Binder and Pesaran 1996)

and it will not in any case find all solutions of the original model. Instead the solution method we shall use

in this paper owes its intellectual debt to that of Blanchard and Kahn (1980).

1.2.2. Indeterminacy

General linear expectational models often have many REE. Although the early DSGE literature confined

itself to models in which there was a unique solution, recently models exhibiting indeterminacy have

been given more serious consideration. Indeterminacy may arise from increasing returns to scale

(Ben-habib and Farmer 1994), market imperfections (Ben(Ben-habib and Nishimura 1998), search externalities

(Howitt and McAfee 1988), variable mark-ups (Woodford 1987), collusion (Rotemberg and Woodford

1992), the interaction of monetary policy and cash in advance constraints (Woodford 1994), policy

feed-back (Blanchard and Summers 1987; Taylor 1998), sticky prices (Benhabib et al. 1998), endogenous

growth (Benhabib and Gali 1995) and several other sources3. Indeed, the theoretical evidence at least is

almost overwhelming in support of some level of indeterminacy.

Indeterminacy can also potentially explain many macroeconomic puzzles. Benhabib and Farmer (1999)

suggest it may have a role to play in explaining price stickiness, Auray and Fève (2007) suggest it may

3

(9)

plain the price puzzle and Benhabib and Farmer (2000) suggest it may help explain the real effects of an

increase in the money supply. All of this suggests that indeterminacy is empirically important as well.

Our interest in indeterminacy stems from two facts. Firstly, the traditional macroeconomic learning

litera-ture has had most problems with learning under indeterminacy, (which is something we will discuss

later), and secondly, intuitively rational learning should perform best under indeterminacy, since under

indeterminacy the set of expectations consistent with stability will be much larger, and thus it will be

eas-ier to end up within it. In light of the previous remarks, we assert that these problems with traditional

learning under indeterminacy should be taken seriously and not dismissed as being the result of poor

modelling choice, and we can be optimistic for the performance of rational learning, even if it turns out to

perform badly under determinacy.

1.2.3. Problems with “rational expectations”

We have already hinted at many of the problems with the REE concept. It is objected firstly that agents do

not have the information to form rational expectations and secondly that they lack the mental capabilities

to act on that information in the required way.

The first objection is uncontroversial. Even professional macroeconomists still have a great deal of

uncer-tainty as to the precise impact of a monetary policy shock, for example. Finding out the parameters of a

macro-model invariably requires undertaking at least some econometrics – a procedure that will never

produce certainty, only posterior probability distributions over the values those parameters might take. It

really does then seem hard to justify assuming that all agents in the economy actually form expectations

under full information.

The second objection leaves more room for debate. It might be argued that it only takes a few agents in

the economy forming expectations rationally for the whole economy soon to acquire rational

expecta-tions4. For example, given sufficient liquidity it only takes a single risk-neutral agent with rational

4_{Precisely this is shown within the context of a simple model in Blume and Easley (1993: 38). In particular, they}

show that if all traders in a simple economy have logarithmic preferences and some traders are Bayesian learners

(10)

tions participating in futures markets for all futures prices to correspond to their prices under rational

expectations. Indeed even non-futures markets reveal significant amounts about market expectations of

the future paths of output and the interest rate. The media then notice such signals and broadcast them

back to the wider population, in effect giving every agent in the economy free access to a set of almost

rational forecasts for major macroeconomic variables. Of course, agents may well ignore this information

or act on it in irrational ways, but this is not an argument against ascribing them rational expectations so

much as one against modelling their micro-behaviour as fully rational.

The validity of the second criticism then depends on both the strength of the transmission mechanism of

expectations and the extent to which forming fully rational expectations is computationally feasible for

those working at investment banks. We will be better placed to answer the latter of these two questions

once we have analysed what rational expectations look like under partial information. In any case,

though, it seems the full information assumption implicit in the classical REE framework is sufficiently

du-bious to warrant a search for alternatives.

1.3. Bounded rationality

1.3.1. Adaptive expectations

The earliest models of macroeconomic expectations formation (e.g. Cagan 1954) took the form:

𝒠𝑡𝑥𝑡+1 =𝜆𝑥𝑡+ 1− 𝜆 𝒠𝑡−1𝑥𝑡

where 𝒠_𝑡 is a period 𝑡 non-rational expectation operator, 𝜆 is an arbitrary parameter and 𝑥_𝑡 is the process

of interest. With 𝜆= 1 the variable is not expected to change from its current value and with 𝜆= 0

ex-pectations can take any constant value, independent of time. With 𝜆 ∈ 0,1 , expectations adjust

slug-gishly to changes in the level of 𝑥_𝑡, which can be thought of as something like a learning process. This

form of learning seems reasonable when the REE solution for 𝑥_𝑡 takes the form 𝑥_𝑡 =𝜇+𝜀_𝑡 (a form we

saw was taken in the Cobweb model when 𝑥_𝑡 is the price level) and where there is some constant

prob-ability in each period of a structural break that changes the value of 𝜇. When 𝜇 is constant over time the

learning procedure will soon settle down to satisfying 𝒠_𝑡𝑥_𝑡₊₁ ≈ 𝔼_𝑡𝑥_𝑡₊₁ providing both 𝑥_𝑡 and 𝒠_𝑡𝑥_𝑡₊₁ are

(11)

proc-ess than there would be in the REE) (G. W. Evans and Honkapohja 2001: 49), but the learning procedure is

nonetheless also capable of responding to changes in 𝜇.

It is worthwhile comparing these models’ properties to those in which we instead have:

𝒠𝑡𝑥𝑡+1=𝑡−1𝑥𝑡+ 1− 𝑡−1 𝒠𝑡−1𝑥𝑡 ⇒ 𝒠𝑡𝑥𝑡+1 =

1

𝑡 𝑥𝑠

𝑠=𝑡

𝑠=1

i.e. 𝒠_𝑡𝑥_𝑡+1 is the sample mean of 𝑥1,…,𝑥_𝑡. If it was genuinely the case that for all 𝑡, 𝑥_𝑡 =𝜇+𝜀_𝑡, then this

would be the unique fully rational way of forming expectations. Unfortunately, if everyone else is learning

at the same time then in models containing expectations it will not in general be the case that 𝑥_𝑡 =𝜇+

𝜀𝑡, though this may be approximately true for large 𝑡 if the REE solution takes this form. Consideration of

these decreasing-gain learning procedures gives an alternative interpretation of the constant gain case: if

we consider a large population of agents all of differing ages each of whom is undertaking decreasing-gain

learning, then, providing agents’ life-spans are not changing through time, constant gains may, in the

ag-gregate, be a reasonable approximation5.

However, crude learning procedures such as these are utterly unsuited to modelling any situation in

which the REE solution is not of the form 𝑥_𝑡 =𝜇+𝜀_𝑡, since then 𝔼_𝑡𝑥_𝑡₊₁ would not be constant and so,

even in the best possible case in which everyone else in the economy has rational expectations, there

would still be no possible way in which 𝒠_𝑡𝑥_𝑡₊₁ could be even approximately asymptotically rational.

1.3.2. Statistical learning à la Evans and Honkapohja

Evans and Honkapohja’s work (henceforth E&H)6 (e.g. G. W. Evans and Honkapohja 2001) is designed to

address this criticism. They assume agents estimate the parameters of the REE solution by usual

econo-metric techniques such as ordinary least squares (OLS). Due to the “online” nature of the learning, it is

5

This result is highly dependent on the age structure of the population, and the value of 𝜆 for which this comes

closest to holding will be a function of the population’s structure. We will discuss this issue in more detail in § 1.3.3.

6

The origins of this literature go back at least as far as Bray (1982), but most of the ideas later used and popularised

(12)

usually convenient to express this in recursive least squares (RLS) form. For example if the REE solution

has the AR 1 form 𝑥_𝑡 =𝜔 𝑥_𝑡−₁+𝜇 +𝜀 _𝑡, then the estimates 𝜇 _𝑡 and 𝜔 _𝑡 of 𝜇 and 𝜔 would be updated by:

𝜇 _𝜔𝑡

𝑡 = 𝜇 𝑡−1 𝜔 𝑡−1

+𝑡−1_𝑅

𝑡−1 _𝑥1

𝑡−1 𝑥𝑡− 𝜇 𝑡−1− 𝜔 𝑡−1𝑥𝑡−1

where 𝑅_𝑡 is the estimated covariance matrix of 𝜀 _𝑡 (assumed IID) which is updated according to:

𝑅𝑡 =𝑅𝑡−1+𝑡−1

1 𝑥_𝑡−₁

𝑥𝑡−1 𝑥𝑡−2 1 − 𝑅𝑡−1

This is fully rational learning if and only if it is actually the case that for all 𝑡, 𝑥_𝑡 =𝜔 𝑥_𝑡−1+𝜇 +𝜀 𝑡. Again,

this will not be true in general if the economy is affected by expectations and everyone is learning at the

same time. For example, if 𝑥_𝑡 =𝑎𝔼_𝑡𝑥_𝑡+1+𝑏𝑥𝑡−1+𝜇+𝜀𝑡 (so 𝜔 = 1 ± 1−4𝑎𝑏 2𝑎, 𝜇 =

𝜇 1− 𝑎 − 𝑎𝜔 and 𝜀 _𝑡=𝜀_𝑡 1− 𝑎𝜔 ) then, if expectations are formed according to the learning

pro-cedure given above, it will actually be the case that:

𝑥𝑡 =𝑎 𝜔 𝑡−1𝑥𝑡−1+𝜇 𝑡−1 +𝑏𝑥𝑡−1+𝜇+𝜀𝑡 = 𝑎𝜔 𝑡−1+𝑏 𝑥𝑡−1+ 𝑎𝜇 𝑡−1+𝜇 +𝜀𝑡

This means agents are estimating evolving parameters as being in fact constant, so their learning

proce-dure is misspecified and consequently cannot be fully rational.

E&H derive some general convergence conditions for this type of learning. The current model under

con-sideration serves as a good illustration of its performance7. When the REE is fully stable, so one solution

for 𝜔 is in the unit circle and one is outside it8, locally at least, RLS learning will always converge to the

unique stable REE. However, under indeterminacy, at most one of the two MSV solutions is locally stable

under RLS learning and, indeed, in one non-null region of indeterminacy there is a zero probability of

con-vergence to either of these two MSV solutions under RLS learning. This demonstrates that the learning

method posited by E&H may fail catastrophically in certain circumstances and illustrates our claim above

that statistical learning performs particularly badly under indeterminacy.

7_{See Figure 8.7 of “Learning and Expectations in Macroeconomics” (G. W. Evans and Honkapohja 2001: 203).}

8

(13)

When applying their work to real world data, E&H tend to switch from decreasing to constant gain, both

to allow for structural breaks and because in real world agents die taking their accumulated knowledge

with them. The convergence properties of constant gain learning are more complicated, as even in the

limit the estimated parameters will be stochastic, which in certain circumstances can cause periodic

jumps from one basin of attraction (i.e. an REE solution) to another. Nevertheless, they prove that in

cer-tain circumstances even constant gain learning will converge in the mean to an REE solution.

1.3.3. Problems with Evans and Honkapohja’s work

The chief problem with E&H’s approach to learning lies in its fundamental misspecification. They attempt

to justify this by noting that “the misspecification may not even be statistically detectable during the tra

n-sition *to a steady state+” (G. W. Evans and Honkapohja 2001: 32), but this will certainly fail to hold in

situations in which RLS learning does not even converge. In these circumstances, surely even the least

rational agents would realise their misspecification. Worse still, this criticism applies not just to regions in

which RLS fails to converge to anything, but also to those in which some, but not all, stationary REE have a

basin of attraction under RLS, such as those described above. To see this, suppose that we are in an

econ-omy of this AR 1 form with parameters in an indeterminate region in which the lower solution is

uniquely stable under RLS, and suppose that until period 𝑡, all the agents had full information and were

forming expectations in line with the higher of the two REE solutions. If from period 𝑡 onwards these fully

informed agents started slowly dying and being replaced with uninformed agents of infinite lifespan, then

we would expect the economy still to remain near its original REE, as the uninformed agents should be

able to learn the equilibria the informed agents had been playing until that point. However, if the

unin-formed agents were learning by RLS, then their probability of convergence to the larger solution would

still be zero, providing the informed agents all died off in a finite period.

E&H wish to use RLS convergence as a justification for picking one REE rather than another. However,

given that even boundedly rational agents would realise RLS was failing in such circumstances, at best,

they have shown criteria for RLS being an acceptable approximation to learning.

Additional problems are caused by E&H’s reliance on constant gain learning in order to get empirical pr

(14)

reason-able model of aggregated expectations. For example, if we take the continuous time version of the model

described in § 1.3.1, then if 𝑝_𝑎 is the density of people of age 𝑎 in the population, for (continuous time)

RLS to aggregate to (continuous time) constant gain learning, it is easy to see that we require ∫ 𝑝𝑎

𝑎 ∞

𝑘 𝑑𝑎=

𝜆𝑒−𝜆𝑘_{since these are the contributions of the}_𝑥

𝑡−𝑘 data point to aggregated RLS and constant gain

learn-ing respectively. This can only hold if 𝑝_𝑎 =𝜆2𝑎𝑒−𝜆𝑎, which our numerical calibrations have shown to be a

poor model of actual data: in particular, it requires there to be far too many over 80s as this distribution

has relatively fat tails. Therefore, in general we expect the dynamics in a population of agents, all of

whom are learning by RLS, to differ substantially from the dynamics under constant gain learning.

Both our claim that stability under RLS learning cannot be validly used as an equilibrium selection device

and our claim that it is invalid to use constant gain learning as an approximation to aggregate learning are

fundamental criticisms of the E&H approach. A perhaps yet more damning one, though, comes from our

suggestion that the only reasonable model may be that expectations are rational in aggregate, given the

expectational transition mechanisms present in the economy, and given the many agents who have

strong financial incentives for rationality. This approach of full rationality but partial information is what

we pursue in this thesis.

1.4. Full rationality, limited information

That economic agents may be fully rational and yet not have full information is certainly not a new idea.

There have been substantial tranches of literature devoted to learning in general equilibrium and learning

in games. Two fairly comprehensive surveys are Blume and Easley (1993) and Blume et al. (1982). The

“rational” part enters from the use of Bayes’ Law for the updating of beliefs. If one accepts the Savage

axioms (Savage 1954) as defining rationality, then Bayesian learning is the only rational kind of learning

there is. Though far from uncontroversial, for the duration of this thesis we will suppose the Savage

axi-oms are a given, so “rational learning” and “Bayesian learning” are synonymous.

The first thing to note is that much of the existing literature has been concerned with estimating

unob-served variables rather than estimating the model’s parameters. This covers estimating current values of

(15)

esti-mating the permanent component of variables subject to transitory shocks. The fully general solution to

this under homogeneous beliefs in a macroeconomic linear REE context was given in Pearlman et al.

(1986), and is (broadly) based on Kalman filter methods (Kalman 1960). Since we are attempting to

an-swer the same question as E&H, ours is an entirely different problem to this and Kalman filtering

tech-niques will not be applicable. That said, future work could examine learning under uncertainty both about

unobserved variables and about the model’s parameters.

Another thing to note is that a good deal of the literature deals with heterogeneity in beliefs and hence in

expectations, the most famous example of which is Townsend (1983) which deals with this in an

unob-served variables context. In assuming homogeneity, we will escape many issues connected with this.

Another source of apparent complication in the existing literature is the placing of learning within the

contexts of a very specific general equilibrium model that has not gone through the usual macroeconomic

“mashing” process of log-linearization, assumed certainty equivalence etc. to get it into a standard linear

expectational reduced form. This means that learning is very closely tied in to the particular agent doing

the learning and that inter-temporal optimization needs to take into account how beliefs might be revised

in future. Townsend (1978) and the subsequent literature it spawned all fall into this category.

A significant explanation for the success of the E&H approach to learning is that it is entirely generic and

plugs straight into the linear expectational reduced form, which would normally be calculated anyway in

order to find the full information REE in an analytically tractable way. Admittedly, there are some very

good reasons, when one is concerned with modelling strict rationality, for not log-linearizing and

assum-ing certainty equivalence, since at best the reduced form that results is a local approximation to the true

behaviour described by the model. However, many of these reasons are just as valid under full

informa-tion as they are under partial, and yet few quibble with the ascripinforma-tion of “rainforma-tionality” to the full inform

a-tion REE solua-tion that results from solving the reduced form. In light of this, we will be solely concerned

(16)

de-scriptions of the micro-founded models from which they arose9. This means that, much as in E&H,

learn-ing will be performed by a representative agent and will be unrelated to utility.

To the best of our knowledge, the problem of forming partial information rational expectations (in the

macroeconomic sense) has never been addressed. In particular, it is the combination of parameter

learn-ing and havlearn-ing to choose expectations in order to (attempt to) stay on the stable path that is novel. There

has been some literature on the related problem of optimal control under parameter uncertainty,

includ-ing Prescott (1972), Easley and Kiefer (1988) and Kiefer and Nyarko (1989), but the complications present

in these papers (chiefly coming from trade-offs between learning speed and the control target) do not

give any great insights into the problems we will encounter below, which is unsurprising since our

learn-ing is utility independent and our “control target” is binary (“end up on the stable path” or “don’t”). Our

task is made particularly difficult by the fact that if agents are far enough off the stable path then they

may never be able to return to it, even if they later know better where it is, since expectational errors

must be unpredictable from the period in which the expectations were formed.

1.5. The model

1.5.1. Core details

We will be solely concerned with models with the standard 𝑡-dated expectations form:

𝑅1𝑦𝑡 =𝑆𝔼𝑡𝑦𝑡+1+𝑇1𝑦𝑡−1+𝑊𝑧𝑡+𝜆𝑦+𝛾𝑦𝑡+𝜈𝑦,𝑡

𝑅2𝑧𝑡 =𝑇2𝑧𝑡−1+𝜆𝑧+𝛾𝑧𝑡+ 𝜈𝑧,𝑡

𝜈𝑡 = 𝜈_𝜈𝑦_𝑧_,,𝑡_𝑡 ~NIID 0,Ξ

9_{The approximation implicit in this is close to what Cogley and Sargent (2008) call an “anticipated-utility” model,}

after Kreps (1998). In these models, agents treat parameters as uncertain when learning, but as constants when

forming decisions. They show that at least in their model the anticipated utility approximation is close to the fully

rational solution. Our agents are slightly more sophisticated than this, though, because they only treat expectations

as constants when forming decisions. The formation of the actual expectation each period will fully account for

(17)

where 𝑦_𝑡 is a vector of endogenous variables (in the sense that they can be influenced by expectations)

and 𝑧_𝑡 is a vector of exogenous variables (in the sense that they are not affected by expectations). A large

proportion of DSGE models take this form, which justifies our focus on it, and as in the standard REE

lit-erature, we shall assume agents have homogenous beliefs. However, unlike this litlit-erature, we shall not

assume that agents are aware of the entire past history of the economy before their “birth”10, or that

they know 𝑅₁,𝑅2,𝑆,𝑇1,𝑇2,𝑊,Ξ,𝜆,𝛾 with certainty; in fact we will not even assume agents know which

variables are exogenous. We do however assume that all agents ascribe probability 1 to all variables

as-ymptotically growing at a sub-exponential rate, i.e. that for all 𝑠 ∈ ℤ, there is some polynomial 𝑝_𝑠 𝑡 such

that as 𝑡 → ∞, 𝔼_𝑠𝑥_𝑡− 𝑝_𝑠 𝑡 →0. This could be justified by assuming that agents are reluctant to assign

probability to the future being significantly different from the past. We have included a linear time trend

in this core model to allow for growth, as even removing a linear trend is not a trivial operation in small

samples when there is uncertainty about other parameters as well.

This model can be simplified if we let 𝑥_𝑡 ≔ 𝑦_𝑧𝑡

𝑡 and assume 𝑅2 is invertible as then:

𝐶𝑥𝑡 =𝐴𝔼𝑡𝑥𝑡+1+𝐵𝑥𝑡−1+𝜇+𝛿𝑡+𝜀𝑡 (1.1)

where 𝐴_𝑡 = 𝑆𝑡 0

0 0 , 𝐵= 𝑇

1 𝑊𝑅2−1𝑇2

0 𝑅₂−1𝑇2

, 𝐶= 𝑅1 0

0 𝐼 , 𝜇=

𝜆𝑦+𝑊𝑅2−1𝜆𝑧 𝑅2−1𝜆𝑧

, 𝛿= 𝛾𝑦 +𝑊𝑅2

−1_𝛾

𝑧

𝑅2−1𝛾𝑧

and

where 𝜀_𝑡= 𝜈𝑦,𝑡+𝑊𝑅2

−1_𝜈

𝑧,𝑡 𝑅2−1𝜈𝑧,𝑡

~NIID 0,Σ , where Σ= 𝐼 𝑊𝑅2

−1

0 𝑅₂−1 Ξ

𝐼 0

𝑅2−1′𝑊′ 𝑅2−1′

.

We will take this equation as our general form from here on. This is valid as in general agents are

uncer-tain which variables are exogenous, so there are no restrictions they can place with ceruncer-tainty on the

struc-ture of this equation’s parameters.

10

This can better be thought of as a model of a major structural change to the economy in period 𝒷 −1, after which

everyone has to start their learning again from scratch. A major change in political institutions or central bank

mone-tary policy regime is the usual example. In future work we will give “birth” its more literal meaning and assess

(18)

1.5.2. Canonical form

Let us now define the innovation process by 𝜂_𝑡 ≔ 𝑥_𝑡− 𝔼_𝑡−₁𝑥_𝑡 for all 𝑡 ∈ ℤ. We can stack this definition

together with (1.1) to get the canonical form:

𝐶 −𝐴_𝐼 ₀ _𝔼 𝑥𝑡

𝑡𝑥𝑡+1 = 𝐵

0 0 𝐼

𝑥𝑡−1

𝔼𝑡−1𝑥𝑡 + 𝜇0 + 𝛿0 𝑡+ 𝐼0 𝜀𝑡+

0

𝐼 𝜂𝑡

So defining 𝑣_𝑡 = _𝔼 𝑥𝑡

𝑡𝑥𝑡+1 , Γ0= 𝐶 −𝐴𝐼 0 , Γ1= 𝐵

0

0 𝐼 , 𝜇 = 𝜇0 , 𝛿 = 𝛿0 , Ψ= 𝐼0 and Π= 0

𝐼 we have:

Γ0𝑣𝑡 =Γ1𝑣𝑡−1+𝜇 +𝛿 𝑡+Ψ𝜀𝑡+Π𝜂𝑡 (1.2)

Beyond requiring that 𝑣_𝑡 = _𝔼 𝑥𝑡

𝑡𝑥𝑡+1 , our solution method will not depend at all on the precise internal

block structure of Γ₀,Γ₁,𝜇 ,𝛿 ,Ψ and Π. However, it is worth noting that if 𝐴 is invertible then we can

pre-multiply by Γ₀−1 = 0 𝐼

−𝐴−1 _𝐴−1_𝐶 giving:

𝑣𝑡 = _−𝐴0−1_{𝐵 𝐴}−𝐼1_{𝐶 𝑣}𝑡−1+

0

−𝐴−1_𝜇+ _−𝐴0₋1_{𝛿 𝑡}+

0

−𝐴−1 𝜀𝑡+ 𝐼_𝐴−1_{𝐶 𝜂}𝑡 (1.3)

If 𝜂_𝑡 is taken to be an arbitrary white noise process, then this is the full set of solutions including explosive

ones. The challenge in both the full and partial information cases is to restrict 𝜂_𝑡 in order to guarantee

(19)

2. Full information solution

We begin by solving the canonical form under full information. We do this both to introduce the

mathe-matical machinery and because we wish eventually to find necessary and sufficient conditions for the

ex-pectational errors under partial information to converge to those under full, which, unsurprisingly,

re-quires a solution for these errors in both circumstances. We will also introduce the concept of a “Feasible

Rational Expectations Equilibria” in this chapter, without which finding the partial information REE would

be incredibly difficult, if not impossible.

2.1. Information sets

In what follows, we will mark all variables that are different under full information by a superscript ∗. This

is necessary to make it perfectly clear that 𝑥_𝑡 (the economy’s state when everyone has limited inform

a-tion) is not the same random variable as 𝑥_𝑡∗(the economy’s state under full information). We will also

de-note expectations taken under this information set at 𝑡 by 𝔼_𝑡∗. So we replace 𝑣_𝑡 by 𝑣_𝑡∗ = 𝑥𝑡

∗

𝔼𝑡∗𝑥𝑡∗+1

.

We suppose that everyone was born at time −∞ and so knows the complete history of the economy

(in-cluding contemporaneous values of 𝑥_𝑡∗11_{) and that they also know the values of}𝐴_,𝐵_,𝐶_,Σ_,𝜇_,𝛿_with

cer-tainty. We suppose they know the data generating process for 𝜀_𝑡 and that Σ is of full rank. Furthermore,

we suppose that at 𝑡 agents know the value of 𝜁_𝑡, a vector of all the sunspot shocks that may possibly

af-fect the economy. Additionally, we suppose that agents know arbitrary matrices 𝑀_𝜀 and 𝑀_𝜁 of size

dim𝑥_𝑡∗− 𝑞 × dim𝑥_𝑡∗ and dim𝑥_𝑡∗− 𝑞 × dim𝜁_𝑡 respectively (where 𝑞 is a known constant whose

value will be defined later in terms of 𝐴,𝐵,𝐶,Σ,𝜇,𝛿), which determine the aggregation of sunspots

11

Allowing 𝑥_𝑡∗ to be in the time 𝑡 information set is not completely uncontroversial, since in the real world data

of-ten takes a while to arrive. However this is not the level on which to incorporate such insights, since the

micro-foundations of these models invariably use information sets in which 𝑥_𝑡 is either observable or at least in

equilib-rium perfectly predictable at 𝑡. (For example in Calvo pricing models (Calvo 1983), firms set prices equal to a

con-stant mark-up over nominal marginal cost, which itself depends on the actual aggregate price level that period.) We

trust that micro-founded model builders would have written 𝔼_𝑡−₁ instead of 𝔼_𝑡 if they did not think the agent in

(20)

ables into a combined sunspot term. We will require that 𝔼_𝑡−∗ ₁𝜁_𝑡 = 0, which is to say that sunspots are

unpredictable. We also assume that 𝜁_𝑡 is independent of all other random variables (so in particular

𝔼𝑡−∗ 1 𝜁𝑡𝜀𝑡′ = 0). This assumption is harmless, as the actual sunspot term will be given by 𝑀𝜀𝜀𝑡+𝑀𝜁𝜁𝑡.

More precisely then, the time 𝑡 information set for all agents is given by:

ℐ𝑡∗≔ 𝑥𝑡∗,𝜁𝑡 𝑡

𝑠=−∞

∪ 𝐴,𝐵,𝐶,Σ,𝜇,𝛿,𝑀_𝜀,𝑀_𝜁 ∪ 𝜀_𝑠~NIID 0,Σ

∞

𝑠=−∞

∪ Σ is of full rank

∪ 𝔼 𝜁𝑠 = 0 and 𝜁𝑠 is independent of 𝐴,𝐵,𝐶,Σ,𝜇,𝛿,𝜀𝑡,𝜀𝑡−1,…,𝜀𝑡+1,𝜀𝑡+2,…

∞

𝑠=−∞

∪ the economy's law of motion is of the form of (1.1)

∪ the economy is asymptotically growing at a sub-exponential rate

Note that we have not assumed that 𝜀_𝑡,𝜀_𝑡−₁,… is in the ℐ_𝑡∗ information set. This is because in the partial

information case (where there is some uncertainty over 𝐴,𝐵,𝐶,Σ,𝜇,𝛿) it is very hard to justify assuming

that 𝜀_𝑡,𝜀_𝑡−1,… is known at 𝑡; econometric data sources do not have series of shock values, rather

econometricians estimate a theoretically justified model from output, inflation etc. and then infer

esti-mates of the shock series. In addition, were 𝜀_𝑡 known at 𝑡, then after at most 3 dim𝑥_𝑡+ 2 observations

of 𝑥_𝑡, 𝔼_𝑡𝑥_𝑡₊₁, 𝑥_𝑡−₁ and 𝜀_𝑡 the parameters 𝐴,𝐵,𝐶,𝜇 and 𝛿 would be known with certainty (since Σ is of full

rank), which would be a rather poor model of “learning”, particularly as it would lead to all shocks being

fully identified, something certainly not true in most macroeconomic contexts.

Now despite 𝜀_𝑡 not being in ℐ_𝑡∗, if we take expectations of (1.1), then we have:

𝔼𝑡∗𝜀𝑡 =𝐶𝑥𝑡∗− 𝐴𝔼𝑡∗𝑥𝑡∗+1− 𝐵𝑥𝑡−∗ 1− 𝜇 − 𝛿𝑡=𝜀𝑡

Thus under the ℐ_𝑡∗ information set agents will know 𝜀_𝑡 anyway. However, this result clearly relies on the

inclusion of 𝐴,𝐵,𝐶,𝜇,𝛿 in ℐ_𝑡∗; if there is any uncertainty at all as to their values then agents will not be

able to work out 𝜀_𝑡 with certainty. In light of this, and since we are chiefly concerned with learning in this

thesis, we will be particularly interested in REE in which 𝔼_𝑡∗𝑥_𝑡∗₊₁ is expressible as linear in

(21)

“Feasible Rational Expectations Equilibria” or FREE12. It is worth pointing out that trivially the MSV

solu-tion is always feasible in this sense, since it will only include contemporaneous shocks.

2.2. The univariate special case

We commence with an analysis of the univariate case. This provides a gentle introduction to the

mathe-matical methods and the procedure for finding FREE solutions, and gives a convenient way of checking

our algebra in the harder cases. It also makes clear the limitations of the MSV solution method.

2.2.1. Stability analysis

Suppose temporarily that 𝑥_𝑡 is one dimensional, so 𝐴=𝑎, 𝐵=𝑏 and 𝐶=𝑐 for some scalars 𝑎, 𝑏 and 𝑐. If

𝑎= 0, then the model is in AR 1 form and so there is a non-explosive solution if and only if 𝑐= 0 (in

which case 𝔼_𝑡∗𝑥_𝑡∗₊₁ = 0) or 𝑏

𝑐 ≤1 (in which case 𝔼𝑡∗𝑥𝑡∗+1 =𝑏_𝑐𝑥𝑡∗+𝜇+𝛿 𝑡+ 1 ).

If 𝑎 ≠0, then from (1.3):

𝑥_𝔼 𝑡∗

𝑡 ∗_𝑥

𝑡∗+1

=

0 1

−𝑏_𝑎 _𝑎𝑐 𝑥𝑡−1

∗

𝔼𝑡−∗ 1𝑥𝑡∗

+ 0

−𝜇_𝑎 +

0

−𝛿_𝑎 𝑡+

0

−1_𝑎 𝜀𝑡+

1

𝑐 𝑎 𝜂𝑡

∗ _(2.1)

The eigenvalues 𝜔₁, 𝜔₂ of 0 1

−𝑏_𝑎 𝑐_𝑎 satisfy 𝜔2−𝑎𝑐𝜔+ 𝑏

𝑎= 0, so:

𝜔1=

𝑐 − 𝑐2₋₄_𝑎𝑏

2𝑎 , 𝜔2=

𝑐+ 𝑐2₋₄_𝑎𝑏

2𝑎

If 𝜔1 ≤1 and 𝜔2 ≤1 then the system is stable

13

, so expectations are indeterminate. If precisely one

eigenvalue satisfies 𝜔 ≤1, then the system is saddle path stable and expectations will be determinate.

If 𝜔₁ > 1 and 𝜔₂ > 1 then the system is unstable independent of expectations.

12

It may be objected that for an REE to be feasible, in fact 𝔼_𝑡∗𝑥_𝑡∗₊₁ should not even depend on 𝜁_𝑡,𝜁_𝑡−₁,…. There is

certainly some validity to this objection, but the direct observability of 𝜁_𝑡 may be justified by noting that the source

of 𝜁_𝑡’s variance is in some sense a choice variable, since expectations are. We may think of agents as calculating the

determinate parts of their expectations and then choosing to use e.g. the deviation between the expected and

(22)

Note that when 𝑐2₋₄_𝑎𝑏_{< 0, both eigenvalues are complex and}_𝜔

1 2= 𝜔2 2=𝑏_𝑎. Thus, in this case,

the system will be stable and indeterminate if 𝑏

𝑎≤1 and explosive otherwise.

When 0≤ 𝑐2−4𝑎𝑏, both eigenvalues are real. In this case 𝜔1 = 1 if and only if 𝜔2 = 1 if and only if

𝑐=𝑎+𝑏 or 𝑐=−𝑎 − 𝑏. Now 𝜕 𝜔1 2

𝜕𝑐 ≤0 and 𝜕 𝜔22

𝜕𝑐 ≥0. Thus 𝜔1 ≤1 if and only if 𝑐 ≥ − 𝑎+𝑏 and

𝜔2 ≤1 if and only if 𝑐 ≤ 𝑎+𝑏 .

2.2.2. Fully stable cases

In the fully stable cases either 𝑐2−4𝑎𝑏< 0 and 𝑏

𝑎 ≤1 or 0≤ 𝑐

2₋₄_𝑎𝑏_and_{− 𝑎}₊_{𝑏 ≤ 𝑐 ≤ 𝑎}₊_𝑏_{. In}

these cases rational expectations impose no restrictions on 𝜂_𝑡∗, so the full set of solutions satisfies

𝜂𝑡∗=𝑚𝜀𝜀𝑡+𝑚𝜁′𝜁𝑡, where 𝑚𝜀 =𝑀𝜀 is a scalar and 𝑚𝜁′ =𝑀𝜁 is a row vector (i.e. in this case, 𝑞= 1). We

are particularly interested in FREE solutions in which 𝔼_𝑡𝑥_𝑡₊₁ does not depend on 𝜀_𝑡,𝜀_𝑡−₁,…. We can

ac-complish this if we are prepared to further restrict 𝑚_𝜀. In particular, if we assume 𝑚_𝜀 ≠0 then

𝜀𝑡 =𝜂𝑡 ∗_−𝑚

𝜁′𝜁𝑡

𝑚𝜀 so from the bottom row of (2.1) and the definition of 𝜂𝑡

∗_{, the FREE solutions satisfy:}

𝔼𝑡∗𝑥𝑡∗+1 =− 𝑏 𝑎 𝑥𝑡−∗ 1+

𝑐

𝑎 𝔼𝑡−∗ 1𝑥𝑡∗− 𝜇 𝑎 −

𝛿 𝑎 𝑡 −

1

𝑎

𝜂𝑡∗− 𝑚𝜁′𝜁𝑡

𝑚𝜀 +

𝑐 𝑎 𝜂𝑡∗

=1

𝑎 𝑐 −

1

𝑚𝜀 𝑥𝑡 ∗₋𝑏

𝑎 𝑥𝑡−∗ 1+

1

𝑎𝑚𝜀𝔼𝑡−1 ∗ _𝑥

𝑡∗−𝜇_{𝑎 −}𝛿_{𝑎 𝑡}+

1

𝑎 𝑚𝜁′

𝑚𝜀𝜁𝑡

The condition that 𝑚_𝜀 ≠0 is also necessary for the existence of a FREE. To see this suppose for a

contra-diction that 𝑚_𝜀 = 0 but that:

𝔼𝑡∗𝑥𝑡∗+1 =𝓇𝑥𝑡∗+𝓈𝜁𝑡+ other terms known at 𝑡 −1

Then 0 = Cov_𝑡−1 𝜂_𝑡∗,𝜀𝑡 = Cov𝑡−1 𝑥_𝑡∗,𝜀𝑡 , so we also have:

13_{In the sense of exhibiting polynomially bound, i.e. non-explosive, growth. We are thus treating unit roots as}

sta-ble. This is valid given our particular definition of explosiveness since expectations of a unit root process, though

time dependent, are nonetheless polynomial. For example if 𝑥_𝑡 =𝑥_𝑡−1+ 1 +𝑡+𝜀𝑡 then 𝔼𝑡𝑥𝑡+𝑘 =𝑥𝑡+𝑘+𝑘𝑡+

1

(23)

0 = Cov_𝑡−₁ 𝑐𝑥_𝑡∗,𝜀_𝑡 = Cov_𝑡−₁ 𝑎𝔼_𝑡∗𝑥_𝑡∗₊₁+𝑏𝑥_𝑡−∗ ₁+𝜇+𝛿𝑡+𝜀_𝑡,𝜀_𝑡 = Cov_𝑡−₁ 𝑎𝔼_𝑡∗𝑥_𝑡∗₊₁+𝜀_𝑡,𝜀_𝑡

=𝑎𝓇Cov_𝑡−₁ 𝑥_𝑡∗,𝜀_𝑡 +𝑎𝓈Cov_𝑡−₁ 𝜁_𝑡,𝜀_𝑡 + Var_𝑡−₁𝜀_𝑡 =Σ

However Σ is of full rank, so we have a contradiction from 0 =Σ ≠0.

To obtain the general solution for 𝑥_𝑡∗, we instead use the definition of 𝜂_𝑡∗ to replace the expectational

terms in the bottom row of (2.1), which implies:

𝑥𝑡∗+1 = 𝑐 𝑎 𝑥𝑡∗−

𝑏 𝑎 𝑥𝑡−∗ 1−

𝜇 𝑎 −

𝛿

𝑎 𝑡+𝑚𝜀𝜀𝑡+1−

1

𝑎 𝜀𝑡+𝑚𝜁′𝜁𝑡+1 _(2.2)

This is an ARMAX 2,1,1 process and thus is more general than the usual “MSV” AR 1 one. To show that

generically these two forms are not equivalent we suppose there exist 𝒜,𝒞,𝒟,ℳ_𝜀,ℳ_𝜁 such that:

𝑥𝑡∗+1 =𝒜𝑥𝑡∗+𝒞+𝒟𝑡+ℳ𝜀𝜀𝑡+1 +ℳ𝜁𝜁𝑡+1

(This is the sunspot augmented MSV form.) So for any ℬ:

𝑥𝑡∗+1 = 𝒜 − ℬ 𝑥𝑡∗+ℬ𝒜𝑥𝑡−∗ 1+𝒞+ℬ𝒞 − ℬ𝒟+𝒟 1 +ℬ 𝑡+ℳ𝜀𝜀𝑡+1+ℬℳ𝜀𝜀𝑡

+ℳ_𝜁𝜁_𝑡₊₁+ℬℳ_𝜁𝜁_𝑡 (2.3)

For this to be equivalent to (2.2) we must be able to equate terms, which at least requires that ℬℳ_𝜁 = 0.

If ℬ= 0, then the ℬℳ_𝜀𝜀_𝑡 term disappears, which is always present in (2.2), thus in fact we must have

ℳ𝜁 = 0, which can only possibly hold if 𝑚𝜁′ = 0 too. When this is the case, equating terms we have:

𝒜 − ℬ= 𝑐

𝑎, ℬ𝒜=− 𝑏

𝑎, 𝒞+ℬ𝒞 − ℬ𝒟=− 𝜇

𝑎, 𝒟 1 +ℬ =− 𝛿

𝑎, ℳ𝜀 =𝑚𝜀, ℬℳ𝜀 =−

1

𝑎

⇒

ℬ=− 1

𝑎𝑚𝜀 and 𝒜=𝑏𝑚𝜀

But then from the first equation 𝑏𝑚_𝜀 + 1

𝑎𝑚𝜀 = 𝑐

𝑎, so this can only hold if we are also prepared to restrict

𝑚𝜀, illustrating how many solutions are ruled out by the imposition of the MSV form.

2.2.3. Saddle-path stable cases

In the saddle-path stable cases 0≤ 𝑐2₋₄_𝑎𝑏_{and either}_𝑐_<_{− 𝑎}₊_𝑏_(for_𝜔

1 > 1) or 𝑐> 𝑎+𝑏 (for

(24)

Schur decomposition14 (Horn and Johnson 1985: 79) of ₋0𝑏 1 𝑎

𝑐 𝑎

there exist possibly complex matrices 𝑍

and Ω, where 𝑍 is unitary15 and Ω is upper triangular with 𝜔₁ and 𝜔₂ on its diagonal such that:

0 1

−𝑏_𝑎 _𝑎𝑐 =𝑍Ω𝑍𝐻 = 𝑧

11 𝑧12 𝑧21 𝑧22

𝜔1 𝜔12

0 𝜔₂ 𝑧11

𝐻 _𝑧

21𝐻 𝑧12𝐻 𝑧22𝐻

where 𝑍𝐻 denotes the Hermitian or conjugate transpose of 𝑍. We note the following implied identities

that will prove useful below:

−𝑧21𝐻 𝑏

𝑎 𝑧11𝐻 +𝑧21𝐻 𝑐 𝑎 −𝑧22𝐻

𝑏

𝑎 𝑧12𝐻 +𝑧22𝐻 𝑐 𝑎

=𝑍𝐻

0 1

−𝑏_𝑎 _𝑎𝑐 =Ω𝑍𝐻 = 𝜔1𝑧11

𝐻 ₊_𝜔

12𝑧12𝐻 𝜔1𝑧21𝐻 +𝜔12𝑧22𝐻 𝜔2𝑧12𝐻 𝜔2𝑧22𝐻

(2.4)

𝑧21 𝑧22

−𝑧11 𝑏 𝑎+𝑧21

𝑐 𝑎 −𝑧12

𝑏 𝑎+𝑧22

𝑐 𝑎

=

0 1

−𝑏_𝑎 _𝑎𝑐 𝑍=𝑍Ω= 𝑧11𝜔1 𝑧11𝜔12

+𝑧₁₂𝜔₂

𝑧21𝜔1 𝑧21𝜔12+𝑧22𝜔2 (2.5)

A third identity follows from 𝑍’s unitarity, namely:

1

𝑍

𝑧22 −𝑧12 −𝑧21 𝑧11 =𝑍

−1₌_𝑍𝐻 ₌_𝑧11𝐻 𝑧12𝐻

𝑧21𝐻 𝑧22𝐻 (2.6)

Now if we let 𝑤_𝑡∗≔ 𝑍𝐻 𝑥𝑡

∗

𝔼𝑡∗𝑥𝑡∗+1

and we pre-multiply (2.1) by 𝑍𝐻 then we have:

𝑤𝑡∗= 𝜔01 𝜔𝜔12₂ 𝑤𝑡−∗ 1+𝑍𝐻

0

−𝜇_𝑎 +𝑍𝐻

0

−𝛿_𝑎 𝑡+𝑍𝐻

0

−1_𝑎 𝜀𝑡+𝑍𝐻

1

𝑐 𝑎 𝜂𝑡

∗ _(2.7)

The bottom row of this is given by:

𝑤2,∗𝑡 =𝜔2𝑤2,∗𝑡−1− 𝑧22𝐻 𝜇 𝑎 − 𝑧22𝐻

𝛿 𝑎 𝑡 − 𝑧22𝐻

1

𝑎 𝜀𝑡+ 𝑧12𝐻 +𝑧22𝐻 𝑐 𝑎 𝜂𝑡∗

Since 𝜔₂ > 1, this equation is explosive, so we solve forward following Sims (2002: 9), giving ∀𝑘 ∈ ℕ:

14_{We could as well have just diagonalized} 0 1

−𝑏_𝑎 𝑐_𝑎 in the usual way, but by using the Schur decomposition here we

hope to make the comparison between the univariate and non-univariate cases clearer.

15

(25)

𝑤2,∗𝑡 =𝜔2−𝑘𝑤2,∗𝑡+𝑘− 𝜔2−𝑠 𝑧12𝐻𝜂𝑡∗+𝑠− 𝑧22𝐻 𝜇 𝑎+

𝛿

𝑎 𝑡+𝑠 +

1

𝑎 𝜀𝑡+𝑠− 𝑐 𝑎 𝜂𝑡∗+𝑠

𝑘

𝑠=1

Taking 𝑡 dated expectations then gives:

𝑤2,∗𝑡 =𝔼𝑡∗𝑤2,∗𝑡 =𝜔2−𝑘𝔼𝑡∗𝑤2,∗𝑡+𝑘+ 𝜔2−𝑠𝑧22𝐻 𝜇 𝑎+

𝛿

𝑎 𝑡+𝑠

𝑘

𝑠=1

By assumption 𝔼_𝑡∗𝑤_2,∗_𝑡₊_𝑘 grows at an asymptotically polynomial rate and thus is dominated by 𝜔₂−𝑘. This

means that in the limit as 𝑘 → ∞:

𝑤2,∗𝑡 = 𝜔2−𝑠𝑧22𝐻 𝜇 𝑎+

𝛿

𝑎 𝑡+𝑠

∞

𝑠=1

=𝑧22

𝐻

𝑎

𝜇+𝛿 𝑡+ 1

𝜔2−1

+ 𝛿

𝜔2−1 2

(where we have used standard formulae for geometric series, proved in the matrix case in appendix A,

§ 5). If we let 𝜙_𝜇 ≔𝑧22𝐻

𝑎

𝜔2 𝜇+𝛿 −𝜇

𝜔2−12

, then we can write:

𝑤2,∗𝑡 =𝜙𝜇 +

𝑧22𝐻𝛿𝑡

𝑎 𝜔2−1 (2.8)

Now conveniently16:

1 −𝑧11

𝐻_𝑎₊_𝑧

21𝐻𝑐 𝑧12𝐻𝑎+𝑧22𝐻𝑐

𝑍𝐻 1_𝑐

𝑎 = 𝑧11

𝐻 ₊_𝑧

21𝐻 𝑐

𝑎 − 𝑧12𝐻 +𝑧22𝐻 𝑐 𝑎

𝑧11𝐻𝑎+𝑧21𝐻𝑐 𝑧12𝐻𝑎+𝑧22𝐻𝑐

= 0

Thus if we pre-multiply (2.7) by 1 −𝑧11𝐻𝑎+𝑧21𝐻𝑐

𝑧12𝐻𝑎+𝑧22𝐻𝑐

= 1 −𝜔1𝑧21𝐻+𝜔12𝑧22𝐻

𝜔2𝑧22𝐻

= 1 𝜔1z12−𝜔12𝑧11

𝜔2𝑧11 (this is valid

assuming 𝑧₁₁ ≠0), by (2.4) and (2.6) we will obtain an expression for the linear combination of 𝑥_𝑡∗ and

𝔼𝑡∗𝑥𝑡∗+1 that is pre-determined, namely:

1 𝜔1z12− 𝜔12𝑧11

𝜔2𝑧11 𝑤𝑡

∗₌_𝜔₁ 𝜔1z12

z₁₁ 𝑤𝑡−∗ 1+ 𝜇

𝑎𝜔2 𝑍 𝑧22 − 𝜔1

z₁₂

+ 𝛿

𝑎𝜔2 𝑍 𝑧22− 𝜔1

z₁₂ 𝑡+ 1

𝑎𝜔2 𝑍 𝑧22− 𝜔1

z₁₂ 𝜀_𝑡

= 𝜔1

𝜔1z12

z11 𝑤𝑡−1

∗ ₊ 𝜇

𝑎𝜔2𝑧11

+ 𝛿

𝑎𝜔2𝑧11𝑡

+ 1

𝑎𝜔2𝑧11𝜀𝑡

(where we have used (2.5) and (2.6) to simplify). Stacking this equation with (2.8) gives:

16

(26)

1 𝜔1z12− 𝜔12𝑧11

𝜔2𝑧11

0 1

𝑤𝑡∗= 𝜔1

𝜔1z12

z11

0 0

𝑤𝑡−∗ 1+

𝜇 𝑎𝜔2𝑧11 𝑧22𝐻

𝑎

𝜔2 𝜇+𝛿 − 𝜇 𝜔2−1 2

+ 1

𝑎𝜔2𝑧11 𝑧22𝐻 𝑎 𝜔2−1

𝛿𝑡+ 1

𝑎𝜔2𝑧11

0

𝜀𝑡

Finally pre-multiplying by 𝑍 1

𝜔1z12−𝜔12𝑧11

𝜔2𝑧11

0 1

−1

= 𝑧∙1 𝑧∙1𝜔12𝑧11−𝜔1 z12

𝜔2𝑧11

+𝑧_∙₂ _{and again simplifying}

us-ing (2.5) and (2.6) gives the solution:

𝑥_𝔼 𝑡∗

𝑡 ∗_𝑥

𝑡∗+1

=𝜔₁𝑧_∙₁ 𝑧11𝐻 +

z₁₂ z11𝑧12

𝐻 ₀_𝑥𝑡−∗ 1 𝔼𝑡−∗ 1𝑥𝑡∗

+1

𝑎 𝑧∙1

1

𝜔2 𝜇 𝑧11

+𝜔12𝑧11 − 𝜔1z12

𝑍

𝜔2 𝜇+𝛿 − 𝜇 𝜔2−1 2

+𝑧_∙2𝑧22𝐻

𝜔2 𝜇+𝛿 − 𝜇 𝜔2−1 2

+ 1

𝑎 𝜔2−1 𝑍 𝑧∙1 𝑍 𝑧11− 𝑧12

+𝑧_∙2𝑧11 𝛿𝑡+ 𝑧∙1 𝑎𝜔2𝑧11𝜀𝑡

To obtain the general solution for 𝑥_𝑡∗ we take the top row of this equation and simplify, which gives:

𝑥𝑡∗=𝜔1

𝑧11𝑧22 𝑍 −

z₁₂𝑧₂₁

𝑍 𝑥𝑡−∗ 1+ 𝜇 𝑎𝜔2 +1 𝑎 1 𝜔2

𝑍 − 𝜔2𝑧12𝑧11

𝑍 +

𝑧12𝑧11 𝑍

𝜔2 𝜇+𝛿 − 𝜇 𝜔2−1 2

+ 1

𝑎 𝜔2−1 𝑍 𝑍 − 𝑧11𝑧12

+𝑧₁₂𝑧₁₁ 𝛿𝑡+ 1

𝑎𝜔2𝜀𝑡

=𝜔₁𝑥_𝑡−∗ ₁+ 𝜇

𝑎𝜔2

+ 1

𝑎𝜔2

𝜔2 𝜇+𝛿 − 𝜇 𝜔2−1 2

+ 1

𝑎 𝜔2−1 𝛿𝑡

+ 1

𝑎𝜔2𝜀𝑡

Thus we have shown that:

𝑥𝑡∗=𝜔1𝑥𝑡−∗ 1+

𝜇 𝑎 𝜔2−1

+ 𝛿

𝑎 𝜔2−1 2

+ 𝛿

𝑎 𝜔2−1 𝑡

+ 𝜀𝑡

𝑎𝜔2

which straight-forward calculation shows to agree with the usual AR 1 “MSV” solution. Pushing this

forward one period and taking expectations we have the following FREE form expectation:

𝔼𝑡∗𝑥𝑡∗+1 =𝜔1𝑥𝑡∗+

𝜇+𝛿

𝑎 𝜔2−1

+ 𝛿

𝑎 𝜔2−1 2

+ 𝛿

𝑎 𝜔2−1 𝑡

2.2.4. Proposition 1

The previous sections have shown that in the univariate case under stability (𝑐2₋₄_𝑎𝑏_{< 0 and}𝑏

𝑎≤1 or

(27)

and only if 𝑀_𝜀 is of full rank, and that in the univariate case under saddle-path stability (0≤ 𝑐2₋₄_𝑎𝑏_and

either 𝑐<− 𝑎+𝑏 or 𝑐> 𝑎+𝑏 ), there is an AR 1 form REE, providing 𝑧11 ≠0, which in fact is

al-ways a FREE.

2.3. Solution to the general canonical form

We now turn to solving the generalized canonical form (1.2) in full generality. To do this we broadly

fol-low Lubik and Schorfheide’s (2003) extension to the irregular case of Sims’s (2002) method for solving

rational expectations models, which is itself more general than that of Blanchard and Kahn (1980) since it

avoids some invertibility assumptions and enables linear combinations of variables to be jointly

prede-termined. This method is particularly convenient for our purposes since it proceeds by first solving for the

expectational error, which, to assess the convergence of the partial information case, is what we shall be

interested in.

Our chief innovations are the inclusion of the drift and linear terms, which are important as being able to

accurately remove a linear trend is non-trivial in the partial information case; the derivation of a simpler

condition for existence of REEs for a large class of models; the addition of FREE restrictions, which will

play a role in the partial information case; and the explicit derivation of VARMAX form solutions for 𝑥_𝑡∗.

2.3.1. Set-up

By the generalized complex Schur decomposition (also known as the QZ decomposition) (Quarteroni et al.

2000: 225) of the matrices Γ₀ and Γ₁ defined in § 1.5.2, there always exist possibly complex matrices 𝑄, 𝑍,

Λ= 𝜆_𝑖,𝑗 _𝑖_,_𝑗 and Ω= 𝜔𝑖,𝑗 _𝑖_,_𝑗 such that 𝑄𝐻Λ𝑍𝐻=Γ0, 𝑄𝐻Ω𝑍𝐻 =Γ1, 𝑄 and 𝑍 are unitary and Λ and Ω are

upper triangular.

Now let 𝑤_𝑡∗=𝑍𝐻𝑣_𝑡∗ for all 𝑡 ∈ ℤ, then if we pre-multiply (1.2) by 𝑄 we have:

(28)

Providing Γ₀ and Γ₁ do not have zero eigenvalues corresponding to the same eigenvector17 the QZ

de-composition always exists and the set 𝜔𝑖𝑖

𝜆𝑖𝑖 𝑖 ∈ 1,…, dim𝑣𝑡 ⊆ ℝ ∪ ∞ is unique even though the

decomposition itself is not (Sims 2002: 9, 20). Thus, without loss of generality we may assume that for

𝑖<𝑗, 𝜔𝑖𝑖

𝜆𝑖𝑖 < 𝜔𝑗𝑗

𝜆𝑗𝑗 . Let 𝑢 be the number of 𝑖 for which 𝜔𝑖𝑖

𝜆𝑖𝑖 ≤1 and consider a partition of the matrices

under consideration in which in each case the top left block is of dimension 𝑢 ×𝑢 18. We then write:

Λ11 Λ12

0 Λ₂₂

𝑤1,∗𝑡 𝑤2,∗𝑡

= Ω11 Ω12 0 Ω₂₂

𝑤1,∗𝑡−1 𝑤2,∗𝑡−1

+ 𝑄1∙

𝑄2∙ 𝜇

+𝛿 𝑡+Ψ𝜀_𝑡+Π𝜂_𝑡∗ (2.9)

Note that this decomposition means that only Λ₁₁ and Ω₂₂ are guaranteed to be invertible.

2.3.2. Derivation of restrictions

The second block of (2.9) is purely explosive by construction; thus we solve it forward following Sims

(2002: 9). From this block we have that for all 𝑘 ∈ ℕ₊:

𝑤2,∗𝑡 = Ω22−1Λ22 𝑘𝑤2,∗𝑡+𝑘− Ω22−1Λ22 𝑠−1Ω22−1𝑄2∙ 𝜇 +𝛿 𝑡+𝑠 +Ψ𝜀𝑡+𝑠+Π𝜂𝑡∗+𝑠

𝑘

𝑠=1

So if we take 𝑡 dated expectations and then take the limit as 𝑘 → ∞, since the components of 𝔼_𝑡∗𝑤_2,∗_𝑡₊_𝑘

are asymptotically polynomial by assumption and thus dominated by Ω₂₂−1Λ₂₂ 𝑘, we have that:

𝑤2,∗𝑡=𝔼𝑡∗𝑤2,∗𝑡 =−𝔼𝑡∗ Ω22−1Λ22 𝑠−1Ω22−1𝑄2∙ 𝜇 +𝛿 𝑡+𝑠 +Ψ𝜀𝑡+𝑠+Π𝜂_𝑡∗+𝑠

∞

𝑠=1

=− Ω₂₂−1Λ₂₂ 𝑠

∞

𝑠=0

Ω22−1𝑄2∙ 𝜇 +𝛿 𝑡+ 1 − 𝑠 Ω22−1Λ22 𝑠−1

∞

𝑠=0

Ω22−1Λ22 Ω22−1𝑄2∙𝛿

where all sums are well defined since the eigenvalues of Ω₂₂−1Λ₂₂ are strictly in the unit circle by

construc-tion, which is shown to be a necessary and sufficient condition for convergence in appendix A, § 5. In fact

by the formulae derived in that appendix:

17

This means that there is one or more equation that places no restrictions on either 𝑣_𝑡 or 𝑣_𝑡−₁. This will create an

additional source of indeterminacy in 𝑣_𝑡 and may also imply that one or more components of 𝜀_𝑡 and 𝜂_𝑡∗ are linear

combinations of the others. We, like both Sims and Lubik & Schorfheide, will not pursue this avenue.

18

Rational macroeconomic learning in linear expectational models

Rational macroeconomic learning in

linear expectational models

Holden, Tom

Department of Economics, University of Oxford

1 May 2008

Online at

https://mpra.ub.uni-muenchen.de/10872/

Rational macroeconomic learning in

linear expectational models

An analysis of the convergence properties of macroeconomic models

un-der partial information rational expectations and Bayesian learning

Contents

1.

Introduction

1.1.

Expectations in macroeconomics

1.2.

“Rational expectations”

1.3.

Bounded rationality

1.4.

Full rationality, limited information

1.5.

The model

2.

Full information solution

2.1.

Information sets

2.2.

The univariate special case

2.3.

Solution to the general canonical form