The Economic and Social Review, Vol. 21, No. 1, October, 1989, pp. 1-25
Economic Theory and Econometric Models
C H R I S T O P H E R L . G I L B E R T *
Queen Mary College and West field, University of London and CEPR
T
he c o n s t i t u t i o n o f the E c o n o m e t r i c Society states as the m a i n objectiveo f the society " t h e u n i f i c a t i o n o f the theoretical-qualitative and the empirical-quantitative approach to economic p r o b l e m s " (Ragnar Frisch, 1933, p. 1). E x p l a i n i n g this objective, Frisch warned that "so long as we content our selves t o statements i n general terms about one economic factor having an 'effect' o n some other factor, almost any sort o f relationship may be selected, postulated as a law, and ' e x p l a i n e d ' by a plausible argument". Precise, realistic, b u t at the same time complex theory was required t o "help us o u t i n this s i t u a t i o n " {ibid, p . 2 ) .
Over f i f t y years after the f o u n d a t i o n of the E c o n o m e t r i c Society Hashem Pesaran stated, i n an editorial w h i c h appeared i n the i n i t i a l issue o f the Journal
of Applied Econometrics, that "Frisch's call for u n i f i c a t i o n o f the research
i n economics has been left largely unanswered" (Pesaran, 1986, p . 1). T h i s is despite the fact that propositions that theory should relate t o applied models, and that models should be based upon t h e o r y , are n o t enormously contro versial. T h e simple reason is that economic t h e o r y and econometric models
D u b l i n E c o n o m i c s W o r k s h o p G u e s t L e c t u r e , T h i r d A n n u a l C o n f e r e n c e of the I r i s h E c o n o m i c s A s s o c i a t i o n , C a r r i c k m a c r o s s , M a y 2 0 t h 1 9 8 9 .
are relatively a w k w a r d bedfellows — m u c h as they cannot do w i t h o u t each other, they also f i n d i t very d i f f i c u l t t o live i n the same house. I shall suggest that this is i n part because the proposed u n i o n has been over-ambitious, and p a r t l y for that reason, neither partner has been sufficiently accommodating to the other. What is needed is the intellectual analogue o f a m o d e r n marriage.
One c o u l d l o o k at this issue either i n terms o f making theory more appli cable, or i n terms o f m a k i n g m o d e l l i n g more theoretically-based. I shall start f r o m the latter approach. What I have t o say w i l l therefore relate t o con troversies about the relative merits o f different econometric methodologies, and i t is apparent that increased a t t e n t i o n has been given b y econometricians to questions o f m e t h o d o l o g y over the past decade. I n particular, the rival methodologies associated respectively w i t h David H e n d r y , E d w a r d Learner and Christopher Sims, and recently surveyed b y A d r i a n Pagan ( 1 9 8 7 ) , generate showpiece discussions at i n t e r n a t i o n a l conferences. I do n o t propose here either t o repeat Pagan's survey, or t o advise o n the "best b u y " (on this see also A r i s Spanos, 1 9 8 8 ) . A major issue, w h i c h underlies m u c h o f Learner's criticism o f conventional econometrics, is the status o f inference i n equations the -specification o f w h i c h has been chosen at least i n part o n the basis o f pre l i m i n a r y regressions. I do n o t propose t o pursue those questions here. I also note the obvious p o i n t that the tools we use may be i n part dictated b y the j o b to hand — hypothesis testing, forecasting and p o l i c y evaluation are different functions and i t is n o t a x i o m a t i c that one particular approach t o m o d e l l i n g w i l l dominate i n all three functions. This p o i n t w i l l recur, b u t I wish t o focus o n t w o specific questions — h o w should economic t h e o r y determine the struc ture o f the models we estimate and h o w should we interpret rejection o f theories? I w i l l argue that i n the m a i n we should see ourselves as using theory to structure our models, rather than using models t o test theories.
1935) — Robbins does n o t wish t o disagree w i t h Blank o n the f o r m o f the economic law, b u t o n l y o n the possibility o f its q u a n t i f i c a t i o n .
This all changed w i t h the p u b l i c a t i o n i n 1944 o f Trygve Haavelmo's "Pro b a b i l i t y A p p r o a c h i n E c o n o m e t r i c s " (Haavelmo, 1944). Haavelmo insisted, first, t h a t " n o t o o l developed i n the t h e o r y o f statistics has any meaning — except, perhaps, for descriptive purposes — w i t h o u t being referred t o sorrie stochastic scheme" (ibid, p . i i i ) ; and second, that theoretical economic models should be f o r m u l a t e d as "a r e s t r i c t i o n u p o n the j o i n t variations o f a system of variable quantities (or, more generally, 'objects') w h i c h otherwise m i g h t have any value or p r o p e r t y " (ibid, p . 8 ) . F o r Haavelmo, the role o f t h e o r y was t o offer a n o n - t r i v i a l , and therefore i n p r i n c i p l e rejectable, structure for the variance-covariance m a t r i x o f the data. I n this may be recognised the seeds of Cowles Commission econometrics.
Haavelmo was unspecific w i t h respect t o the genesis o f the theoretical restrictions, b u t over the post-war p e r i o d economic t h e o r y has been increas ingly d o m i n a t e d b y the paradigm o f atomistic agents m a x i m i s i n g subject t o a constraint set. I t is true that game t h e o r y has p r o v i d e d a new set o f models, although these game theoretic models t y p i c a l l y i m p l y fewer restrictions t h a n the atomistic o p t i m i s a t i o n models w h i c h remain the core o f the d o m i n a n t neo classical or neo-Wairasian "research p r o g r a m m e " . T h i s research programme has been reinforced b y the so-called " r a t i o n a l expectations r e v o l u t i o n " , and i n Lakatosian terms this provides evidence t h a t the programme remains "progressive".
I shall b r i e f l y illustrate this programme f r o m the t h e o r y o f consumer behaviour o n the grounds that i t is i n this area that the econometric approach has been most successful; and also because i t is an area w i t h w h i c h all readers w i l l be very familiar. T h e systems approach t o demand m o d e l l i n g was i n i t i ated b y Richard Stone's derivation o f and estimation o f the Linear E x p e n d i ture System (the L E S ; Stone, 1954). The achievement o f the LES was the derivation o f a complete set o f demand equations w h i c h c o u l d describe the outcomes o f the o p t i m i s i n g decisions o f a u t i l i t y m a x i m i s i n g consumer. We n o w k n o w that the f u n c t i o n a l specification adopted b y Stone, i n w h i c h expen diture is a linear f u n c t i o n o f prices and money i n c o m e , is highly restrictive and entails additive separability o f the u t i l i t y f u n c t i o n ; b u t this is i n no w a y t o devalue Stone's c o n t r i b u t i o n .
Peter Hansen and K e n n e t h Singleton ( 1 9 8 3 ) , is t o estimate the parameters
of the m o d e l f r o m the Euler equations1 w h i c h l i n k the first order conditions
f r o m the o p t i m i s a t i o n p r o b l e m i n successive periods. The c o m b i n a t i o n exhi b i t e d b o t h i n these t w o recent papers and i n the original Stone LES paper o f theoretical derivations o f a set o f restrictions o n a system o f stochastic equa tions, and the subsequent testing o f these restrictions using the procedures o f classical statistical inference, is exactly that anticipated b o t h i n the consti t u t i o n o f the E c o n o m e t r i c Society and i n Haavelmo's manifesto. I t therefore appears somewhat churlish that the profession, having d u l y arrived at this destination, should n o w question that this is where we wish t o go. However, i n terms o f the c o n s t i t u t i o n a l objective o f u n i f y i n g theoretical and e m p i r i c a l m o d e l l i n g , the Haavelmo-Cowles programme forces this u n i f i c a t i o n t o o m u c h o n the terms o f the t h e o r y p a r t y .
This argument can be made i n a number o f respects, b u t almost invariably w i l l revolve a r o u n d aggregation. I shall consider t w o approaches to aggre gation — one deriving f r o m theoretical and the other f r o m statistical con siderations.
T h e three consumption-demand papers t o w h i c h I have referred share the characteristic t h a t they use aggregate data t o examine the i m p l i c a t i o n s o f theories that are developed i n terms o f i n d i v i d u a l o p t i m i s i n g agents. This requires t h a t we assume either t h a t a l l individuals are identical and have the same income, or that individuals have the same preferences (although they may differ i n the intercepts o f their Engel curves) w h i c h , moreover, belong
t o the P I G L class.2 I n the latter and marginally more plausible case, the market
demand curves may be regarded as the demand curves o f a h y p o t h e t i c a l representative agent m a x i m i s i n g u t i l i t y subject t o the aggregate budget con
straint. T h i s allows i n t e r p r e t a t i o n o f the parameters o f the aggregate functions i n terms o f the parameters o f the representative agent's u t i l i t y f u n c t i o n .
T h e representative agent hypothesis therefore performs a " r e d u c t i o n " o f the parameters o f the aggregate f u n c t i o n t o a set o f more fundamental m i c r o parameters. I n this sense, the aggregate equations are " e x p l a i n e d " in terms o f these more fundamental parameters, apparently i n the same way that one m i g h t e x p l a i n the properties o f the hydrogen a t o m i n terms o f the q u a n t u m mechanics equations o f an electron-proton pair. T h i s reductionist approach t o e x p l a n a t i o n was discussed b y Haavelmo i n an i m p o r t a n t b u t neglected section o f his manifesto e n t i t l e d " T h e A u t o n o m y o f an E c o n o m i c R e l a t i o n "
and w h i c h anticipates the Lucas c r i t i q u e ( R o b e r t Lucas, 1.976).3 A
relation-1. O r o r t h o g o n a l i t y c o n d i t i o n s i m p l i e d by the E u l e r e q u a t i o n s . A d i f f i c u l t y w i t h these "tests" is that they m a y have v e r y l o w p o w e r against i n t e r e s t i n g alternatives — this a r g u m e n t w a s made by J a m e s D a v i d s o n a n d D a v i d H e n d r y ( 1 9 8 1 ) in r e l a t i o n to the tests r e p o r t e d in H a l l ( 1 9 7 8 ) .
ship is autonomous t o the extent that i t remains constant i f other relationships in the system (e.g., the money supply rule) are changed. Haavelmo explains " I n scientific research — i n the field o f economics as w e l l as i n other fields — our search for 'explanations' consists o f digging d o w n t o more fundamental relations than those w h i c h stand before us w h e n we merely 'stand and l o o k ' "
(ibid, p . 38).
There is however abundant evidence t h a t attempts t o make inferences about individual tastes f r o m the tastes o f a "representative agent" o n the basis o f aggregate t i m e series data can be highly misleading. Thomas Stoker (1986) has emphasised the importance o f d i s t r i b u t i o n a l considerations i n a m i c r o -based a p p l i c a t i o n o f the Linear E x p e n d i t u r e System t o US data and R i c h a r d B l u n d e l l (1988) has reiterated these concerns. R i c h a r d B l u n d e l l , Panos Pashardes and G i u l l i m o Weber (1988) estimate the D e a t o n and Muellbauer
(Deaton and Muellbauer, 1980) A l m o s t Ideal D e m a n d System o n b o t h a panel o f m i c r o data and o n the aggregated macro data. T h e y f i n d substantial biases i n the estimated coefficients f r o m the aggregate relationships i n comparison w i t h the m i c r o e c o n o m e t r i c estimates. F u r t h e r m o r e , and, t h e y suggest as a consequence o f these biases, the aggregate equations e x h i b i t residual serial correlation and reject the homogeneity restrictions. T h e y suggest, as major causes o f this aggregation failure, differences between households w i t h and
w i t h o u t children and the prevalence o f zero purchases i n the m i c r o d a t a .4 T h i s
study in particular suggests that i t is d i f f i c u l t t o claim that aggregate elasticities correspond i n any very straightforward w a y t o u n d e r l y i n g micro-parameters.
H o w does this leave the reductionist i n t e r p r e t a t i o n o f aggregate equations? Pursuing further the hydrogen analogy, the demand theoretic " r e d u c t i o n " is flawed b y the fact that we have i n general no independent knowledge o f the parameters o f i n d i v i d u a l u t i l i t y functions w h i c h w o u l d allow us t o predict elasticities p r i o r t o estimation. I t is therefore better t o see u t i l i t y t h e o r y i n t r a d i t i o n a l studies as " s t r u c t u r i n g " demand estimates. I n that case, the repre sentative agent assumption is a convenient and almost Friedmanite simplify ing a s s u m p t i o n ,5 i m p l y i n g that the role o f t h e o r y is p r i m a r i l y i n s t r u m e n t a l .
I n the Stoker (1986) and B l u n d e l l et al. (1988) studies, b y contrast, the presence o f good m i c r o data allow a genuine test o f the c o m p a t i b i l i t y o f the macro and m i c r o estimates, and this allows those authors t o investigate the circumstances i n w h i c h a r e d u c t i o n i s t i n t e r p r e t a t i o n o f the macro estimates
4. T h i s p r o b l e m is e v e n m o r e severe i n the r e c e n t s t u d y b y V a n e s s a F r y a n d P a n o s P a s h a r d e s ( 1 9 8 8 ) w h i c h a d o p t s the same p r o c e d u r e i n l o o k i n g at t o b a c c o p u r c h a s e s . C l e a r l y , the p r e v a l e n c e of n o n -s m o k e r -s i m p l i e -s t h a t there i-s no repre-sentative c o n -s u m e r . B u t it i-s al-so the ca-se t h a t ela-sticitie-s e-sti m a t e d f r o m aggregate d a t a m a y fail to r e f l e c t the elasticities of a n y r e p r e s e n t a t i v e s m o k e r . T h i s is because it is n o t possible to distinguish b e t w e e n the effect of a ( p e r h a p s t a x - i n d u c e d ) p r i c e rise in i n d u c ing s m o k e r s to s m o k e less a n d the e f f e c t , if a n y , o f i n d u c i n g s m o k e r s to give u p the h a b i t .
is possible. B o t h papers conclude that aggregate estimates w i t h no corrections for d i s t r i b u t i o n a l effects t e n d t o suffer f r o m misspecified dynamics and this does i m p l y that t h e y w i l l be less useful i n forecasting and p o l i c y analysis. There is also a suggestion t h a t t h e y may vary significantly, in a Lucas (1976) manner, because government policies w i l l affect income d i s t r i b u t i o n more t h a n they w i l l affect household characteristics and taste parameters. A l t h o u g h neither set o f authors makes an explicit argument, the i m p l i c a t i o n appears t o be that one is better o f f confining oneself t o microeconometric data. This view seems t o me t o be radically mistaken.
I t has been clear ever since Lawrence K l e i n (1946a, b) and A n d r e N a t a f (1948) first discussed aggregation issues i n the c o n t e x t o f p r o d u c t i o n functions that the requirements for an exact correspondence between aggregate and m i c r o relationships are enormously strong. T h r o u g h the w o r k o f Terence G o r m a n ( 1 9 5 3 , 1959, 1 9 6 8 ) , Muellbauer ( 1 9 7 5 , 1976) and others these con d i t i o n s have been somewhat weakened b u t remain heroic. I n this light i t m i g h t appear somewhat surprising that aggregate relationships do seem to be rela tively constant over t i m e and are b r o a d l y interpretable i n terms o f economic t h e o r y .
A n interesting clue as t o w h y this m i g h t be was p r o v i d e d b y Yehuda G r u n -feld and Z v i Griliches (1960) w h o asked " I s aggregation necessarily bad?" I n his P h . D . thesis, G r u n f e l d had obtained the surprising result that the invest m e n t expenditures o f an aggregate o f eight major US corporations were better explained b y a t w o variable regression o n the aggregate market value o f these corporations and their aggregate stock o f p l a n t and equipment at the start o f each p e r i o d , t h a n b y a set o f eight m i c r o regressions i n w h i c h each corpor ation's investment was related to its o w n market value and stock o f p l a n t and e q u i p m e n t ( G r u n f e l d , 1 9 5 8 ) . I n forecasting the aggregate, one w o u l d do better using an aggregate equation than b y disaggregating and forecasting each com p o n e n t o f the aggregate separately. G r u n f e l d and Griliches suggest that this may be explained b y misspecification o f the m i c r o relationships. I f there is even a small dependence o f the m i c r o variables o n aggregate variables, this can result i n better e x p l a n a t i o n b y the aggregated equation than b y the slightly misspecified m i c r o equations.
has j u s t been f o r m e d (new purchase). The role o f the interest rate w i l l be secondary. However, the i n d i v i d u a l factors are u n l i k e l y t o be f u l l y observed. I f one is obliged t o regress simply o n the c o m m o n factors — i n this case the
interest rate — the m i c r o R2 w i l l be t i n y ( w i t h a m i l l i o n households, Granger
obtains a value o f 0 . 0 0 1 ) , b u t the aggregate equation may have a very high
R2 (Granger obtains 0.999) because the i n d i v i d u a l effects average o u t across
the p o p u l a t i o n .6
So long as the m i c r o relationships are correctly specified and all variables ( c o m m o n and individual) are f u l l y observed there is no gain t h r o u g h aggre g a t i o n . However, once we allow parsimonious simplification strategies, the effects o f these simplifications w i l l be t o result i n quite different m i c r o and aggregate relationships. F u r t h e r m o r e , i t is n o t clear a priori w h i c h o f these a p p r o x i m a t e d relationships w i l l more closely reflect t h e o r y . B l u n d e l l ( 1 9 8 8 ) i m p l i e d that w h e n m i c r o and aggregate relationships differ this must entail aggregation bias i n the aggregate relationships. Granger's results show t h a t theoretical effects may be swamped at the m i c r o level b y i n d i v i d u a l factors w h i c h are o f l i t t l e interest t o the economist, and w h i c h i n any case are l i k e l y to be i n c o m p l e t e l y observed resulting i n o m i t t e d variable bias i n the m i c r o equations. Microeconometrics is i m p o r t a n t , b u t i t does n o t invalidate tra d i t i o n a l aggregate t i m e series analysis.7
M y concern here is w i t h the m e t h o d o l o g y o f aggregate t i m e series econo metrics so I shall n o t d w e l l o n the problems o f doing microeconometrics. The question I have posed is h o w economic t h e o r y should be incorporated i n aggregate models. The naive answer to this question is the reductionist r o u t e , in w h i c h the parameters o f aggregate relationships are interpreted i n terms o f the decisions o f a representative o p t i m i s i n g agent. However, there is absolutely no reason to suppose that the aggregation assumptions required b y this reduc-w i l l h o l d . There is l i t t l e p o i n t , therefore, i n using these estimated aggregate relationships t o " t e s t " theories based o n the o p t i m i s i n g behaviour o f a repre sentative agent — i f we fail t o reject the t h e o r y i t is o n l y because we have insufficient d a t a .8
6. S t r i c t l y , the v a r i a n c e of the h o u s e h o l d effects is of order n , w h e r e n is the n u m b e r of h o u s e h o l d s , and the v a r i a n c e o f the c o m m o n f a c t o r s is of order n2. H e n c e , as the n u m b e r o f h o u s e h o l d b e c o m e s
large, the c o n t r i b u t i o n of the i n d i v i d u a l effects b e c o m e s negligible. I n the converse case i n w h i c h we observe the i n d i v i d u a l f a c t o r s b u t n o t the c o m m o n f a c t o r s the R2 s are r e v e r s e d . See also G r a n g e r ( 1 9 8 8 ) .
7. W e r n e r H i l d e n b r a n d ( 1 9 8 3 ) arrives at a similar c o n c l u s i o n in a m o r e s p e c i a l i s e d c o n t e x t . H e r e m a r k s
(ibid, p. 9 9 8 ) " T h e r e is a qualitative difference in m a r k e t a n d i n d i v i d u a l d e m a n d f u n c t i o n s . T h i s obser
v a t i o n s h o w s t h a t the c o n c e p t o f the 'representative c o n s u m e r ' , w h i c h is often u s e d i n the l i t e r a t u r e ; d o e s n o t r e a l l y simplify the a n a l y s i s ; o n the c o n t r a r y , it m i g h t be m i s l e a d i n g " . I a m grateful to J o s e C a r b a j o for bringing this r e f e r e n c e to m y a t t e n t i o n .
I t is n o w nearly ten years since Sims argued i n his "Macroeconomics and R e a l i t y " (Sims, 1980) that the Haavelmo-Cowles programme is misconceived. Theoretically-inspired i d e n t i f y i n g restrictions are, he argued, simply " i n c r e d i b l e " . T h i s is p a r t l y because many sets o f restrictions a m o u n t t o no more than normalisations together w i t h " s h r e w d aggregations and exclusion restrictions" based o n an " i n t u i t i v e econometrician's view o f psychological and sociological t h e o r y " (Sims, 1 9 8 0 , p p . 2-3); because the use o f lagged dependent variables for i d e n t i f i c a t i o n requires p r i o r knowledge o f exact lag lengths and orders o f serial c o r r e l a t i o n (Michio Hatanaka, 1 9 7 5 ) ; and p a r t l y because r a t i o n a l expec tations i m p l y t h a t any variable entering a particular equation may, i n p r i n c i p l e , enter a l l other equations containing expectational variables. I n Sims' example, a demand for meat equation is " i d e n t i f i e d " b y n o r m a l i s a t i o n o f the coefficient o n the q u a n t i t y (or value share) o f meat variable to - 1 ; b y exclusion o f a l l other q u a n t i t y (or value share) variables; and b y exclusion o f the prices o f goods considered b y the econometrician t o be distant substitutes for meat, or replacement o f these prices b y the prices o f one or more suitably defined aggregates.
T h e V A R m e t h o d o l o g y , elaborated i n a series o f papers b y Thomas D o a n , R o b e r t L i t t e r m a n and Sims, is t o estimate unrestricted d i s t r i b u t e d lags o f each non-deterministic variable o n the complete set o f variables ( D o a n , L i t t e r m a n and Sims, 1 9 8 4 ; L i t t e r m a n , 1986a; Sims, 1982, 1 9 8 7 ) . Thus given a set o f k variables one models
k n
x. = £ 2 0.. x. , +u.t ( i = 1, . . ,k) (1)
The objective is to a l l o w the data t o structure the d y n a m i c responses o f each variable. O b v i o u s l y , however, one may wish t o consider a relatively large number o f variables and relatively long lag lengths, and this c o u l d result i n shortage o f degrees o f freedom and in-poorly determined coefficient estimates. Some o f the early V A R articles impose " i n c r e d i b l e " marginalisation (i.e., vari able exclusion) and lag length restrictions — for example, Sims (1980) uses a six variable V A R o n q u a r t e r l y data w i t h lag length restricted t o f o u r . B u t these restrictions are h a r d l y more palatable than those Sims argued against i n "Macroeconomics and R e a l i t y " and at least i m p l i c i t r e c o g n i t i o n o f this has pushed the V A R school i n t o an a d o p t i o n o f a Bayesian f r a m e w o r k . The crucial element o f Bayesian V A R ( B V A R ) m o d e l l i n g is a "shrinkage" procedure i n w h i c h a loose Bayesian p r i o r d i s t r i b u t i o n structures the otherwise unrestricted d i s t r i b u t e d lag estimates (Doan et al., 1 9 8 4 ; Sims, 1 9 8 7 ) .
lagged coefficients, one w i l l i n t u i t i v e l y feel t h a t m a n y o f t h e m , p a r t i c u l a r l y
those at high lag lengths, should be s m a l l .9 D o a n et al. (1984) formalise this
i n t u i t i o n b y specifying the p r i o r for each modelled variable as a r a n d o m w a l k w i t h d r i f t .1 0 T h i s p r i o r can be j u s t i f i e d o n the argument t h a t , under (perhaps
i n c r e d i b l y ) strict assumptions, r a n d o m w a l k models appear as the outcomes o f the decisions o f atomistic agents o p t i m i s i n g under u n c e r t a i n t y (most
n o t a b l y , H a l l , 1 9 7 8 ) ; or o n the heuristic argument that " n o change'1' fore
casts provide a sensible " n a i v e " base against w h i c h any other forecasts should be compared. M o r e f o r m a l l y , one can argue t h a t c o l l i n e a r i t y is clearly a major p r o b l e m i n the estimation o f unrestricted d i s t r i b u t e d lag models and t h a t severe c o l l i n e a r i t y may give models w h i c h " p r o d u c e erratic, p o o r forecasts and i m p l y explosive behavior o f the data" (Doan et al., 1 9 8 4 ) . A standard remedy for c o l l i n e a r i t y , i m p l e m e n t e d i n ridge regression ( A r t h u r H o e r l and R o b e r t K e n n a r d , 1970a, b) is to " s h r i n k " these coefficients towards zero b y adding a small constant (the "ridge constant") t o the diagonal elements o f
the data cross-product m a t r i x .1 1
Specification o f the prior variance involves the investigator q u a n t i f y i n g his/ her u n c e r t a i n t y about the p r i o r mean. The p r i o r variance m a t r i x w i l l t y p i c a l l y contain a large number o f parameters, and this therefore appears a d a u n t i n g task. M u c h o f the o r i g i n a l i t y o f the V A R shrinkage procedure arises f r o m the economy i n specification o f this m a t r i x w h i c h , i n the most simple case, is characterised i n terms o f o n l y three parameters ( D o a n , L i t t e r m a n and Sims, 1986). These are the overall tightness o f the p r i o r d i s t r i b u t i o n , the rate at w h i c h the p r i o r standard deviations decay, and the relative w e i g h t o f variables other than the lagged dependent variable i n a particular autoregression ( w i t h p r i o r covariances set to zero). A tighter p r i o r d i s t r i b u t i o n implies a larger ridge constant and this results i n a greater shrinkage towards the r a n d o m w a l k m o d e l . The i m p o r t a n t feature o f the D o a n et al. (1984) procedure is t h a t the tightness o f the p r i o r is increased as lag length increases. Degrees o f freedom considerations are no longer p a r a m o u n t since coefficients associated w i t h long
9 . B u t note t h a t t h i s i n t u i t i o n m a y be i n c o r r e c t if o n e uses seasonally u n a d j u s t e d d a t a . H o w e v e r , K e n n e t h Wallis ( 1 9 7 4 ) has s h o w n t h a t use of seasonally a d j u s t e d d a t a c a n d i s t o r t the d y n a m i c s i n the e s t i m a t e d r e l a t i o n s h i p s .
10. T h e e x p o s i t i o n in D o a n et al. ( 1 9 4 8 ) is c o m p l i c a t e d a n d n o t e n t i r e l y c o n s i s t e n t . See J o h n G e w e k e ( 1 9 8 4 ) for a c o n c i s e s u m m a r y .
11. I n the s t a n d a r d l i n e a r m o d e l
y = X)3 + u
w h e r e y a n d X are b o t h m e a s u r e d as deviations f r o m their s a m p l e m e a n s , the ridge regression e s t i m a t o r of P is
b = ( X ' x + k l ) ^ ' y
lag lengths, and w i t h less i m p o r t a n t explanatory variables, are forced t o be close t o zero.
I t is o f t e n suggested t h a t the V A R approach is completely atheoretical \ (see, e.g., Thomas Cooley and Stephen L e R o y , 1 9 8 6 ) . This view is given support b y those V A R modellers whose activities are p r i m a r i l y related t o forecasting and w h o argue t h a t relevant economic t h e o r y is so incredible I t h a t one w i l l forecast better w i t h an unrestricted reduced f o r m m o d e l ( L i t
-t e r m a n , 1986a, b ; S-tephen McNees, 1 9 8 6 ) .1 2 However, this p o s i t i o n is t o o
extreme. M o s t s i m p l y , t h e o r y may be tested t o a l i m i t e d extent b y
examina-i t examina-i o n o f b l o c k exclusexamina-ion (Granger causalexamina-ity) tests, although I w o u l d agree w examina-i t h
Sims t h a t , interpreted s t r i c t l y , such restrictions are n o t i n general credible. I t is therefore more interesting to examine the use o f V A R models i n p o l i c y analysis since i n this a c t i v i t y t h e o r y is indispensable.
Suppose one is interested i n evaluating the p o l i c y impact o f a shock to the m o n e y supply. One w i l l t y p i c a l l y l o o k for a set o f d y n a m i c m u l t i p l i e r s showing the impact o f t h a t shock o n a l l the variables o f interest. A n i n i t i a l d i f f i c u l t y is t h a t i n V A R models all variables are j o i n t l y determined b y their c o m m o n his t o r y and a set o f current disturbances. This implies t h a t i t does n o t make sense to t a l k o f a shock t o the m o n e y supply unless a d d i t i o n a l structure is imposed on the V A R . T o see this, note that the autoregressive representation (1) may be transformed i n t o the m o v i n g average representation
k ° °
x. = E 2 a., u. t ( i = 1 , . . ,k) (2)
i t j=1 r=0 i j r },t-i \ > > i V /
where each variable depends o n the h i s t o r y o f shocks t o a l l the variables i n the m o d e l . There are t w o possibilities. Take the money supply t o be variable 1 . I f none o f the other k - 1 variables i n the m o d e l Granger-causes the money supply (so t h a t 0 ^ - = 0 for a l l j > l and r ) we may i d e n t i f y monetary
p o l i c y w i t h the i n n o v a t i o n U j o n the money supply equation and trace o u t
the effects o f these innovations o n the other variables i n the system. I t is more likely, however, p a r t i c u l a r l y given Sims' views, that all variables are inter dependent at least over t i m e . I n t h a t case analysis o f the effects o f m o n e t a r y p o l i c y requires the i d e n t i f y i n g assumption that the monetary authorities
choose X j independently o f the current p e r i o d disturbances o n the other equations. I n an older t e r m i n o l o g y , this defines the first l i n k i n a W o l d causal chain w i t h m o n e y causally p r i o r t o the other variables (Herman W o l d and Radnar Bentzel, 1946; W o l d and Lars Jureen, 1953). I t can be i m p l e m e n t e d
by renormalisation o f (2) such that x] t depends o n l y o n the p o l i c y innovations
vu w h i l e the remaining variables depend o n vu and also a set o f innovations
v2 f " " 'vk tw m c n are orthogonal t o v] t. I n the l i m i t i n g case i n w h i c h all the
innovations are m u t u a l l y o r t h o g o n a l , we may rewrite (2) as
* i t = .2 V i j r V - r 0 = 1 . - (3)
j = 1 r = 0 J •"
This expression is unique given the ordering o f the variables, b u t as Pagan
(1987) notes, it is n o t clear a priori h o w the innovations v2 t, . . , vk t should
be i n t e r p r e t e d . The p o l i c y m u l t i p l i e r s w i l l depend o n the causal ordering adopted, and the ordering o f variables 2 . . k may i n practice be somewhat a r b i t r a r y . We f i n d therefore that, although i n estimation V A R modellers can avoid m a k i n g strong i d e n t i f y i n g assumptions, p o l i c y i n t e r p r e t a t i o n o f their models, including the calculation o f p o l i c y m u l t i p l i e r s , requires t h a t one make exactly the same sort o f i d e n t i f y i n g assumption that Sims criticised i n the Haavelmo-Cowles programme. This is the basis o f Cooley and L e R o y ' s ( 1 9 8 5 ) critique o f atheoretical macroeconometrics.
As a criticism o f Sims, this is t o o strong. N o t e first that i n his applied w o r k , Sims does n o t restrict himself t o orthogonalisation assumptions as i n ( 3 ) , b u t is w i l l i n g to explore a wider class o f i d e n t i f y i n g restrictions w h i c h are n o t dis similar t o those made by structural modellers (see Sims, 1986). Moreover, he allows himself t o search over different sets o f i d e n t i f y i n g assumptions i n order t o o b t a i n plausible p o l i c y m u l t i p l i e r s . However, the sets o f assumptions he explores all generate j u s t i d e n t i f i e d models w i t h the i m p l i c a t i o n that they are all c o m p a t i b l e w i t h the same reduced f o r m . This permits a t w o stage procedure i n w h i c h at the first stage the autoregressive representation (1) is estimated, and at the second stage this representation is interpreted i n t o economic t h e o r y by the i m p o s i t i o n o f i d e n t i f y i n g assumptions o n the m o v i n g average repre sentation. The i d e n t i f y i n g assumptions may be controversial, b u t they do n o t contaminate estimation.
A l t h o u g h i t is n o t true that V A R m o d e l l i n g is completely atheoretical, the p h i l o s o p h y o f the V A R approach may be caricatured as a t t e m p t i n g t o l i m i t the role o f t h e o r y i n order t o o b t a i n results w h i c h are as objective as possible and as near as possible independent o f the investigator's theoretical beliefs or prejudices. A n alternative approach, associated w i t h what I have called else where ( G i l b e r t , 1989) the L S E ( L o n d o n School o f Economics) m e t h o d o l o g y
is t o use t h e o r y t o structure models in a more or less loose way so as t o o b t a i n a m o d e l whose general i n t e r p r e t a t i o n is i n line w i t h t h e o r y b u t whose detail is determined b y the data. The i n s t r u m e n t for ensuring coherence w i t h the data is classical testing m e t h o d o l o g y .
This i m m e d i a t e l y p r o m p t s the question o f what constitutes a test o f a t h e o r y w h i c h we regard as at best approximate? I have n o t e d that i t does n o t usually make m u c h sense to suppose that we can sensibly use classical testing procedures t o a t t e m p t t o reject theories based o n the behaviour o f atomistic o p t i m i s i n g agents o n aggregate economic data, since there is no reason t o sup
pose that those theories apply precisely on aggregate d a t a .1 3 There are i n
practice t w o interesting questions. The first is whether a given theory is or is not t o o simple relative b o t h t o the data and for the purposes t o hand. The second question is whether one theory-based m o d e l explains a given dataset better than another theory-based m o d e l .
The issue o f simplification almost invariably p r o m p t s the map analogy. For example, Learner ( 1 9 7 8 , p . 205) writes "Each map is a greatly simplified ver sion o f the t h e o r y o f the w o r l d ; each is designed for some class o f decisions and w o r k s relatively p o o r l y for others". S i m p l i f i c a t i o n is forced u p o n us by the fact that we have l i m i t e d comprehension, and, more acutely i n t i m e series studies, b y l i m i t e d numbers o f observations. As the a m o u n t o f data available increase, we are able t o entertain more complicated models, b u t this is n o t necessarily a benefit i f we are interested in investigating relatively simple theories since the a d d i t i o n a l c o m p l e x i t y may then largely take the form o f nuisance parameters. F r e q u e n t l y , the increased m o d e l c o m p l e x i t y w i l l take
the f o r m o f inclusion o f more v a r i a b l e s1 4 — i.e., revision o f the marginalisation
decision — and this can be tested using conventional classical nested tech niques. T h e i m p o r t a n t question is whether omission o f these factors results in biased coefficient values and incorrect inference i n relation t o the purposes o f the investigation. The tourist and the geologist w i l l t y p i c a l l y use different maps, b u t the t o u r i s t may wish t o k n o w i f there are steep gradients on his/her r o u t e , and questions o f access are n o t t o t a l l y irrelevant t o the geologist.
The obvious trade-off in the sort o f samples we frequently f i n d ourselves ana lysing i n t i m e series macroeconometrics is between r e d u c t i o n i n bias t h r o u g h the inclusion o f a d d i t i o n a l regressors and reduced precision through the reduc t i o n i n degrees o f freedom and increase in collinearity. Short samples o f aggre gate data can o n l y relate to simple theories since they o n l y contain a l i m i t e d a m o u n t o f i n f o r m a t i o n . Macroeconometric models w i l l therefore be more
1 3 . H e t e r o g e n e i t y m a y i m p l y that these theories also fail to h o l d o n m i c r o d a t a .
simple than the w o r l d they p u r p o r t t o represent. This does n o t p a r t i c u l a r l y matter, b u t i t does i m p l y that we must always be aware that previously neglected factors may become i m p o r t a n t — an obvious example is p r o v i d e d by the role o f i n f l a t i o n i n the c o n s u m p t i o n f u n c t i o n .
T w o strategies are c u r r e n t l y available for c o n t r o l l i n g for structural n o n -constancy. V A R modellers advise use o f random coefficient vector auto-regressions i n w h i c h the m o d e l coefficients all evolve as r a n d o m walks (Doan
et al., 1984). I n principle, this leads t o very high dimensional models, b u t
i m p o s i t i o n o f a t i g h t Bayesian p r i o r d i s t r i b u t i o n heavily constrains the co efficient e v o l u t i o n and permits estimation. This procedure automates c o n t r o l for structural constancy, since the modeller's role is reduced t o choice o f the tightness parameters o f the p r i o r . A disadvantage is that i t cannot ever p r o m p t reassessment o f the marginalisation decision — i.e., inclusion o f previously excluded or unconsidered regressor variables.
A n alternative approach w h i c h is gaining increasing support is the use o f recursive regression methods t o check for structural constancy. I n recursive regression one uses u p d a t i n g formulae, first w o r k e d o u t b y T i m o Terasvirta ( 1 9 7 0 ) , t o compute the regression o f interest for each subsample [ l , t ] for t = T j , . . ,T where T is the final observation available and T j is o f the order of three times the number o f regressor variables (see H e n d r y , 1989, p p . 20-21). This produces a large v o l u m e o f o u t p u t w h i c h is d i f f i c u l t t o i n t e r p r e t except by graphical methods. Use o f recursive methods had therefore t o w a i t u n t i l PC technology allowed easy and costless preparation o f graphs. I t is n o w c o m p u t a t i o n a l l y t r i v i a l t o graph Chow tests (Gregory C h o w , 1960) for a l l possible structural breaks, or for one p e r i o d ahead predictions for all periods w i t h i n the [ T j , T - 1] interval. Also one can p l o t coefficient estimates against sample size. A l t h o u g h these graphical methods do n o t i m p l y any precise statistical tests, they show up structural non-constancy o f either the break or e v o l u t i o n f o r m i n an easily recognisable f o r m , and p r o m p t the investigator t o ask w h y a particular coefficient is m o v i n g t h r o u g h the sample, or w h y a par ticular observation is exerting leverage on the coefficient estimates. These questions should then p r o m p t appropriate m o d e l respecification. I am n o t aware that Learner has ever advised use o f recursive methods, b u t they do appear t o be very m u c h i n the spirit o f his concern w i t h fragility i n regression estimates (see Learner and H e r m a n n L e o n a r d , 1 9 8 3 ) , even i f the proposed databased " s o l u t i o n " is n o t one he w o u l d favour.
required t o make a choice. There is n o w a considerable b o d y o f b o t h econo m e t r i c t h e o r y and o f experience i n non-nested hypothesis testing. Suppose we have t w o alternative and apparently congruent models A and B . Suppose i n i t i a l l y m o d e l A (say a regression o f y on X ) gives the " c o r r e c t " represen t a t i o n o f the economic process under consideration. This implies that the estimates obtained b y i n c o r r e c t l y supposing model B (regression o f y on Z ) to be true w i l l suffer f r o m misspecification bias. Knowledge o f the covariance o f the X and Z variables allows this bias t o be calculated. Thus i f A is true, i t allows the econometrician t o predict h o w B w i l l p e r f o r m ; b u t i f A does n o t give a good representation o f the economic process, i t w i l l n o t be able t o " e x p l a i n " the m o d e l B coefficients. F u r t h e r m o r e , we can reverse the entire procedure and a t t e m p t t o use m o d e l B t o predict h o w A w i l l p e r f o r m .
These non-nested hypothesis tests, or encompassing tests as they are some times called, t u r n o u t t o be very simple t o p e r f o r m . One forms the composite b u t quite possible economically uninterpretable hypothesis A U B w h i c h in the case discussed above is the regression o f y o n b o t h X and Z (deleting one occurrence o f any variable included i n b o t h X and Z ) , and then performs the standard F tests o f A and B i n t u r n against A U B . Four outcomes are possible. I f one can accept the absence o f the Z variables in the presence o f the X vari ables, b u t n o t vice versa (i.e., E [ y | X , Z ] = X a ) , m o d e l A is said t o encompass m o d e l B ; equally, m o d e l B may encompass m o d e l A ( E [ y l X , Z ] = Z 0 ) . B u t t w o other outcomes are possible. I f one cannot accept either E [ y | X , Z ] = X a or E [ y | X , Z ] = Zj3 neither hypothesis may be maintained. F i n a l l y , one m i g h t be able t o accept b o t h E [ y | X , Z ] = X a and E [ y | X , Z ] = Zj3 i n w h i c h case the data are i n d e c i s i v e .1 5 T h i s relates t o Thomas K u h n ' s view that a scientific
t h e o r y w i l l n o t be rejected simply because o f anomalies, b u t rather because some o f these anomalies can be explained b y a rival theory ( K u h n , 1962).
The LSE procedure may be summarised as an a t t e m p t t o o b t a i n a parsi
m o n i o u s representation o f a general unrestricted e q u a t i o n .1 6 This represen
t a t i o n should simultaneously satisfy a number o f criteria (Hendry and Jean-Francois R i c h a r d , 1 9 8 2 , 1 9 8 3 ) . First, i t must be an acceptable simplification o f the unrestricted equation either o n the basis o f a single F test against the unrestricted e q u a t i o n , or o n the basis o f a sequence o f such t e s t s .1 7 Second, i t
should have serially independent errors. T h i r d , i t must be structurally constant. I w i l l r e t u r n t o the error c o r r e c t i o n specification s h o r t l y . The model
dis-15. T h i s is " c o e f f i c i e n t e n c o m p a s s i n g " . A m o r e l i m i t e d q u e s t i o n ( " v a r i a n c e e n c o m p a s s i n g " ) is w h e t h e r w e c a n e x p l a i n the r e s i d u a l v a r i a n c e s . C o e f f i c i e n t e n c o m p a s s i n g i m p l i e s variance e n c o m p a s s i n g , b u t n o t vice versa ( M i z o n a n d R i c h a r d , 1 9 8 6 ) .
1 6 . U s u a l l y this w i l l involve O L S e s t i m a t i o n o f single e q u a t i o n s , b u t the same p r o c e d u r e s m a y be a d o p t e d i n s i m u l t a n e o u s m o d e l s using a p p r o p r i a t e e s t i m a t o r s .
covery a c t i v i t y takes place i n part i n the parsimonious s i m p l i f i c a t i o n a c t i v i t y , w h i c h t y p i c a l l y involves the i m p o s i t i o n o f zero or equality restrictions o n sets of coefficients, and also i m p o r t a n t l y i n reviewing the marginalisation (variable exclusion) decisions. Parsimonious s i m p l i f i c a t i o n may be regarded as i n large measure a t i d y i n g up operation w h i c h does l i t t l e t o affect equation f i t , controls for c o l l i n e a r i t y and thereby improves forecasting performance, and at worst results i n exaggerated estimates o f coefficient precision (since coefficients
w h i c h are a p p r o x i m a t e l y zero or equal are set to be exactly zero or e q u a l ) .1 8
Pravin Trivedi (1984) has coined the term " t e s t i m a t i o n " t o describe the " e m p i r i c a l specification search involving a blend o f estimation and significance tests". I m p o r t a n t l y , parsimonious simplification conserves degrees o f freedom and i n this respect i t is n o t dissimilar to the shrinkage procedure adopted i n V A R m o d e l l i n g , the difference being m a i n l y whether one imposes strong restrictions o n a set o f near zero coefficients ( L S E ) , or weaker restrictions on the entire set o f coefficients ( V A R ) . I t does n o t seem t o me that there is any strong basis for suggesting that one m e t h o d has superior statistical properties than the other. V A R modellers argue that their models have superior fore casting properties, b u t L S E modellers w o u l d reply that their methods t e n d to be more robust w i t h respect t o structural change. This is n o t an argument that can be settled o n an a priori basis.
Opening up the marginalisation question is o f greater importance. I f a variable w h i c h is o f practical importance is o m i t t e d f r o m the m o d e l , perhaps because its presence is n o t indicated by the available t h e o r y , this omission is l i k e l y t o cause biased coefficient estimates and either serially correlated residuals or over-complicated estimated dynamics. I n the former case, one m i g h t be t e m p t e d t o estimate using an appropriate autoregressive estimator, w h i c h is t a n t a m o u n t to regarding the autoregressive coefficients as nuisance parameters; w h i l e in the latter one w i l l o b t a i n the same result via unrestricted estimates o f the autoregressive equation. The alternative, w h i c h is familiar t o all o f us, is t o take the residual serial correlation as p r o m p t i n g the question of whether the m o d e l is well-specified, and i n particular, whether i m p o r t a n t variables have been o m i t t e d . Subsequent discovery that this is indeed the case may either indicate a need t o extend or revise the u n d e r l y i n g t h e o r y , or more simply suggest the observation that the t h e o r y offers o n l y a partial explana t i o n o f the data. I n the latter case, the additional variables i n t r o d u c e d i n t o the m o d e l may perhaps be legitimately regarded as nuisance variables, b u t i n the former case the t w o way interaction between t h e o r y and data w i l l have a clear positive value.
The feature o f the L S E approach on w h i c h I wish to concentrate is the role o f cointegration and the prevalence o f the error c o r r e c t i o n specification. The error c o r r e c t i o n specification is an a t t e m p t t o combine the f l e x i b i l i t y o f
time series ( B o x - J e n k i n s )1 9 models i n accounting for short t e r m dynamics
w i t h the t h e o r y - c o m p a t i b i l i t y o f t r a d i t i o n a l structural econometric models (see G i l b e r t , 1989). I n this specification b o t h the dependent variable and the explanatory variables appear as current and lagged differences (sometimes as second differences or m u l t i - p e r i o d differences), as i n Box-Jenkins models, b u t u n l i k e those models, the specification also includes a single lagged level o f the dependent variable and a subset o f the explanatory variables. For example, a stylised version o f the Davidson et al. (1978) c o n s u m p t i o n function may be w r i t t e n as
A4lnct = 0o + U1A4l n yt+ U2A1A4l n yt- 03( l n ct_4 - l n yt_4) (4)
where, o n quarterly data, annual changes i n c o n s u m p t i o n are related t o annual changes i n income and a four quarter lagged discrepancy between income and c o n s u m p t i o n . I t is these lagged levels variables w h i c h determine the steady state s o l u t i o n o f the m o d e l .2 0 I t w i l l frequently be f o u n d that augmentation
of pure difference equations b y lagged levels terms in this way has a dramatic effect o n forecasts and o n estimated p o l i c y responses.
I t is always possible to reparameterise any unrestricted d i s t r i b u t e d lag equation specified i n levels (e.g., a V A R ) i n t o the error correction f o r m , so i t may appear o d d t o claim any special status for this way o f w r i t i n g d i s t r i b u t e d lag relationships. N o t e however that the LSE procedure i m p l i c i t l y p r o h i b i t s parsimonious s i m p l i f i c a t i o n o f the unrestricted equation i n t o a Box-Jenkins m o d e l i n w h i c h the lagged level o f the dependent variable is excluded, even i f this exclusion w o u l d result in negligible loss in f i t . I n this sense, the specifica t i o n is non-trivial. T h a t i t is an interesting non-trivial specification depends on the claim that economic theory implies a set o f comparative static results w h i c h are reflected i n long-run constancies, and is reinforced by the logically independent b u t incorrect claim that economic theory tells us l i t t l e about
short-term adjustment processes.2 1
The earliest error c o r r e c t i o n specification was Denis Sargan's (1964) wage
19. G e o r g e B o x a n d G w y l y m J e n k i n s ( 1 9 7 0 ) .
2 0 . I n the steady state s o l u t i o n all the d i f f e r e n c e d variables arc set to z e r o . T h e steady state g r o w t h s o l u t i o n , i n w h i c h all the d i f f e r e n c e d v a r i a b l e s are set to a p p r o p r i a t e c o n s t a n t s , is often more i n f o r m a tive — see J a m e s D a v i d s o n , D a v i d H e n d r y , F r a n k S r b a a n d S t e p h e n Y e o ( 1 9 7 8 ) , a n d G i l b e r t ( 1 9 8 6 , 1 9 8 9 ) .
model i n w h i c h the rate o f increase i n wage rates was related t o the difference between the lagged real wage and a n o t i o n a l target real wage. Here there is a straightforward structural i n t e r p r e t a t i o n o f the error correction t e r m . M o r e recently, however, the generality o f the specification has received support f r o m the Granger representation theorem ( Robert Engle and Granger, 1987) w h i c h states that i f there exists a stationary linear c o m b i n a t i o n o f a set o f non-stationary variables (i.e., i f the variables are "cointegrated") t h e n these variables must be l i n k e d b y at least one relationship w h i c h can be w r i t t e n i n the error c o r r e c t i o n f o r m . ( I f this were n o t the case, the variables w o u l d increasingly diverge over t i m e . ) Since most macroeconomic aggregates are non-stationary ( t y p i c a l l y they grow over t i m e ) any persisting (autonomous) relationship between aggregates over t i m e is l i k e l y t o be o f the error c o r r e c t i o n f o r m .
Cointegration therefore provides a p o w e r f u l reason for supposing that there w i l l exist structural constant relationships between macroeconomic aggregates. I f economic t i m e series are non-stationary b u t cointegrated there are strong arguments for imposing the error correction structure o n our models, and i t is an advantage o f the LSE m e t h o d o l o g y over the V A R m e t h o d o l o g y that i t adopts this approach. A major role for economic theory i n the L S E m e t h o d ology is t o aid the specification o f the cointegrating t e r m . Unsurprisingly, short samples o f relatively high frequency (quarterly or m o n t h l y ) data are often relatively u n i n f o r m a t i v e about the long-run relationship between the variables, so that theoretically u n m o t i v a t e d specification o f these terms gives l i t t l e precision or d i s c r i m i n a t i o n between alternative specifications. One pos s i b i l i t y , suggested b y Engle and Granger ( 1 9 8 7 ) , is a t w o stage procedure where at the first stage one estimates the static ( " c o i n t e g r a t i n g " ) regression ignoring the short-term dynamics, and at the second stage one imposes these long-run coefficients o n the dynamic error correction m o d e l . However, M o n t e Carlo investigation suggests that this procedure has p o o r properties ( A n i n d y a Banerjee, Juan D o l a d o , D a v i d H e n d r y and Gregor S m i t h , 1986) and that i t is preferable t o a t t e m p t t o estimate the long-run s o l u t i o n f r o m the dynamic adjustment equation as i n the i n i t i a l Sargan (1964) wage m o d e l and the David son et al. (1978) c o n s u m p t i o n f u n c t i o n m o d e l . Nevertheless, the long-run solution may still be p o o r l y determined, i m p l y i n g that theoretical restrictions are u n l i k e l y t o be rejected.
A c t u a l DGPs, they suggest, w i l l be very complicated, b u t the c o m b i n a t i o n of marginalisation (exclusion o f variables that do n o t m u c h m a t t e r ) , c o n d i t i o n
ing (regarding certain variables as e x o g e n o u s2 2) and simplification w h i c h to
gether make up the activity o f m o d e l l i n g can give rise t o simple and structurally constant representations o f the D G P .
The DGP concept derives f r o m M o n t e Carlo analysis where the investigator specifies the process w h i c h w i l l generate the data t o be used i n the subsequent estimation experiments. This suggests an analogy i n w h i c h we suppose a fic t i o n a l statistician choosing the data t h a t we analyse i n applied economics. I n a pioneering c o n t r i b u t i o n t o the A r t i f i c i a l Intelligence literature, A l a n T u r i n g (1950) asked whether an investigator c o u l d infallibly distinguish w h i c h o f t w o terminals is connected t o a machine and w h i c h operated b y a h u m a n . H e n d r y dares us t o claim t h a t we can distinguish between M o n t e Carlo and real w o r l d economic data. I f we cannot, the DGP analogy carries over, and we can hope to discover structural short-term dynamics.
This argument appears to me t o be flawed. I f macroeconomic data do e x h i b i t constant short-term dynamics then one m i g h t expect any structural interpre t a t i o n t o relate t o the parameters o f the adjustment processes o f the o p t i m i s i n g agents. B u t we have seen t h a t the aggregation c o n d i t i o n s required for the aggregate parameters t o be straightforwardly interpretable in terms o f the m i c r o e c o n o m i c parameters are heroic. W i t h M o n t e Carlo data, b y contrast, we can be c o n f i d e n t that there does exist a simple structure since the structure has been imposed by a single simple investigator. There are no aggregation issues, and the question o f r e d u c t i o n does n o t arise.
The most p r o m i s i n g route for rationalising the dynamics o f LSE equations is i n terms o f the b a c k w a r d representation o f a f o r w a r d l o o k i n g optimising models. I n simple models, o p t i m i s i n g behaviour i n the presence o f adjustment costs w i l l give rise t o a second order difference equation w h i c h can be solved to give a lagged (partial) adjustment term and a forward lead on the expected values o f the exogenous variables. B u t these future expected values may always be solved o u t i n terms o f past values o f the exogenous variables giving a back w a r d l o o k i n g representations. Stephen N i c k e l l (1985) showed that i n a number o f cases o f interest, this b a c k w a r d representation w i l l have the error correction f o r m , and this suggests t h a t i t may i n general be possible t o rationalise error c o r r e c t i o n models i n these terms ( K e i t h Cuthbertson, 1988). A n i m p l i c a t i o n of this view, via the Lucas (1976) c r i t i q u e , is that i f the process f o l l o w e d by any o f the exogenous variables changes, the backward l o o k i n g relationship w i l l be structurally non-constant while the forward l o o k i n g representation
w i l l remain constant. Current experience, however, is that i n these circum stances i t is the f o r w a r d l o o k i n g equation that is non-constant ( H e n d r y , 1988; Carlo Favero, 1 9 8 9 ) .
A n alternative approach is t o regard the short-term dynamics i n LSE relation ships as nuisance terms. D i r e c t estimation o f the cointegrating relationships is inefficient because o f the residual serial c o r r e l a t i o n resulting f r o m the o m i t t e d dynamics and may be inconsistent because o f s i m u l t a n e i t y . I t is possible that i n part these o m i t t e d dynamics arise f r o m aggregation across heterogeneous agents (Marco L i p p i , 1 9 8 8 ) . I n p r i n c i p l e , one c o u l d estimate using a systems m a x i m u m l i k e l i h o o d ( M L ) estimator t a k i n g i n t o account the serial correlation (Sorenjohansen, 1988; Soren Johansen andKatarinaJuselius, 1989), b u t there is advantage i n using a single equations estimator since this localises any misspecification error. The single equations estimator must correct b o t h for the simultaneity and for the serial c o r r e l a t i o n . I n recent w o r k Phillips ( 1 9 8 8 ) has argued that L S E dynamic equations often come very close to and sometimes achieve o p t i m a l estimation o f the cointegrating relationship. O n this i n t e r p r e t a t i o n , the short-run dynamic terms i n those equations are simply the simultaneity and Generalised Least Squares (GLS) adjustments, i n the same way that one can r e w r i t e the familiar Cochrane-Orcutt autoregressive estimator ( D o n a l d Cochrane and G u y O r c u t t , 1949) i n terms o f a restricted OLS estimation o f an equation containing lagged values o f the dependent variable and the regressor variables (Hendry and M i z o n , 1 9 7 8 ) . A n implica t i o n is that we have come f u l l circle back t o pre-Haavelmo econometrics where the concern was the measurement o f constants i n lawlike relationships w h i c h i n m o d e r n t e r m i n o l o g y are simply the cointegrating relationships.
However, this is t o miss m u c h o f the p o i n t o f the methods generated b y Sargan, H e n d r y and their colleagues. R o u t i n e forecasting and p o l i c y analysis in econometrics is as m u c h or more concerned w i t h short-term movements i n key variables than w i t h their long-term equilibria. F u r t h e r m o r e , short-term (derivative) responses are generally very m u c h better determined t h a n long-term relationships. I argued i n G i l b e r t (1989) t h a t a substantial part o f the m o t i v a t i o n o f the L S E t r a d i t i o n i n econometrics was the perceived challenge to " w h i t e b o x " econometric models f r o m " b l a c k b o x " t i m e series (Box-Jenkins) models ( R i c h a r d Cooper, 1972; Charles Nelson, 1972). The same points are true i n r e l a t i o n t o the development o f V A R m e t h o d o l o g y . Prac titioners o f the L S E approach are u n l i k e l y , therefore, t o recognise themselves in Phillips' description.
A t the start o f his famous 1972 survey "Lags i n E c o n o m i c Behavior", Marc Nerlove q u o t e d Schultz (1938) as saying " A l t h o u g h a theory o f dynamic economics is still a t h i n g o f the f u t u r e , we must n o t be satisfied w i t h the status
is s t i l l , i n large part, a t h i n g o f the f u t u r e " (Nerlove, 1972, p . 2 2 2 ) . The rational expectations o p t i m i s i n g models, examples o f w h i c h I have already discussed, have c o n s t i t u t e d a major a t t e m p t t o provide t h a t dynamic t h e o r y . They have n o t been w h o l l y successful, for the reasons I have indicated. Neither have they been w h o l l y unsuccessful. A possible criticism o f b o t h the V A R and L S E approaches t o m o d e l l i n g aggregate macro-dynamics is that they do n o t make any a t t e m p t t o accommodate these theories. A n alternative possibility is t o argue that the p r o b l e m is on the theorists' side; and that the rational expectations atomistic o p t i m i s i n g models deliver models w h i c h are t o o simple even t o be taken as reasonable a p p r o x i m a t i o n s . The p r o b l e m is, neverthe less, that however m u c h the anomalies m u l t i p l y , we are l i k e l y t o abandon these theories u n t i l an alternative paradigm becomes available. Sadly, I do n o t see any i n d i c a t i o n t h a t such a development is i m m i n e n t .
I started this lecture b y recalling a c o m m i t m e n t to u n i f y theory and empirical m o d e l l i n g . T h a t programme has recorded a measure o f success, b u t t o a large e x t e n t that success has been in the m o d e l l i n g o f long-term e q u i l i b r i u m relation ships. When Nerlove surveyed the methods o f dynamic economics, the con t r i b u t i o n o f t h e o r y was relatively new and relatively slight. We n o w have m u c h better developed and more securely based theories o f dynamic adjust m e n t b u t these theories have been too simple t o i n f o r m practical m o d e l l i n g . I t is obviously possible t o argue that this is the fault o f the econometricians, and the level o f discord among the econometricians m i g h t be held as evidence for this view. M y suspicion is, however, that the current disarray in the eco n o m e t r i c camp is the consequence o f the lack o f applicable t h e o r y . Where we have i n f o r m a t i v e and detailed theories, as for example i n demand analysis or the t h e o r y o f financial asset prices, methodological debates are m u t e d . I f the theorist can develop realistic b u t securely based dynamic theories, then the c o m p e t i n g approaches t o econometric m e t h o d o l o g y c o u l d coexist, quite happily t h r o u g h o u t macroeconometrics.
I have made a number o f different arguments i n the course o f this paper, so a brief summary may be useful.
1. I agree w i t h the c u r r e n t l y w i d e l y held view that i t is n o t possible in general t o estimate parameters o f m i c r o functions f r o m aggregate data. 2. I disagree w i t h the i m p l i e d view t h a t aggregate relationships cannot be interpreted i n terms o f m i c r o e c o n o m i c t h e o r y . The appropriate level o f aggregation w i l l depend b o t h on the purpose o f the m o d e l l i n g exer cise and o n the questions being asked.
3. T h e o r e t i c a l restrictions should n o t be expected t o h o l d precisely on aggregate data. This implies that classical rejections cannot per se be taken t o i m p l y rejection o f the theories i n question.
for d i s c r i m i n a t i n g between alternative imprecise theories.
5. I t is d i f f i c u l t t o argue a priori t h a t Bayesian shrinkage procedures have either superior or inferior statistical properties t o the pseudo-structural methods associated w i t h the B r i t i s h approach t o d y n a m i c m o d e l l i n g . A n advantage o f the latter approach is however that i t gives a central role t o m o d e l discovery, w h i c h may a l l o w a beneficial feedback f r o m data t o t h e o r y .
6. C o i n t e g r a t i o n provides a p o w e r f u l reason for believing t h a t macro-economic aggregates w i l l be l i n k e d b y s t r u c t u r a l l y stable relationships, and i t is an i m p o r t a n t advantage o f the B r i t i s h approach that i t embodies this feature or economic t i m e series t h r o u g h error c o r r e c t i o n . However, the argument that the B r i t i s h approach t o d y n a m i c m o d e l l i n g should be seen as simply a m e t h o d o f efficiently estimating these e q u i l i b r i u m relationships is misconceived.
7. The progress i n estimating relationships has n o t been matched b y c o m parable progress i n estimating d y n a m i c adjustment processes, where t h e o r y and data appear t o be quite starkly at odds. A possible response is that the existing o p t i m i s i n g theories are j u s t t o o simple.
REFERENCES
A L D R I C H , J O H N , 1989. "Autonomy", Oxford Economic Papers, Vol. 41, pp. 15-34. B A N E R J E E , A N I N D Y A , J U A N J . D O L A D O , D A V I D F . H E N R Y , and G R E G O R W.
SMITH, 1986. "Exploring Equilibrium Relationships in Econometrics through static Relationships: Some Monte Carlo Evidence", Oxford Bulletin of Economics and Statis
tics, 48, pp. 253-277.
B L U N D E L L , R I C H A R D , 1988. "Consumer Behaviour: Theory and Empirical Evidence",
Economic Journal, Vol. 98, pp. 16-65.
B L U N D E L L , R I C H A R D , PANOS P A S H A R D E S , and G I U L L I M O W E B E R , 1988. "What do we Learn about Consumer Demand Patterns from Micro-Data?", Institute for Fiscal Studies, Working Paper No. W88/10.
B O X , G E O R G E E.P., and G W Y L Y M M. J E N K I N S , 1970. Time Series Analysis: Forecast
ing and Control, San Francisco: Holden Day.
CHOW, G R E G O R Y C , 1960. "Tests of Equality Between Sets of Coefficients in Two Linear Regressions", Econometrica, Vol. 28, pp. 591-605.
C O C H R A N E , D O N A L D , and G U Y H. O R C U T T , 1949. "An Application of Least Squares Regression to Relationships Containing Auto-Correlated Error Terms"', Journal of the
American Statistical Association, V o l . 44, pp. 32-61.
C O O L E Y , THOMAS F . , and S T E P H E N F . L E R O Y , 1985. "Atheoretical Macroecono-metrics: A Critique", Journal of Monetary Economics, Vol. 16, pp. 283-308.
C O O P E R , R I C H A R D L . , 1972. "The Predictive Performance of Quarterly Econometric Models of the United States", in Bert Hickman (ed.), Econometric Models of Cyclical
Behavior, N B E R Studies in Income and Wealth, Vol. 36, pp. 813-926, New York:
C U T H B E R T S O N , K E I T H , 1988. "The Demand for M l : A Forward-Looking Buffer Stock Model", Oxford Economic Papers, Vol. 40, pp. 110-131.
D A V I D S O N , J A M E S E . H . , and D A V I D F . H E N D R Y , 1981. "Interpreting Econometric Evidence: The Behaviour of Consumers' Expenditure in the U K " , European Economic
Review, Vol. 16, pp. 177-592.
D A V I D S O N , J A M E S E . H . , D A V I D F . H E N D R Y , F R A N K S R B A , and S T E P H E N Y E O , 1978. "Econometric Modelling of the Aggregate Time Series Relationship Between Consumers' Expenditure and Income in the United Kingdom", Economic Journal, Vol. 88, pp. 661-692.
D E A T O N , A N G U S S., 1974. "A. Reconsideration of the Empirical Implications of Addi tive Preferences", Economic Journal, Vol. 84, pp. 338-348.
D E A T O N , A N G U S S., and J O H N M U E L L B A U E R , 1980. "An Almost Ideal Demand System", American Economic Review, Vol. 70, pp. 312-332.
DO AN, T H O M A S , R O B E R T B. L I T T E R M A N , and C H R I S T O P H E R A. SIMS, 1984. "Forecasting and Conditional Projection Using Realistic Prior Distributions", Econo
metric Reviews, V o l . 3, pp. 1-100.
E N G L E , R O B E R T F . , and C L I V E W.H. G R A N G E R , 1987. "Co-Integration and Error Correction: Representation, Estimation and Testing", Econometrica, Vol. 55, pp. 251-276.
E N G L E , R O B E R T F . , D A V I D F . H E N D R Y , and J E A N - F R A N C O I S R I C H A R D , 1983. "Exogeneity", Econometrica, Vol. 51, pp. 277-304.
F A V E R O , C A R L O , 1989. "Testing for the Lucas Critique: An Application to Consumers' Expenditure", University of Oxford, Applied Economics Discussion Paper #73. F R I E D M A N , M I L T O N , 1953. "The Methodology of Positive Economics", in Milton Fried
man, Essays in Positive Economics, Chicago: University of Chicago Press, pp. 3-43. F R I S C H , R A G N A R , 1933. "Editorial", Econometrica, Vol. 1, pp. 1-4.
F R Y , V A N E S S A C , and PANOS P A S H A R D E S , 1988. "Non-Smoking and the Estimation of Household and Aggregate Demands for Tobacco", Institute for Fiscal Studies, pro cessed.
G E W E K E , J O H N , 1984. "Comment", Econometric Reviews, Vol. 3, pp. 105-112. G I L B E R T , C H R I S T O P H E R L . , 1986. "Professor Hendry's Econometric Methodology",
Oxford Bulletin of Economics and Statistics, Vol. 48, pp. 283-307.
G I L B E R T , C H R I S T O P H E R L . , 1989. " L S E and the British Approach to Time Series Econometrics", Oxford Economic Papers, Vol. 41, pp. 108-128.
G O R M A N , W.M. ( T E R R E N C E ) , 1953. "Community Preference Fields", Econometrica, Vol. 19, pp. 63-80.
G O R M A N , W.M. ( T E R R E N C E ) , 1959. "Separable Utility and Aggregation", Econometrica, Vol. 27, pp. 469-481.
G O R M A N , W.M. ( T E R R E N C E ) , 1968. "The Structure of Utility Functions", Review of
Economic Studies, Vol. 21, pp. 1-17.
G R A N G E R , C L I V E W.J., 1987. "Implications of Aggregation with Common Factors",
Econometric Theory, V o l . 3, pp. 208-222.
G R A N G E R , C L I V E W.J., 1988. "Aggregation of Time Series - A Survey", Federal Bank of Minneapolis, Institute for Empirical Macroeconomics, Discussion Paper # 1 .
G R U N F E L D , Y E H U D A , 1958. "The Determinants of Corporate Investment", unpublished Ph.D. Thesis, University of Chicago.
G R U N F E L D , Y E H U D A , and Z V I G I R L I C H E S , 1960. "Is Aggregation Necessarily Bad?",