•
l
1966Institute of Statistics Mimeograph Series No. 505
Raleigh
1966
COMBINING TIME SERIES AND CROSS-SECTION DATA IN SIMULTANEOUS ~INEAR EQUATIONS
By
ASHIQ HUSSAIN
HUSSAIN, ASHIQ. Combining Time Series and Cross-Section Data in
parameters.
Simultaneous LinearEquations~·(Under the direction of THOMAS DUDLEY WALLACE) •
This thesis is concerned with the estimation of parameters in a
system of· simultaneous linear equations,by the combined use of time
series and cross-sectional data. An error component disturbance model
is postulated: the disturbance term in each equation of the system is
assumed to have three mutually independent random components - one
associated with time, another associated with cross-sectional units, and
a third one representing.measurement error~ Usual distributional
properties are.ascribed to. these error components, and methods of
estimation appropriate for the model are developed.
A two-stage estimation procedure is given for the reduced-form
*
In the first.stage, covariance matrices L I (~,~'
=
),l~1, 2, 3, ••• , M) .of the reduced-form disturbances are estimated from
the ordinary least squares -residuals. In the second stage, two sets
of estimators for the reduced-form par~eters are derived: (i) single
equation estimators which result when Attken's two-stage method is
applied to each reduced-form equationpeparately; and (it) generalized
estimators obtained from the entire set of reduced-form equations.
Both sets of estimators compare favorably with the ordinary least squares estimators.
For the estimation of structural ,parameters, two methods designated
as the "Two-Stage.Generalized.Least.Squares Method" and the "Three-Stage.
simultaneous linear equations. For a system of exactly identified
equations, a third method called the "Indirect Generalized Least 8quares
Method" is also available. All three methods are based upon the
covariance matrices of reduced-form disturbances; the estimation of
covariance matrices of structural disturbances is avoided. The estimators
of structural parameters obtained by these methods are found to have
some optimal large sample properties.
Finally, some special cases, including the dummy variable model, are
considered. In this last case, it is found that, although BLU estimators
can be obtained for reduced-form parameters, the estimation of structural
parameters is made difficult by the fact that the number of
•
IN SIMULTANEOUS LINEAR EQUATIONS
by
ASHIQ HUSSAIN
A thesis submitted to the Graduate Faculty of North Carolina State University
at Raleigh
in partial fulfillment of the requirements for the Degree of
Doctor of Philosophy
DEPARTMENT OF EXPERIMENTAL STATISTICS
RALEIGH
1966
APPROVED BY:
Born:
Married:
Previous Work:
Undergraduate:
Graduate:
Employment:
BIOGRAPHY
Sargodha, West Pakistan January 1, 1924
Fahmida, June 27, 1954
B.A. (Mathematics) 1947
University of Punjab, Lahore, Pakistan
M.A. (Mathematics) 1950
University of Punjab, Lahore, Pakistan
M.A. (Economics) 1962
University of Peshawar, Peshawar, Pakistan
M.E.S. 1965
North Carolina State University, Raleigh
Lecturer in Mathematics
Murray College, Sialkot, Paki~tan
and
Pakistan Military Academy, Kakul, Pakistan 1950 - 1954
Senior Lecturer in Mathematics
-ACKNOWLEDGMENTS
I want to acknowledge my indebtedness to Dr. Thomas Dudley Wallace,
Chairman of my committe~, who inspired my interest in the subject of
simultaneous linear equations and from whom I received unfailing
guidance and encouragement throughout my stay at North Carolina State
University.
I am also thankful to the other members of my committee for their
advice and helpful criticism.
I must thank the Agency for International Development, Department
of State, United States Government for financing my education here and
the faculty and staff of the Department of Experimental Statistics whose
cooperation and courtesy made my stay a pleasant one.
A special word of thanks should go to Mrs. Jo Ann Beauchaine for her
excellent typing of this thesis.
Lastly, lowe a debt of gratitude to my wife, Fahmida, who had to
shoulder the burden of supporting our children for three years that I
was in the United States, earning nothing, but who never failed to s~nd
a word of encouragement from across the seas, and to my father, who
1.0 INTRODUCTION • • • . •
TABLE OF CONTENTS
Page
1
2.0 ORDINARY SIMULTANEOUS LINEAR EQUATION MODELS ••
2.1 Stochastic Specifications •• 2.2 Identification • • • • • • • 2.3 Reduced-Form Estimation •• 2.4 Structural Estimation ••
5 5 7 10 14 2.4.1 2.4.2 2.4.3 2.4.4 2.4.5 2.4.6
Indirect Least Squares (ILS). Two-Stage Least Squares (2-SLS) • Three-Stage Least Squares (3-SLS) • Limited Information Single Equation
(LISE) or Least Variance Ratio
Method. . . . • • • . . . . . • .
K-Class Estimators • • • • • • • • • • Full Information Least Generalized
Residual Variance Method. •
.
.
. .
14 15 17 19 21 223.0 SOME THEOREMS ON STOCHASTIC CONVERGENCE 23
3.1 3.2 3.3
Useful Inequalities • • • • • • • • • • • Stochastic Convergence, Univariate Case. • Convergence of Sequences of Random Vectors .
23 24 30
4.0 ERROR COMPONENT MODEL: REDUCED-FORM ESTIMATION • • •
4.1 Description of the System • • • • • • • 4.2 Estimation of Covariance Matrices of
Reduced-Form Disturbances • • • • • • 4.3 Some Useful Results • • • • • • • • • 4.4 Estimators of Reduced-Fo~Parameters ••
5.0 ERROR COMPONENT MODEL: STRUCTURAL ESTIMATION OF AN EXACTLY IDENTIFIED SYSTEM • • • • •
5.1 Notation. . • . • . • • • • •
5.2 Indirect Generalized Least Squares 5.3 Two-Stage Generalized Least Squares •• 5.4 Three-Stage Generalized Least Squares ••
TABLE OF CONTENTS (continu~d)
Page
6.0 ERROR COMPONENT MODEL: STRUCTURAL ESTIMATION OF
OVERIDENTIFIEDSYSTEMS • • • • • • • • • • • •
. .
.
. . .
'. 796.1
6.2 6.3 6.4
Inadequacy of Indirect Generalized Least Squares Method. • 0 • • • •
Two Lenunas. • 0 0 0 • 0 • .. • • • • • • •
Two-Stage Generalized Least Squares Estimators •• Three-Stage Generalized Least Squares Estimators.
.
.
798083
88
7.0 ERROR COMPONENT MODEL: SOME SPECIAL CASES.
7.1 Fixed Cross-Sectional Effects • • •
. . .
.
.
~93
93
7.1.1 7.1. 2
Estimators of Reduced-Form Parameters. • • Structural Estimation. • . • • • • •
95 101
7.2
Cross-Sectional and Time Effects Random with Finite Nonzero Expectations .7.3 Dummy Variable Specifications.
8.0 SUMMARY AND CONCLUSIONS • • •
LIST OF REFERENCES • • • • • .
102 104
106
This thesis is concerned with the estimation of parameters in a
system of simultaneous linear equations by the combined use of time
series and cross-section data. The problem was suggested by Hildrethl
in 1950; but so far as the present writer knows, it has received little
attention. Most of the work done thus far on simultaneous equations
is based on time series data. The model which is ordinarily used is:
,
,
,
.YtA+~B+~=Q, (1.1)
,
where Zt = (Ylt' Y2t' ••• , YMt) is the vector of observed values of M
. th '
endogenous variables for the t time period; ~ = (x
lt' x2t' ••• , xKt) is the vector of observed values of K exogenous variables for the tth
,
time period; ~t = (ult' u2t' ••. , ~t) is the vector of values of M unobservable random variables in the system, specified by the relations
E(uj.lt)
=
0 for all t=
1, 2,i
(J,
(finite) , i f t,
=
t,E(u t u , ,)j.l j.l t
=
j.l].l,
,
0, i f t
+
t, (j.l,].l=
1, 2, ••• , M). A is an MxM nonsingular matrix of constants with diagonal elementsequal to - 1; and B is a KxM matrix of constants.
A few cross-sectional studies have been made using the same general
model (model (1.1)) but data consisting of observations for different
cross-sectional units.
But no attempt has been made to pool time series and cross-section
data for the estimation of simultaneous equations.
In the classical single-equation system problem two different models
have been suggested for combining time series and crosssection data
-(i) the "dummy-variable" model, and (ii) the error-component model.
In the "dummy-variable" model, time and cross-sectional effects are
assumed constant. A "dummy-variable'} version of the simultaneous linear
equation system will be the following:
,
,
' "
,
~t A
+
~t B+
4
+
~+
~t = .Q. ,where A and B are as defined in (1.1):
(1.2)
~t is an Mxl vector of observations on the endogenous variables for
h ,th . d t h . . d
t e 1 cross-sect1on an t t1me per10 ;
~t is a Kxl vector of observations on the exogenous variables for the
. th . d t h . . d
1 cross-sect1on an t t1me per10 ;
A, is an Mxl vector of constant effects associated with the ith cross--:J..
section;
~
is an Mxl vector of constant effects associated with the tth timeperiod; and
~t is an Mxl vector of values of unobservable random variables
specified by the relations
E(uIJ1't) = 0,
)
"
(J '(finite) , i f i = i and t = t,
E(u. u . '
'=
IJIJIJ1t IJ1 t ) ,
0, otherwise for IJ,IJ = 1,2, ••.,M.
The model is unsuitable for two reasons. First, it does not take
cognizance of random effects which may cause variations from time to
time and from one cross-sectional unit to another. Second, the number
of cross~sectionalunits as well as the number of time intervals must be
But this very assumption stands in the way of structural estimation. For
a subset of explanatory variables in each structural equation consists of
endog~nQus variables which are correlated with the error term. Conseqpently
the principle of least squares cannot be applied.
In the error-component model, i~ is assumed that the error term has
three mutually independent random components - a component associated with
time, another associated with cross-sectional units, and a third one
representing measurement error. We adopt this model. We assume that the
di13turbance term of the lJth equation of the system is of the form
(1.3)
where u
lJi' VlJt and WlJit are mutually independent random variables with zero means and
E(u . u ' . ' )
=
In lJ 1.
E(v v " )
=
t tIt is further assumed
normally distributed.
i
aU , (finite) if i i, lJlJ,
0, if i
+
i for lJ,lJ=
l,2,3, ..• ,M;).o~"' (f~nite)if
t' - t,10,
if t+
t for lJ,lJ=
l,2,3, ••. ,M;,
,
) a~lJ' (finite) i f i = i and t = t,
10'
otherwise, for lJ,lJ' = 1,2,3, ... M. that the components of the error terms areOur model is, therefore,
,
~t A+
,
~t B
+
-4t
=°
(1.4)and disturbances E
it = (£lit' £2it' ... , £Mit) as specified in (1.3).
A good rationalization of this type of error structure is given by
estimating the parameters of the system of equations - that is, the
elements of matrices A andB - which are appropriate to this error model.
It will be shown in the sequel that our methods yield estimators which
have some optimal properties.
The plan of this thesis is as follows: In Chapter 2.0 we present a
review of the existing material on the subject of simultaneous linear
equations, discussing briefly some of the well-known methods of
structural and reduced-form estimation. All these methods are based on
model (1.1). This is followed by a chapter on stochastic convergence;
a number of simple results on the convergence of sequences of random
variables are given, which are helpful in the derivation of large-sample
properties of estimators. The remaining chapters are devoted to the main
topic - estimation of the reduced-form and structural parameters in the
2.0 ORDINARY SIMULTANEOUS LINEAR ~EQUATION MODELS
2.1 Stochastic Specifications
By far the most important contributions to the subject of
simulta-neous linear equations are those which use the model (1.1). We have a
rich and growing body of literature in this area, and a variety of methods
for ~tructural and reduced-form estimation are available. Excellent
treatment of these methods is given by Theil [17], Goldberger [8] and
Johnston
[9].
We review some of these methods briefly here merely toprovide a frame of reference for what is to follow in succeeding chapters.
Let T be the number of observations made on the observable variables
of the system (1.1). We can then write the system of equations compactly
as
where
YA + XB + U = 0 (2.1)
Y is a TxM matrix of observations on M endogenous variables;
X is a TxK matrix of obse+vations on K exogenous variables; and
U is a TxM matrix of values of M unobservable random variables
(ul ' u2 ' ••• , uM)·
The equation (2.1) will be referred to as the structure, while the
equations
Y=XIT+V, ...
where
-1 IT
= -
BA ,V
= -
UA-1,will be ~alled the reduced-form of the system.
The exogenous variables ~
=
(xlt' x2t' ••• xKt) are determined out-side the system. The observation matrix X=
(Xl' X2' ••• ~) is assumed to be generated by some mechanism independently of the disturbance so that
E(ulx)
=
E(U)=
0E(Ylx)
=
E(Y)It is further assumed that
EX(XtX~)
= LXX is nonsingular so that Plim(x~x)
exists and is positive definite.
It has been shown that the specifications given above give the same
asymptotic results as would be given by the specification that (xl"'~)
(2.3)
we
are nonstochastic but subject to the condition that the ordinary limit,
i~ (x~x)
exists and is nonsingular.In what follows we shall, for sake of convenience, ignore the process
generating observations on the exogenous variables, regarding them as
nonstochastic but subject to the condition (2.3).
It is clear that
(2.4)
0"11 0" 12 O"IM
0"21 0"22 0" 2M
= = L
uu
O"MI O"M2 O"MM
,
th
and that u , the disturbance vector of the u structural equation has
tJ
covariance matrix
,
E(u u)
=
a ItJ tJ ].1].1 T
, ' - 1
Continuing, we note that ~ = ~t A so that E(~) = 0 and
Let v be the disturbance vector of ].Ith equation in the reduced-form
tJ
(2.2). Clearly
1,2, •••,M) are elements of a].lr u where a].lr (r =
-r'
v
=
-u
M
L
r=l
th -1
the].l column of A •
(t
=
o
for(2.6)
a ') rr '
, ,
].Ir ].I r
a a
1,2, •••,M)
M
L
r=l
,
t
+
t M = (,2:r =1 v].lt
E(v v " ) = ].It ].I t
Or
so that
-e
,
(].I,].I = 1,2, ••• ,M) which shows that the reduced-form disturbances are
temporally uncorre1ated.
2.2 Identification
We assume that the system (1.1) is identified - all underidentified
equations are deleted from the system. This means that given IT there is
at least one solution of
tions - that is, some of the elements of A and B are zeros.
-e
There must be a sufficient number of ~priori restrictions on the
elements of A and B if (Z.7) is to have at least one solution. We have
already imposed one restriction, namely, that diagonal elements of A are
equal to -1. Let the remaining restrictions take the form of
zero-restric-*
Let Q\ and
--,.I
*
th8
be the ~ column vectors of Aand B, respectively. Let us assume--,.I
*
that t elements of Q\ are nonzero.
~ --,.I
th
One of these nonzero elements (the ~ element) is -1. Therefore,
rearranging these elements we can write
*
[
~].
0Q\ = where Q\ is t xl, having one element equal to -1
--,.I --,.I ~
and all other elements nonzero.
*
Similarly, assuming that K~ elements of ~ are nonzero, while all
the remaining elements are zeros,we can write
where 8 is K xl.
--;.l ~
From (Z.7) it follows that
* *
-8
=
IT Q\--,.I --,.I (Z.8)
Partitioning IT as
IT
=[rr~l
ITZI
IT~ . K xt
ll~s ~ ~
~ is (I\-K )xJ/, IT
ZI ~ ~
~ is K x(M-t ) and IT
IZ ~ ~
,
n~22 is (K-K )x(M-t )
~ ~
and using (2.8), we have
(2.9)
The
o
=
0equation in the structure is
*
*
Ya
+
xa
+
u = O.11 11 I.l
-+
u = O./.l
]J
Here Yis the matrix of (t~-l) endogenous variables other than Y]J in-th
eluded in the]J structural equation~ and X is the matrix of ]J
observations on K exogenous variables included in this equation. ]J
-e
(~)
Or y
=
Y a+
X!
I+
uI J I.l ~... I.l (2.10)
Suppose that we know
n.
o
If the equations (2.9) are solvable for a I.l th
and
a ,
we can immediately find the parameters of the]J structural I.lequation (2.10).
Now, by a theorem of matrix algebra (Goldberger [8], p. 23), a
necessary condition for the equation
to have a solution that
K - K]J -> t]J-1
or
K > K
+
t -1The solution of the equation
o
when substituted for a in the first equation of (2.9) will yield a value
-;.t
of~. Thus, given a knowledge of reduced~formparameters IT, we can at th
once determine the ~ structural equation.
Three situations arise:
When K = K~
+
~~-1, the nullity of the (K - K~)x~~ matrix IT~lis exactly equal to 1 so that there is a unique solution (a ,6 ) of
-;.t -;.t
th
equation (2.9), and the ~ structural equation is exactly identified. When K > K
+
~ -1, the nullity~ ~
case, the equations (2.9) admit more than
~ of IT
21 exceeds 1. In this
· 1 ' d h th
one so ut~on an t e ~
-e
structural equationisoveridentified.Finally, when K < ~
+
K -1, equations (2.9) have no solution~ ~
and the
~th
structural equation is said to be underidentified. The underidentified relations are indeterminate so that only the first twocases are relevant to the estimation problem.
2.3 Reduced-Form Estimation
The reduced-form equations are
y
=
XIT+
vJ.l I.l I.l (~
=
1, 2, ... , M)th
where IT is the ~ column vector of IT. ~
We have seen that
E(V~t v ') = 0 for ~ = 1, 2,
...
,
M ~tM M
and E(v~t V~t) = w = ,L L a~r (1 ar ~ for all t so that
~~ r =1 r=l rr
Hence the ordinary least squares method applied to each reduced~form
equation yields unbiased estimates.
We have, therefore
, 1 ' , 1 '
II
=
(X X)- X Y=
II+
(X X)- X v~ ~ ~ ~
which have ,covariance matrix
(2.13)
,
E(II - II ) (II - II )
~ ~ ~ ~
, -1
=
(X X ) w .j..l~ (~ = 1,2, ... , M). (2.14)
It will be shown that if we take all the reduced-form equations and
simultaneously estimate !1' !2' ••• , ~ we obtain the same estimates as are given byequation-by-equation ordinary least squares method.
Write the reduced~formequations in the form
::L=
::L1
= XIII+
v1 (2.15)
'e
::L2
X!2 v2~ X~ ~
If we write
X
= X 0 0 00 X 0
0 0 X
0 0 X
MTxMK,
then the equations (2.15) take the form
where II
=
v=
~
~MK+l TMxl
,
E(y' v ) = wll1 w
121 wlM1
w
2l1 w221 w2M1
wMl1 wM21 wMM1
w
ll w12 wlM
= w2l w22 w
2M 001 = fl
<ID!.
(2.17)-e
w MM
By Aitken's generalized least squares method we have
].=
These estimates are B.L.U. Denoting the elements of fl-l by wjk
e
wll(X 'X) w12(X'X) w1M (X' X)-1 M
Ij
,
j~1 w X~
w21(X' X) w22(X'X) w2M(X'X)
M
2'
,
l!. = ,1:1 w J XY..,
J= J
wM1(X'X) wMM(X'X) ,1:M M'
,
1 w J XY..'
J= J
, -1 ' -1 ' -1 M 1j
w
ll(X x) w12(X X) ••• w1M(XX) .
j~l
w X~, -1 ' -1 ' -1 M 2 '
= w
21(X X) w22(X X) •.• w2M(X X) J=,1:1 w J X~ (2.18)
M
1: wM'J X,Y..,
J
"e
Since ~1:1M w ,wJ'R, =.~1'
J= ].lJ 0,
we have from (2.18)
if ].l = R,
if].lfR,
(2.19)
= which shows the classical least
squares method applied to each equation
Further, the covariance matrix of IT
~
separately yields B.t.ll. estimators.
,
( , )-1
wis (X X) -1 w
=
U
..l:!J:!. which].l].l T T
2.4 Structural Estimation
2.4.1 Indirect Least Squares (ILS)
Suppose that we have estimated
n
byn.
n
is clearly consistent. Let us recall the equations (2.9):If the equations are exactly identified,
Q, -1 II
-e
and K - K
=
Q, -1 II IITherefore the nullity of n
21 which is(K - Kll)x Q,ll matrix is exactly equal to 1, and there is a unique solution of
A
0 0 0
Denote it by a . Then a is a consistent estimate of a . Therefore
"jJ J.l "jJ
= so that
A 0
1,
= - nll a is a consistent estimate ofS .
,. 11 "jJ J.l
This holds for all II = 1, 2, ••• , M.
Thus in order to find consistent estimates of structural parameters
-e
Now if K > KlJ
+
!/,lJ-l, the nullity of IIilexceeds one, and thereo
are more than one consistent estimates of a corresponding to a cons
is--j..I
"lJ lJ tent estimate II
21 of II21• Each of these estimates substituted in
will give a consistent estimate of
S •
One has to arbitrarily discard11
o
all but one of the estimates of a in order to obtain a consistent
11
estimate of the lJth structural equation in the overidentified case.
2.4.2 Two-Stage Least Squares (2-SLS)
The two-stage least squares method was developed by Theil [17] and,
independently, by Basmann [3, 4]. The method avoids the arbitrariness
and loss of efficiency involved in the ILS method when the system is
over-identified.
Suppose we estimate the reduced-form parameters by ordinary least
sq1-1ares method applied to each reduced-form equation separately. These
est;imates have been shown to be B.L.U. and consistent. Therefore
, l '
~
=
X(X X)- X~m
=
1, 2,...
,
M, andis a consistent estimate of XII , for lJ
'" I -1 I I 1 I
Ym
=
~-
X(X X) XYm
=
(IT - X(X X)- X ) ~ (2.20)is a consistent estimate of v (m
=
1, 2, ••• , M).-m
In the reduced-form estimation we obtain the estimator ylJ and the
e$timator V of the corresponding matrix of reduced-form errors so that lJ
and (2.10) can be written in the form
(lJ)
Y. = Y ex
+
X11,
+
(u+
Vex)IJ "j.l l J . . IJ lJ "j.l
(2.22)
The two-stage least squares estimators 0
"j.l parameters[
t]_
..
(lJ) , Yo
=
"j.l X lJ..
-1(lJ) (lJ) ,
Y Y X
lJ
(lJ)
Y X X
lJ lJ
~ [~]
of structural..
(lJ) , Y;
(2.23),
X ~It has been shown that these estimators are consistent and that
their asymptotic covariance matrix is
(J ~ T
..
1((~)'
Plim TT-+oo. X'
lJ
..
1((~)'
P1imT
T-+oo ' X lJ..
(].1) , )
Y X lJ
,
X X lJ lJ (2.24)which is consistently estimated by the matrix
s lJlJ
..
(lJ) , Y XlJ XlJ
. -1
X")
(2.25) X lJ where (2.26)An alternative derivation is the (lJ)
Writing H =(Y,X) and premu1tip1ying
lJ
,
lJ,
,
X Y. = X H 0
+
X u lJ IJ lJ "j.l IJfollowing:
,
,
The covariance matrix of the transformed error vector is a (X X)
llll
,
,
which is not diagonal. Ignoring the correlation between X yll and X u
f.l
(Which is weak) and applying Aitken's method, we obtain
, , l ' l ' , 1 '
o =
[H X(X X)- X H]- H X(X X)- X Y-V II II II II (2.27)
,
If the equation is exactly identified, X H is a square matrix. If II
it is also nonsingular, (2.27) simplifies to
"
' - 1 '
o
=
(X H) X v .-V II f.l
2.4.3 Three-Stage LeastSguares (3-SLS)
(2.28)
--The three-stage least squares method was developed by Zellner and
Theil [20]. Zellner [19] has shown that if the error terms in a set of
regression relations were contemporaneously correlated and if the
regressors in different relations were different, Aitken's method applied
to the entire set of relations would yield better estimates than the
ordinary least squares method applied,to each equation separately. The
three-stage least squares method utilizes the same principle.
(ll) HI
Writing H
ll
= (
y ,Xll), H=
0o
o
X 0, X
=
0 Xo
o
X
y
=
Y2 , u=
u2 and 0=
, and premultiplying each structural,
equation by X , we obtain
,
,
,
,
,
8ince E(X ~~ X)=
,
(X L: X)t Aitken's generalized least squares method uu
is appropriate. ConsequentlYt we have
" l ' 1 " l '
o = [H X(X L: . X) - X H]- [H X(X L: X) - X y]
uu uu
which reduces to
o
=
lL_' ' - 1 '
cr 1H1X(X X) X HI
ZL_' ' - 1 '
cr IHZX(X X) X HI
crlZH~X(X'X)-lX'HZ
crZZH~X(X'X)-lX'HZ
-1
.
MMu'v
'
l ' cr L~(X X)- X~-e
(Z.Z9)
Replacing crjk by their estimates Sjk obtained in 2-8L8 method t we obtain
the three-stage least squares estimators
o
==lL_' ' - 1 '
s 1H1X(X X) X HI
ZL' ' - 1 '
s IHZX(X X) X HI
SlZH~X(X'X)-lX'HZ
SZZH~X(X'X)-lX'HZ
...
MMu'v
'
l ' s ~(X X)- X ~x
M Mj' ' - 1 '
Zellner and Theil have shown that these estimators are consistent and
that their asymptotic covariance matrix is
1
lim E[T (§..-.0 ('§"-.0 ' ]TT~
=
t
Plim T T~allH~X(X'X)-lX'Hl
a 2lH;X(X'X)-lX'H l
alMH~X(X'X)-lX'~
a2MH;X(X'X)-lX'~
MMu'
,
l 'a ~~X(X X)- X ~
-1
(2.31)
"-which is consistently estimated by the matrix Qn the right-hand side of
(2.31). It has been, futther, shown that these estimators are
asymptotically more efficient than the two-stage least squares estimators.
It is, of course, easy to see that if all the equations are exactly
identified, the 2-SLS and 3-SLS give identical estimators.
2.4.4 Limited Information Single Equation (LISE) or Least Variance Ratio Method"
The two methods are essentially the same, though these were
developed under different assumptions - the former by Anderson and
Rubin [1, 2] under the assumption of normality of structural disturbances,
and the latter by Koopmans and Hood ~O] without normality assumption. Consider the first structural equation
1
Zl = y (Xl + Xl~l +.!!1 so that
*
.I.I =Partition X - the matrix of observations on all exogenous variables
in the complete system of equations - as X
=
(Xl' X*
is regressed on Xl to the residual variance when Y1 is regressed on X is
as small as possible - that 'is, the addition of excluded exogenous
variables X
2 should make a minimum improvement to the explained sums of squares. Let t denote the ratio of the two unexplained sums of squares.
We have
t
=
[I
-[I - X(X X), - 1 'X] Y1
*
(2.32)
= -
= - (say) •0'
1'*a y Therefore, t =
0'
1'*
~1 y
o ,
* 0
~1 W1 a1
-e
= 0* 0
a
1 W ~1 .,."~.
o
Minimizing with respect to ~1' we obtain
(2.33)
This equation has a nontrivial solution if and only if the nullity
*
*
of the matrix W
1 - t W1 is at least equal to 1 - that is, if and only if
(2.34)
This is a polynomial of degree T in t. If t is the smallest of
the roots of (2.34), we can solve
o
for a
1 , obtaining an estimate a1 of ~1'
(2.35)
*
The next problem is to find an estimator ~1 of ~1' Regressing Y1Thus Al is estimated by
A
A I -1 I 1* ~
A
=
(XlXl ) XlY~l' (2.36)where
;
I-~
~l
=
l~J
2.4.5 K-Class Estimators
Theil [24] defines the K-class estimators by the relations
-e
(].i)I (].i)
(")' J
(].i) IY Y - K V V Y X a Y - KV
].i ]1 ].l -'\l = ].i
~, (2.37)
I (].i) I
X Y X X
~
X]1 ].i ].i ].i
(ll)
(]1
=
1, 2, .•• , M), where Y and V have been defined before, and K is a ].iscalar which may be a constant. Or it may be a stochastic or
non-stochastic variable.
When K
=
0, we h.ve the ordinary least squares estimators;whenK
=
~, we have the LVR estimators, andwnen K
=
1, we obtain the 2-8L8 estimators.It has been shown by Nagar [14] that if P1im (K~l)
=
0, the K-c1ass T-+<><>P
estimaters are cons:f,.st'int; further, i f
IT""
(K-1)--7>
0, have(2.38) -1
X X
]1 ].i (].i) I
Y X
]1
P1im T-+<><> (1
]1].i
asymptotic covariance matrix
[
(].i) I(]1) AI A
Y Y - V V,
]1 ].i
I (]1)
X Y
2.4.6 Full Information Least Generalized Residual Variance Method
This method was developed by Koopmans, Rubin and Leipnik [11] under
the specification that the structural disturbances are normally
dis-tributed. While in the limited information least variance ratio method
only the restrictions on a single structural equation are utilized, this
method uses the restrictions on the entire system of structural equations.
It is a method for simultaneous estimation of all structural parameters.
In order to estimate the reduced~formsystem
Y = XII + V, we minimize
-e
I ' I
- XII) (Y - XII).
- (V V) = - (Y
T T
minimize
t
,
We may as well log I(v V)
I.
But1
,
II 1-
2 I,
- log I(v V)2
I
=2'
log A +2'
loglu ul·
(2.39)From the
or
structural equations,
(Y,X)
[~]
+
U=
0 Z A+ U = 0 , we haveI '
2'
log Iv vi = -log IAI +2'
I loglA'2 2
, AI (2.40)This quantity is minimized subject to the constraints on A =[AJ
B. '
to obtain estimates of structural parameters. The method has not beenextensively used, because on equating to zero, the partial derivatives of
,
log Iv vi with respect to the elements of A are, confronted with a complex
3.0 SOME THEOREMS ON STOCHASTIC CONVERGENCE
3.1 Useful Inequalities
This chapter concerns the convergence of sequences of random
variables. Using the standard notation, we shall give some theorems
on stochastic convergence which will be useful in the sequel.
First, we state a few inequalities:
(i) Let xl' x2' ••. , xn; Yl' Y2, •• ", Yn be nonnegative real
-e
numbers. Then
1 1
n n
-
n~ xi Y
i ~ (~ x.p)p ( ~ y.q)q (Holder)
i=l i=l 1 i=l 1
where p > 1 and -1
+ -
1 = 1.p q
(ii) For arbitrary real xi' Yi (i = 1, 2, o • • , n)
(3.1)
(Cauchy-Schwarz) (3.2)
(iii) If X and Yare random variables,
1.
EIXYI ~ [ElxIP]p (Holder) (3.3)
for p > 1 and
1. + 1.
= 1, provided that the expectations exist.p q
(3.4)
••• = 1.
following:
1
[Elzlr]r ...
,1.+1.+1.+
p q r
(3.3) is the
1
[EIYlq]q
An extension of
1.
E
I
XYZ ••I.~JE (I
XI
p) ] p(iv) Let X be a random variable and c a positive constant. Then
< EIXlr
p[IXI > c] ~ for all r > O. r
c
(Markov) (3.5)
(v) Let X and Y be two random variables and e: > 0 an arbitrary
number. Then
p[IX-YI > e:] < E[X-1]2
e:
(vi) For any random variable X,
I
E(X)I' .::.
EI
X! provided EX exis t s •(For proof of these inequalities see Rao [15]' and Loeve [12].
3.2 Stochastic Convergence. Univariate Case
Let Xl' X2' "', Xn ' be a sequence of random variables. A
particular member, X , of this sequence has distribution function n
(3.7)
F (x) which depends upon integer n and moments (if existing) which also n
depend upon n. The various modes of convergence of sequences of random
variables seek an answer to the question: What,happens to the random
variable X , its distribution function F (x) and it~ moments when n gets
n n
large?
Definition 1. A sequence {X
n} converges in distribution to the random variable X, if the distribution function F (x) of X converges
. n n
to the distribution function F(x) of X at all points of continuity of
F(x). We write X dist.
'>
X. nDefinition 2. A sequence {X } of~'random variables converges in n
probability to a constant cif
for
every E > 0 lim p[IX nn -+co
p
We write X --~ c, or P1im X = c.
n n-+co n
cl
> E] = O.lim P [
I
X - XI
> E] = O. nDefinition 3. A sequence of random variab~es {X} is said to n
converge in probability to a random variable X if for every E > 0
P
Wewri te X --;> X. n
Convergence in probability implies convergence in distribution.
'e
Theorem (3.1). Let {X } be a sequence of random variables and X
I n
2 2
a random variable such that EX < 00, EX < 00 for all. n, and
n
lim E(X - X)2 =
o.
nn-7<lO
Then (i) X
2-..>
X·n
,
(ii) lim EX
=
EX; nn-7<lO
(iii) lim E(X - EX )2
=
E(X - EX)2.n n
n-+oo
Proof: (i) See Wilks [18], p. 100.
(ii)
I
[EXn -
EX]I~
Elx
n -xl
~
IE(Xn - X)2, by (3.7) and (3.3).
Therefore, 0 ~ lim I (EXn - EX)
I
~ lim IE(X _ X)2 = 0n-+00 n-7<lO n
so that lim EX = EX, which clearly exists because E(X2) < 00.
n
n-7<lO
(iii)
Now E(X - EX )2
n n
Let EX = a, EX = a so that by (ii) lim a = a.
n n n-+oo n
2 = E(Xn - X+ X - a + a - a )' n
= E(X - X)2
+
E(X - a)2+
(a _ a)2n n
+
2E(Xn - X) (X - a)+
2(a - a ) E(Xn n - X)+
(a- a ) E(X - a) n= E(X - X)2
+
E(X - a)2+
(a _ a)2n n
+
2E[(Xn - X) (X - a)]'+
2(a - a ) E(Xn n - X).Since lim E(X - X)2 = 0, E(X - X) is bounded for all n.
n n Moreover,
lim
n-+oo IE(Xn - X) (X - a)
I
~ n-+oolim IE(Xn _ X)2 E(X _ a)2 = O.Therefore, lim E(X - EX )2 = E(X - a)2 = E(X - EX)2.
Corollary (3.1.1). Let X be a degenerate random variable. X= c
with probability 1. Then lim E(X - c)2
=
°
implies thatN~ n lim EX =
n
n~
c
=
Flim X •n
n~
Corollary (3.1.2). Let {X } and {y } be two sequences of random
n . n
n~ n~
variables such that lim E y2
n < 00, and lim E Yn exists. I f
lim E(X - Y )2 = 0, then
n~ n n
both exist and are bounded for all n. (i) lim E(X )
=
lim E Y ,n n
n~ n~
(ii) lim E(X - EX )2
=
lim E(Y _ EY )2n n n n
n~ n~
--
Proof: Since lim E(y2)n~ n
< 00, it follows that E y2 and E(Y - E Y )2
n n n
Now IE(Xn - Yn
)1
-< Elxn - YnI
-<I (
EX-Y )2n n
as n + 00, so that lim E(X - Y ) = 0. Therefore,
n n
which tends to zero
n~
lim E X
=
lim [E(X - Y+
Y )]=
n n n n
n~
n~ n~
lim E(X - Y ) + lim E Y =
n n n
n~ n~
lim E Y
n
This proves (i) above. Further,
E(X - E X )2 = E(X - Y + Y - E Y + E Y - E X )2
n n n n n n n n
=
E(X - Y )2 + E(Y _ E Y )2 + (E Y _ E X )2n n n n n n
+ 2E[(X - Y )(Y - E Y )] + 2(E Y - E X )E(Y - E Y )
n n n n n n n n
- (E X _ E Y )2
n n
Y )
n
_ E Y )2 n
+ 2(E Y - EX )E(X
-n n n
=
E(X - Y )2+
E(Yn n n
On taking limit and using the given condition, we obtain
lim E(X - EX )2
n n
=
n-+oolim E(Yn - EY )2n+
2 lim E(Xn-+oo n - Y ) (Yn n - EY ).nBut lim IE[(X
n
n-+oo Y )2 E(Yn n
=
o.
Therefore, lim E(X - EX )2
n n
n-+oo
The corollary is proved.
=
lim E(Y - EY )2.. , n n
n-+oo
Theorem (3.2). Let S be-a statistic and
B
a finite constant suchn
that for a given integer ~ > 3
lim E
Is -
13I
~ = 0.n n-+oo
Let X be normal (0, 0 2 ) where 0 2 < 00 and lim 0 2
=
0 2 •n n n n-+oo n
dist.
Then S X ~ normal (d, 132 0 2 ), and the mean and variance of n n
a X converge respectively to the mean and variance of the limiting
n n
distribution.
Proof: Using Holder's 'inequality, we have
P
Hence, by theorem (3.1), an
---:;>
S.Let X be normal (0, 0 2 ). Then Xn dist.
:>
X, and, by Slutsky'stheorem, (Cramer [7], p. 254) an ,nX dist.:'> BX, which is normal (0, 132 0 2). Further, E(a X)
=
E(S - S)X+
SEXn n n n n
'V
= E(8 - 8)X .
n n
which tends to zero as n goes to infinity.
Therefore, lim E(S - S)X
=
0, which implies that lim E Sn X=
0.n n n
Moreover, 'using Holder's inequality, we have
1 2 1 2
E X2 (S - S)2 < [E X6]3 [EIS - S13]3 = (15 06
)3
[EIS - S13]3n n n n n n '
which tends to zero as n ~ 00, and
which also tends to zero as n ~ 00.
Therefore,
lim E
X~ S~
= lim EX~
(Sn - S)2+
S2 lim EX~
+
2S lim EX~
(Sn - S)n~ n~ n~ n~
The theorem is proved.
Corollary (3.2.1). Let X and Sn n be specified as in theorem (3.2).
Let P(x) be a polynomial of degree p (3p ~ ~). Then lim Elp(S ) - peS) Ir =
n
n~
o
for 0 < r < ~ •- p
Further, y = X Pc'S) dist
.~
normal(0,
02 p2(S») and the firstn n n
two moments of y converge to the corresponding moments of the limiting
n
distribution.
Proof: By Taylor's theorem,
tV pes )
n
p
= pes)
+
L v=l v= d P(x) I_
dxv
x =
s.
where
Therefore,
[
p
J
(v)L
- pes)
I
r~
L P v! (S) IS n v=land the given conditions ensure that
lim Elp(S ) - pes)
I
r = 0 for 0 < r ~ ~/pn
The rest of the corollary.follows from the theorem (3.2), because 'U
P(S ) plays the same role as S .
n n
Corollary (3.2.2). Let X , 'USand P(x) be specified as in corollary
n n
(3.2.1). Let R(x) be a rational function of the form R(x) =
1~6(~)
Q(x) being any polynomial such that Q(x) > 0 for all real x. Thenlim EIR(Sn) - R(S)
I
r~
°
for 0 < r~ ~/p.
n-+oo
Further,
dist.
Z = X
R(~)
--:.>
normal [0, (J2 R2(S)], and the firstn n n
two moments of Z converge to the corresponding moments of the limiting
n
distribution.
,
Proof:' Evidently Q (x) is bounded for all real x so that [1 +:Q(x)2:
there exists a positive constant M< 00 such that
,
~<M.
[1
+
Q(x)]2 -By Taylor's theorem,1 1 +Q(S )
n
where
Y
n=
S+
e(Sn - S), (0~
e~
1). Therefore, 'UP(S ) - P(S)
n .
1
+
Q(S ) n, 'U P(S) Q (Y
n)
[1
+
Q(y
)]2, n
r
< c - r
pes ) -
P(S) E . n .'1
+
Q(S ) nr
+
cr, 'U
r Q (Yn ) 'U
I
P(8)I
E - - - : ; ; . - (8 1+
Q(y)
nn
- 8)
r
by c - inequality, c
. r r
[12], p. 155).
r-l
Using the results of corollary (3.2.1), we have
= 0 for 0 < r < ~ • - p
The remaining part of the corollary now follows immediately from
'"
the fact that R(B ) satisfies the conditions stated in theorem (3.2). n
3.3 Convergence of Sequences of Random Vectors
Y1
Let ~ = Y2 be a K-dimensiona1 ranQom vector defined on the basic
probability space (Q,a,P), the symbols having the usual meanings. We
de:l;ine
Definition 4. E ~ =
J
~ dP =Q
J
Y1dPQ
J
Y2 dPQ
The integral J~ dP is said to be finite (or existing), if each of
Q
the integrals on the right is finite.
In usual notation, we have
<
K
E
ly.l,
i=l ~
K
If
~ dPI
2.J
I~1
dP 2. EJ
I
yiI
dP •Q Q i=l Q
.
(3.8)
The following very useful result is probably not new, but a simple
.Yl
Theorem (3.3). Let ~= Y2 be a K-dimensional random vector.
Then for every E > 0,
,
P [Max
I
yiI
> E ]~
P [ I~
I > £] < E(~'1.~
l<i<K E
Proof: Since I~I < E is a sphere in K-dimensional Euclidean space,
I~I ~ E ==;>Max IYi
l
~ E.l<i<K
Or
Max IYi
l
> E ==>I~I > E.l<i<K
[
J
E(I~12)
Therefore, P Max 1Yi
l
> E ~ p[I~1 > E] < 2 =l<i<K E
Theorem (3.4).
,
E~~
2 E
(i) Let ~ be a sequence of K-dimensional random vectors and
~ a K-dimensional random vector such that
,
,
E(~~ <
00,
and lim E(~ - ~ (~ - ~ = 0.n~
.
l~ E(~) = E~, and
n~
,
lim E(~ - E~) ~ - E~) = E(~ - E ~)(~ - E~)n~
(ii) Let c be a Kxl vector of constants such that
lim E(~ - s) (~ - s) = 0.
n~
Then lim E ~ = Plim ~
=
.£.n~ n~
Proof: By Holder's inequality, we have
IE ~I ~ EI~I ~
f
E~'~ ,which is finite by assumption. Hence,
E~ exists. Moreover, it is evident that E(Z - E ~ ~ - E~) exists and
Now let y. and y. denote the corresponding elements of v and y
l.n l. -n
respectively. By the stated conditions, it follows that (y. - Yi) is l.n
always well-defined.
By theorem (3.3), we have
P [Max
I
y . - YiI
>e:]
<l<i<K l.n
,
E(.Y.n - ~ (.Y.n - ~
2
e:
, for every-e
,
e:
> 0. But lim E(.Y.n - y) (.Y.n - y) .;, lim tr. E(.Y.n - ~.~ - ~=
0,n~ n~
since K is fixed.
Therefore, lim P [Max !Yin - yil >
e:]
=
0, which implies thatn~ l.::.i.::.K
P Yn ---7>" y.
Further, lim E(.Y.n) = lim E(~ - ~
+
E{y).n~ n~
Since lim !E(.Y.n - y)
I .::.
lim EIXn - yl~
limtl
E( _ .. \ '(v _ )=
0,n~ n~ n~ ~ .:J..I ""'n Y
\ .
it follows that lim E(~ - ~ = Q,. and lim E Xn
=
E y.n~ n~
=
lim E~ - y) (Xn - ~n~
+
lim E(Xn - y)(y - E y)n~
=
lim E{y - E:U
(Xn - ~n~
,
+
E(y - E ~ (y - E ~ •,
+
lim E(y - E y) ~ - ~n~
+
E{y - E y) (y - E y)+
lim E(Xn - y)(y - E y)n~
,
A typical element of E(y - E~(Xn - 1) is E(Yi - E Y
i) (Yjn - yj),
Using Holder's inequality, we obtain
< lim
tl
E( )2 ( _ y)2=
0.,
This implies that lim E (y, - E ::t) (~ -::t.) = 0n~
= lim E~ - ::t.)(::t. - E ::t..>
n~
Hence, lim E(~ - E ~)(~ - E~) = E(y, - E ::t..>(::t. - E ::t..>
n~
Q.E.D.
,
Replacing ::t. by .£, we see that lim E~ - .£)~ -.£)
=
0 impliesn~
Plim ~
=
lim E~=
c. This,proves (ii).n~ n~
Yln
Theorem (3.5). Let ~
=
Y2n be two sequences ofK-dimensional random vectors such that
lim E ~ exists,
n~
,
lim E~) <
00,
andn~
lim E ( ; - ~) ( ; -~) = O.
n~
Then (i) lim E ;
=
lim E ~,n~ n~
(ii) lim Ee; - E ~) e ; -E ~)
n~
Proof: Since lim E(~ - ~)e; -~)
=
0,n~
=
0, which implies thatTherefore,
< lim
I
tr.n~
lim E ( ; ) = lim E( ; - ~)
+
lim E ~=
lim E ~ ,n~ n~ n~ n~
which is finite by assumption.
,
Further» lim E~)n~
< 00 implies that ~ has existing covariance
matrix for all n and» moreover» lim E(~ - E~)~ - E~) exists. ·n~
Denote this matrix by V ~ [v .• ]. Then in the manner of corollary (3.1.2)>>
~J
it follows» from the given conditions» that
lim E(Zi - E Zi )(Z. - E Z. ) = lim E(Yi - E Yin) (Yjn - E Yjn)
n~ . n n . Jn Jn n~ n
=
vij (i» j=
1» 2, ••• » K). This implies that,
lim E(Z - E Z )(Z - E Z ) -xl -xl --n -xl
n~
The theorem is proved.
,
=
lim E(~ - E~)(~ - E::Lu) •n~
elements also depending on n such that
upon the integer n» and IT
=
(IT ) a KxK matrix with real nonstochastic rs"-Theorem 3.6. Let IT'V = [IT'V ] be a KxK matrix of statistics depending rs
and:
lim IT = IT (positive definite)>>
n~
lim
E(~-
IT )4=
0» (r» s=
1» 2» ... , K).n~ rs rs
Further» let ~ be K~variatenorma1[Q» V= V(n)]» where
lim V
=
V
(positive definite).n~
Then (i) ~ ~ converges in distribution to normal (Q»
IT V IT
)>> 'V(ii) lim E IT ~
=
Q» andn~
'V 'V - - -(iii) lim E IT ~ IT
=
IT V IT •n~
Proof: Let IT v and v denote the typical elements of IT» V
rs» rs rs
and
V
respectively.Using Holder's inequality» we have» for t < 4
t
EI~
- ITIt
<[E(~
_ IT )4] /4 •Therefore,
lim E
I
rt'rs n-t<>o'U
A typical element of (IT - IT) Yn is or that
Further, c inequality (see Loeve [12], p. 155) gives r
Elrt' - IT
I~
= EI(rt'. - IT )+
(IT - IT )I~
rs rs rs rs rs rs
< 2~-1 [E·lrt' - IT I~ + lIT - IT I~]
- rs rs rs r s . ·
Since lim ITrs = ITrs' we have
n-t<>o
lim Elrt' - IT
I~
= 0,~
= 1,2,3,4.rs rs
n-t<>o
'U P
This implies that IT ----? IT (elementwise).
Since Yn dist.> normal (Q, V), it follows by Slutsky's theorem
'U _ - '
IT Yn approaches in distribution to normal (Q, IT V IT ).
Now E(rt' Yn) = E(rt' - IT) Yn + IT E Yn = E(Tl - IT) Yn' K
=
I: (II 0 - IT .) y. , andi=l r1 r1 1n
n
< l:
i=l
E
I
(rt' . - IT .)y.I
r1 r1 1n
K
=
l:i=l
'U
E(IT .r1 IT. )- 2
ri
Hence, lim IE(o )
I
=r n-t<>o
K
l: ./ .. ' { . 'U _ -IT )2} 1 lim v. ~ E(IT . i
i= 11 r1 r
n-t<>o K
= l:
i=l
./- 'U - 2 = 0, which implies that
vii lim E(ITri - ITri)
n-t<>o
'U 'U
-lim E(IT Yn) = lim E(IT -IT) Yn = Q.
n-t<>o n-t<>o
'U ''U'
Finally, E(IT Yn Yn IT )
=
E[(IT - IT) Yn Yn (IT - IT) ]'U - ' ' U - ' + ITE[Yn Yn (IT - IT) ]- ' ' U - ''U _ ' - ' - ' - '
'V ~ ' ' V - '
Consider the matrix (IT - IT)~ (IT - IT) • A typical element of this matrix is given by
eprs
K 'V K
= {E (IT i -
ff
i)Yi }" {E(~i
-ff
')Yi }i=l r r n i=l s S1 n
Using Holder's inequality, we obtain
K 4 1 1 4
1
lim IE eprs
l
< E lim[{E(~ri
-ff
ri) }4{E(~Si
-ff
si)4}4 {E yin}2]n~ i=ln~
"-
+
KE KE limi=l j=ln~
ifj
K K
+
E E<1--i=l j=l 3ViiVjj ifj
= 0, which implies that
'V - ' ' V - '
lim E[(IT - IT) ~ (IT - IT) ] =
o.
n~
' V ' ' ' V '
Similarly, lim E[(IT -
ff)
~YnJ = 0 = lim E[~ (IT - IT) ].n~ n~
-
,
-'
-
-'
- -
-,
Moreover, lim (IT E X~ IT ) = lim (IT V IT ) = IT V IT •
n~ n~
'V ' ' V ' I Therefore, lim E(IT
YnYn
IT ) = IT V IT .n~
Corollary 0-.6.1)0
'V
Let TI, TI and.In he-specified-as in theorem (3.6).
Let Z be a sequence-of r<-dimensiona'l-random··ve-ctors·such that
"""'I1
4
Um E(Zin -:- Yin) = 0, (i
=
1, 2,.0.,
,K).n-+<X>
Then (i)
(if)
(ifi)
~ ~ di~
1::. -::.>nonnal (.Q., TIV IT'),
tV ' t V - - - '
lim E(TI Z Z TI')
=
TI V TI •-n-n
n-+oo
Proof: Using Holder's inequality, we have
2
[E(Zin - 4
1
lim E(Zin - Yin) < lim Yin)]2
=
0,n-+oo n-+oo
(i = 1, 2, ••• , K)
, K 2
so that lim E(~ -~) (~ - ~)
=
E lim E(Zin - Yin)=
0.n-loOO i=l n-+oo
Therefore, by theorem (3.3),
Plim (Z"""'I1 - v )"""'I1
=
°.
n-+ooLet
Z
be normal (Q,V).
Plim
d
Z -IT
Z) = PUm ~(Z"""'I1 -n
n-+oo n-+oo
=
o.
P
Then ~
Z'
andtV -
--
~)+
PUm (TI - TI) ~+
PUm TI<.I.u -
Z)n-+oo n-+oo
Thus
~ ~ dist.~
IT
Z, which is normal (0, TIV IT').
Since Yin (i
=
1, 2, ..• , K) have existi~g moments of all orders which converge to the corresponding moments of y. (i=
1, 2, .•• , K), the~
given condition ensures that Zin has existing moments of the 4th order,
and moreover,
lim
E(Z~n)
=
n-+ooJ/,
limE(y. )
.~n
n-+oo
-e
The remaining part of the corollary, therefore, follows from the
theorem if we rep1ace.y.;:m by Z. and use these results.
~n .
Note: In many practical problems, we have to deal with random
variables whose distribution depends upon two or more integers. The
results established in sections (3.2) and (3.3) can be reworded to take
care of such a situation. Suppose that in theorem (3.2) we have the 'V
following hypothesis:.
S
is a statistic depending upon integers nand Tand 8 a finite constant such that
Ele-
sl~ (~~ 3) goes to zero as nand Ttend to infinity independently. If X is normal [0,
a~n,T)]'
such2
that lim lim
a~n
T) =cr ,
it is straightforward to show that~X
convergesn-+eo T-+eo '
2
in distribution to normal (0,
cr),
and that the mean and the variance of4.0 ERROR COMPONENT MODEL: REDUCED-FORM ESTIMATION
401 Description of the System
This and the following three chapters are intended to provide
methods for the estimation of structural and reduced-form parameters in
a system of simultaneous linear equations with error structure specified
in (1.3). Following the common practice, we first consider reduced-form
estimation.
To begin with, let us restate in greater detail the assumptions
underlying the error component model. The structural equations are
,
,
,
,
~t A
+
~t B+
~t=
0 (4.1),
where ~t
=
(Ylit' Y2it ' "0, YMit) , ~t=
(xlit ' x2it ' ••. , xKit),A is an MxM nonsingular matrix of constants with diagonal elements equal
I ,
to -1, B is a KxM matrix of constants, and ~t
=
(Elit, E2it, ••• , EMit)is a vector of unobservable errors with E~it
=
U~it+
V~t+
W~it'We make the following assumptions:
u
li
(i) For each i, t, ~
=
u 2iw lit
and ~t = w2it are
mutually independent M-dimensional normally distributed random vectors
with zero means and covariance-matrices
u u u
°11 °12 °lM
,
u u un
= E(u. u.) =°21 °22 °2M
u -J. -J.
U U u
e
v v vall a12 am
,
vv v
Q = E~~) = a
2l a22 a2M and
v
v v v
a~ll aM2 aMM
w w w
all a12 aIM
,
w ww
Q = E~tWit) = a
2l a22 a2M
w
.
w w waMI aM2 aMM
the elements of these matrices being all finite.
(ii) For each j.l ( = 1, 2,
.0.,
M), uj.ll' uj.l2 ' ••. , represent independent drawings from the marginal distribution of uj.l which is--
unormal (0, aj.lj.l); v l'j.l marginal distribution
vj.l2' .•• , represent independent
of vj.l which is normal (0, aVj.lj.l);
drawings from the
and wj.lll' wj.l12' .•• , represent independent drawings from the marginal distribution of
w which is normal (0, aW ).
j.l j.lj.l
(iii) The exogenous variables x" (j = 1, 2,
J1.t
...
,
K) areindependent of the error terms E 't (j.l = 1, 2, ••. , M). j.l.1.
Suppose that we have a sample of n T observations for n
cross-sectional units and T time intervals. To avoid complications, we shall
assume that nTxK matrix X is nonstochastic and subject to the condition
that
<a)
~~
l
x:;
I
T~
exists and is positive definite, and further
that
n T 2
(b) ~ ~ (x ..t - x . . -
X.
t+
x
j ) >
°
i=l t=l J1. J1.0 J • • •
(j = 1, 2, ... , K)
Nonstochastic.X implies the assumption (iii) above, while (4.2)
eJ.'isures that X does not include a colunmvector of ones. th
Inter1'iJ.s of nl' observations, we can write the]..l structural
equation in the compact form.
Y..
=
y]..l Ct+
X S+
e: ,JJ ""1J ]..l""1J ""1J (4.3)
-e
th
where y is an nTxl vector of observations on the]..l endogenous ]..l
variable, y]..l is an nTx(t -1) matrix of observations on (t -1) endogenous
]..l ]..l
. th·
variables (other than y ) included in the]..l]..l equation, X] . . l ] . . lis an nTxK
. th
matrix of observations on K exogenous variables included in the ]..l ]..l
equation, e: is an nTxl vector of errors, Ct is an (t -l)xl vector of
un-""1J ""1J ]..l
known parameters, and S is a K xl vector of unknown parameters. '"'"'1J ]..l
If we write,
1 .Q. 0 0
.
I u]..ll v]..ll w]..lllA
=
0 1-
0,
B=
I u=
u]..l2 v=
v]..l2 and w=
w]..l12 (4.4)Il Il Il
0 1
0 0 0 1 I u v]..lT w]..li:lT
nTxn nTxT ]..In nxl Txl nTxl
where
1
is a Txl vector of ones and I is a TxT identity matrix, we havee:
=
Au
+
Bv
+
w . ""1J Il Il IlUsing the assumptions listed above, we obtain
, , u
L
=
E(e: e:)=
A A cr]..l]..l 1-1 1-1 ]..l]..l InTxnT
J 0 0 0 I I I I 0 0
0 J 0 0 u
+
I I I v+
0 I 0 w= cr cr cr
]..l]..l ]..l]..l ]..l]..l
0 0 J
e
u v+
crw ) I v I v I
cr J
+
(crllll cr cr
llll llll llll llll
v
I u J
+
(cr v+
crW ) I v I=
cr a crllll llll llll llll llll
(4.5)
Here J is T:x:T matri:x: with 1 everywhere. There are n rows and n columns
of T:x:T block matrices.
Evidently, L
,
=
E(E E' ,) where II+
II can be obtained from (4.5) llll ""1.1 ""1.1,
by replacing one of the ll'S by II.
By trial error and generalization we obtain
-e
a1 I
+
a2 J a3 I+
a4 J a3 I+
a4 J-1
L
=
a3 I
+
a4 J a1 I+
a2 J a3, I+
a4 J (4.6)llll
a3 I
+
a4 J a3 I
+
a4 J a1 I+
a2 Jwhere
v
1 cr
a
1 =
-}.l)J
w w w v
cr cr (cr
+
ncr )llll' llll llll llll u
[1 -
crV (20W
+nov
+
ToU
1]
cr
= )J)J )J)J 1J)J lJ)J )Jll
a
2 w w u w v w v u
crllll(cr
llll
+
Tcrllll) cr,llll+
ncrllll crllll+
ncrllll+
TcrllllThe reduced form of the system written in (4.1) is
*'
~t =~t II
+
.f..:t.t
*'
,
-1 -1where
.f..:t.t = -.f..:t.t A and II = -B A •
(4.8)
(lJ = 1, 2 , ..., M) denote the column vectors of
*
(l
£ =
-
(£lit' £2it'...
,
£Mit) lJitM M M M
=
-
~ alJr £ = - ~ alJr U-
~ alJr v-
~ alJrWrit
r=l rit r=l ri r=l rt r=l
-e
*
where ulJi
* * *
= ulJi
+
VlJt
+
WlJitM
= - ~ alJr U ,etc.
r=l ri (4.9)
,
0, i f i
+
i for all lJ, lJ = 1, 2,...
,
M·,
M M
v*,
,
~ ~ alJr alJ s °v = i f
°lJlJ ' t = t
r=ls=l rs
,
0, if t
+
t for lJ,lJ = 1, 2,...
,
M;M M
w*,
,
,
~ ~ alJr alJ s W° = i f i i and t t
°lJlJ ' = =
r=ls=l rs
0, i f i
+
i and/or t+
t (4.10)Recalling the assumptions about u., L. and w. , we see that
~ .. ~t
M M ,
~ ~ alJr alJ s nU = ou*' if i' = i, "rs lJlJ'
r=lr=l
and E(W*'t w*'.' ') =
e
Thus we can assert that* * *
(i) ~,~ and~t are mutually independent M-variate normal
vectors with zero means and covariattce matrices
u* u* u*
°11 °12 °lM
rl
=
u* u* u*°21 °22 °2M
u*
u* u* u*
°M1 °M2 °MM
v* v* v*
°11 °12 °lM
rl
=
v* v* v*°21 °22 °2M
v*
v* v* v*
--
°M1 °M2 °MMw* w* w*
°11 °12
om
rlw*
=
°21w* °22w* °2Mw* respectively;w* w* w*
°M1 °M2 °MM
(ii) For each ~,
*
*
u~l' u~2' ••. , are
*
marginal distribution of u which is
~
independent drawings u*
normal (0, ° );
~~
from the
*
*
v~l' v~2'
*
marginal distributio~of v
~
.•. ,are independent drawings from the v*
which is normal (0, °, . ); and
. ~~
*
*
w~ll' w~12' .•. ,are independent drawings from the
* w*
marginal distribution of w which is normal (0, ° ).
1111 1111
1111 1111 1111
(4.12) and
* * * * * *
a
l I
+
a2 J a3 I+
a4 J a3 I+
a4 J-e
*-1 * * * * * *2: = a
3 I
+
a4 J al I+
a2 J a3 I+
a4 J 1111(4.13)
* * * *
where aI' a
2, a3, and a4 have the same expressions as those of aI' a2, a3
u* v* w*
replaced by 0 0 0 respectively.
1111 ' 1111 ' 1111 '
*
'
The formula for 2: '(11
1111
+
,
11) can be derived from (4.12) by changing one of the subscripts 11 into 11 •The distinctive feature of the error component model is that the
covariance matrices of disturbance vectors of anyone of the structural
or reduced-form equations are not diagonal. Consequently, many of the
techniques which are ordinarily employed for structural and reduced-form