Estimation of Higher-Order Spatial Autoregressive Panel Data Error Component Models

(1)

Estimation of Higher-Order Spatial

Autoregressive Panel Data Error

Component Models

H

ARALD

B

ADINGER

P

ETER

E

GGER

CES

IFO

W

ORKING

P

APER

N

O

.

2556

C

ATEGORY

10:

E

MPIRICAL AND

T

HEORETICAL METHODS

F

EBRUARY

2009

An electronic version of the paper may be downloaded • from the SSRN website: www.SSRN.com • from the RePEc website: www.RePEc.org • from the CESifo website: Twww.CESifo-group.org/wpT

(2)

CESifo Working Paper No. 2556

Estimation of Higher-Order Spatial

Autoregressive Panel Data Error

Component Models

Abstract

This paper develops an estimator for higher-order spatial autoregressive panel data error component models with spatial autoregressive disturbances, SARAR(R,S). We derive the moment conditions and optimal weighting matrix without distributional assumptions for a generalized moments (GM) estimation procedure of the spatial autoregressive parameters of the disturbance process and define a generalized two-stages least squares estimator for the regression parameters of the model. We prove consistency of the proposed estimators, derive their joint asymptotic distribution, and provide Monte Carlo evidence on their small sample performance.

JEL Code: C13, C21, C23.

Keywords: higher-order spatial dependence, generalized moments estimation, two-stages least squares, asymptotic statistics.

Harald Badinger Department of Economics Vienna University of Economics and

Business Administration Althanstrasse 39 - 45 1090 Vienna Austria [email protected] Peter Egger

Ifo Institute for Economic Research and Center for Economic Studies at

the University of Munich Poschingerstrasse 5

81679 Munich Germany [email protected]

(3)

I. Introduction

This paper considers the estimation of panel data models with higher-order spatially autocorrelated error components and spatially autocorrelated dependent variables. Spatial interactions in data may originate from various sources such as strategic interaction between jurisdictions (to attract firms or other mobile agents) and firms (in their price, quantity, or quality setting) or general equilibrium effects which disseminate with spatial decay due to their transmission through trade flows, migration, or input-output relationships.1 Data sets used in empirical studies often share three features: first, they are available in the form of panel data, with a large cross-sectional and a small time series dimension; second, spatial interactions of various kinds co-exist – such as geography-related, trade-related, migration-related interactions – or the decay function of a single spatial interaction is unknown; third, it is unclear whether spatial interactions are local – and affect only immediate neighbors – or global – and affect second third and other neighbors with repercussions. The estimator proposed here addresses the mentioned three features in a unified framework. It allows for panel data with a fixed but arbitrary number of channels or decay segments of spatial interaction in both the error components and the dependent variable, referred to as SARAR(R,S).

In developing the estimator, we build on the SARAR(1,1) generalized moments (GM) framework in Kelejian and Prucha (1999) for a single cross-section and the SARAR(0,1) model in Kapoor, Kelejian, and Prucha (2007) for panel data error components models. Obvious advantages of the GM framework is that it does not rely on distributional assumptions and that it can be applied to large data-sets without imposing any restrictions on the matrices of spatial interdependence. We derive GM estimators for the spatial regressive parameters of the disturbance process based on alternative weighting schemes for the moments. We then define a feasible generalized two-stages least squares (FGTSLS) estimator for the model’s regression parameters. We determine asymptotic properties of the estimators for the case where the time dimension of the panel and the number of spatial interactions is fixed while the number of cross-sectional units approaches infinity. In particular, we prove that the proposed GM and FGTSLS procedures obtain consistent estimators of the model parameters and we derive their joint asymptotic distribution.

1

See Cliff and Ord (1973, 1981), Anselin (1988), and Cressie (1993) for classic references about spatial econometric models in general. Recent theoretical contributions of spatial panel data models include Baltagi, Song, and Koh (2003), Baltagi and Li (2004), Baltagi, Song, Jung, and Koh (2007), Kapoor, Kelejian, and Prucha (2007), Korniotis (2008), Baltagi, Egger, and Pfaffermayr (2008), and Lee and Yu (2008). Recent applications of spatial panel data models include Druska and Horrace (2004), Arbia, Basile, and Piras (2005), Egger, Pfaffermayr, and Winner (2005), Baltagi, Egger, and Pfaffermayr (2007), and Badinger and Egger (2008a).

(4)

The remainder of the paper is organized as follows. Section II introduces the basic model specification and some notation. Section III proposes GM estimators of the parameters of spatial dependence in the error components based on alternative weighting schemes of the moments. Section IV derives a two-stages least-squares routine to estimate the regression parameters of the model and derives the asymptotic variance-covariance matrix of all model parameters. The latter enables Wald tests on the structure and decay of spatial interactions in the SARAR(R,S) model. Section V presents the results of a Monte Carlo simulation exercise. Section VI summarizes our main findings and concludes. The appendix sketches the proofs of consistency and the asymptotic distribution of the model parameters, whereas the full details of the proofs are relegated to a technical appendix.

II. Basic Model Specification and Notation

The basic set-up of the error components model with spatially correlated error terms represents a generalization of Kapoor, Kelejian, and Prucha (2007), who consider a panel data error components model with nonstochastic explanatory variables and first-order spatial autoregressive disturbances, i.e., a SAR(1) model. The present paper delivers the following contributions. First, we allow for an R-th order spatial autoregressive process in the dependent variable and an S-th order spatial process in the disturbances cum error components, i.e., we consider a SARAR(R,S) panel data error components model. As we show below, this also covers the case of endogenous explanatory variables other than spatial lags of the dependent variable.2 Second, we prove consistency of proposed generalized moments (GM) estimators of the model parameters and derive their joint asymptotic distribution. In particular, we also relax the normality assumption used, for instance, in Kapoor, Kelejian, and Prucha (2007) to obtain a simplified version of the optimal weighting matrix for the moment conditions. Third, we provide some Monte Carlo evidence with a special emphasis on the spatial model parameter point estimates and the rejection probabilities of Wald tests of the SARAR(R,S) model against interesting alternatives such as the SARAR(1,1), SARAR(0,S), SARAR(R,0), and the non-spatial model.

The basic model comprises i=1,...,N cross-sectional units and t=1,...,T time periods. For time period t, the model reads

) ( ) ( ) ( ) ( 1 , , t t t t R _N r rN rN N N N N X β W y u y = +

∑

+ =

λ

, (1a) or 2

See Lee and Liu (2008) and Badinger and Egger (2008b) for cross-sectional spatial models with a SARAR(R,S) process.

(5)

) ( ) ( ) (t _N t _N _N t N Z δ u y = + , (1b)

where y_N(t) is an N×1 vector with cross-sectional observations of the dependent variable in year t, )X_N(t is an N×K matrix of observations on K non-stochastic explanatory variables, i.e., X_N(t)=[x₁_,_N(t),...,x_K_,_N(t)] with each N×1 vector x_k_,_N(t) denoting the observations on the k-th explanatory variable. The structure of spatial dependence in y_N(t) is determined by the time-invariant N×N matrices W_r,_N, r=1,...,R, whose elements w_ij,_r,_N are assumed to

be known and will often (but need not) be specified as a decreasing function of geographical distance between the cross-sectional units i and j. The expression y_r_,_N(t)=W_r_,_Ny_N(t) is referred to as the r-th spatial lag of y_N. The specification of a higher-order process allows the strength of spatial interdependence in the dependent variable (reflected in the spatial autoregressive parameters

λ

_r,_N, r=1,...,R) to vary across a fixed number of R subsets of relations between cross-sectional units.

In equation (1b), the N×(K+R) design matrix is given by Z_N(t)=[X_N(t),Y_N(t)], with )] ( ),..., ( [ ) (t 1,N t R,N t N y y

Y = , and δ_N =(β′_N,λ′_N)′, where the K×1 parameter vector of the exogenous variables is given by β_N =(β₁_,_N,...,β_K_,_N)′ and the R×1 vector of spatial autoregressive parameters of y_N is defined as λ_N =(

λ

₁_,_N,...,

λ

_R_,_N)′.

The N×1 vector of error terms uN(t)=[u1,N(t),...,uN,N(t)]′ is assumed to follow a spatial autoregressive process given by

) ( ) ( ) ( 1 , , t t t S _N m N N m N m N M u ε u =

∑

+ =

ρ

, (1c) ) ( ) (t _N _N t N μ v ε = + , (1d)

where

ρ

_m,_N and M_m,_N denote the time-invariant, unknown parameters and the known N×N matrix of spatial interdependence, respectively. The structure of spatial correlation in the disturbances is determined by the S different, time-invariant N×N matrices M_m,_N. As in equation (1a), the specification of a higher-order process allows the strength of spatial interdependence in the disturbances (reflected in the parameters

ρ

_m,_N, m=1,...,S) to vary across a fixed number of S subsets of relations between cross-sectional units. The expression

) ( ) ( _, ,N t mN N t m M u

u = is referred to as the m-th spatial lag of u_N. The S×1 vector of the spatial autoregressive parameters of u_N(t) is defined as ρN =(

ρ

1,N,...,

ρ

S,N)′.

(6)

Finally, the N×1 vector of error terms ε_N(t) consists of two components, μ_N and v_N(t). As indicated by the notation, μ_N is time-invariant while v_N(t) is not. The typical elements of

) (t

N

ε and v_N(t) are the scalars

ε

_it,_N and v_it_,_N, respectively, and the N×1 vector of unit-specific error components is given by μN =(μ1,N,...,μN,N)′.

Stacking observations for all time periods such that t is the slow index and i is the fast index with all vectors and matrices, the model reads

N N N N N N X β Y λ u y = + + , (2a) or N N N N Z δ u y = + , (2b)

with the NT×K regressor matrix X_N =[X′_N(1),...,X′_N(T)]′, and YN =(y1,N,...,yR,N), where ] ) ( ),..., 1 ( [ _, _, ,N = ′rN ′rN T ′ r y y

y is the NT×1 vector of observations on the r-th spatial lag of the

dependent variable y_r,_N. The NT×1 vector of disturbances u_N =[u′_N(1),...,u′_N(T)] for the spatial autoregressive process of order S is given by

N S m mN T mN N N I M u ε u =

∑

⊗ + =1 , , ( )

ρ

, (2c)

where I_T is an identity matrix of dimension T×T. The NT×1 vector ε_N =[ε′_N(1),...,ε′_N(T)]′ is specified as N N N T N e I μ v ε =( ⊗ ) + , (3a)

where e_T is a unit vector of dimension T×1 and I_N is an identity matrix of dimension N

N× . In light of (2c), the error term can also be written as

∑

= = − ⊗ = ⊗ − = S m N N m N m N T S m N N m T N m N N 1 , , 1 , , (I M )u I (I M )u u ε

ρ

. (3b) It follows that

∑

= − − ⊗ = S m N N m N m N T N 1 1 , , ) ] ( [I I M ε u

ρ

, (4a)

(7)

and N R r N r N r N T N N R r N r N r N T N I I W X t β I I W u y [ ( ) ] ( ) [ ( ) 1] 1 , , 1 1 , , − = − =

∑

+ ⊗ − − ⊗ =

λ

, (4b)

The following assumptions are maintained throughout this paper.

Assumption 1.

Let T be a fixed positive integer. (a) For all 1≤t≤T and 1≤i≤N,N ≥1, the error components v_it_,_N are identically and (mutually) independently distributed with E(v_it_,_N)=0,

2 2

, )

(v_it_N _v

E =σ , where 0<

σ

_v2 <b_v <∞, and Ev_it_,N4+η <∞ for some

η

>0. (b) For all 1

,

1≤i≤N N ≥ , the unit-specific error components

μ

_i,_N are identically and (mutually) independently distributed with E(

μ

_i_,_N)=0, E(μ_i2_,_N)=σ_μ2, where 0<σ_μ2 <b_μ <∞, and

∞ < +η

μ

4 ,N i

E for some

η

>0. (c) The processes {v_it_,N} and {

μ

_i_,N} are independent of each other. Assumption 1 as maintained here is slightly stronger than that in Kapoor, Kelejian, and Prucha (2007), since it requires not only the fourth but also the (4+

η

)-th moments of the error components to be finite for some

η

>0. This is required for the central limit theorem of Kelejian and Prucha (2008) to apply, which will be used to derive the asymptotic distribution of the parameter estimates in Section III.

Assumption 1 implies that

2 2 ,

, )

( _it_N _js_N _v

E

ε

=

σ

_μ +

σ

for i= jand t =s, (5a)

2 , , ) (

ε

_it_N

ε

_js_N =

σ

_μ E for i= j and t≠s, (5b) 0 ) ( _it_,_N _js_,_N = E

ε

, otherwise. (5c)

As a consequence, the variance-covariance matrix of the stacked error term ε_N reads

NT v N T N N N E ε ε J I I Ω_ε 2 2 , = ( ′ )=

σ

μ( ⊗ )+

σ

, (6a)

where J_T =e_Te′_T is a T×T matrix with unitary elements and I_NT is an identity matrix of dimension NT × NT. Equation (6a) can also be written as

N N v N 1, 2 1 , 0 2 , Q Q Ω_ε =

σ

+

σ

, (6b)

(8)

where

σ

12 =

σ

v2 +T

σ

μ2. The two matrices Q0,N and Q1,N, which are central to the estimation

of error component models and the moment conditions of the GM estimator, are defined as

N T T N T I J I Q₀_, =( − )⊗ , (7) N T N _T I J Q₁_, = ⊗ . (8)

Notice that Q₀_,_N and Q₁_,_N are both of order NT × NT, symmetric, idempotent, orthogonal to each other, and sum up to I_NT.3

Assumption 2.

(a) All diagonal elements of the matrices W_r,_N, r=1,...,R, and M_s,_N, s=1,...,S, are zero. (b) The admissible parameter space is restricted as follows:

) , ( , r r N N N r a a λ λ

λ

∈ − , with <aλ_Nr a_Nλr ≤aλ,r ≤aλ <∞ , 0 , r = 1, …, R, and

∑

< <∞ = λ

λ

A R r 1 r N , .

The first part of Assumption (2b) requires the parameters

λ

_r,_N, r =1,...,R to be finite. We take aλ such that max( , )

,..., 1 r R r a aλ λ =

= holds; the expression aλ will be used to denote an R×1 vector with elements aλ. In the second part of Assumption (2b), the scalar A_λ generally depends on the properties of the weights matrices W_r,_N. For example, with row-normalized

matrices W_r,_N, r=1,...,R, assuming that A_λ =1 ensures that ( )

1 , ,

∑

= − R r N r N r N W I

λ

is invertible,

as required in Assumption (2c). If the matrices W_r,_N are not row-normalized, Assumption

(2c) is implied by 1 , ,..., 1 max − = ⎟⎠ ⎞ ⎜ ⎝ ⎛ = _r_N R r

A_λ W for some matrix norm ⋅ (see Horn and Johnson,

1985, p. 301). Analogous assumptions are made for the parameters of the spatial autoregressive error process:

) , ( , s s N N N s a a ρ ρ

ρ

∈ − , with <a_Nρs a_Nρs ≤aρ,s ≤aρ <∞ , 0 , s = 1, …, S, and

∑

< <∞ = ρ

ρ

A S m N m 1 , .

We take aρ such that max( , )

,..., 1 s S s a aρ ρ =

= holds; the expression aρ will be used to denote an

1 ×

S vector with elements aρ. As above, with row-normalized matrices M_s,_N, s=1,...,S, the

3

Observe that pre-multiplying an NT × 1 vector with Q0,_N transforms its elements into

deviations from cross-section specific sample means taken over time, and that pre-multiplying a vector by Q1,_N transforms its elements into cross-section specific sample means. See

(9)

second part of this assumption ensures invertibility of ( ) 1 , ,

∑

= − S m mN mN N M I

ρ

if A_ρ =1. If the

matrices M_s,_N are not row-normalized, Assumption (2c) is implied by

1 , ,..., 1 max − = ⎟⎠ ⎞ ⎜ ⎝ ⎛ = _s_N S s A_ρ M

for some matrix norm ⋅ .

(c) The matrices ( ) 1 , ,

∑

= − R r N r N r N W I

λ

and ( ) 1 , ,

∑

= − S m N m N m N M

I

ρ

are nonsingular for

) , ( r r N N r a aλ λ

λ

∈ − and ( r, r) N N s a aρ ρ

ρ

∈ − , respectively. This ensures that u_N and y_N are

uniquely identified through equations (4a) and (4b).

Assumption 3.

The row and column sums of the matrices W_r,_N, r=1,...,R, M_s,_N, s=1,...,S,

1 1 , , ) ( − =

∑

− R r rN rN N W I

λ

, and 1 1 , , ) ( − =

∑

− S m mN mN N M

I

ρ

are bounded uniformly in absolute value.

(See Remark A.1 in Appendix A for a definition of row and column sum boundedness.)

In light of Assumptions 1-3 and Remark A.1 in the Appendix, it follows that E(u_N)=0 and the variance-covariance matrix of u_N is given by

∑

= − = − _⊗ ₋ _′ − ⊗ = ′ = S m mN mN N T S m mN mN N N T N N N E 1 1 , , 1 , 1 , , , (u u ) [I (I M ) ]Ω [I (I M ) ] Ω_u

ρ

_ε

ρ

, (9a) and

∑

= − = − ₋ _′ − + = ′ S m mN mN N S m mN mN N v N N t t E 1 1 , , 1 1 , , 2 2 ) ( ) )( ( )] ( ) ( [u u

σ

_μ

σ

I

ρ

M I

ρ

M . (9b)

Note that all variables and parameters except for the variances of the error components are allowed to depend on sample size N. Such a specification is consistent, for example, with models where the weights matrix is row-normalized and the number of neighbours of a given cross-sectional unit depends on sample size (see Kapoor, Kelejian, and Prucha, 2007, p. 102) or where the strength of interdependence (in terms of the spatial autoregressive parameters) changes with the number of neighbours. Note that X_N is allowed to depend on sample size and may thus also contain spatial lags of exogenous variables. As a result, the model specification in equations (1a)-(1c) is fairly general, allowing for higher-order spatial dependence in the dependent variable, the explanatory variables, and the disturbances.

(10)

III. GM Estimation of a SAR(S) Model

In the following, we consider GM estimators for the spatial autoregressive parameters of the disturbance process in equation (1c) and derive the asymptotic joint distribution of all model parameters.

1. Moment Conditions

With an S-th order process (SAR(S), with S >1), the GM estimators of the parameters

N S N , ,

1 ,...,

ρ

,

σ

_v2, and

σ

₁2 can be obtained by recognizing that – under Assumptions 1 and 2 – the moment conditions used by Kapoor, Kelejian, and Prucha (2007) hold for each matrix

N s,

M , s=1,...,S. In particular, define for each M_s,_N, s=1,...,S

] ) ( )[ ( ) ( 1 , , , , ,

∑

= ⊗ − ⊗ = ⊗ = S m N N m T N m N N s T N N s T N s I M ε I M u I M u ε

ρ

. (10)

A word on notation is in order here. In equation (10), subscript s has been introduced together with m to indicate that, with higher-order spatial processes, M_s,_N and M_m,_N meet in ε_s,_N. While we will use index s to refer to the matrix M_s,_N, by which ε_N is pre-multiplied in equation (10), index m is required for the summation over the terms

ρ

_m,_NM_m,_N.

The moment conditions are then given by

Ma 0, 0, ] 2 ) 1 ( 1 [ ] ) 1 ( 1 [ _N _N _N _N _N _N _v T N E T N E ′ =

σ

− = ′ − ε Q ε v Q v , (11) M1,s ( ) 1 ] ) ( ) 1 ( 1 [ ] ) 1 ( 1 [ _s_,_N ₀_,_N _s_,_N _N ₀_,_N _T _s_,_N _s_,_N ₀_,_N _N _v2 tr _s_,_N _s_,_N N T N E T N E ε Q ε v′Q I ⊗M′ M Q v = M′ M − = ′ −

σ

, M2,s ( ) ] 0 ) 1 ( 1 [ ] ) 1 ( 1 [ _, ₀_, ′ ₀_, ⊗ ′_, ₀_, = − = ′ − sN N N N N T sN N N T N E T N E ε Q ε v Q I M Q v , Mb 1, 1, ) 12 1 ( ] ) ( 1 [ ) 1 ( ′_N _N _N = ′_N _T′ _T ⊗ _N _N + ′_N _N _N =

σ

N E N E N E ε Q ε μ e e I μ v Q v , M3,s ( ) ] 1 [ ] ) ( 1 [ ) 1 ( _s_,_N ₁_,_N _s_,_N _N _T _T _s_,_N _s_,_N _N _N ₁_,_N _T _s_,_N _s_,_N ₁_,_N _N N E N E N E ε′ Q ε = μ′ e′e ⊗M′ M μ + v′Q I ⊗M′ M Q v ) ( 1 , , 2 1 tr sN sN N M′ M =

σ

, M4,s (1 ′s,N 1,N N)= [1 ′N( T′ T ⊗ ′s,N) N]+ [1 ′N 1,N( T ⊗ ′s,N) 1,N N]=0 N E N E N E ε Q ε μ e e M μ v Q I M Q v ,

where

σ

₁2 =

σ

_v2+T

σ

_μ2. The moment conditions associated with matrices M_s,_N, s=1,...,S, through (10), are indexed with subscripts 1 to 4. The remaining two moment conditions,

(11)

which do not depend on s, are denoted as Ma and Mb. For an S-th order process as given by equation (2c), we thus have (4S+2) moment conditions.4

Substituting equations (3b), (10), and (1c) into the 4S+2 moment conditions (11) yields a (4S+2) equation system in ( ,..., , , 12)

2 , ,

1

ρ

σ

ρ

_N _S _N _v , which can be written as

0

Γ

γ_N − _Nb_N = , (12)

where b_N is a [2S+S(S−1)/2+2]×1 vector, given by

) , , ,..., ,..., , ,..., , ,..., ( 12 2 , , 1 , , 1 , 2 , 1 2 , 2 , 1 , , 1 ′ =

ρ

_N

ρ

_S_N

ρ

_N

ρ

_S_N

ρ

_N

ρ

_N

ρ

_N

ρ

_S_N

ρ

_S₋ _N

ρ

_S_N

σ

_v

σ

N b ,

i.e., b_N contains S linear terms

ρ

_m,_N, m=1,...,S, S quadratic terms

ρ

m2,N, m=1,...,S,

2 / ) 1 (S−

S cross products

ρ

_m,_N

ρ

_l,_N, m=1,...,S −1 ,l=m+1,...,S, as well as 2 v

σ

and

σ

12. For

later reference, we define the (S+2)×1 vector of all parameters as

) , , ,..., ( ) , , ( 2 1 2 , , 1 2 1 2 ′= ′ ′ = _N

σ

_v

σ

ρ

_N

ρ

_S_N

σ

_v

σ

N ρ θ . N

γ is a (4S+2)×1 vector with elements [

γ

_i_,N], i=1,...,(4S+2), and Γ_N is a )

2 4

( S+ ×[2S+S(S−1)/2+2] matrix with elements [

γ

_i_,_j_,_N], i=1,...,(4S+2), ] 2 2 / ) 1 ( 2 [ ,..., 1 + − + = S S S

j . The elements

γ

_i,_N and

γ

_i_,_j_,_N will be defined below. The

row-index of the elements γ_N and Γ_N will be chosen such that the equation system (12) has the following order. The first four rows correspond to the moment restrictions M1,1 to M4,1 associated with matrix M1,_N through (10); rows five to eight correspond to M1,2 to M4,2

associated with matrix M₂_,_N, and so forth; rows (S−4) to 4 correspond to the MS 1,S to M4,S associated with the matrix M_S_,_N. Finally, rows (4S+1) and (4S+2) correspond to the moment conditions Ma and Mb, respectively, which do not depend on s.

4

Notice that further moment conditions are available through pre- and post-multiplying Q0,_N

and Q₁_,_N with ε_r,′_N and ε_s,_N, r≠s, respectively, in moment conditions M1 and M3. The associated efficiency gain will depend on the properties of the weights matrices. If the two involved weights matrices are orthogonal, the corresponding moment condition is trivially satisfied and does not add any information. For the sake of brevity, we use the moment conditions given in (11), and leave an assessment of the potential efficiency gains from exploiting further moment conditions for future research.

(12)

The sample analogue to equation system (12) is given by ) ( ~ ~ N N N N N Γ θ γ − b =

ϑ

, (13)

where the elements of ~ and γ_N Γ~_N are equal to those of γ_N and Γ_N with the expectations operator suppressed and the disturbances u_N replaced by (consistent) estimates u~ . _N

GM estimates of the parameters

ρ

1,_N,...,

ρ

_S,_N, 2 v

σ

and

σ

12are then obtained as the solution to

)] ( ~ ) ( [ )] ~ ~ ( ~ ) ~ ~ [( min arg 2 1 2 2 1, ,.., , , N N N N N N N N N N N N v S θ Θ θ Γ γ Θ Γ γ ϑ ϑ σ σ ρ ρ ρ ′ = − ′ − b b , (14)

i.e., the parameter estimates can be obtained from a (weighted) non-linear least squares regression of ~ on the columns of γ_N Γ~_N; )ϑ_N(θ_N can then be viewed as a vector of regression residuals. The optimal choice of the (4S+2)×(4S+2) weighting matrix Θ_N and its estimation will be discussed below.

In the following, we define the elements of γ_N and Γ_N, grouped by the corresponding moment conditions. Thereby, we use the following notation:

N N s T N s I M u u_, =( ⊗ _, ) , s=1,...,S, and (15a) N N m N s T N N m T N s T N sm I M I M u I M M u u , =( ⊗ , )( ⊗ , ) =( ⊗ , , ) , s=1,...,S, m=1,...,S. (15b) At this point, we need to introduce an index l=1,...,S for proper definition of the elements of

N

γ and Γ_N.

Moment condition M1,s delivers s=1,...,S rows of equation system (12), appearing in rows

1 ) 1 (

4 s− + with the following elements of γ_N and Γ_N:

) 1 ( 1 , 1 ) 1 ( 4 − = + − T N N s γ E(us′,NQ0,Nus,N), (16a) ) ( ) 1 ( 2 , , 0 , , , 1 ) 1 ( 4 s mN E sN N smN T N − u′ Q u = + − γ , m=1,...,S, ) ( ) 1 ( 1 , , 0 , , , 1 ) 1 ( 4s− + S+mN =−_N _T₋ E usm′ NQ NusmN γ , m=1,...,S, ) ( ) 1 ( 2 , , 0 , , 2 / ) 1 ( ) 1 ( , 1 ) 1 ( 4 s S m mm l mN E smN N slN T N − u′ Q u − = − + − − + + − γ , m=1,...,S−1, l=m+1,...,S,

(13)

) ( 1 , , , 1 2 / ) 1 ( 2 , 1 ) 1 ( 4s S S S N tr sN sN N M′ M = + − + + −

γ

, 0 , 2 2 / ) 1 ( 2 , 1 ) 1 ( 4s− + S+S S− + N =

γ

.

Moment condition M2,s consists of s=1,...,S rows of equation system (12), appearing in rows

2 ) 1 (

4 s− + with the following elements of γ_N and Γ_N:

) ( ) 1 ( 1 , 0 , , 2 ) 1 ( 4 s N E sN N N T N − u′ Q u = + − γ , (16b) ) ( ) 1 ( 1 , , 0 , , 0 , , , 2 ) 1 ( 4s mN E smN N N sN N mN T N − u′ Q u +u′ Q u = + − γ , m=1,...,S, ) ( ) 1 ( 1 , , 0 , , , 2 ) 1 ( 4s− + S+mN =−_N _T ₋ E usm′ NQ NumN γ , m=1,...,S, ) ( ) 1 ( 1 , , 0 , , , 0 , , 2 / ) 1 ( ) 1 ( , 2 ) 1 ( 4s S m mm l mN E slN N mN smN N lN T N − u′ Q u +u′ Q u − = − + − − + + − γ , m=1,...,S−1, S m l= +1,..., , 0 , 1 2 / ) 1 ( 2 , 2 ) 1 ( 4s− + S+S S− + N =

γ

, 0 , 2 2 / ) 1 ( 2 , 2 ) 1 ( 4s− + S+S S− + N =

γ

.

Moment condition M3,s corresponds to s=1,...,S rows of equation system (12), appearing in

rows 4(s−1)+3 with the following elements of γ_N and Γ_N:

N N s 1 , 3 ) 1 ( 4 − + =

γ

E(u_s′_,_NQ₁_,_Nu_s_,_N), (16c) ) ( 2 , , 1 , , , 3 ) 1 ( 4 s mN E sN N smN N u′ Q u = + −

γ

, m=1,...,S, ) ( 1 , , 1 , , , 3 ) 1 ( 4s S mN E smN N smN N u′ Q u − = + + −

γ

, m=1,...,S, ) ( 2 , , 1 , , 2 / ) 1 ( ) 1 ( , 3 ) 1 ( 4s− + S m+ −mm− +l−mN =−_N E usm′ NQ NuslN

γ

, m=1,...,S−1, l=m+1,...,S, 0 , 1 2 / ) 1 ( 2 , 3 ) 1 ( 4s− + S+S S− + N =

γ

, ) ( 1 , , , 2 2 / ) 1 ( 2 , 3 ) 1 ( 4 s S S S N tr sN sN N M′ M = + − + + −

γ

.

Moment condition M4,s represents s=1,...,S rows of equation system (12) appearing in rows

4 ) 1 (

(14)

) ( 1 , 1 , , 4 ) 1 ( 4s N E sN N N N u′ Q u = + −

γ

, (16d) ) ( 1 , , 1 , , 1 , , , 4 ) 1 ( 4s− + mN = _N E usm′ NQ NuN +us′NQ NumN

γ

, m=1,...,S, ) ( 1 , , 1 , , , 4 ) 1 ( 4s S mN E smN N mN N u′ Q u − = + + −

γ

, m=1,...,S, ) ( 1 , , 1 , , , 1 , , 2 / ) 1 ( ) 1 ( , 4 ) 1 ( 4 s S m mm l mN E slN N mN smN N lN N u′ Q u +u′ Q u − = − + − − + + −

γ

, m=1,...,S−1, l=m+1,...,S, 0 , 1 2 / ) 1 ( 2 , 4 ) 1 ( 4s− + S+S S− + N =

γ

, 0 , 2 2 / ) 1 ( 2 , 4 ) 1 ( 4s− + S+S S− + N =

γ

.

Moment condition Ma reflects one equation of the system in (12), appearing in row (4S+1) with the following elements of γ_N and Γ_N:

) ( ) 1 ( 1 , 0 , 1 4S N E N N N T N − u′ Q u = + γ , (16e) ) ( ) 1 ( 2 , 0 , , , 1 4S+ mN = _N _T ₋ E um′ NQ NuN γ , m=1,...,S, ) ( ) 1 ( 1 , , 0 , , , 1 4S S mN E mN N mN T N − u′ Q u − = + + γ , m=1,...,S, ) ( ) 1 ( 2 , , 0 , , 2 / ) 1 ( ) 1 ( , 1 4S S m mm l mN E mN N lN T N − u′ Q u − = − + − − + + γ , m=1,...,S−1, l=m+1,...,S, 1 , 1 2 / ) 1 ( 2 , 1 4S+ S+S S− + N =

γ

, 0 , 2 2 / ) 1 ( 2 , 1 4S+ S+S S− + N =

γ

.

Moment condition Mb is associated with one equation of the system in (12), appearing in row )

2 4

( S+ with the following elements of γ_N and Γ_N:

) ( 1 , 1 , 2 4S N E N N N N u′ Q u = +

γ

, (16f) ) ( 2 , 1 , , , 2 4S mN E mN N N N u′ Q u = +

γ

, m=1,...,S, ) ( 1 , , 1 , , , 2 4S+ S+mN =−_N E u′mNQ NumN

γ

, m=1,...,S, ) ( 2 , , 1 , , 2 / ) 1 ( ) 1 ( , 2 4S S m mm l mN E mN N lN N u′ Q u − = − + − − + +

γ

, m=1,...,S−1, l=m+1,...,S, 0 , 1 2 / ) 1 ( 2 , 2 4S+ S+S S− + N =

γ

, 1 , 2 2 / ) 1 ( 2 , 2 4S+ S+S S− + N =

γ

.

(15)

This completes the specification of the elements of the matrices γ_N and Γ_N. The similarity of the structure between the expressions resulting from the moment conditions Ma, M1,s, and M2,s on the one hand and Mb, M3,s, and M4,s on the other hand is apparent. First, they differ by the normalization factor and the corresponding matrix of quadratic forms, Q0,_N and Q1,N,

respectively. Second, the rows in (12) associated with Ma, M1,s and M2,s,s=1,...,S, do not

depend on σ₁2 whereas the rows associated with Mb, M3,s, and M4,s, s=1,...,S, do not depend

on

σ

_v2. This fact will be used to define an initial GM estimator, which is based on a subset of moment conditions (Ma, M1,s, and M2,s) only, in order to obtain an estimate of the optimal

weighting matrix Θ_N.

For future reference, we define the (2S+1)×1 vector γ0_N as the sub-vector containing rows s and (s+1), s=1,...,S and row (4S+1)of γ_N, corresponding to M1,s, M2,s, and Ma.

Moreover, we define the (2S+1)×[2S+S(S−1)/2+1] matrix Γ0_N as the sub-matrix containing rows s and (s+1), s=1,...,S, and row (4S+1) of Γ_N, corresponding to M1,s, M2,s, and Ma.

Analogously, we define the (2S+1)×1 vector γ1_N as the sub-vector containing rows 2s, )

1 2

( s+ , s=1,...,S, and row (4S+2) of γ_N, corresponding to M3,s, M4,s, and Mb. Finally, we

define the (2S+1)×[2S+S(S−1)/2+1] matrix Γ1_N as the sub-matrix containing rows 2s, )

1 2

( s+ , s=1,...,S, and (4S+2) of Γ_N, corresponding to M3,s, M4,s, and Mb.

2. Definition of GM Estimators

We next define three alternative GM estimators for the spatial autoregressive parameters of the disturbance process given by (1c) and the variances of the error components.5

2.1. Initial GM Estimation

The initial GM estimator is a special case of (14), using the identity matrix as weighting matrix Θ_N and a subset of moment conditions (Ma, M1,s and M2,s) only. It is thus based on the

the vector γ0_N and the matrix Γ0_N. Define θ0_N as the corresponding parameter vector that excludes σ₁2, i.e., θ0 =(ρ′_N,σ_v2)=(ρ₁_,_N,...,ρ_S_,_N,σ_v2), and accordingly

) , ,..., ,..., , ,..., , ,..., ( 2 , , 1 , , 1 , 2 , 1 2 , 2 , 1 , , 1 0 = ′ − N SN v S N S N N N N S N N S N N ρ ρ ρ ρ ρ ρ ρ ρ ρ ρ σ b .

The initial GM estimator is then obtained as the solution to

5

See Kapoor, Kelejian, and Prucha (2007) for analogous conditions under SARAR(0,1) estimation, assuming only nonstochastic regressors in equation (1a).

(16)

} ] , 0 [ , ), ( ) ( min{ arg ) , ,..., ( 2, 0 0 0 0 2 , , 1,N ρSN σvN = ϑN N ′ϑN N − ≤ ≤ σvN∈ bv ρ) ) ) _θ _θ _aρ _ρ _aρ , (17a) with ϑ_N0(θ0)=ϑ_N0(ρ,σ_v2)=(~γ0_N −Γ~0_Nb0).

Using these initial estimates of (

ρ

1,N,...,

ρ

S,N) and

2 v

σ

, the parameter σ12 can be estimated

from moment condition Mb:

∑

= = − ′ − = S m mN mN N N S m mN mN N N _N 1 , , , 1 1 , , 2 , 1 ) ~ ~ ( ) ~ ~ ( 1 u u Q u u

ρ

σ

) ) ) (17b) ... ~ ~ ... ~ ~ 2 , 1 1 , 2 4 , , 2 4 , 1 1 , 2 4 2 4S γ S N γ S S S N γS S N γ ₊ − ₊

ρ

) − − ₊

ρ

) − ₊ ₊

ρ

) = . ~ ... ~ ~ , , 1 2 / ) 1 ( 2 , 2 4 , 2 , 1 1 2 , 2 4 2 , 2 , 2 4S S SN γ S S N N γ S S S S S N SN γ ₊

ρ

) − ₊ ₊

ρ

)

ρ

) − − ₊ ₊ ₋

ρ

) ₋

ρ

) − 2.2. Weighted GM Estimation

While the initial GM estimator as defined in (17) is consistent (as will be shown below), it is inefficient. First, it ignores the information contained in moment conditions (Mb, M3,s and

M4,s).

6

Second, it is well known from the literature on generalized method of moments estimation, that it is optimal to use as weighting matrix the inverse of the (properly normalized) variance-covariance matrix of the moments, evaluated at true parameter values. Denote the optimal weighting matrix, which will be derived in Subsection 3.2, by Ψ−_N1. In general, the optimal weighting matrix is unknown and has to be estimated, e.g., using the results from the initial GM estimation. In Subsection 3.3 we derive a consistent estimator of

1 − N

Ψ , referred to as Ψ~−_N1. The optimally weighted GM estimator is based on all (4S+2) moment conditions and uses Θ~_N =Ψ~−_N1 as the weighting matrix for the moment conditions. It is defined as: } ] , 0 [ ], , 0 [ , ), ( ~ ) ( { min arg ) ~ , ~ , ~ ,..., ~ ( 12 2 2 1, 2 , , 1,N

ρ

SN

σ

vN

σ

N =

ϑ

N ′ N

ϑ

N − ≤ ≤

σ

v∈ bv

σ

∈ c

ρ

_θ _Θ _θ _aρ _ρ _aρ , with c≥b_v+Tb_μ, and

ϑ

_N(θ)=

ϑ

_N(ρ,

σ

_v2,

σ

₁2)= (~γ_N −~Γ_Nb). (18)

As already mentioned, the optimal weighting matrix is derived without distributional assumptions. As a consequence it involves third and fourth moments of the error components

6

This does not mean that any GM estimator using all moment conditions is necessarily superior. In fact, Kapoor, Kelejian, and Prucha (2007) show in a Monte Carlo study on their SAR(1) model that the initial GM estimator performs much better (in terms of bias and RMSE) than the unweighted GM estimator using all moment conditions. Their results suggest that a proper weighting of the moment conditions, in particular the weighting of moment conditions Ma, M1, and M2 relative to moment conditions Mb, M3, and M4 is of crucial importance.

(17)

N it

v _, and

μ

_i,_N. Kapoor, Kelejian, and Prucha (2008) use the assumption that ε_it_,_N is normally distributed to obtain a simplified weighting matrix as an approximation of the true optimal weighting matrix. For comparison, we also consider this simplified optimal weighting matrix, which is shown to be a special case of Ψ−_N1 in the Appendix and referred to as (Ψo_N)−1. The simplified weighted GM estimator is defined as the weighted GM estimator given in (18), using Θ~_N =(Ψ~o_N)−1.

3. Asymptotic Properties of the GM Estimator for θ_N

3.1 Consistency

In order to prove consistency, the following additional assumptions are required:

Assumption 4.

Assume that u~_N −u_N =D_NΔ_N, i.e., u~_i_,_N −u_i_,_N =d_i_.,_NΔ_N, for i=1,...,NT,7 where D_N is an P

NT× matrix, the 1×P vector d_i.,_N denotes the i-th row of D_N and Δ_N is a P×1 vector. Let d_ij,_N be the j-th element of di.,N. For some δ >0, we assume that ≤ <∞

+ d N ij t c d E , ( )2 δ ,

where c_d does not depend on N, and that N1/2 Δ_N =O_p(1).

Assumption 4 will be fulfilled in many settings, e.g., if model (1a) contains endogenous variables (such as spatial lags of y_N) and is estimated using two-stages least squares. In that case, Δ_N denotes the difference between the parameter estimates and the true parameter values and d_i.,_N is the (negative of the) i-th row of the design matrix Z_N (compare Lemma 1 in Subsection 2 of Section IV). Under certain conditions, Assumption 4 will also be satisfied if model (1a) involves a non-linear specification (see Kelejian and Prucha, 2008, p. 12). Finally, observe that Assumption 4 implies that

∑

= + − NT i N i NT 1 2 ., 1 ) ( d δ is O_p(1). Assumption 5.

(a) The smallest eigenvalues of Γ0_N′Γ0_N and 1 1 N NΓ

Γ ′ are bounded away from zero, i.e., 0 ) ( _* min ′ ≥

λ

>

λ

i N i N Γ

Γ for i = 1, 2. (b) Θ~_N −Θ_N =o_p(1), where Θ_N are (4S+2)×(4S+2) nonstochastic, symmetric, positive definite matrices. (c) The largest eigenvalues of Θ_N are bounded uniformly from above, and the smallest eigenvalues of Θ_N are bounded uniformly away from zero.

7

Note that we use single indexation i=1,...,NT to refer to the elements of the vectors that are stacked over time periods. (See the remark on notation in Appendix A.)

(18)

Assumption 5 implies that the smallest eigenvalues of Γ′_NΓ_N and Γ′_NΘ_NΓ_N are bounded uniformly away from zero, ensuring that the true parameter vector θ_N is identifiable unique. Moreover, by the equivalence of matrix norms, it follows from Assumption 5 that Θ_N and

1 − N

Θ are O(1).

Assumptions 1-5 ensure consistency of the GM estimators for ( , , 12)

2

σ

_v N N ρ

θ = ′ . We

summarize these results in the following theorems, which are proved in Appendix B.

Theorem 1a.Consistency of Initial GM Estimator ~0

N

θ

Suppose Assumptions 1-5 hold. Then, provided the optimization space contains the parameter space, the initial GM estimators ( 1, ,..., , 2, )

0 = ′ N v S N N ρ) ρ) σ) )

θ defined by (17a), and σ)12,N, defined

by (17b) are consistent for

ρ

1,_N,...,

ρ

_S,_N, 2 v

σ

, and σ12, i.e., 0 , s, p N s N −

ρ

→

ρ

) , s=1,...,S, 2 2 0 , p v N v −

σ

→

σ

) , and 2 0 1 2 , 1 p N −

σ

→

σ

) as N →∞.

Theorem 1b.Consistency of Weighted GM Estimator θ~_N

Suppose Assumptions 1-5 hold. Then, provided the optimization space contains the parameter space, the weighted GM estimators θ~_N(Θ~_N)=[ρ~₁_,_N(Θ~_N),...,ρ~_S_,_N(Θ~_N),

σ

~_v2_,_N(Θ~_N),

σ

~₁2_,_N(Θ~_N)]′ defined by (18) are consistent for

ρ

_1,_N,...,

ρ

_S_,_N,

σ

_v2, and

σ

₁2, i.e.,

0 ) ~ ( ~ , s, p N s N N −

ρ

→

ρ

Θ , s=1,...,S,

σ

~_v2_,_N(Θ~_N)−

σ

_v2→p 0 , and

σ

~₁2_,_N (Θ~_N)−

σ

₁2→p 0 as N →∞. This result holds for an arbitrary weighting matrix (that satisfies Assumption 5). Hence, it applies to both the optimally weighted GM estimator defined by (18) with Θ~_N =(Ψ~_N)−1and the simplified optimally weighted GM estimator θ~_No with Θ~_N =(Ψ~o_N)−1.

3.2 Asymptotic Distribution of GM Estimator for θ_N

In the following we consider the asymptotic distribution of the optimally weighted GM estimator θ~_N. To establish asymptotic normality of ~ (~ , ~ ,~12, )

2 ,N N v N N ρ

σ

θ = , we need some additional assumptions. Assumption 6.

Let D_N be defined as in Assumption 4, such that u~_N −u_N =D_NΔ_N. For any real N×N matrix A_N, whose row and column sums are bounded uniformly in absolute value, it holds that )N−1D′_NA_Nu_N −N−1E(D′_NA_Nu_N)=o_p(1 .

(19)

A sufficient condition for Assumption 6 is, e.g., that the columns of D_N are of the form

N N N Π ε

π + , where the elements of π_N are bounded uniformly in absolute value and the row and column sums of Π_N are bounded uniformly in absolute value (see Kelejian and Prucha, 2008, Lemma C.2). This will be the case in many applications, e.g., for the model in equation (1a), if D_N equals (the negative of) matrix Z_N (compare Lemma 1 in Section IV).

Assumption 7.

Let Δ_N be defined as in Assumption 4. Then, ) 1 ( ) ( ) (NT 1/2Δ_N = NT −1/2T′_Nξ_N +o_p , with TN =(Tv′,N,Tμ′,N)′, )ξN =(v′N,μ′N ′, i.e., ) 1 ( ) ( ) ( ) (NT 1/2Δ_N = NT −1/2T_v′_,_Nv_N + NT −1/2T_μ′_,_Nμ_N +o_p ,

where T_N is an (NT+N)×P-dimensional real nonstochastic matrix whose elements are bounded uniformly in absolute value; T_v,_N is of dimension (NT×P) and T_μ_,_N is of dimension (N×P). As remarked above, Δ_N typically denotes the difference between the parameter estimates and the true parameter values. Assumption 7 will be satisfied by many estimators. In Section IV, we verify that it holds if the model in equation (1a) is estimated by two-stages least squares (TSLS) or feasible generalized two-stages least squares (FGTSLS).

The limiting distribution of the GM estimator of θ_Nwill be seen to depend on (the inverse of) the matrix J′_NΘ_NJ_N and the variance-covariance matrix of a vector of quadratic forms in v_N and μ_N, denoted as q_N. We consider each of these expressions in the following. The

) 2 ( ) 2 4

( S+ × S+ matrix J_N of derivatives of the (4S+2)×1 vector of moment conditions in (11) is given by θ Γ γ θ J ′ ∂ − ∂ = ( ) ) ( _N N N N N b ) , , ,..., ( _,₁_, _, _, _, _, _, _, 1 N i N i N S i N i j j j j v σ σ = , with (19a) = N s i j, , s N N i N i

ρ

∂ − ∂(γ., Γ., b ) , i=1,...,(4S+2), s=1,...,S, = N i v j_,_σ _, v N N i N i

σ

∂ − ∂(γ_., Γ_., b ) , i=1,...,(4S+2), = N i j,σ₁, 1 ., ., ) (

σ

∂ − ∂ γ_i _N Γ_i _Nb_N , i=1,...,(4S+2),

(20)

Using 0 θ γ ₌ ′ ∂ ∂ _N

and ignoring the negative sign, we have

N N N N N ρ _θ Γ b Γ B J = ′ ∂ ∂ = ) ( , (19b)

where Γ_N is defined above and of dimension (4S+2)×[2S+S(S−1)/2+2] and B_N is a ) 2 ( ] 2 2 / ) 1 ( 2

[ S+S S− + × S+ matrix of the form

) , , , ( ₁ ′₂_, ′₃_, ′₄_, ′ = _N _N _N N B B B B B , (20a)

with B₁ =(I_S,0_S_×₂), )]B₂_,_N =[diag_sS₌₁(2

ρ

_s_,_N),0_S_×₂ , and B₃_,_N =(B′₃_,₁_,_N,...,B₃′_,_S₋₁_,_N)′ is an S

S

S( −1)/2× matrix, consisting of (S−1) vertically arranged blocks B₃_,_m,_N, ) 1 ( ,..., 1 − = S

m , which have the following structure:

) , , , ( , , , ( ) 2 , , 3mN = CmN dmN EmN 0S−m× B , (20b)

where C_m,_N is a (S−m)×(m−1) matrix of zeros,8 d_m,_N is a (S−m)×1 vector, defined as )

,...,

( ₁_, _,

,N m N S N

m =

ρ

+

ρ

d , and E_m_,_N =

ρ

_m_,_NI_S₋_m. Finally, B₄_,_N is a 2×(S +2)matrix, defined as ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ = + × × 1 , 0 , 1 , 1 1 1 , 4 S S N ₀ 0 B . (20c)

For later reference, note that B_N has full column rank (S+2); as a consequence, the )

2 ( ) 2

(S+ × S+ matrix B′_NB_N is positive definite (see, e.g., Greene, 2003, p. 835).

We next consider the vector q_N and its limiting distribution. First, define q_N(θ_N,Δ_N) as the 1

) 2 4

( S+ × vector of sample moments as given by (11) with the expectation operator suppressed, evaluated at the true parameter values, and ignoring the deterministic constants:

8

(21)

= ) , ( _N _N N θ Δ q ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ − N N b N N N a N N N S N N N S N N N S N N N S N N N N N N N N N N N N N N u C u u C u u C u u C u u C u u C u u C u u C u u C u u C u ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ . ~ ~ ~ ~ ~ ~ ~ ~ , , , , 4 , , 3 , , 2 , , 1 , 1 , 4 , 1 , 3 , 1 , 2 , 1 , 1 1 , (21) where

∑

= = − ⊗ ′ ⊗ ′ − ⊗ − = S m N m N m N T N N s N s T N S m N m N m N T N s T 1 , , , 0 , , , 0 1 , , , , 1 [ ( )] ( ) [ ( )] ) 1 ( 1 M I I Q M M I Q M I I C

ρ

,

∑

= = − ⊗ ′ + ⊗ ′ − ⊗ − = S m mN mN N T N N s N s T N S m mN mN N T N s _T 1 , , , 0 , , , 0 1 , , , , 2 [ ( )] [ ( )] [ ( )] ) 1 ( 2 1 M I I Q M M I Q M I I C

ρ

,

∑

= = − ⊗ ′ ⊗ ′ − ⊗ = S m mN mN N T N N s N s T N S m mN mN N T N s 1 , , , 1 , , , 1 1 , , , , 3 [I (I M )]Q (I M M )Q [I (I M )] C

ρ

,

∑

= = − ⊗ ′ + ⊗ ′ − ⊗ = S m mN mN N T N N s N s T N S m mN mN N T N s 1 , , , 1 , , , 1 1 , , , , 4 [ ( )] [ ( )] [ ( )] 2 1 _I _I _M _Q _I _M _M _Q _I _I _M C ρ ρ ,

∑

= = − ⊗ ′ − ⊗ − = S m N m N m N T N S m N m N m N T N a _T 1 , , , 0 1 , , , [ ( )] [ ( )] ) 1 ( 1 M I I Q M I I C ρ ρ ,

∑

= = − ⊗ ′ − ⊗ = S m N m N m N T N S m N m N m N T N b 1 , , , 1 1 , , , [I (I M )]Q [I (I M )] C

ρ

. (22)

By Assumption 3 and Remark A.1 in Appendix A, the row and column sums of the symmetric NT×NT matrices C_p,_s,_N, p=1,...,4, s=1,...,S, Ca,N, and Cb,N are bounded

uniformly in absolute value. Also, note that C₃_,_s,_N, C₄_,_s,_N, C_b,_N differ from C₁_,_s,_N, C₂_,_s,_N,

N a,

C only by the normalization and the use of Q₁_,_N versus Q₀_,_N.

In light of (21) and Lemma B.1 (see Appendix B), the elements of N1/2q_N(ρ_N,Δ_N) can be expressed as

(22)

= ) , ( 2 / 1 N N N N q θ Δ (1) . 2 / 1 , , 2 / 1 2 / 1 , , 2 / 1 2 / 1 , , 4 , , 4 2 / 1 2 / 1 , , 3 , , 3 2 / 1 2 / 1 , , 2 , , 2 2 / 1 2 / 1 , , 1 , , 1 2 / 1 2 / 1 , 1 , 4 , 1 , 4 2 / 1 2 / 1 , 1 , 3 , 1 , 3 2 / 1 2 / 1 , 1 , 2 , 1 , 2 2 / 1 2 / 1 , 1 , 1 , 1 , 1 2 / 1 p N N b N N a N N N a N N a N N N S N N S N N N S N N S N N N S N N S N N N S N N S N N N N N N N N N N N N N N N N N N N N N o N N N N N N N N N N N N N N N N N N N N + ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ′ + ′ ′ + ′ ′ + ′ ′ + ′ ′ + ′ ′ + ′ ′ + ′ ′ + ′ ′ + ′ ′ + ′ − − − − − − − − − − Δ α u C u Δ α u C u Δ α u C u Δ α u C u Δ α u C u Δ α u C u Δ α u C u Δ α u C u Δ α u C u Δ α u C u , (23) where 2 ( ,, ) 1 , ,sN N psN N p N E D C u α = − ′ , p=1,...,4, s=1,...,S, 2 ( , ) 1 ,N N aN N a N E D C u α = − ′ , and ) ( 2 , 1 ,N N bN N b N E D C u α = − ′

. By Lemma B.1 the elements of the P×1 vectors α_p,_s,_N,

4 ,..., 1

=

p , s=1,...,S, α_a,_N and α_b,_N are bounded uniformly