A multidimensional spatial lag panel data model with spatial moving average nested random effects errors

(1)

https://doi.org/10.1007/s00181-017-1410-7

A multidimensional spatial lag panel data model with

spatial moving average nested random effects errors

Bernard Fingleton1 · Julie Le Gallo2 ·

Alain Pirotte3

Received: 21 March 2017 / Accepted: 4 October 2017

Abstract This paper focuses on a three-dimensional model that combines two differ-ent types of spatial interaction effects, i.e. endogenous interaction effectsviaa spatial lag on the dependent variable and interaction effects among the disturbancesviaa spatial moving average (SMA) nested random effects errors. A three-stage procedure is proposed to estimate the parameters. In a first stage, the spatial lag panel data model is estimated using an instrumental variable (IV) estimator. In a second stage, a gener-alized moments (GM) approach is developed to estimate the SMA parameter and the variance components of the disturbance process using IV residuals from the first stage. In a third stage, to purge the equation of the specific structure of the disturbances a Cochrane–Orcutt-type transformation is applied combined with the IV principle. This leads to the GM spatial IV estimator and the regression parameter estimates. Monte Carlo simulations show that our estimators are not very different in terms of root mean square error from those produced by maximum likelihood. The approach is applied to European Union regional employment data for regions nested within countries.

B

Bernard Fingleton [email protected] Julie Le Gallo [email protected] Alain Pirotte [email protected]

1 _{Department of Land Economy, University of Cambridge, Cambridge CB3 9EP, UK} 2 _{CESAER UMR1041, AgroSup Dijon, INRA, University Bourgogne Franche-Comté,}

26, Boulevard Petitjean, 21000 Dijon, France

(2)

Keywords Multidimensional·Spatial moving average nested random effects · Generalized moments·Instrumental variables·Maximum likelihood·Panel data

JEL Classification C13·C23

1 Introduction

Recently, Fingleton et al. (2016) introduced a generalization of the Kapoor et al. (2007) (hereafter KKP) generalized moments (GM) procedure to multidimensional panel data models assuming that the disturbances follow a first-order spatial autore-gressive (SAR) process, which includes a nested random effects structure, namely SAR-NRE. They refer to this specification as a panel data model with spatially nested random effects disturbances. They derive a spatial feasible generalized least squares (S-FGLS) estimator for the model’s regression parameters which uses the GM param-eter estimates of the SAR paramparam-eter and the variance components of the disturbance process, namely GM-S-FGLS. This estimator is based on a spatial counterpart to the Cochrane–Orcutt transformation, as well as on transformations which are used in the estimation of classical error component models.

In this paper, we consider a more general multidimensional panel data model which includes a spatial lag and where the disturbances are assumed to follow a spatial moving average (SMA) process (local spatial spillover effects) in the spirit of Fingleton (2008). This structure constitutes an alternative to incorporating spatial lags on the explanatory variables. In the cross-sectional case, when the model contains a spatial lag dependent variable, Kelejian and Prucha (1998, 1999) suggest a 2SLS procedure. They propose that the instrument set should be kept to a low order to avoid linear dependence and retain full column rank for the matrix of instruments, and thus recommend that (X,W X) should be used, if the number of regressors is large. Inclusion of spatial lags of the explanatory variables could have a major impact on the performance of the estimation procedures if one were to keep to this recommendation. Pace et al. (2012) show that instrumental variable estimation suffers greatly in situations where spatial lags of the explanatory variables (W X) are included in the model specification. The reason is that this requires the use of (W2_X_,_W3_X_{, . . .}_{) as instruments, in place of} the conventional instruments that rely onW X, and this appears to result in a weak instrument problem. Our motivation for the adoption of a SMA specification of the error process, which has been largely neglected in spatial econometrics, is that it mitigates against the problem for instrumental variable estimation identified by Pace et al. (2012). Naturally the choice of this specification should be predicated on the applied researcher, at a preliminary stage, examining the nature of local spillovers in order to establish its appropriateness for the empirical application at hand.

We propose a three-stage procedure to estimate the parameters. In a first stage, the spatial lag panel data model is estimated using an instrumental variable (IV) estimator. In a second stage, a GM approach is developed to estimate the SMA parameter and the variance components of the disturbance process using IV residuals from the first stage. In a third stage, to purge the equation of the specific structure of the disturbances, a Cochrane–Orcutt-type transformation combined with the IV principle is applied.

(3)

This leads to the GM spatial IV estimator and the regression parameter estimates of the spatial lag model. Monte Carlo simulations show that our estimates are not very different in terms of root mean square error compared to those produced by maximum likelihood (ML).

The outline of the paper is as follows: Sect.2presents the spatial lag panel data model with spatial moving average nested random effects errors, and Sect.3focuses on estimation methods. This section introduces a spatial GM instrumental variable approach to estimate the parameters of the model. Section4presents the Monte Carlo design and describes the Monte Carlo results. Section5illustrates our approach using an application to EU regional employment data for regions nested within countries. The last section concludes.

2 The spatial model

Our point of departure is a three-dimensional model that combines two different types of spatial interaction effects, i.e. endogenous interaction effectsviaa spatial lag on the dependent variable and interaction effects among the disturbancesviaa spatial moving average (SMA) process on the error term. The notation is as follows: the dependent variableyi j t is observed along three indices, withi =1, . . . ,N, j =1, . . . ,Mi and t =1, . . . ,T.Ndenotes the number of groups.Midenotes the number of individuals

in groupi, so in total there areS =_iN₌₁Miindividuals. Since this model allows for

an unequal number of individuals across theN groups, it is therefore unbalanced in the spatial dimension, although it is balanced in the time dimension. Hence, the model describes a hierarchical structure with the index j pertaining to individuals that are nested within theN groups. Assuming that spatial autocorrelation only takes place at the individual level and that the slope coefficients are homogenous, the model can be written as: yi j t =ρ N g₌1 Mg h₌1 wi j,ghyght+xi j tβ+εi j t, (1)

whereyi j tis the dependent variable;xi j tis a (1×K)vector of explanatory exogenous

variables;β represents a (K×1) vector of parameters to be estimated; andεi j t is the

disturbance, the properties of which will be discussed below.

The weightwi j,gh=wk,l is the (k=i j;l=gh) element of the spatial matrixWS

withi jdenoting individualjwithin groupi, and similarly forgh. Thus,k,l=1, . . . ,S

andWS is a (S×S) matrix of known spatial weights which has zero on the leading

diagonal and is usually row-normalized so that for rowk,_gN₌₁_hM₌g₁wk,gh = 1,

although as we will illustrate in the empirical example other normalizations are per-missible. We maintain the standard assumption concerning the weight matrix, i.e.WS

is assumed non-stochastic, and its row and column sums are required to be uniformly bounded in absolute value.ρis the spatial lag parameter to be estimated. This coef-ficient is bounded numerically to ensure spatial stationarity, i.e.e−_min1 < ρ <1 where

(4)

In this paper, we consider the case of disturbancesεi j t that are contemporaneously

correlated through a moving average process at the individual level:

εi j t =ui j t−λ N g=1 Mg h=1 mi j,ghught. (2)

The weightmi j,gh is an element of the spatial matrix MS which satisfies the same

assumptions as forWS. For simplicity in the following, we assume thatMS =WS.

λ is the spatial moving average parameter to be estimated. ui j t is assumed to be

i.i.d.(0, σ2

u). Spatial heterogeneity is captured through a random effects structure for

the errorsui j t: it contains an unobserved permanent unit-specific error componentαi,

a nested permanent unit-specific error componentμi jtogether with a remainder error

componentvi j t. Hence, we envisage a time-invariant group effect applying equally to

all individuals nested within a group, time-invariant individual group-specific effects and transient effects that vary at random across groups, individuals and time. More formally:

ui j t =αi +μi j+vi j t, (3)

in whichαi is the unobservable group-specific time-invariant effect which is assumed

to be i.i.d.N0, σ_α2;μi jis the nested effect of individualjwithin theith group which

is assumed to be i.i.d.N0, σ_μ2andvi j tis the remainder term which is also assumed to

be i.i.d.N0, σ_v2. Theαi’s,μi j’s andvi j t’s are independent of each other and among

themselves.

In contrast to the classical literature on panel data, grouping the data by periods rather than units is more convenient when we consider spatial autocorrelation. For a cross sectiont, Eqs. (1), (2) and (3) can be written as:

yt =ρWSyt+xtβ+εt, (4)

whereytis of dimension(S×1)andxtis an(S×K)matrix of explanatory variables

that are assumed to be exogenous and non-stochastic and have elements uniformly bounded in absolute value. The first-order moving average error processtis given by

εt =ut−λWSut, (5) with ut =diag ιMi α+μ+vt, (6)

where ut is (S×1), α is the vector of group effects of dimension (N×1),

μ ₌ _μ 1, . . . , μN , a vector of dimension (1 ×S),μ_i = μi1, . . . , μi Mi , a vector of dimension (1×Mi),ιMi is a vector of ones of dimension(Mi ×1). By

diagιMi

, we mean diagιM1, . . . , ιMN

. Finallyvt is of dimension(S×1).

Stacking theT cross sections gives

(5)

and

ε=u−λW u, (8) with Z = [W y,X] and δ = [ρ, β]. y and X are the vector and matrix of the dependent and explanatory variables (covariates), respectively, of size(T S×1)and (T S×K);βis the vector of the slope parameters of size(K×1); and finallyεis the vector of the disturbance terms of dimension (T S×1). Given that IT is an identity

matrix of dimension(T ×T), thenW =(IT ⊗WS)is of size(T S×T S). Finally,

for the full(T S×1)vectoru, we have:

u=ιT ⊗diag

ιMi

α+(ιT ⊗IS) μ+v. (9)

In order to compute the GM-S-IV estimator ofδ, which is described in Sect.3.2, we need to obtain the inverse of the covariance matrix ofu, which is−u1. This is

achieved by means of the spectral decomposition. Following Baltagi et al. (2014), the covariance matrix ofut is:

E utut =σ2 αdiagJMi +σ2 μ+σv2 IS, (10) where IS =diagIMi

is an identity matrix of dimension S. JMi =

ιMiιMi is a

matrix of ones of dimension(Mi×Mi). The covariance matrix ofucorresponds to:

u=σ_α2 Z_αZ_α +σ_μ2 Z_μZ_μ +σ_v2(IT ⊗IS) =σ2 αJT ⊗diag JMi +σ2 μJT +σ_v2IT ⊗IS, (11) where Z_α =ιT ⊗diag ιMi ,Z_μ =ιT ⊗ISandJT = ιTιT is a matrix of ones of dimension(T ×T). ReplaceJT by its idempotent counterpartT JT,JMi byMiJMi

with JT = JT/T and JMi = JMi/Mi. Also, define ET = IT −JT,andEMi =

IMi −JMi,and replace IT by ET +JT ,IMi by EMi +JMi . Collecting terms with the same matrices, one gets the spectral decomposition ofu:

u =θ1Q1+θ2Q2+ IT ⊗diag θ3iIMi Q3, (12) with θ1=σ2 v,θ2=Tσμ2+σv2,θ3i =MiTσ_α2+Tσ_μ2+σ_v2. (13)

These equalities occur because of the definitions1ofQ1,Q2andQ3. It turns out that

Q1relates to within transformation.Q2andQ3relate, respectively, to between and mean transformation matrices. More formally,

Q1=ET ⊗IS,Q2=JT ⊗diag EMi , (14) Q3=JT ⊗diag JMi . (15)

(6)

The operators Q1,Q2andQ3are symmetric and idempotent, with their rank equal to their trace. Moreover, they are pairwise orthogonal and sum to the identity matrix. From (12), we can easily obtain−u1as:

−1 u =θ− 1 1 Q1+θ2−1Q2+ IT ⊗diag θ−1 3i IMi Q3. (16)

For the full(T S×1)vectorε, we then have:

ε=u−λ(IT ⊗WS)u, (17)

or

ε= [IT S−λ(IT ⊗WS)]u=(IT ⊗GS)u, (18)

where GS = IS−λWS. The corresponding(T S×T S)covariance matrix is given

by:

ε =AuA, (19)

where Ais a block diagonal matrix equal to (IT ⊗GS). Following the properties of

the matricesuandA, we obtain the inverse covariance matrix ofεdefined as:

−1

ε =A−1−u1A−1. (20)

3 Estimation methods

The estimation methods of multidimensional spatial panel models are direct extensions of the ones that have been created for the standard spatial panel data econometrics. This means that two main approaches are used to estimate these models: one based on ML principle and the other one linked to method of moments procedures.

3.1 Maximum likelihood estimation

Upton and Fingleton (1985), Anselin (1988), LeSage and Pace (2009) and Elhorst (2014) provide the general framework for ML estimation of spatial models. Under normality of the disturbances, the log-likelihood function is

lnL= −T S 2 ln(2π)− 1 2ln|ε| +T ln|DS| −1 2(Dy−Xβ) −1 ε (Dy−Xβ) , (21)

whereDS=(IS−ρWS)andD=(IT⊗DS). For a SMA process for the disturbances

(7)

lnL = −T S 2 ln(2π)− 1 2ln|u| −Tln|GS| +Tln|DS| −1 2(Dy−Xβ) −1 ε (Dy−Xβ) . (22)

Letγ1=σ_α2/σ_v2,γ2=σ_μ2/σ_v2and_ε =σ_v2, then the log-likelihood function (22) can be written as2 lnL = −T S 2 ln(2π)− T S 2 lnσ 2 v− 1 2 N i=1 ln(T(Miγ1+γ2)+1) −1 2 N i=1 (Mi−1)ln(Tγ2+1) −T N i=1 Mi j=1 ln1−ωi jλ +T N i=1 Mi j=1 ln1−ηi jρ − 1 2σ2 v (Dy−Xβ) −1₍ Dy−Xβ) . (23)

The first-order conditions for the parameters in (22) and (23) are intertwined which means that they are nonlinear, i.e. the equations cannot be solved analytically. There-fore, a numerical solution by means of an iterative procedure is needed in the spirit of Anselin (1988).

3.2 GM and instrumental variables

There are several issues with ML procedures. First, they call for explicit distributional assumptions, which may be difficult to satisfy, although quasi-ML (QML) approaches may to some extent allay this problem. Second, specifying and maximizing likelihood functions appropriate to extensions to more complex models may be problematic, espe-cially if there are endogenous variables other than the spatial lag, as ML estimation is not possible when endogeneity is in implicit form. Finally, there are very computa-tionally intensive. In view of the desirability of estimation approaches that avoid some of these challenges posed by ML, Kelejian and Prucha (1998, 1999) suggested an alternative instrumental variable estimation procedure for the cross-sectional spatial lag model also including a SAR process for the disturbances. This approach is based on a GM estimator of the parameter in the SAR process. The procedures suggested in Kelejian and Prucha (1998,1999) are computationally feasible even for very large sample sizes. In a panel data context with a spatial error autoregressive process, KKP (2007) derive a GM estimator, which is computationally feasible even for large sample size, while Fingleton et al. (2016) extend this procedure to capture spatial

autoregres-2 _{See Baltagi et al. (}₂₀₀₁_{) who give the expression of the log-likelihood function for the multidimensional}

(8)

sive nested random effects errors. We follow this and adapt the moments conditions in order to consider SMA nested random effects errors.

3.2.1 Moments conditions

We follow Fingleton et al. (2016) to develop a GM approach leading to estima-tors of λ, σ_α2, σ_μ2, σ_v2, or equivalently of λ, σ_α2, θ2=Tσ_μ2+σ_v2 and σ_v2, relies on moments conditions related to E[uQiu], E[uQiu], E[uQiu], E[uQiu], E[uQiu],E[uQiu],i =1,2,3. For notational convenience, we have

ε=(IT ⊗WS) ε, (24) ε=(IT ⊗WS) ε, (25) u =(IT ⊗WS)u, (26) u =(IT ⊗WS)u. (27) Following (17), we have ε=u−λu, (28) ε=u−λu. (29) First, we compute the quadratic moments with respect toQ1:

ε_Q_1ε₌₍_u₋_λ_u₎_Q₁₍_u₋_λ_u₎ =uQ1u+λ2uQ1u−2λuQ1u, (30) ε_Q_1ε₌₍_u₋_λ_u₎_Q₁₍_u₋_λ_u₎ =uQ1u+λ2uQ1u−λ uQ1u+uQ1u , (31) εQ1ε=(u−λu)Q1(u−λu) =uQ1u+λ2uQ1u−2λuQ1u. (32)

Then, the expectations of the quadratic moments (30), (31), (32) depend on the moments E[uQ1u],E[uQ1u], E[uQ1u],E[uQ1u], E[uQiu], E[uQ1u]. After some computations,3these latter expectations are given by:

E[uQ1u] =σ_v2S(T −1) , (33) E[uQ1u] =0, (34) E[uQ1u] =σ_v2(T −1)tr W_SWS , (35) E[uQ1u] =σ_v2(T −1)tr W_SW_SWS , (36) E[uQ1u] =σ_v2(T −1)tr W_SW_SWSWS , (37)

(9)

E[uQ1u] =σ_v2(T −1)tr

W_SW_S . (38)

Substituting (33) to (38) into (30), (31) and (32) gives:

E[εQ1ε] =σ_v2S(T −1) +λ2_σ2 v(T −1)tr W_SWS , (39) E[εQ1ε] =λ2σ_v2(T −1)tr W_SW_SWS −λσ2 v(T −1)tr W_SWS+WSWS , (40) E[εQ1ε] =σ_v2(T −1)tr WSWS +λ2_σ2 v(T −1)tr WSWSWSWS −λσ2 v2(T −1)tr W_SW_SWS . (41)

We proceed in a similar fashion as a result of replacing Q1in (30), (31) and (32) byQ2and byQ3. The moments ofE[uQiu],E[uQiu],E[uQiu],E[uQiu], E[uQiu],E[uQiu],i =2,3, are: E[uQ2u] =θ2(S−N) , (42) E[uQ2u] =θ2tr WS• , (43) E[uQ2u] =θ2tr WS•WS• +Tσα2tr WS•WS• , (44) E[uQ2u] =θ2tr W_S••W_S• +Tσ_α2tr W_S••W_S• , (45) E[uQ2u] =θ2tr W_S••W_S•• +Tσ_α2tr W_S••W_S•• , (46) E[uQ2u] =θ2tr W_S•• , (47) and E[uQ3u] = Nθ2+STσ_α2, (48) E[uQ3u] =θ2tr WS∗ +Tσα2tr WS , (49) E[uQ3u] =θ2tr W_S∗W_S∗ +Tσ_α2tr W_S∗W_S∗ , (50) E[uQ3u] =θ2tr W_S∗∗W_S∗ +Tσ_α2tr W_S∗∗W_S∗ , (51) E[uQ3u] =θ2tr W_S∗∗W_S∗∗ +Tσ_α2tr W_S∗∗W_S∗∗ , (52) E[uQ3u] =θ2tr W_S∗∗ . (53)

(10)

E[εQ2ε] =θ2(S−N) +λ2_θ2tr_W• S WS• +Tσα2tr W_S•W_S• −λθ22trW_S• , (54) E[εQ2ε] =θ2tr W_S• +λ2_θ2tr W_S••W_S• +Tσ_α2tr W_S••W_S• −λθ2trW_S•W_S•+W_S•• +Tσ_α2tr W_S•W_S• , (55) E[εQ2ε] =θ2tr W_S•W_S• +Tσ_α2tr W_S•W_S• +λ2_θ2tr WS••WS•• +Tσα2tr WS••WS•• −λθ22trWS••WS• +Tσα22tr WS••WS• , (56)

where=diag(JMi),WS•=diag(EMi)WS,WS••=diag(EMi)WSWSand

E εQ3ε =θ2N+σ_α2ST +λ2_θ2tr WS∗WS∗ +Tσα2tr WS∗WS∗ −λθ22trW_S∗ +Tσ_α22tr W_S , (57) E[εQ3ε] =θ2tr W_S∗ +Tσ_α2tr W_S +λ2_θ2tr W_S∗∗W_S∗ +Tσ_α2tr W_S∗∗W_S∗ −λθ2trW_S∗W_S∗+W_S∗∗ (58) + Tσ_α2tr W_S∗W_S∗+W_SW_S , (59) E[εQ3ε] =θ2tr WS∗WS∗ +Tσα2tr WS∗WS∗ +λ2_θ2tr WS∗∗WS∗∗ +Tσα2tr WS∗∗WS∗∗ −λθ22trWS∗∗WS∗ +Tσα22tr WS∗∗WS∗ , (60)

whereW_S∗=diag(JMi)WSandWS∗∗=diag(JMi)WSWS. Overall, we obtain a system

of nine equations involving the second moments ofε,ε:

(11)

where = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ κ1κ2 κ2t1 0 0 0 0 0 0 0 0 κ2t2 κ2t3 0 0 0 0 0 0 κ2t1 κ2t4 2κ2t2 0 0 0 0 0 0 0 0 0 κ3 t₁• 2t₆• 0 κ4t₅• 0 0 0 0 t₆• t7 t8• 0 κ4t9 κ4t₅• 0 0 0 t₁• t₁•• 2t7 κ4t₅• κ4t₅•• 2κ4t9• 0 0 0 κ5 t₁∗ 2t₆∗ κ1κ4 κ4t₅∗ 2κ4t10 0 0 0 t₆∗ t11 t8∗ κ4t10 κ4t9∗ κ4t12 0 0 0 t₁∗ t₁∗∗ 2t11 κ4t5∗ κ4t5∗∗ 2κ4t9∗ ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ , (62) and ϒ= ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ σ2 v λ2_σ2 v −λσ2 v θ2 λ2_θ2 −λθ2 σ2 α λ2_σ2 α −λσ2 α ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ ,γ = ⎛ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎜ ⎝ EεQ1ε EεQ1ε EεQ1ε EεQ2ε EεQ2ε EεQ2ε EεQ3ε EεQ3ε EεQ3ε ⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠ , (63) withκ1 = S,κ2 = T −1, κ3 = S −N,κ4 = T,κ5 = N, t1 = tr W_SWS , t2 = tr W_SW_SWS , t3 = tr W_SWS+W_SW_S , t4 = tr W_SW_SWSWS , t₁• =trW_S•W_S•,t₁∗ =trW_S∗W_S∗,t₁•• =trW_S••W_S••,t₁∗∗ =trW_S∗∗W_S∗∗, t₅• = trW_S•W_S•, t₅∗ = trW_S∗W_S∗, t₅•• = trW_S••W_S••, t₅∗∗ = trW_S∗∗W_S∗∗, t₆• = trW_S•, t₆∗ = trW_S∗, t7 = tr W_S••W_S•, t₈•=trW_S•W_S•+W_S••,t₈∗ =trW_S∗W_S∗+W_S∗∗,t₉• =trW_S••W_S•,t₉∗ = trW_S∗∗W_S∗,t10=tr W_S,t11=tr W_S∗∗W_S∗andt12=t₅∗+tr W_SW_S. In practice, to obtain the GM estimators ofλ,σ_α2,θ2andσ_v2, we have to use the sample counterparts of the terms in Eq. (61), i.e.andγ. Nevertheless, to estimateλ andσ_v2, it is possible to use only the moments from (33) to (38). Then, the estimates of θ2andσ_α2 follow from the moments (42) and (48). This estimator is called the unweighted GM estimator. In other words, the GM estimators ofλandσ_v2are obtained from the reduced system

•ϒ•−γ•=0, (64) where •= ⎛ ⎝κ1κ20 κ2κ2tt12 κ20t3 κ2t1 κ2t4 2κ2t2 ⎞ ⎠_, (65)

(12)

and ϒ•= ⎛ ⎝ σ 2 v λ2_σ2 v −λσ2 v ⎞ ⎠,γ•= ⎛ ⎝E ε_Q_1ε EεQ1ε EεQ1ε ⎞ ⎠_. (66)

Then, from Eqs. (42) and (48) we obtain, respectively:

ˆ θ2= 1 (S−N) ˆ G−1εˆ Q2Gˆ−1ε,ˆ (67) and ˆ σ2 α = _ST1 ˆ G−1εˆ Q3Gˆ−1εˆ− N STθ2,ˆ (68) whereGˆ−1= IT⊗ ˆG−_S1 withGˆS=IS−λWS.

The above approach relates to unweighted GM. Nevertheless, the literature on gen-eralized method of moments estimators indicates that it is optimal to use the inverse of the variance–covariance matrix of the sample moments at the true parameter val-ues as a weighting matrix in order to obtain asymptotic efficiency. In the following, our Monte Carlo simulations show that our results are not very different from those produced by ML, especially when our three-stage procedure is iterated. We therefore leave the weighted GM method for further research.

3.2.2 The GM spatial IV estimator

To obtain the GM-S-IV estimator ofδ=[ρ, β], one first calculates the unweighted GM estimates ofλ, σ_v2, θ2andσ_α2, following a three-stage procedure:

• in the first stage, the model (7) is estimated using an IV approach based on the matrix of instruments H which is given by X,W X,W2_X_{, . . .}_{. Thus, the IV} estimator ofδis defined as:

ˆ δIV=ZPHZ −1 ZPHy, (69) wherePH =H HH−1H;

• in the second stage, the parametersλ,σ_v2,θ2andσ_α2are estimated using the GM approach from Sect.3.2based on IV residuals, i.e.εˆ=y−ZδIV. The GM estimatesˆ

are obtained from the sample counterpart of the reduced system (64) which is:

•ϒ•−γ•=ξλ, σ2

(13)

whereξλ, σ_v2is a vector of residuals. The unweighted GM estimators ofλand σ2

v are the nonlinear least squares estimators based on (70):

_λ,_σ2 v =arg min ξλ, σ2 v ξλ, σ2 v . (71)

Then, the estimated parameters ofθ2andσ_α2are obtained using, respectively (67) and (68);

• in the third stage, we need the estimated variance–covariance matrixuobtained

using the first stage estimates ofσ2

v,σμ2,σα2. In order to obtain an equation in

terms ofu, from which spatial autocorrelation is absent, rather than in terms of ε in which it is present, we can purge the equation of spatial dependence by pre-multiplication byGˆ−1. This can be seen to be a type of Cochrane–Orcutt transformation appropriate to spatially dependent data. Hence, pre-multiplication of the model (7) byGˆ−1yields:

y∗=Z∗δ+u, (72)

where Z∗ = ˆG−1Z, y∗ = ˆG−1y. If we are guided by the classical panel data random effects literature (see Baltagi2013), and transform the model in (72) by pre-multiplying it by−u1/2, then applying the IV principle gives the GM-S-IV

estimatorδˆG M−S−I V which corresponds to:

_δG M−S−I V = Z∗∗PH∗∗Z∗∗ −1 Z∗∗PH∗∗y∗∗, (73) where Z∗∗ = u−1/2Z∗, y∗∗ = − 1/2 u y∗, H∗∗ = − 1/2 u H∗, H∗ = ˆG−1H, PH∗∗ =H∗∗ H∗∗H∗∗−1H∗∗.

This three-stage procedure can be iterated. After the first iteration, i.e. the application of the procedure describes above, the GM-S-IV residuals are computed. Then, they are used to compute new sets of unweighted GM estimates. Last, these latter are used to obtain new GM-S-IV parameter estimates of the multidimensional spatial lag model and so on.

4 A Monte Carlo study

The idea here is to demonstrate the comparative performance of the various estimators described thus far, namely ML and the GM-S-IV approach. For this purpose, we gen-erate data using a model with known parameters and see how accurately the different estimators recover the true parameter values. Our data generating process is the spatial lag regression model:

(14)

where yt is of dimension(S×1)as is the exogenous variable xt. Likewise,ιt is a

vector of ones of dimension(S×1).DS=IS−ρWSwhereWSis the spatial matrix

of size(S×S). We retain the spatial structure proposed by Kelejian and Prucha (1999), which are referred to as “Jahead andJ behind”, with the nonzero elements equal to 1/2J. Note that, asJincrease, the value of nonzero elements 1/2Jdecreases and this is turn may reduce the amount of spatial correlation. Here, we considerJ =2,6 and 10. The error termεt has a SMA structure

εt =ut−λWSut, (75)

anduthas a nested random components structure given by ut =diag

ιMi

α+μ+vt, (76)

where ut is (S×1), α is the vector of group effects of dimension (N×1),

μ ₌ _μ 1, . . . , μN , a vector of dimension (1 ×S),μ_i = μi1, . . . , μi Mi , a vector of dimension(1×Mi),ιMi is a vector of ones of dimension(Mi×1).vtis of

dimension(S×1).

Throughout the experiment, the parameters of (74) and (75) were set atβ0 = 5, β1=2 andρ =0.3,0.6 andλ= −0.2,−0.5,−0.9, i.e. positive dependence. The explanatory variablexi j t is generated by a similar method to that of Nerlove (1971),

Antweiler (2001) and Baltagi et al. (2001). More precisely, we have:

xi j t =0.3t+0.8xi j t₋1+ωi j t, (77)

wherei = 1, . . . ,N, j = 1, . . . ,Mi, andωi j t is a random variable uniformly

dis-tributed on the interval[−0.5,0.5]andxi j0 =60+30ωi j0. Observations over the

first 10 periods are discarded to minimize the effect of initial values. For the data gen-erating process for the errors, we assumeαi ∼ii d.N

0, σ_α2,μi j ∼ ii d.N 0, σ_μ2 andvi j t ∼ii d.N 0, σ2 v

. We fixσu2=σ_α2+σ_μ2+σ_v2=2 and defineγ1=σ_α2/σu2and

γ2=σ2

μ/σu2. These two ratios vary over the set(0.2,0.4,0.6)such that(1−γ1−γ2)

is always positive. For all experiments, we have 20 groups observed over 5 peri-ods, hence(N,T)=(20,5), and we haveS =100 individuals, so the sample size (i.e.T S) is fixed at 500. We consider the 3 unbalanced patterns proposed by Fingleton et al. (2016) denoted by P1,P2andP3, with individuals nested within theNgroups with differing frequencies(M1, . . . ,M20). More precisely, consideringN =20, P1 is characterized by Mi = 5,i = 1, . . . ,20. P2,Mi = 3,i =1, . . . ,12,Mi = 4, i =13, . . . ,16 andMi =12,i=17, . . . ,20. ForP3, we haveMi =2,i =1, . . . ,8, Mi =3,i =9, . . . ,12,Mi =4,i =13, . . . ,18 andMi =24,i =19,20.

For each experiment, we focus on the estimates of the parametersρ,β0,β1,λ,σ_α2, σ2

μandσv2. Following KKP (2007), we adopt a measure of dispersion which is closely

related to the standard measure of RMSE defined as follows:

RMSE= bias2+ I Q 1.35 21/2 , (78)

(15)

where bias corresponds to the difference between the median and the true value of the parameter, whileIQis the interquantile range defined asq1−q2whereq1is the 0.75 quantile andq2is the 0.25 quantile.

In the tables below, three sets of RMSE parameters are reported. They are the outcomes of ML, unweighted GM-S-IV and iterated unweighted GM-S-IV estimators. More precisely, (λ,ˆ σˆ_α2,σˆ_μ2,σˆ_v2) and (λˆ(1),σˆ_α2(1),σˆ_μ2(1),σˆ_v2(1)) denote the unweighted and iterated unweighted GM estimates, respectively, whereas (λˆML,σˆ_α2,M L,σˆ_μ2,M L,σˆ_v2,M L) denote the ML estimates. These estimates are based on IV residuals. Subsequently, the GM-S-IV estimates of ρ, β0 andβ1 are com-puted, i.e. (ρ,ˆ β0,ˆ β1) and (ˆ ρˆ(1),βˆ₀(1),βˆ₁(1)), respectively. The ML estimates are denoted (ρˆML,βˆ₀ML,βˆ₁ML). The results of 1,000 replications for P1(balanced subgroups pat-tern, Mi =5,∀i =1, . . . ,N) andρ =0.3 are given in Table1, whereas Tables2

and3give results for the unbalanced patternsP2andP3.

From Table1, it is apparent that while ML is the most efficient for all parameters, the iterated GM-S-IV is almost equally as good for almost all parameters. For example, on average, the RMSE of both GM-S-IV estimators of the spatial autoregressive parameter ρis approximately only 2% larger than the ML estimateρˆML. The differences forβ0, β1,σ2

α,σμ2are also very small between ML and iterated GM-S-IV and never larger

than 4%. This is important because the parametersβ0,β1are of particular interest in applied economics. It also means that the computational benefits associated with the use of the GM approach do not seem to have much cost in terms of efficiency. Forβ0, β1,σ2

α,σμ2, the differences between ML and the simple (i.e. not iterated GM-S-IV) are

a bit larger (up to 5% forβ1,σ_α2,σ_μ2and 28% forβ0). While iterating the GM-S-IV estimator is likely to achieve marginally more efficient estimates, this is definitely not the case forλ, especially whenλis near the upper end of its range. Indeed, it appears that the RMSE of the iterated GM-S-IV estimatorλˆ(1)is 32% larger on average than the RMSE ofλˆML. Looking in more details, the difference is especially high forλ= −0.9 (up to 100%), while it remains acceptable for smaller values (in absolute value) ofλ (for instance: 17% forλ= −0.2). Hence, caution is in order as the absolute value of λtends to unity.

Tables2and3concern two unbalanced patterns P2 and P3. More precisely, the distribution of individuals over the twenty subgroups changes but the sample size remain fixed atT S=5×100=500. In Table2, based on the least unbalanced of the two, the results are qualitatively similar to those of Table1. In terms of averages, the RMSE of both the simple and iterated GM-S-IV estimator for the spatial autoregressive parameterρis approximately 0.73% larger than that produced by ML estimator. The differences for the regression parametersβ0,β1are very small (1% forβ0and even

−0.58% forβ1). Conversely, the differences between ML and iterated GM-S-IV for the variancesσ2

α,σμ2are a bit larger than in the balanced case and up to 35% forσα2.

These values are even higher for the simple GM-S-IV estimator, highlighting the less efficient estimates of GM compared with ML estimation, and the need to consider this in relation to the advantages provided by GM. With respect toλ, we find again that the RMSE for the iterated GM-S-IV estimator is larger than under ML, with an average difference of 35% produced by an assumption that the true value ofλis−0.9. With

(16)

Ta b le 1 RMSEs o f the estimators of ρ , β0 , β1 , λ , σ 2,σ_α 2and_μ σ 2for_v p attern P1 considering ( N , T ) = ( 20 , 5 ) , ρ = 0 . 3, 1,000 replications P arameter v alues RMSE λγ 1 γ2 ˆ ρ ML ˆ_β ML 0 ˆ_β ML 1 ˆλ ML ˆ σ 2 , ML α ˆ σ 2 , ML μ ˆ σ 2 , ML v ˆ ρ ˆ_β0 ˆ_β1 J = 2 − 0.2 0 .2 0.6 0 .0122 0.5664 0.0218 0.0699 0.2005 0.1951 0.0263 0.0122 0.5565 0.0217 − 0.2 0 .4 0.4 0 .0117 0.5851 0.0209 0.0688 0.2967 0.1373 0.0267 0.0118 0.5852 0.0207 − 0.2 0 .6 0.2 0 .0113 0.5967 0.0195 0.0671 0.3988 0.0736 0.0271 0.0114 0.6014 0.0196 − 0.5 0 .2 0.6 0 .0129 0.6783 0.0201 0.0681 0.1941 0.1924 0.0266 0.0130 0.6746 0.0204 − 0.5 0 .4 0.4 0 .0129 0.7120 0.0199 0.0692 0.2930 0.1348 0.0264 0.0127 0.7038 0.0195 − 0.5 0 .6 0.2 0 .0124 0.7229 0.0180 0.0679 0.3808 0.0741 0.0265 0.0127 0.7290 0.0183 − 0.9 0 .2 0.6 0 .0145 0.8321 0.0186 0.0548 0.1990 0.1931 0.0281 0.0151 0.8426 0.0185 − 0.9 0 .4 0.4 0 .0146 0.8802 0.0176 0.0563 0.2931 0.1338 0.0277 0.0151 0.8848 0.0178 − 0.9 0 .6 0.2 0 .0140 0.8891 0.0162 0.0553 0.3777 0.0733 0.0279 0.0147 0.8912 0.0166 J = 6 − 0.2 0 .2 0.6 0 .0139 0.6160 0.0235 0.1418 0.2000 0.1990 0.0286 0.0136 0.6090 0.0229 − 0.2 0 .4 0.4 0 .0132 0.6019 0.0220 0.1368 0.2960 0.1374 0.0267 0.0137 0.6077 0.0217 − 0.2 0 .6 0.2 0 .0128 0.6121 0.0205 0.1289 0.3821 0.0735 0.0265 0.0129 0.6102 0.0204 − 0.5 0 .2 0.6 0 .0160 0.7502 0.0222 0.1527 0.1956 0.2007 0.0270 0.0162 0.7560 0.0221 − 0.5 0 .4 0.4 0 .0156 0.7670 0.0215 0.1504 0.2953 0.1364 0.0273 0.0160 0.7519 0.0212

(17)

Ta b le 1 continued P arameter v alues RMSE λγ 1 γ2 ˆ ρ ML ˆ_β ML 0 ˆ_β ML 1 ˆλ ML ˆ σ 2 , ML α ˆ σ 2 , ML μ ˆ σ 2 , ML v ˆ ρ ˆ_β0 ˆ_β1 − 0.5 0 .6 0.2 0 .0151 0.7591 0.0197 0.1470 0.3877 0.0744 0.0274 0.0154 0.7604 0.0197 − 0.9 0 .2 0.6 0 .0184 0.9394 0.0219 0.1437 0.1959 0.2009 0.0272 0.0190 0.9448 0.0217 − 0.9 0 .4 0.4 0 .0181 0.9381 0.0208 0.1438 0.2941 0.1344 0.0271 0.0188 0.9394 0.0213 − 0.9 0 .6 0.2 0 .0177 0.9525 0.0188 0.1364 0.3800 0.0743 0.0274 0.0182 0.9556 0.0186 J = 10 − 0.2 0 .2 0.6 0 .0144 0.6166 0.0241 0.1813 0.1998 0.1964 0.0271 0.0144 0.6318 0.0240 − 0.2 0 .4 0.4 0 .0141 0.6221 0.0228 0.1802 0.3027 0.1360 0.0267 0.0142 0.6313 0.0227 − 0.2 0 .6 0.2 0 .0136 0.6413 0.0214 0.1768 0.3852 0.0751 0.0273 0.0137 0.6210 0.0211 − 0.5 0 .2 0.6 0 .0168 0.7603 0.0236 0.2033 0.1968 0.1955 0.0273 0.0171 0.7779 0.0239 − 0.5 0 .4 0.4 0 .0167 0.7544 0.0226 0.2037 0.3060 0.1386 0.0276 0.0169 0.7721 0.0223 − 0.5 0 .6 0.2 0 .0162 0.7841 0.0208 0.1953 0.3782 0.0755 0.0277 0.0163 0.7728 0.0209 − 0.9 0 .2 0.6 0 .0205 0.9681 0.0229 0.1917 0.1912 0.1981 0.0275 0.0206 0.9733 0.0230 − 0.9 0 .4 0.4 0 .0200 0.9668 0.0218 0.1939 0.3019 0.1368 0.0274 0.0200 0.9666 0.0222 − 0.9 0 .6 0.2 0 .0192 0.9448 0.0207 0.1839 0.3869 0.0749 0.0272 0.0193 0.9628 0.0202 A v erages 0.0151 0.7577 0.0209 0.1322 0.2929 0.1358 0.0272 0.0154 0.7598 0.0209

(18)

Ta b le 1 continued P arameter v alues RMSE λγ 1 γ2 ˆλ ˆ σ 2 _α ˆ σ 2 _μ ˆ σ 2 _v ˆ ρ ( 1 ) ˆ_β ( 1 ) 0 ˆ_β ( 1 ) 1 ˆλ ( 1 ) ˆ σ 2 ( 1 ) α ˆ σ 2 ( 1 ) μ ˆ σ 2 ( 1 ) v J = 2 − 0.2 0 .2 0.6 0 .1086 0.2126 0.2037 0.0314 0.0123 0.5534 0.0219 0.0817 0.2042 0.2007 0.0268 − 0.2 0 .4 0.4 0 .1179 0.3192 0.1463 0.0325 0.0119 0.5903 0.0208 0.0811 0.3094 0.1371 0.0266 − 0.2 0 .6 0.2 0 .1294 0.4359 0.0830 0.0337 0.0114 0.5946 0.0194 0.0813 0.4032 0.0718 0.0268 − 0.5 0 .2 0.6 0 .1270 0.2128 0.2126 0.0323 0.0132 0.6898 0.0203 0.0970 0.2057 0.2064 0.0273 − 0.5 0 .4 0.4 0 .1417 0.3153 0.1556 0.0337 0.0130 0.7169 0.0196 0.0961 0.3075 0.1375 0.0274 − 0.5 0 .6 0.2 0 .1595 0.4271 0.0953 0.0348 0.0129 0.7327 0.0181 0.0970 0.4048 0.0736 0.0275 − 0.9 0 .2 0.6 0 .1167 0.2009 0.2087 0.0448 0.0153 0.8454 0.0185 0.1221 0.2044 0.2092 0.0332 − 0.9 0 .4 0.4 0 .1193 0.3106 0.1440 0.0511 0.0152 0.8939 0.0176 0.1213 0.2999 0.1436 0.0332 − 0.9 0 .6 0.2 0 .1184 0.4158 0.0774 0.0558 0.0148 0.9092 0.0165 0.1208 0.3981 0.0756 0.0332 J = 6 − 0.2 0 .2 0.6 0 .1974 0.2031 0.2040 0.0294 0.0136 0.6111 0.0233 0.1483 0.2008 0.2026 0.0266 − 0.2 0 .4 0.4 0 .2386 0.3038 0.1474 0.0294 0.0137 0.6145 0.0219 0.1479 0.3093 0.1360 0.0266 − 0.2 0 .6 0.2 0 .2793 0.4120 0.0818 0.0306 0.0128 0.6034 0.0205 0.1486 0.4026 0.0739 0.0266 − 0.5 0 .2 0.6 0 .2407 0.2027 0.2040 0.0300 0.0160 0.7541 0.0223 0.1783 0.1984 0.2051 0.0268 − 0.5 0 .4 0.4 0 .2866 0.3002 0.1486 0.0314 0.0159 0.7598 0.0211 0.1787 0.3115 0.1364 0.0268

(19)

Ta b le 1 continued P arameter v alues RMSE λγ 1 γ2 ˆλ ˆ σ 2 _α ˆ σ 2 _μ ˆ σ 2 _v ˆ ρ ( 1 ) ˆ_β ( 1 ) 0 ˆ_β ( 1 ) 1 ˆλ ( 1 ) ˆ σ 2 ( 1 ) α ˆ σ 2 ( 1 ) μ ˆ σ 2 ( 1 ) v − 0.5 0 .6 0.2 0 .3316 0.4098 0.0812 0.0342 0.0151 0.7547 0.0197 0.1798 0.4074 0.0742 0.0267 − 0.9 0 .2 0.6 0 .1656 0.1980 0.2020 0.0397 0.0190 0.9554 0.0217 0.2144 0.2007 0.2009 0.0303 − 0.9 0 .4 0.4 0 .1505 0.2997 0.1361 0.0461 0.0187 0.9445 0.0211 0.2142 0.3119 0.1356 0.0303 − 0.9 0 .6 0.2 0 .1361 0.4032 0.0752 0.0543 0.0180 0.9688 0.0188 0.2152 0.4001 0.0726 0.0303 J = 10 − 0.2 0 .2 0.6 0 .2311 0.1986 0.1982 0.0286 0.0145 0.6201 0.0244 0.1977 0.2012 0.2050 0.0272 − 0.2 0 .4 0.4 0 .2677 0.3118 0.1391 0.0292 0.0144 0.6212 0.0228 0.1979 0.3095 0.1389 0.0272 − 0.2 0 .6 0.2 0 .2932 0.4079 0.0765 0.0292 0.0139 0.6339 0.0213 0.1984 0.4102 0.0754 0.0270 − 0.5 0 .2 0.6 0 .2779 0.1995 0.2009 0.0290 0.0171 0.7688 0.0239 0.2483 0.1995 0.2032 0.0269 − 0.5 0 .4 0.4 0 .3178 0.3075 0.1396 0.0292 0.0171 0.7624 0.0226 0.2474 0.3093 0.1379 0.0268 − 0.5 0 .6 0.2 0 .3429 0.4045 0.0787 0.0306 0.0164 0.7823 0.0210 0.2470 0.4054 0.0748 0.0269 − 0.9 0 .2 0.6 0 .2010 0.1991 0.1998 0.0340 0.0206 0.9693 0.0230 0.2893 0.2005 0.2007 0.0292 − 0.9 0 .4 0.4 0 .1922 0.3092 0.1327 0.0373 0.0200 0.9655 0.0220 0.2889 0.3090 0.1397 0.0291 − 0.9 0 .6 0.2 0 .1669 0.3988 0.0727 0.0405 0.0193 0.9669 0.0205 0.2873 0.4000 0.0745 0.0289 A v erages 0.2021 0.3081 0.1424 0.0357 0.0154 0.7623 0.0209 0.1750 0.3046 0.1386 0.0282

(20)

(21)

(22)

(23)

(24)

(25)

(26)

(27)

(28)

smallerλ, the difference is less stark. For example, whenλ= −0.2, the difference is equal to approximately 16%.

In Table3, the RMSEs are affected differently because of the way we have treated unbalancedness inP3. Focusing especially on the differences between ML and iterated GM-S-IV, we note that the regression parameters are estimated efficiently in both cases as isρ. As for the variances, the differences between ML and iterated GM-S-IV are higher for P3than was the case P2when one considerσ_α2(28% forP3compared to 12% forP2). Conversely, the differences are smaller forσ_μ2(0.74% forP3compared to 2.86% for P2) andσ_v2 (4.38% for P3compared to 8.49% forP2). The estimates of the spatial error parameter are affected in a similar way under P3as was the case for P2: the RMSE ofλˆ(1)is 33% higher than the RMSE ofλˆMLwith especially high differences forλ=0.9.

We have also performed the simulations withρ=0.6 considering the same patterns

P1, P2and P3. The results are provided in an online appendix and the conclusions remain identical to those described above.

5 Empirical application

In this section, we consider the relationship between log employment (lnE), log output (lnQ) and an indicator of (log) capital investment (lnK)across S = _iN₌₁Mi =

255 NUTS2 regions nested within N = 25 countries of the EU. lnQ is measured by gross value added, or GVA, and lnK is gross fixed capital formation, or GFCF. These annual regional data series are based on Cambridge Econometrics’ European Regional Economic Data Base. As an illustration, Fig.1shows the distribution of log employment in the year 2010 across the 255 regions. Similar maps but with varying regional employment levels covering the period 1999–2010 constitute the dependent variable. Our model endeavours to explain the spatio-temporal variation in lnE as a function of lnQand lnKorganized on the same basis as Fig.1, and also as an outcome of unobservable region-specific random effects nested within country-specific random effects. Accordingly, the model specification is

lnEt =ρWSlnEt +β0ιt+β1lnQt+β2lnKt +εt, (79)

in which lnEt is an (S ×1) vector of levels of (log) employment at time t, with

exogenous variables lnQt and lnKt, andιt is a vector of ones of dimension (S×1).

The compound errorsεt are an (S×1) vector of spatially dependent unobservables

comprising time-invariant national effects, one for each of N countries and denoted byαi,i = 1, . . . ,N, together with time-invariant regional effects with the region j

effect, where j is nested within countryi, denoted byμi j =μk,k = 1, . . . ,S. In

addition, there are remainders of dimensionSwhich vary across regions and time, and the remainder effect for region j within countryi at timet is denoted byνi j t. Thus,

repeating for convenience Eqs. (5) and (6) in vector and matrix notation, we have εt =ut−λMSut, (80)

(29)

Fig. 1 Distribution of log employment in the year 2010 and ut =diag ιMi α+μ+vt. (81)

So, the estimation procedure takes into account two different spatial interaction processes: one for the endogenous spatial lag and the other for the errors. For the spatial lag at timet,WSlnEt, the matrixWS is based on interregional trade flows between

the 255 EU regions in the year 2000. The method of estimating these trade flows has been discussed elsewhere, for example by Polasek et al. (2010), Vidoli and Mazziotta (2010) and Fingleton et al. (2015), so here we simply note that the method employed bases interregional trade on data for international trade using a spatial version of the method for the construction of quarterly time series from annual series introduced by Chow and Lin (1971). The resulting matrix of bilateral interregional trade flowsW_S∗

is scaled following the approach of Ord (1975), so that

WS =D−0.5WS∗D−0.5, (82)

in whichDis a diagonal matrix with each cell on the leading diagonal containing the corresponding row total from W_S∗. This normalization means that the most positive real eigenvalue ofWSis equal to max(eig)=1.0, and the continuous range for which

(IS−ρWS)is non-singular is 1/min(eig) < ρ <1. Thus, we require estimatedρto

(30)

For GM-S-IV, the assumption is that the compound errors are interrelated according to an SMA process (80). In this case, the spatial matrixMSis based on a contiguity

matrix of dimension (S×S) with 1 in cell(m,n)indicating that regionsmandnshare a common border and 0 indicating otherwise, although for nine isolated regions it has been necessary to create artificial, contiguous neighbours. The resulting contiguity matrix has been subsequently standardized to giveMSin which the rows sum to 1. This

means that stationary region forλis also given by 1/min(eig) < λ <1/max(eig)= 1. The key feature of an SMA process is that shocks to the unobservables have local rather than global effects. Note that all components of the compound errors are assumed to be subject to this same spatial error dependence processes.

Table4gives the resulting non-iterated estimates. The ratios ofβ1andβ2 to their respective standard errors indicate that lnQand lnKare significantly positively related to lnE, and there is also a significant positive effect due to the endogenous spatial lag, sinceρ > 0 witht ratio equal to 34.2374. The significant positive effect due to the endogenous spatial lag means that we should interpret the effects of these variables via the true derivatives, following LeSage and Pace (2009) and Elhorst (2014). These show that, allowing for both the direct and indirect effects of spatial interaction across regions, the total effect of a 1% change in Q is associated with a 0.3795% change in employment. The total effect of 1% change in K leads to 0.0459% change in employment.

We find that the null hypothesis thatλ=0 is rejected in favour of positive residual spatial dependence (which is indicated by a negative estimate ofλ). The distribution of λnull, which is λ under a null hypothesis of no spatial dependence among the errors, is based on the residuals from the nested error model assuming no spatial error dependence, but which includes a spatial lag (as described in Baltagi et al. (2014)). We refer to this by the acronym NRE-IV. The residuals on which the null distribution is based have the same moments as the NRE-IV residuals, and are assumed to be normally distributed, but they are randomly assigned to regions in order to eliminate spatial dependence. Given randomly assigned residuals, the same GM estimation method used to obtainλis applied to obtainλnull, and this estimation is repeated 100 times to obtain 100 estimates ofλnull. We find that the estimateλis not a typical member of thisλnulldistribution, since

t= √λ−λnull

var(λnull)=

−0.8641−(−0.0021)

0.0252 = −34.24, (83) in which λnull is the mean of the empirical null distribution, and var(λnull)is the variance.

The estimated varianceσ_α2 =0.0577 of unobserved country effects is larger than estimated regional effects variance, which isσ_μ2 = 0.0483, and both of these are large relative to the remainder varianceσ2

ν = 0.0008. In the generation of theλnull

distribution, we also generate null distributions ofσ_α,2_null,σ_μ,2_nullandσ_ν,2_null. Because the null distributions are based on a random pattern of errors, this also breaks up any effects due to country or region. This is evident from the means of the resulting distribution, henceσ2_α,_null=0.0131 , andσ2_μ,_null=0.0118. In contrast, the remainder null variance is comparatively large, henceσ2_ν,_null=0.1719. Using also the standard