Testing for Spatial Correlations with Randomly Missing Observations in the Dependent Variable

(1)

http://dx.doi.org/10.4236/tel.2014.48079

How to cite this paper: Gao, J. and Wang, W. (2014) Testing for Spatial Correlations with Randomly Missing Observations in the Dependent Variable. Theoretical Economics Letters, 4, 623-633. http://dx.doi.org/10.4236/tel.2014.48079

Testing for Spatial Correlations with

Randomly Missing Observations in the

Dependent Variable

Jing Gao1_{, Wei Wang}2

1_{College of Sciences, Shanghai Institute of Technology, Shanghai, China}

2_{Antai College of Economics and Management, Shanghai Jiao Tong University, Shanghai, China}

Email: [email protected], [email protected]

Received 29 June 2014; revised 25 July 2014; accepted 26 August 2014

This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/

Abstract

We consider LM tests for spatial correlations in the spatial error model (SEM) and spatial autore-gressive model (SAM) with randomly missing data in the dependent variable. We derive the for-mulas of the LM test statistics and provide finite sample performance of the LM tests through Monte Carlo experiments.

Keywords

LM Test, Spatial Correlations, Missing Data, Dependent Variable

1. Introduction

Spatial models have a long history in regional science and geography (see [1], for example). Recently, many economic processes that concern spatial correlations have been drawn more and more attention. Examples in-clude housing decision, technology adoption, tax competition, welfare participation, and price decision. There-fore, spatial correlations are of much interest in the study of urban, environmental, labor, and developmental economics among others. Various spatial econometric models are currently being applied, among which the most popular ones are the spatial error model (SEM) and spatial autoregressive model (SAM). Before setting up a spatial econometric model and doing estimation, people tend to test the existence of the spatial correlations first. The LM tests for spatial correlations have already been developed by [2] and [1] for the SEM and the SAM. However, these tests are designed for models with fully observed data.

(2)

624

term/dependent variable vector (see [3], for example). Therefore, the LM tests proposed by [2] and [1] will be no longer valid when missing data problem occurs. In this paper, we consider a case in which observations are randomly missing only from the dependent variable and study the LM tests for spatial correlations in this situa-tion. This situation could be very common in regional studies, where exogenous variables may be available from different sources rather than from data available on a local government web site, but the dependent variable may have missing data. LeSage and Pace [4] and [3] [5] have considered this situation and study the estimations of the spatial econometric models1. In this study, we focus on the tests of spatial correlations in both SEM and SAM.

The rest of the paper is organized as follows. Section 2 provides the SEM model specification with missing data in the dependent variable and LM test for the spatial correlation. We derive the formula of the LM test sta-tistic, which is asymptotically χ2

( )

1 . In Section 3, we study the SAM model and provide the LM test. Some Monte Carlo experiments are carried out in Section 4, and Section 5 concludes the paper.

2. LM Test for Spatial Correlation in the SEM

The Spatial Error Model is:

0

0 ,

n n n

n n n n

Y X

W u

β ε

ε λ ε

= +

= + (1)

where Yn is an n×1 vector of outcomes of n cross sectional units; Xn is an n k× matrix of exogenous va-riables representing the n units’ exogenous characters; un is an n×1 vector of i.i.d. disturbances with zero mean and a finite variance σ02; Wn is an n n× spatial weights matrix of known constants with a zero di-agonal; and λ0 is the spatial effect coefficient that measures the spatial autocorrelation on εn.

If the data are fully observed, we may test

0: 0 0

H λ =

for spatial autocorrelation. Burridge [2] and [1] derived the LM test statistic as

(

)

2 T

T 1

,

n s

n n

ne W e LM

e e tr W W

 

=  

 

where Wns =Wn+WnT and e is the OLS residual of model (1), i.e., e=M Yn n with

(

)

T T

n n n n n n

M =I −X X X X .

However, if there are missing observations on Yn, the above test statistic could not be computed.

We consider the case where some of the observations in the outcome vector are unavailable. Without loss of generality, we assume that the outcomes of the last n1 units are missing, where 0 < n1 < n. Therefore, we can write

( )

( ) , o n

n _u

n Y Y

Y

 

 

=_ _

 

where Yn( )o is the n2 × 1 subvector of observed outcomes, where n2 = −n n1, and ( )u n

Y is the remaining n1 × 1 subvector of unobserved (missing) outcomes. So the (population) system under consideration is

( )

( ) 0

0 .

n n

n n n

o

n n

u n Y

Y X

W u

β ε

ε λ ε

 

 

 = +

= +

(2)

Note that Yn( )u =J Yn n, where Jn= 0n1×n2,In1×n1 is a selection matrix which picks the unobserved elements

from the whole vector Yn. Similarly,

( )o ( )o

n n n

Y =J Y where J_n = I_n₂_×_n₂, 0_n₂_×_n₁_. To simplify some of the nota-

tions, denote Sn =In−λ0Wn. Then we can write ( )o n Y as

1_{LeSage and Pace}_[4]_{consider an example of housing prices, where the unsold properties have known characteristics. Examples of Wang}

(3)

625

( ) ( ) ( ) ( ) ( )

0 0

1

n n n n n

o o o o

n n

o

n J X J J X Jn S un

Y = β + ε = β + − . (3)

The maximum likelihood (ML) approach can be based on the above equation. Let ( ) 1

n n

o

n n

v =J S−u , then

( )

2 0 ,

n v n

Var v =σ Σ , where Σ =v n, B Bn nT with

( )o 1

n n n

B =J S− . Let θ ₌

(

λ β σ_, T_, 2

)

T

, with θ0 being the true pa-

rameter value. Under normality, the log likelihood function is

( )

( ) ( )

_{( )}

( ) ( ) 2 2 2 , T 1 , 2 1

ln ln 2π ln ln

2 2 2

1

, 2

n v n

o o o o

n n v n n n

n n

Y

n n

L

J X Y J X

θ σ λ

β λ β

σ

− = −

 ₋  ₋



− − Σ



 

− Σ _ _

where Σ =v n, Bn

( ) ( )

λ BnT λ with

( )

( )o

₍

₎

1

n n n n

B λ =J I −λW − . The expressions for the elements of the score

vector are:

( )

_{( )} _{( )}

( )

( ) ( )

( )

( ) ( )

( )

{

( )

( ) ( )

( )

(

)

}

1 , , 1 1

T T T T T

1 T T ( ) ( ) 2 T 2 2 T ln

ln 1 1

,

2 2

1 2

o o

v n v n

n n

n o o

n n n n

n n n n n n n n

n n n n

n n

v B B B G G B B B

L

Y J X

v

B B B B G G

Y J X

tr

λ λ

β β

β λ λ λ λ λ λ λ λ β

λ λ λ λ

θ

λ λ σ λ

σ λ λ σ − − − − ∂ ∂ ∂ _ _

= − Σ − _ − _ Σ _ − _

   +           ∂ ∂ ∂ =     − _ + _ (4)

( )

_{

_T

_{( )}

_{( ) ( )}

_T 1

_{( )}

₂

_}

2

2 4

ln 1

,

2 n n n n

n

v L

B B v n

β λ

θ

σ σ λ β σ

−

  −

= _

∂

∂  (5)

and

( )

_{( )}

_{( ) ( )}

1 ( )

T T

2

ln 1

,

o

n n n n

n

T v B Xn

L

B J

β λ λ

θ

β σ

−

 

∂

∂ =  (6)

where n

( )

( )o ( ) o

n n

n

v β =Y −J X β, and Gn

( )

λ Wn

(

In λWn

)

1 −

= − . The second order derivatives are, for the relevant combinations of parameters, Equations (A.1)-(A.6) in the Appendix. Thus the elements of the information matrix are,

( )

( ) ( )

( )

( ) ( )

{

}

( ) ( ) ( ) ( )

( ) ( )

{

}

{

( ) ( )

( ) ( ) ( ) ( )

}

( ) ( )

( )

( ) ( )

( ) ( ) ( )

{

}

1 1

T T T T T T

1 1

T T T T T T

1 1

T T T T T _,

n n n n n n n n n n n n

n n n n n n n n n n n

tr B G G B B B B G G B B B

tr B G G B B B tr B B B G G B

tr B B B G G B B B B G B

I_λλ λ λ λ λ λ λ λ λ λ λ λ λ

λ λ λ λ λ λ λ λ λ λ λ λ

λ λ λ λ λ λ λ λ λ λ λ

− − − − − −  +     +                − _ _ + _ _       − _ _ _ + _ _ _ = (7)

( ) ( ) ( )

( ) ( )

{

}

2 2 2 2

1

2

2 4

T T

1

, 0, , 0,

2

n n n n n

B G B B B n

I_λσ tr λ λ λ λ λ I_λβ I_{σ σ} I_{σ β}

σ σ

−

= _ _ = = = (8)

( )_T

_{( ) ( )}

1 ( )

T T

2 .

1 o o

n n n n n n

I_ββ X J B λ B λ J X

σ

−

 

 

= (9)

To perform the LM test, expressions (4)-(6) and (7)-(9) need to be evaluated under constrained estimation, i.e., with the parameter values included in the null hypothesis set to zero (namely, ˆλR =0), and with the other parameters set to their ordinary-least-squares estimates, i.e., _ˆ

(

T ( )oT ( )o

)

1 T ( )oT ( )o _,

R X Jn n Jn Xn X Jn n Yn

β = − and

(

)

( )T ( ) ( )

2 2

ˆ_R 1n Y_no M_noY_no ,

σ = with Mn( )o In₂ Jn( )oXn

(

X JnT n( )oTJn( )oXn

)

1X JnT n( )oT −

= − .

(4)

( )

( ) ( )

( )

{

_T _T 1 _T

}

{

( )_T ( ) _T

}

0 0 0 0 0 0 o o 0

n n n n n n n n n n

tr B _B B _− B _G +G _ =tr J J _W +W _ =

because Wn has a zero diagonal. Thus we have the score vector as follows

( )

T ( )

(

T

)

( ) 2 _T ( ) ( ) _T

2

T _T

ˆ 2 ˆ

ln

0 0 ,

0 0

o _o

n n n n R _n _n _n

R

o

n

o

e J W W J e _{n e J W J} _{e e e}

L θ σ

θ

 +   

 

∂ _ _

 

= _{= } _

∂  _{ } _

   

 

where e=Mn( ) ( )oYno is the ordinary-least-squares residual of model (3). And the estimated information matrix is

( )

( ) ( )

(

)

( ) ( )

( ) ( ) T

2 4 T

T T 2 T

0 0

ˆ ₀ ₀ _.

ˆ 2

1

0 0

ˆ

o o

n o

n n n n n n

R

o

n n n n

R o

o

tr J J W W J J W

n I

X J J X

θ

σ

 

  ₊  

 

 

=  

 

 

Therefore, the LM test statistic is

( )

_{( )}

( )

( ) ( ) ( ) ( )

( ) ( )T

T T

T

2 T

1

2 T

ˆ ˆ

ln ln ₁

ˆ _,

ˆ ˆ

o

n R n R _n _n o

o o

n

R _o _s _o

R R n n n n n n

L L _{n e J W J} _e

LM I

e e

tr J J W J J W

θ θ

θ

θ θ

−

_∂  _∂  _ _

     

=_ _ _ _ _ _= _ _

 

∂ ∂ _ __ _

   

with ( ) ( ) ( )

(

( ) ( )

)

( ) ( )

2

1

T T

o o o o o o o

e=M Y =_I −J X X J J X − X J _Y

  and

T s

n n n

W =W +W .

Under the null, we have LM →χ2

( )

1 .

3. LM Test for Spatial Correlation in the SAM

The Spatial Autoregressive Model is:

0 0

n n n n n

Y =λW Y X β +ε (10)

where all the notations have same meanings as those in the previous section, except that εn now is an n×1 vector of i.i.d. disturbances with zero mean and a finite variance σ02. We can see that in this model, the spatial correlations exist among the components of Yn instead of εn, compared with SEM.

We consider testing the spatial lag dependence of the model, namely testing the null hypothesis

0: 0 0.

H λ =

If the data are fully observed, by using the likelihood function [1], derived the LM test statistic explicitly as 2

T

T 1

, n ne W y LM

T e e

 

=  

 

where e is the OLS residual of model (10) under the null, and

(

)

T

(

)

(

)

1 1

T T T T T T T

.

n n n n n n n n n n n n n n n n

T=n W X_ X X − X Y _ M W X X X − X Y +tr_W +W W _e e e e

 

 

We consider the case where some of the observations in the outcome vector are unavailable. By adopting the same notations as those in the previous section, the (population) system under consideration can be written as

( )

0 0 .

o o

n n

u u

n n

Y Y

W X

λ β ε

   

    +

   

  

+ 

(5)

627 The reduced form Equation of (11) for write Yn( )o is

( ) ( ) 1 ( )

0

1 ,

o o o

n n S Xn n Jn n n

Y =J − β + S−ε (12)

and therefore, the ML approach based on this reduced form equation can be applied. Under normality, the log likelihood function is

( )

_{( )}

( )

_{( )}

2 2 2 , T 1 , 2 1

ln ln 2π ln ln

2 2 2

1

. 2

n v n

n n v n n n

o o

n n

Y n L

B Y X

n

X B

θ σ λ

λ β λ λ β

σ

−

= − − − Σ

   

− _ − _ Σ _ − _

The expressions for the elements of the score vector are

( )

_{

_{( )}

_{( ) ( )}

_{( )}

_{( ) ( )}

_{( )}

( )

( ) ( )

( )

(

)

}

( ) ( )

( )

2 2 T 2 1 1

T T T T T

1

T T T

1 T ln 1 2 , 1

n n n n n n n n n n

n n n n n n

n n

n

n n n

L

t

v B B B G G B B B v

B B B B G G

B G X B v

r

B

θ λ λ λ λ λ λ λ λ θ

λ λ λ λ λ λ

λ λ θ λ σ β σ λ σ λ θ − − − −    +             +        + _ ∂ = ∂ −      (13)

( )

_{

_T

_{( )}

_{( ) ( )}

_T 1

_{( )}

₂

_}

2

2 4

ln 1

,

2 n n n n

n

v L

B B v n

θ λ

θ

σ σ λ θ σ

−

  −

= _

∂

∂  (14)

and

( )

_T

_{( )}

_{( ) ( )}

_T 1

_{( )}

T 2 ,

ln 1

n n

n

n n n

v B

L

B B X

θ λ λ

θ

β σ λ

− 

∂



∂ =  (15)

where vn

( )

θ =Yn( )o −Bn

( )

λ Xnβ. The second order derivatives are, for the relevant combinations of parameters, Equations (A.7)-(A.12) in the Appendix. Thus the elements of the information matrix are,

( )

( ) ( )

( )

( ) ( )

{

}

( ) ( ) ( ) ( )

( ) ( )

{

}

{

( ) ( )

( ) ( ) ( ) ( )

}

( ) ( )

( )

( ) ( )

( ) ( ) ( )

{

}

( ) ( )

1 1

T T T T T T

1 1

T T T T T T

1 1

T T T T T

2 1

n n n

tr B G G B B B B G G B B B

tr B G G B B B tr B B B G G B

tr B B B G G B B B B G

B G

I

B

X

λλ λ λ λ λ λ λ λ λ λ λ λ λ

λ λ λ λ λ λ λ λ λ λ λ λ

λ λ λ λ λ λ λ λ λ λ λ

λ λ σ − − − − − −  +     +                − _ _ + _ _       − _ _ _ + _ _ _ + =

( ) ( )

1

( ) ( )

T _T

,

n n n n n

B B B G X

β  λ λ − λ λ β

      (16)

( ) ( ) ( )

( ) ( )

{

}

2 1 T T 2 1 ,

n n n n n

I_λσ tr B λ G λ B λ B λ B

σ λ

−

 

 

= (17)

( ) ( )

T

( ) ( )

_T 1

( )

2 1

,

n n n n n n n

B G X B B B X

I_λβ λ λ β λ λ λ β

σ

−

 

   

= (18)

2 2 2

2

4, 0,

2

n

I_{σ σ} I_{σ β}

σ

= = (19)

( )

( ) ( )

1

( )

T T T

2 .

1

n n n n n n

I_ββ X B λ B λ B λ B λ X

σ

−

 

 

= (20)

To perform the LM test, expressions (13)-(15) and (16)-(20) need to be evaluated under constrained estimation, i.e., with the parameter values included in the null hypothesis set to zero (namely, ˆλR =0), and with the other parameters set to their ordinary-least-squares estimates, i.e., βˆ_R =

(

X J_nT _n( )oTJ_n( )o X_n

)

−1X J_nT _n( )oTY_n( )o,

and 2

(

)

( )T ( ) ( ) 2

ˆ_R 1n Y_no M_noY_no ,

σ = with ( ) ( )

(

( ) ( )

)

( )

2

1

T T

o o o o o

n n n n n n n n n n

(6)

Note that Bn

( )

0 =Jn( )o,Gn

( )

0 =Wn and

( )

( ) ( )

( )

{

_T _T 1 _T

}

{

( )_T ( ) _T

}

0 0 0 0 0 0 o o 0

n n n n n n n n n n

tr B _B B _− B _G +G _ =tr J J _W +W _ =

because Wn has a zero diagonal. Thus we have the score vector as follows

( )

T ( ) ( ) T

2 T ˆ ˆ ln 0 , 0 o

n n n R

n R

o n

n e J W X J e e e

L θ β

θ  ₊    ∂   =   ∂        

where e=Mn( ) ( )oYno is the ordinary-least-squares residual of model (12) with ˆλR =0. And the estimated information matrix is

( )

( ) ( )

(

)

( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( ) T T T 2 2 2 4 T T 2 T T T 2 ˆ ˆ ˆ ˆ ˆ ˆ

1 _ˆ _ˆ 1 _ˆ

0

0 0 .

2

1 _ˆ 1

0 R

o

n n n n n n n n n n R n n

o o o o o o o

o o

n R n n n R n n

R R

R

n n n n n R n n n

o n

R R

o I

tr J J W W J J W J W X J W X J W X J X

n

J X J W X X J J X

σ σ θ β σ σ σ β β β  _ _ _ _ _ _  + +  _ _ _ _ _           =            

Using the formula of the inverse of a partitioned matrix, we have

( )

1 ( )T ( )

(

T

)

( )T ( ) ( ) T ( ) ( )

2 ˆ 1

ˆ o o o ˆ o o _{ˆ .}

R n n n n

o o

n n n n n n R n n n

R

n R

I tr J J W W J J W J W X M J W X

λλ σ β

θ − β

   ₊ ₊  

   

=

 

Therefore, the LM test statistic is

( )

(

( ) ( )

)

1 ( ) ( ) ( ) 2

2 T T T

2 1

T T T

T ˆ

ln ₁

ˆ _,

o o o

n R

o o o

R o

n e J W X X J J X X J Y J e

L LM I e e T λλ θ θ λ − −     ₊  _∂  _ _     =_ _{ } =    ∂ _ _          where ( )

(

( ) ( )

)

( ) ( ) ( ) ( )

(

( ) ( )

)

( ) ( )

( )

( ) ( ) ( ) ( ) T 1 1

T T T T

( ) T T T

2

T T T T _,

o o o o o o o o o o o

o T

n n n n n n n n n n n n n n n n n n n n n

o o s o o

n n n n n n

T n J W X X J J X X J Y M J W X X J J X X J Y

tr J J W J J W e e e e

− −      =  _ _ _    + _ _ _     

with ( ) ( ) ( )

(

( ) ( )

)

( ) ( )

2

1

T T

o o o o o o o

e=M Y =_I −J X X J J X − X J _Y

  and

T s

n n n

W =W +W .

Under the null, we have LM →χ2

( )

1 .

4. Monte Carlo Experiments

To investigate the finite sample performance of the LM tests, we conduct Monte Carlo experiments, designed as follows.

4.1. LM Tests in SEM

(7)

629

are β =10 1 and β =20 1. The ui’s are independently drawn from N

( )

0,1 and are independent of xi1 and 2

i

x . xi1 and xi2 are generated from N

( )

0,1 .

For weights matrix Wn, we follow the design of [6] and [3], which is referred to as the “circular world ma-trix.” The weights matrix is designed as follows. For the n n× weights matrix, the first n/3 rows (except for the first row) have zeroes everywhere, except for the elements in positions (i, i + 1) and (i, i ‒ 1). In the first row, the non-zero elements are in positions (1, 2) and (1, n) so that it relates to a circular world. The nonzero elements in the first n/3 rows are all random draws from U

( )

0,1 ; i.e., we allow the neighbors to asymmetrically affect one another2. Then, these rows are row normalized, so that the sum of each row is equal to 1. The next n/3 rows (say,

3 1, , 2 3

j=n +  n ) have zeroes everywhere, except in positions (j, j ± r), where r=1, 2,, 5. The nonzero elements are designed in the same fashion as those in the first n/3 rows. The last n/3 rows are defined in a simi-lar manner to the first n/3 rows. Specifically, the nonzero elements in rows j=2 3 1,n + ,n−1 are in posi-tions (j, j+1) and (j, j‒1); in the last row, the nonzero elements are in positions (n, 1) and (n, n ‒ 1). The nonzero elements in these rows are also designed in the same fashion as those in the first 2n/3 rows. The weights matrix is a sparse weights matrix, with each individual having only several “neighbors.” The number of neighbors dif-fers for each individual, depending on its position.

For sample sizes, we set n from “small”, n = 60 and “moderate”, n = 180, to “large”, n = 540. For missing observations, the yi’s of the first α percent of the n individuals are unobserved for each sample size n, where

α is set as 10, 25, and 50.

For each n and α (percentage of missing) combination, we report the percentages of rejecting the null hy-pothesis in all the 1000 Monte Carlo replications, for different nominal sizes 1%, 5% and 10%. The first row shows the results for the λ =0 0 case, and the second and third row show those for λ =0 0.2 and 0.5, respec-tively.

Tables 1-3below show the finite performance of the LM test in the SEM. The empirical levels (first row) of the LM test are close to the theoretical ones. But for the powers (second and third row), they depend on the sample sizes and the value of λ0. For small value of λ0, the powers are poor, especially for small n. For larger

0

[image:7.595.86.540.411.713.2]

λ , the powers are good, except for small n.

Table 1. SEM: 10% missing data.

n 60 180 540

Nominal sizes 1% 5% 10% 1% 5% 10% 1% 5% 10%

0 0

λ = 0.6 3.7 8.1 1.3 4.9 9.5 1.2 4.6 10.1

0 0.2

λ = 9.8 24.1 33.9 37.5 61.7 73.0 91.5 96.5 97.9

0 0.5

λ = 78.7 91.4 95.9 99.9 100 100 100 100 100

n 60 180 540

Nominal sizes 1% 5% 10% 1% 5% 10% 1% 5% 10%

0 0

λ = 0.8 5.2 9.4 1.6 4.8 10.1 0.8 5.2 9.9

0 0.2

λ = 6.4 18.9 28.0 28.9 49.7 62.0 81.6 94.3 97.4

0 0.5

λ = 64.1 82.1 87.4 99.9 99.9 99.9 100 100 100

n 60 180 540

Nominal sizes 1% 5% 10% 1% 5% 10% 1% 5% 10%

0 0

λ = 1.3 4.5 8.3 1.3 6.4 10.9 0.8 5.0 9.9

0 0.2

λ = 4.3 14.1 21.5 19.2 39.9 54.2 66.6 84.2 90.8

0 0.5

λ = 45.2 68.3 76.8 97.1 99.1 99.6 100 100 100

(8)

[image:8.595.98.538.95.390.2]

Table 4. SAM: 10% missing data.

n 60 180 540

Nominal sizes 1% 5% 10% 1% 5% 10% 1% 5% 10%

0 0

λ = 1.1 4.9 10.3 0.8 5.0 10.0 0.5 5.5 11.4

0 0.2

λ = 26.8 47.7 58.7 77.6 91.3 95.3 100 100 100

0 0.5

λ = 98.2 99.5 99.7 100 100 100 100 100 100

n 60 180 540

Nominal sizes 1% 5% 10% 1% 5% 10% 1% 5% 10%

0 0

λ = 1.8 5.7 10.5 1.6 5.5 10.1 0.6 4.5 10.3

0 0.2

λ = 16.7 33.9 47.5 63.9 81.2 87.3 99.3 99.9 99.9

0 0.5

λ = 93.9 97.9 98.9 100 100 100 100 100 100

n 60 180 540

Nominal sizes 1% 5% 10% 1% 5% 10% 1% 5% 10%

0 0

λ = 1.1 5.8 10.7 0.9 4.5 9.0 1.1 4.7 10.6

0 0.2

λ = 12.1 25.9 38.8 47.7 69.5 79.1 95.1 98.9 99.5

0 0.5

λ = 78.9 91.1 95.6 100 100 100 100 100 100

4.2. LM Tests in SAM

In the SAM, we generate εi’s independently from N

( )

0,1 and independent of xi’s. All other designs are the same as those in the previous subsection. Tables 4-6 show the results of the LM test. The empirical levels are all close to the theoretical ones. But for the powers, they are not good for small value of λ0 when sample sizes are small. For large value of λ0 and larger sample sizes, the powers are good.

5. Conclusion

In this paper, we extend the LM tests for spatial correlations to the case where there are missing data in the de-pendent variable. We considered the spatial error model as well as the spatial autoregressive model and derived the formulas of the LM test statistics in both models. Monte Carlo experiments show good finite sample perfor-mance of the tests. The empirical levels of the LM tests are close to the theoretical ones and the powers are good for large sample sizes.

References

[1] Anselin, L. (1988) Spatial Econometrics: Methods and Models. Kluwer, Dordrecht. http://dx.doi.org/10.1007/978-94-015-7799-1

[2] Burridge, P. (1980) On the Cliff-Ord Test for Spatial Autocorrelation. Journal of the Royal Statistical Society, 42, 107- 108.

[3] Wang, W, and Lee, L. (2013a) Estimation of Spatial Autoregressive Models with Randomly Missing Data in the De-pendent Variable. Econometrics Journal, 16, 73-102. http://dx.doi.org/10.1111/j.1368-423X.2012.00388.x

[4] LeSage, J. and Pace, R.K. (2004) Models for Spatially Dependent Missing Data. Journal of Real Estate Finance and Economics, 29, 233-254. http://dx.doi.org/10.1023/B:REAL.0000035312.82241.e4

[5] Wang, W, and Lee, L. (2013) Estimation of Spatial Panel Data Models with Randomly Missing Data in the Dependent Variable. Regional Science and Urban Economics, 43, 521-538.

(9)

631

(10)

Appendix

The second order derivatives are, for the relevant combinations of parameters of the log likelihood function for the SEM,

( )

_{

_{( )}

_{( ) ( )}

_{( )}

( )

( ) ( )

( )

( ) ( )

( )

( ) ( )

( ) ( ) ( ) ( )

( ) ( )

( )

}

( ) ( )

( )

{

( ) ( )

1

T T T

1 1

T T T T T

1 1

T T T T

2 2 T 1 2 1

T T T T

ln 1

1 2

n n n n n n

n n n n n n n n n n

n

n n n n n n n n n n

n n n n n n n n

L

tr

v B B B G G

B B B B G G B B B v

v B B B G G B B B v

B B B G G B B B

B

β λ λ λ λ λ

λ λ λ λ λ λ λ λ λ β

β λ λ

θ λ

λ λ λ λ λ λ β

λ λ λ λ λ λ λ λ

σ − − − − − − −    +            × _ _ _ + _ _ _     − _ _ _ _       − _ _ _ + −    = ∂ × ∂

( )

_T

( )

_T

( )

}

{

( ) ( )

_T 1

( ) ( ) ( ) ( )

_T _T

}

,

n λ Gn λ Gn λ Bn λ tr Bn λ Bn λ Bn λ Gn λ Gn λ Bn λ −  +  +       (A.1)

( )

_{( )}

_{( ) ( )}

_{( )}

_{( ) ( )}

_{( )}

2 2 4 1 1

T T T T T

ln 1

,

2 n n n n n n n n n

n

v B B B G G B B

L

B v

θ

λ σ σ β λ λ λ λ λ λ λ λ β

− −

∂

− =

∂ ∂    +    (A.2)

( )

_T

_{( )}

_{( ) ( )}

_T 1

_{( )}

_T

_{( )}

_T

_{( )}

_{( ) ( )}

1 ( ) 2 T T 2 ln 1 , o

n

v B B B G B

L

G B B J X

β λ λ λ λ λ λ λ

λ β σ λ

θ − −

   +   

     

∂

− =

∂ ∂ (A.3)

( )

( ) ( )

( )

1

T T 2

2

2 6 4

2 ln , 2 1 n n n n n n

v B B v

L θ

σ σ

σ β λ λ β

−

  −

∂  

∂

− = (A.4)

( )

_T

_{( )}

_{( ) ( )}

1 ( ) 2

2 T 4

T ln

,

1 o

n n n n

n

v B B J

L X β λ θ σ λ σ β −    = _ ∂ −

∂ ∂ (A.5)

and

( )

_T ( )_T

_{( ) ( )}

_T 1 ( ) 2

T 2

ln 1

.

o o

n n n n n

n

X J B B J X

L θ λ σ λ β β − ∂ −

∂ ∂ =   (A.6)

The second order derivatives are, for the relevant combinations of parameters of the log likelihood function for the SAM,

( )

_{

_{( )}

_{( ) ( )}

_{( )}

_{( ) ( )}

( )

( ) ( )

( )

( ) ( )

( ) ( ) ( ) ( )

( ) ( )

( )

}

( )

( ) ( )

( )

( ) ( )

1 1

T T T T T

1 1

T T T T T

1 1

T T T T T

T T 2 2 2 T 2 ln 1 1

n n n n n n n n n

n n n n n n n n n n

n n n n n n

n

v B B B G G B B B

B G G B B B v v B B

B G G

L

B B B

θ λ λ λ λ λ λ λ λ

λ λ λ λ λ λ θ θ λ λ

λ λ λ λ λ λ θ

θ

λ σ

σ θ λ λ

λ λ λ λ λ λ

− − − − − −    +                × _ + _ _ _ − _ _     × _ _ + _ _     × _ + _ ∂ = ∂  −

( ) ( )

( )

{

( ) ( )

( )

}

{

( ) ( )

( ) ( ) ( ) ( )

}

( ) ( )

{

( )

(

)

( ) ( )

( )

( ) ( )

1 1 1

T T T T

1

T T T T T

1 T 1 2 T T _T 1 2 1

n n n

n n n n n n n n

n n n n n n n n n n

n n n n n n n n

n n n n n n n n n n

B G X

B B B G G B B B

B G G B tr B B B G G B

B G X B B B G X

B G G X B B v B G

tr

X

λ λ β

λ λ λ λ λ λ λ λ

λ λ λ λ λ λ λ λ λ λ

λ λ β λ λ λ λ β

λ λ

σ

λ β λ λ θ λ λ β

− − − − − −        − _ _ _ + _ _ _     × _ + _ + _ _   + _ _{ } _   − _ − _ _ _ +_ 

( ) ( )

( )

( ) ( )

( )

}

T 1 1

T T T T _,

n n n n n n n n n

B λ B λ − B λ G λ G λ B λ B λ B λ − v θ



     

×_ _ _ + _ _ _

(11)

633

( )

_{( )}

_{( ) ( )}

_{( )}

_{( ) ( )}

_{( )}

( ) ( )

( )

1 1

T T T T T

2

2 4

T 4

1 T ln

1

, 1

2 n n n n n n n n n n

n n

n

n n n

v B B B G G B B B v

L

X

B G B B v

θ λ λ λ λ λ λ λ λ θ

λ λ λ

θ

λ σ σ

σ λ θ

− −

− ∂

   +   

     



− =

∂



+ _  _

∂

 

(A.8)

( )

_{( )}

_{( ) ( )}

_{( )}

_{( ) ( )}

_{( )}

( ) ( )

( )

( ) ( )

( )

2

1 1

T T T T T

1 1

T T

2

T T

2 ,

ln 1

1 1

n T

n n n n n n n n n n n n n

v B B B G G B B B B X

B G X B B v B G X B

L

B B X

θ λ λ λ

θ

λ β σ λ λ λ λ λ λ

λ λ λ λ λ

σ θ σ λ λ λ λ

− −

   +   

     

  +  

   

 

∂

− =

∂ ∂

− _ _ _ _ _ _

(A.9)

( )

( ) ( )

( )

1

T T 2

2

2 6 4

2 ln

, 2 1

n n

n

n n

n

v B B v

L θ

σ σ

σ θ λ λ θ

−

  −

∂  

∂

− = (A.10)

( )

_T

_{( )}

_{( ) ( )}

1

_{( )}

2

2 T

T

4 ,

ln 1

n n n n

n

v B B B X

L θ _θ _λ

σ β σ λ λ

−

 

 

∂

− =

∂ ∂ (A.11)

and

( )

_{( )}

_{( ) ( )}

1

_{( )}

T T T

2

T 2

ln

. 1

n n n n

n

n n

X B B X

L

B B

θ

β β σ λ λ λ λ

− ∂

− =

∂  

(12)