Unit Root Tests in Panel Data: Weighted Symmetric Estimation and Maximum Likelihood Estimation

(1)

KIM, HYUNJUNG. Unit Root Tests in Panel Data: Weighted Symmetric Es-timation and Maximum Likelihood EsEs-timation. (Under the direction of David A. Dickey.)

There has been much interest in testing nonstationarity of panel data in the econo-metric literature. In the last decade, several tests based on the ordinary least squares and Lagrange multiplier method have been developed. In contrast to a unit root test in the univariate case, test statistics in panel data have Gaussian limiting distribu-tions.

(2)

(3)

To my eternal beloved

and

(4)

Biography

(5)

Acknowledgements

I would like to express my gratitude to Dr. David A. Dickey. He has been my mentor and good example to me as a researcher. Without his kindly and sincere guidance, I could not complete this dissertation. I am also grateful to my committee members, Dr. Battacharyya, Dr. Gumpertz, and Dr. Pantula for their comments on my dissertation. Dr. Pantula has given me encouragement and valuable advice through my academic years. It has been a great pleasure for me to work with Dr. Brownie. Her supervision as well as financial support helped me to finish my grad-uate program without any difficulty. I will not forget the faculties and staff in our department for their enthusiastic teachings and kindness.

I also want to show my appreciation to the faculties in the department of statis-tics at Ewha Womans University. Especially, Dr. Dongwan Shin had provided me invaluable guidance during my master’s program and led me to keep studying for Ph. D.

Recently, I realized that I have known many precious friends. Even though I do not mention their names, they are treasures to me that I have obtained through my life and will be remained in my deepest heart. I really appreciate all the saints who have spent time with me for last 4 years. Their love, fellowship and prayer in the Lord enliven me in every situation.

(6)

2.1 AR(1) process . . . 11 2.2 Higher Order Processes . . . 29 3 Maximum Likelihood Estimation with Fixed Individual Effects 43 3.1 Maximum Likelihood Estimator for the Fixed Individual Effects Model 43 3.2 Limiting Distribution . . . 49 4 Maximum Likelihood Estimation with Random Individual Effects 57 4.1 Maximum Likelihood Estimator for the Random Individual Effects Model 57 4.2 Sequential Limit Distribution . . . 62

5 Power Study 68

6 Example 75

7 Summary 77

References 79

A Finite Sample Moments in Fixed Effects Case 83

B Finite Sample Moments in Random Effects Case 88

(7)

List of Tables

2.1 Regression table for weighted symmetric estimation in AR(1) process 13

2.2 Empirical cumulative distribution of √N T(ˆρ_s−1−b_s) . . . 25

2.3 Empirical cumulative distribution of √N T(ˆρ_ws−1−b_ws) . . . 26

2.4 Empirical cumulative distribution of ˆτ_s . . . 27

2.5 Empirical cumulative distribution of ˆτ_ws . . . 28

2.6 Regression table for weighted symmetric estimation in AR(2) process 31 2.7 Empirical cumulative distribution of ˆτ(θ₁)_s . . . 41

2.8 Empirical cumulative distribution of ˆτ(θ₁)_ws . . . 42

3.1 Empirical cumulative distribution of √NhT(ˆρ_{ml,f ix}−1) +ζ₀(T)i . . . 56

4.1 Empirical cumulative distribution of √NhT(ˆρ_ml,ran−1) + 1i . . . 67

5.1 Size of unit root tests in panel data when ρ= 1 . . . 70

5.2 Power of unit root tests in panel data when ρ= 0.98 . . . 71

(8)

List of Figures

(9)

Chapter 1 Introduction

1.1 Statement of the problem

Time series across states, industries, or randomly sampled individuals can be easily found in the macroeconomic and microeconomic literature. These kinds of data are called panel data, cross-section data, or longitudinal data. Among econometricians, there have been many studies about panel data with the autoregressive covariance structure over time. If it is assumed that there is no correlation across individuals, an autoregressive panel data model of order 1 with individual effects µ_i can be written as

y_it =ρy_it₋₁+ (1−ρ)µ_i+e_it, i= 1,2,· · ·, N;t= 1,2,· · ·, T, (1.1) wheree_it follows a white noise process with mean zero and varianceσ2. Here, y_it can be interpreted as an observation which is obtained from the ith individual at time t. The individual effect µ_i can be assumed to be either a fixed or random effect which is independent of the white noise process e_it.

(10)

many econometricians have developed test statistics and the corresponding limiting distributions. Unlike a univariate time series, we need to consider two dimensions of indices, that is, (N, T) for deriving a limiting distribution.

In this thesis, we will develop unit root test statistics based on weighted sym-metric estimation and maximum likelihood estimation for model (1.1). For weighted symmetric estimation, we also discuss higher order processes. For maximum likeli-hood estimation, we will consider random individual effects as well as fixed individual effects. Concerning white noise, we will assume the Gaussian distribution with mean zero and varianceσ2. Also, we will derive the limiting distributions for the test statis-tics underH₀ :ρ= 1 asN, T → ∞. Through Monte Carlo simulation study, we will compare the power of our test statistics with some of previous studies. As an example, we will consider the real Gross Domestic Product per Capita for 12 countries.

1.2 Literature Review

Since the 1960’s, there has been much literature about panel data models [e.g. Balestra and Nerlove (1966), Mundlak (1978), T.W. Anderson (1978), T.W. Anderson and Hsiao (1982)]. Anderson and Hsiao (1982) present several estimation methods for various regression models with the autoregressive order 1 covariance structure. Specifically, they focus on maximum likelihood estimation for the following stationary autoregressive panel data model.

y_it =ρy_it₋₁+δ

0

z_i+γ

0

x_it+α_i+e_it, i= 1,· · ·, N;t = 1,· · ·, T, (1.2) where

|ρ|<1, E(α_i) =E(e_it) = 0,

(11)

α_i and e_it are independent of each other, z_i is a vector of time-invariant exogenous variables and x_it is a vector of time-varying exogenous variables. For maximum likelihood estimation, the underlying distributions are Gaussian. They assume that the initial value y_i₀ may be fixed or random. For both cases, they allow thaty_i₀ may be related to α_i or independent of α_i. They present consistency results for ˆρ and ˆδ when N → ∞for fixed T or T → ∞for fixed N.

Due to the fact that there exist nonstationary time series in many economic data sets, Quah (1994) introduces a unit root test for panel data. He considers the simplest panel data model,

y_it=ρy_it₋₁ +e_it, (1.3) where y_i₀ is a given random variable with (2 +δ)th finite moment and the e_it’s are independent identically distributed with mean zero, variance σ2, and have bounded (4 +δ)th moment for δ >0. He assumes the cross-section dimension N and the time dimension T to be the same order of magnitude, that is, N = N(T) = O(T). For model (1.3), he considers the ordinary least squares estimator ˆρ_ols of ρ. That is,

ˆ ρ_ols =h

N_X(T) i=1

T

X

t=1

y_it₋₁2i−1h

N_X(T) i=1

T

X

t=1

y_ity_it₋₁i. (1.4) Under H₀ :ρ= 1, he shows

q

N(T)

√

2 T(ˆρols−1−2 µ σ2T

−3/2₎₌D_⇒_N_(0,₁₎ _(1.5)

as T → ∞, where

µ= lim

T→∞E

h

y_i₀³P

T t=1eit

√

T

´i

.

(12)

Levin and Lin (1992) generalize model (1.3) to have individual-specific effects and a time trend in their model. The model with individual-specific effects can be written as

y_it =η_i+ρy_it₋₁+e_it, (1.6) where the e_it are independent and identically distributed with mean 0 and variance σ2. Also, E|e_it|2+δ _<_∞_{, for} _{δ >}_0.

They consider the ordinary least squares estimator of ρ and the corresponding t-statistic in model (1.6). These are defined as

ˆ ρ_ols =h

N

X

i=1 T

X

t=1

(y_it−y¯_i)2i−1h

N

X

i=1 T

X

t=1

(y_it−y¯_i)(y_it₋₁−y¯_i)i, (1.7) t_ρ =hσˆ_ols−2

N

X

i=1 T

X

t=1

(y_it−y¯_i)2i1/2hρˆ_ols−1i, (1.8) where

¯ y_i = 1

T

X

t=1

y_it, and

ˆ

σ_ols2 = 1 N T −N −1

N

X

i=1 T

X

t=1

h

y_it−y¯_i−ρˆ_ols(y_it₋₁−y¯_i)i2.

Under H₀ :ρ= 1, t_ρ has the following limiting distribution, based on the central limit theorem for triangular arrays (Billingsley 1986).

√

1.25ht_ρ−

√

N µ₁_T

√

µ₂_T

i _D

=⇒N(0,1), (1.9)

(13)

where

µ₁_T =Eh 1 T σ2

T

X

t=1

(y_it₋₁−y¯_i)e_iti=−1 2 −

1 2T, and

µ₂_T =Eh 1 T2σ2

T

X

t=1

(y_it₋₁−y¯_i)2i= 1 6 +

5 6T2. If

√

N

T →0, then

√

1.25t_ρ+√1.875N=D⇒N(0,1). (1.10) In their 1993 paper, they allow the degree of persistence to vary across individu-als. In other words, their individual-specific effects model (1.6) is generalized to the following model.

∆y_it =η_i+δ_iy_it₋₁+²_it. (1.11) Here, ²_it is distributed independently across individuals and is a stationary invertible ARMA process as in Said and Dickey (1984). Also, this disturbance has finite fourth moments and bounded variation at frequency zero, that is,

E(²2_it) + 2

∞

X

j=1

E(²_it²_it₋_j)< M,forM >0.

In the model (1.11), testing for a unit root is equivalent to testing δ_i = 0 for all i. To remove time-specific aggregate effects, they subtract the cross-section average

1 N

P_N

i=1yit from the original data at time t. After this procedure, they perform the

(14)

∆y_it=η_i+δ_iy_it₋₁+

pi

X

j=1

θ_ij∆y_it₋_j +e_it (1.12)

Let ˆu_it be the residual from the regression of ∆y_it onη_i and P_jp₌₁i θ_ij∆y_it₋_j. Also let ˆ

v_it₋₁ be the residual from the regression of y_it₋₁ on η_i and Ppi

j=1θij∆yit−j. By using

these residuals, they consider the following regression model for each i.

ˆ

u_it=δ_iˆv_it₋₁+e_it. (1.13) Dividing ˆu_itand ˆv_it₋₁by the regression standard error in the model (1.13), they obtain standardized residuals ˜u_it and ˜v_it₋₁. They compute a combined estimator ˆδ and the corresponding t-test statistic from these standardized residuals. The definitions are

ˆ δ =

P_N

i=1

P_T

t=2+piv˜it−1u˜it

P_N

i=1

P_T

t=2+piv˜2it−1

, (1.14)

and

t_δ = ˆ δ

SE(ˆδ), (1.15)

where

SE(ˆδ) = ˆσ_eh

N

X

i=1 T

X

t=2+pi ˜

v_it2₋₁i−1/2, ˆ

σ_e=h 1 NT˜

N

X

i=1 T

X

t=2+pi

(˜u_it−δ˜ˆv_it₋₁)2i1/2, and

˜

T =T − 1 N

N

X

i=1

(15)

Sincet_δ in (1.15) goes to−∞for the individual-specific effects model, they introduce the following adjusted t-test statistic.

t∗_δ = tδ−N ˜

TSˆσˆ−_e2SE(ˆδ)µ∗_{1 ˜}

T

σ_{1 ˜}∗

T

, (1.16)

where the average standard deviation ratio ˆS = 1 N

P_N

i=1sˆi and ˆsi is the ratio of the

long-run standard deviation to the innovation standard deviation for each i. The mean and variance adjustment factors,µ∗_{1 ˜}

T, σ ∗

1 ˜T can be found in Levin and Lin (1993). These

are presented in a table computed by Monte Carlo simulation for various (N, T). If N

T →0 then,

t∗_δ =D⇒ N(0,1). (1.17)

Im, Pesaran, and Shin (1997) propose the LM-bar test and t-bar test which are based on the averages of the Lagrange multiplier statistics and t-statistics, respec-tively. They consider the models (1.6) and (1.11) in Levin and Lin (1992 and 1993). However, the alternative hypothesis in this paper is that the fraction of the individ-ual processes that are stationary is non-zero, whereas the alternative hypothesis in Levin and Lin (1993) is that all the individual processes are stationary. When errors in regressions are serially uncorrelated, that is, under the model (1.6), the LM-bar statistic is

LM_{N T} = 1 N

N

X

i=1

LM_iT (1.18)

where

LM_iT = T∆y

0

iPi∆yi

∆y_i

0

M_T∆y_i, (1.19)

1_T =h1,1,· · ·,1i

0

(16)

M_T =I_T −1_T(1_T

0

1_T)−11

0

_T,

P_i =M_Ty_i,₋₁(y

0

_i,₋₁M_Ty_i,₋₁)−1y

0

_i,₋₁M_T,

y_i,₋₁ =hy_i₀, y_i₁,· · ·, y_iT₋₁i

0

, and

∆y_i =h∆y_i₁,∆y_i₂,· · ·,∆y_iTi. Then their standardized LM-bar statistic is

Γ_LM =

√

NhLM_{N T} −E(LM_iT)i

q

V ar(LM_iT)

, (1.20)

where E(LM_iT) and V ar(LM_iT) are the mean and the variance of LM_iT under H₀ :ρ = 1.

Their t-bar statistic is

t_{N T} = 1 N

N

X

i=1

t_iT, (1.21)

wheret_iT is the t-statistic which corresponds to the estimator ofδ_ifrom the augmented Dickey-Fuller regression model ∆y_it =η_i+δ_iy_it₋₁+e_it for each i. For testing δ_i = 0 for every i, they present the standardized t-bar statistic which is

Γ_t =

√

Nht_{N T} −E(t_T|δ_i = 0)i

q

V ar(t_T|δ_i = 0)

(17)

They also construct the standardized LM-bar and t-bar statistics under the model (1.11). Under the model assumption of (1.6), they prove that Γ_LM and Γ_t have standard Gaussian limiting distributions asN → ∞ and T is fixed. Also, they show that the standardized LM-bar and t-bar statistics which are constructed under the model (1.11) have standard Gaussian limiting distributions asN and T become large with a restriction of N

T →k, a finite positive constant.

Karlsson and L¨othgren (2000) compare power of Levin and Lin (1992)’s t- tests (hereafter LL tests) in (1.9) and (1.10) with Im, Peseran, and Shin (1997)’s tests (hereafter IPS tests) in (1.20) and (1.22). In their Monte Carlo experiments, they allow the proportion of stationary series in the panel to vary. The results show that IPS tests perform better than LL tests for large T. Also, the power increases as

• N increases,

• T increases,

• and the proportion of stationary series in the panel increases.

Especially, T is a critical factor in determining power.

Harris and Tzavalis (1999) also consider the unit root test statistics based on the pooled ordinary least squares estimator. In their paper, they derive limiting distributions for the normalized least squares estimators when N becomes large and T is finite. Under the model (1.6) and H₀ :ρ= 1,

√

N(ˆρ−1−b)=D⇒N(0, C), (1.23) as N → ∞ with fixed T,

where ˆρ is an ordinary least squares estimator of ρ under the model (1.6),

(18)

C = 3(17T

2₋_20T _{+ 17)}

5(T −1)(T + 1)3 .

(19)

Chapter 2 Simple and Weighted Symmetric

Estimation

2.1 AR(1) process

In this chapter, we consider the simple symmetric and the weighted symmetric estimator in testing for a unit root in the panel data model. Lety_it be the first order autoregressive time series with fixed individual effects.

y_it=η_i+ρy_it₋₁+e_it, i= 1,2,· · ·, N;t = 1,2,· · ·, T, (2.1) where the e_it’s are independent and identically distributed as N(0, σ2) and y_i₀’s are fixed. Under the stationarity condition, that is, |ρ| < 1, we can have a backward representation of the autoregressive time series. That is,

y_it =η_i+ρy_it₊₁+u_it,

wheree_itandu_it have same covariance structure for each i. The discussion follows the logic of the univariate case as laid out by Dickey, Hasza, and Fuller (1984). In their paper, the following univariate seasonal model with seasonal period d is considered.

y_t=

d

X

i=1

(20)

where

x_it = 1 if t=i(mod d) = 0 otherwise,

y₋_d₊₁,y₋_d₊₂,· · ·, y₀ are fixed initial values and thee_t’s are independent and identically distributed with mean 0 and variance σ2.

The seasonal process in (2.2) can be partitioned into d component series such that

y_dk₊_i =θ_i+αy_d₍_k₋₁₎₊_i+e_dk₊_i;i= 0,1,· · ·, d−1,

and each component runs from k = 1 to m where the total number of observations is n =dm. This partition into d component series establishes a useful link between model (2.1) and (2.2). In other words, θ_i in the seasonal process corresponds to an individual intercept η_i in the panel model whereas the seasonal period d corresponds to the number of panels.

The weighted symmetric estimator of ρ in the model (2.1) is the estimator that minimizes

N

X

i=1 T

X

t=1

w_t[y_it−η_i−ρy_it₋₁]2+

N

X

i=1 T_X−1

t=0

(1−w_t₊₁) [y_it−η_i−ρy_it₊₁]2.

If the weightw_t≡1, the estimator is the ordinary least squares estimator. The simple symmetric estimator which is discussed by Dickey, Hasza, and Fuller (1984) can be obtained by setting w_t ≡ 0.5. In weighted symmetric estimation for the first order autoregressive process, the weight is defined as follows.

w_t = 0, t= 1 t−1

T , t = 2,3,· · ·, T.

The weighted symmetric estimator of ρ can be obtained from Table 2.1. Letβ

0

= [η₁, η₂,· · ·η_N, ρ].

(21)

Table 2.1: Regression table for weighted symmetric estimation in AR(1) process Weight Dependent Parameter

Variable η₁ η₂ · · · η_N ρ w₁ y₁₁ 1 0 · · · 0 y₁₀ w₂ y₁₂ 1 0 · · · 0 y₁₁

..

. ... ... ... ... ... ... w_T y₁_T 1 0 · · · 0 y₁_T₋₁ 1−w_T y₁_T₋₁ 1 0 · · · 0 y₁_T 1−w_T₋₁ y₁_T₋₂ 1 0 · · · 0 y₁_T₋₁

..

. ... ... ... ... ... ... 1−w₁ y₁₀ 1 0 · · · 0 y₁₁

..

. ... ... ... ... ... ... ..

. ... ... ... ... ... ... w₁ y_N₁ 0 0 · · · 1 y_N₀ w₂ y_N₂ 0 0 · · · 1 y_N₁

..

. ... ... ... ... ... ... w_T y_{N T} 0 0 · · · 1 y_{N T}₋₁ 1−w_T y_{N T}₋₁ 0 0 · · · 1 y_{N T} 1−w_T₋₁ y_{N T}₋₂ 0 0 · · · 1 y_{N T}₋₁

..

. ... ... ... ... ... ... 1−w₁ y_N₀ 0 0 · · · 1 y_N₁

ˆ

β = (X

0

WX)−1(X

0

WY),

where X is the (2NT)×(N+1) matrix that consists of columns corresponding to pa-rameters, Y is the vector of the dependent variables, and W is the (2NT)×(2NT) diagonal matrix of weights. Then the simple symmetric estimator of ρ is

ˆ

ρ_s =Q−_s1h

N

X

i=1 T

X

t=1

y_ity_it₋₁− 1 T

N

X

i=1

(

T

X

t=2

(22)

Q_s = N X i=1 ( T X t=2

y_it₋₁2+ 0.5y_iT2+ 0.5y_i₀2)− 1 T N X i=1 ( T X t=2

y_it₋₁+ 0.5y_iT + 0.5y_i₀)2. The simple symmetric estimator of an individual intercept η_i is

ˆ

η_is = 1 T Q_s(

T

X

t=2

y_it₋₁+ 0.5y_iT + 0.5y_i₀)h

N X j=1 ( T X t=2

y_jt2₋₁ + 0.5y_jT2 + 0.5y2_j₀)

−XN

j=1 T

X

t=1

y_jty_jt₋₁i.

On the other hand, the weighted symmetric estimator of ρ is

ˆ

ρ_ws = Q−_ws1h

N X i=1 T X t=1

y_ity_it₋₁− 1 T

N

X

i=1

(T

2₋₁

T2 (

T

X

t=2

y_it₋₁)2+T

2₊_T ₋₂

T2 (

T

X

t=2

y_it₋₁)y_iT +T −1

T2 yiT

2₊T + 1

T yi0

T

X

t=2

y_it₋₁+ 1

TyiTyi0)

i

(2.4) where

Q_ws =

N

X

i=1

h_T _{+ 1}

T

X

t=2

y_it₋₁2+ 1 TyiT

2i₋ 1

T

N

X

i=1

h_T _{+ 1}

T

X

t=2

y_it₋₁+ 1 TyiT

i₂

.

Also, the weighted symmetric estimator of an individual intercept η_i is

ˆ

η_iws = T −1 T2 (

T

X

t=2

y_it₋₁+y_iT) + 1 Tyi0+

1 T Q_ws

h_T2₋₁

T3 N X j=1 ( T X t=2

y_jt₋₁)2 +T

2₊_T ₋₂

T3 N X j=1 ( T X t=2

y_jt₋₁)y_jT + T −1 T3

N

X

j=1

y2_jT + T + 1 T2 N X j=1 ( T X t=2

y_jt₋₁)y_j₀ + 1

T2

N

X

j=1

y_jTy_j₀−

N X j=1 T X t=1

y_jty_jt₋₁i(T + 1 T

T

X

t=2

y_it₋₁ + 1 TyiT).

(23)

√

N T(ˆρ_s−1−b_s) = 1

√

N

P_N

i=1[

P_T

t=1(yit−yit−1)yit−1+ ₂1(yi20−yiT2)−bsQsi]/T

1 N

P_N

i=1Qsi/T2

(2.5)

and

√

N T(ˆρ_ws−1−b_ws) = 1

√

N

P_N

i=1[Ai−bwsQwsi]/T

1 N

P_N

i=1Qwsi/T2

, (2.6)

where

b_s=− 6T

2T2+ 1 =− 3

T +O(1/T

2_),

b_ws =−2T

2₊_T _{+ 2}

T3+ 1 =− 2

T +O(1/T

2_),

Q_si =

T

X

t=2

y_it₋₁2+ 0.5(y_iT2+y_i2₀)− 1 T[

T

X

t=2

y_it₋₁ + 0.5(y_iT +y_i₀)]2,

Q_wsi = T + 1 T

T

X

t=2

y_it₋₁2+ 1 TyiT

2₋ 1

T[ T + 1

T

X

t=2

y_it₋₁+ 1 TyiT]

2_,

and

A_i = 2(T + 1) T3 (

T

X

t=2

y_it₋₁)2+−T

2₊_T _{+ 4}

T3 (

T

X

t=2

y_it₋₁)y_iT − T

2₊_T ₋₂

T3 yiT

2

+

T

X

t=1

(y_it−y_it₋₁)y_it₋₁+y_i2₀ − 1 T

T

X

t=2

y_it₋₁2− 1

T2yi0((T + 1)

T

X

t=2

(24)

ˆ

τ_s = ρˆsq−1−bs

ˆ σ2_sQ−_s1

, (2.7)

and

ˆ

τ_ws = ρˆwsq−1−bws ˆ

σ2_wsQ−_ws1 (2.8)

where

ˆ

σ_s2 = 1

N T −N −1[Y

0

WsY−Y

0

WsX(X

0

WsX)

−1_X

0

_W

sY],

and

ˆ

σ2_ws = 1

N T −N −1[Y

0

WwsY−Y

0

WwsX(X

0

WwsX)

−1_X

0

_W

wsY]

Here, W_s and W_ws are the diagonal matrices of weights for simple symmetric and weighted symmetric estimations, respectively.

Now, we discuss the joint limiting distributions of the above test statistics as N, T → ∞. To derive the joint limiting distributions of test statistics underH₀ :ρ= 1 as N, T → ∞, we employ the following joint limit results from Phillips and Moon (1999). In their paper, they present the following theorems for random vectors instead of random variables.

Theorem 2.1: (Joint Probability Limits) Suppose the random variables Y_iT are independent across i for all T and integrable. Let X_N,T =

P_N

i=1YiT

N which is square integrable and µ_x=lim_{N, T} _{→ ∞} E(X_N,T)=lim_{N, T} _{→ ∞}_N1 PN_i₌₁E(Y_iT) be finite. If lim_{N, T} _{→ ∞} _N1₂ PN_i₌₁ V ar(Y_iT) = 0, then

(25)

Proof:

It is sufficient to show that lim_{N, T} _{→ ∞}P{|1

N

P_N

i=1(YiT−E(YiT))|> ²}= 0 ∀² >0.

By Chebyshev’s inequality,

lim

N, T → ∞P{| 1 N

N

X

i=1

(Y_iT −E(Y_iT))|> ²} ≤ 1

²2 N, Tlim→ ∞E| 1 N

N

X

i=1

(Y_iT −E(Y_iT))|2 = 1

²2 N, Tlim→ ∞ 1 N2

N

X

i=1

E|Y_iT −E(Y_iT)|2 = 1

²2 N, Tlim→ ∞ 1 N2

N

X

i=1

V ar(Y_iT) = 0.

The first equality comes from the independence. If Y_iT’s are independent and iden-tically distributed across i for all T, the condition for Theorem 2.1 reduces to the finiteness of lim_T_→∞V ar(Y_iT). The following theorem gives us the condition for the joint limit Central Limit Theorem.

Theorem 2.2: (Joint Limit CLT) Suppose that Y_iT are independent and identically distributed with mean 0 and variance σ_T2 across i for all T. Assume the following conditions hold.

i) Y_iT2 are uniformly integrable in T. ii) lim_T_→∞σ_T2 =σ2_∞>0.

Then,

1

√

N

X

i=1

Y_iT=D⇒N(0, σ2_∞) as N, T → ∞. Proof:

Phillips and Moon (1999) show that Lindeberg’s theorem holds in panel data. In other words, letZ_i,N,T = qP YiT

N

j=1V ar(YjT)

and ∀² >0 ,

lim N, T → ∞

N

X

i=1

(26)

Then asN, T → ∞,

N

X

i=1

Z_i,N,T=D⇒N(0,1)

Therefore we need to prove that ∀² >0,

lim N, T → ∞

N

X

i=1

E[ Y

2

iT

N σ_T21{|YiT|> ²

q

N σ2_T}] = 0. Since Y_iT are identically distributed across i for all T,

N

X

i=1

E[ Y

2

iT

N σ2_T1{|YiT|> ²

q

N σ_T2}] =E[Y

2 1T

σ_T2 1{|Y1T|> ²

q

N σ_T2}]. (2.9)

By the assumption of uniformly integrability of Y₁2_T, (2.9) goes to 0 as N, T → ∞. Since it satisfies the Lindeberg condition,

P_N

i=1YiT

q

N σ_T2

D

=⇒N(0,1).

Since lim_{N, T} _{→ ∞} σ

2

T

σ_∞2 = 1,

P_N

i=1YiT

q

N σ2_∞

D

=⇒N(0,1).

Now we apply these joint limit results to the test statistics in (2.5)-(2.8).

Theorem 2.3: Given model (2.1), under the null hypothesisH₀ :ρ= 1 and assum-ing η_i =y_i₀ = 0 for every i, then

(27)

ii) τˆ_s= ρˆs−1−bs [ˆσ_s2Q−_s1]0.5

D

=⇒N(0,1.2) as N, T → ∞,

iii) √N T(ˆρ_ws−1−b_ws)=D⇒N(0,9) as N, T → ∞,

iv) ˆτ_ws = ρˆws−1−bws [ˆσ_ws2 Q−_ws1]0.5

D

=⇒N(0,1.5) as N, T → ∞.

Proof of i):

Under the assumptions, the normalized bias test statistic for H₀ :ρ= 1 is

√

N T(ˆρ_s−1−b_s) = 1

√

N

P_N

i=1[

P_T

t=2(yit−yit−1)yit−1−0.5yiT2−bsQsi]/T

1

N

P_N

i=1Qsi/T2

(2.10)

where Q_si =PT_t₌₂y_it₋₁2+ 0.5y_iT2− 1 T[

P_T

t=2yit−1+ 0.5yiT]2.

If we apply the joint probability limit result to the denominator and the joint limit CLT result to the numerator in (2.10), we find that the normalized bias test statistic has a Gaussian limiting distribution as N, T → ∞. For convenience, the first and the second finite sample moments of random variables in the test statistics under the null hypothesis are shown in Appendix A.

Now,

E(Q_si/T2) = 2T

2_{+ 1}

12T2 σ

2 T_−→→∞ 1

6σ

2_.

To prove that 1 N

P_N

i=1Qsi/T2−→P

1 6σ

2_{, we need to show lim}

T→∞V ar(Qsi/T2)<∞ .

Since

V ar(Q_si/T2) = 8T

4_{+ 20T}2 _{+ 17}

360T4 σ

4 T_−→→∞ σ4

(28)

1 N

N

X

i=1

Q_si/T2−→P 1 6σ

2

as N, T → ∞.

Now, we will prove the numerator in (2.10) satisfies the conditions for Theorem 2.2. From the first and second moment results in Appendix A, we find that

EhP

T

t=2(yit−yit−1)yit−1−0.5yiT2−bsQsi

T

i

= 0, and

V ar

"P_T

t=2(yit−yit−1)yit−1−0.5yiT2−bsQsi

T

#

= σ

4

10(2T2+ 1)2T

h

8T5

−20T2(T2−T + 1) + 17T −5i. Defining Z_iT as [PT_t₌₂(y_it−y_it₋₁)y_it₋₁−0.5y_iT2−b_sQ_si]/T, we need to show thatZ_iT2 is uniformly integrable in T. If we consider the T-limit distribution ofZ_iT2 ,

Z_iT2 =D⇒h− 1 2+ 3

³ Z 1 0 B

2

i(t)dt−(

Z ₁

0 Bi(t)dt)

2´i2_σ4 def

= Z_i2 as T → ∞,

where B_i(t) is a Wiener process (Phillips,1987). Since

lim

T→∞E(Z

2

iT) = E(Zi2) =

σ4 5 , Z_iT2 are uniformly integrable in T.

Hence, by Theorem 2.2,

1

√

N

X

i=1

[

T

X

t=2

(y_it−y_it₋₁)y_it₋₁−0.5y_iT2−b_sQ_si]/T=D⇒N(0,σ

4

(29)

Therefore,

√

N T(ˆρ_s−1−b_s)=D⇒N(0,7.2) as N, T → ∞ by Slutsky’s Theorem.

Proof of ii):

The studentized test statistic from simple symmetric estimation is

ˆ τ_s =

1

√

N

P_N

i=1[

P_T

t=2(yit−yit−1)yit−1−0.5yiT2−bsQsi]/T

s

ˆ σ2_s 1

N

P_N

i=1Qsi/T2

Since ˆσ2_s−→P σ2 and 1 N

P_N

i=1Qsi/T2−→P

1 6σ

2 _as _{N, T} _{→ ∞}_{, the denominator in the}

studentized statistic converges toσ2q1₆ in probability. Since the numerator converges toN(0,σ₅4) in distribution from the Proof of i),

ˆ

τ_s=D⇒N(0,1.2).

Proof of iii):

This proof is analogous to the proof of i). We apply joint limit results to the normalized bias test statistic in (2.6). If we look at the moments of the denominator in (2.6) first,

E(Q_wsi/T2) = T

4₋_T3 ₊_T ₋₁

6T4 σ

2 _→ σ2

6 , and

V ar(Q_wsi/T2) = 2T

8₋_4T7_{+ 11T}6₋_8T5₋_17T4_{+ 34T}3₋_T2₋_22T _{+ 5}

90T8 σ

(30)

Since lim_T_→∞V ar(Q_wsi/T2) = σ

4

45 <∞,

1 N

N

X

i=1

Q_wsi/T2−→P σ

2

6 . as N, T → ∞.

The first and second moments of the numerator in (2.6) are

E([A_i−b_wsQ_wsi]/T) = 0, and

V ar([A_i−b_wsQ_wsi]/T) = σ

4h_15T8₋_76T7 _{+ 167T}6₋_188T5_{+ 84T}4₋_4T3i

60T4(T2−T + 1)2 +σ

4h_13T2₋_2T ₋₉i

60T4(T2−T + 1)2.

As in the proof of (i), let us define P_iT = [A_i −b_wsQ_wsi]/T. To prove that P_iT2 is uniformly integrable in T, we consider the T-limit distribution of P_iT2 .

P_iT2 =D⇒h1 2(B

2

i(1)−1) +

Z ₁

0 B 2

i(t)dt−(

Z ₁

0 Bi(t)dt)Bi(1)

i₂

σ4 def= P_i2. as T → ∞.

Since

lim

T→∞E(P

2

iT) =E(Pi2) =

σ4 4 ,

P_iT2 is uniformly integrable in T. Since it satisfies the condition of Theorem 2.2, 1

√

N

X

i=1

[A_i−b_wsQ_wsi]/T=D⇒N(0,σ

4

(31)

Hence,

√

N T(ˆρ_ws −1−b_ws)=D⇒N(0,9) as N, T → ∞.

Proof of iv):

By arguments similar to those in (ii), we can prove that

ˆ

τ_ws=D⇒N(0,1.5) as N, T → ∞.

Now we present the empirical percentiles of the test statistics in (2.5)-(2.8) from Monte Carlo simulations. We conduct simulations for the following combinations of N and T.

(N, T) = (10,25),(10,50),(10,100),(25,25),(25,50),(25,100), (50,25),(50,50),and (50,100).

(32)

on √1 N and

1

√

T. For simple symmetric estimation, R

2 _{is 0.97 and the smoothed}

percentiles are given by −1.81− 0.4√1

N + 0.16 1

√

T. For the weighted symmetric regression, R2 is 0.86 and the smoothing function for the empirical percentiles is

−2.02−0.34√1

N + 0.22 1

√

(33)

Table 2.2: Empirical cumulative distribution of√N T(ˆρ_s−1−b_s) Probability of a Smaller Value

(N,T) 0.01 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.99 (10,25) -8.45 -5.74 -4.31 -2.22 -0.27 1.28 2.55 3.12 4.10 (10,50) -9.02 -5.83 -4.44 -2.30 -0.29 1.34 2.56 3.20 4.15 (10,100) -9.00 -6.01 -4.63 -2.39 -0.38 1.32 2.55 3.19 4.28 (25,25) -7.42 -5.18 -3.94 -2.10 -0.18 1.41 2.75 3.45 4.68 (25,50) -7.87 -5.30 -4.08 -2.10 -0.18 1.47 2.78 3.54 4.74 (25,100) -7.68 -5.36 -4.06 -2.12 -0.21 1.49 2.86 3.56 4.89 (50,25) -7.24 -4.91 -3.73 -1.97 -0.16 1.47 2.81 3.58 5.05 (50,50) -7.41 -4.97 -3.85 -1.99 -0.12 1.56 2.96 3.75 5.20 (50,100) -7.20 -4.94 -3.75 -1.99 -0.12 1.59 2.95 3.76 5.10 (∞,∞) -6.25 -4.43 -3.43 -1.80 0.00 1.80 3.43 4.43 6.25

(34)

Table 2.3: Empirical cumulative distribution of √N T(ˆρ_ws−1−b_ws) Probability of a Smaller Value

(N,T) 0.01 0.05 0.10 0.25 0.50 0.75 0.90 0.95 0.99 (10,25) -9.39 -6.35 -4.78 -2.45 -0.38 1.50 3.06 3.89 5.20 (10,50) -9.67 -6.39 -4.87 -2.59 -0.34 1.52 3.03 3.89 5.20 (10,100) -9.57 -6.50 -4.91 -2.69 -0.37 1.57 3.02 3.87 5.24 (25,25) -8.21 -5.76 -4.52 -2.38 -0.25 1.68 3.26 4.12 5.65 (25,50) -8.54 -5.93 -4.51 -2.37 -0.18 1.68 3.28 4.21 5.69 (25,100) -8.52 -5.79 -4.42 -2.31 -0.20 1.72 3.28 4.17 5.68 (50,25) -8.22 -5.47 -4.26 -2.24 -0.17 1.73 3.34 4.31 5.89 (50,50) -7.87 -5.53 -4.26 -2.24. -0.12 1.80 3.44 4.39 5.97 (50,100) -8.03 -5.39 -4.15 -2.22 -0.12 1.84 3.46 4.37 6.04 (∞,∞) -6.99 -4.95 -3.84 -2.01 0.00 2.01 3.84 4.95 6.99

(35)

Table 2.4: Empirical cumulative distribution of ˆτ_s Probability of a Smaller Value

(36)

Table 2.5: Empirical cumulative distribution of ˆτ_ws Probability of a Smaller Value

(37)

2.2 Higher Order Processes

In this section, we consider the limiting distributions of the simple symmetric and the weighted symmetric estimators for the autoregressive higher order processes. If we generalize model (2.1) to a pth order autoregressive process,

y_it =η_i−

p

X

k=1

α_ky_it₋_k+e_it, i= 1,2,· · ·, N;t= 1,2,· · ·, T, (2.11)

where the e_it’s are independent and identically distributed as N(0, σ2) and y_i,₋_p₊₁,y_i,₋_p₊₂,· · ·,y_i₀ are fixed. We consider the case whenp−1 roots of

mp+

p

X

k=1

α_kmp−k = 0

are less than one and there exists only one unit root. We can rewrite model (2.11) as

y_it=η_i+θ₁y_it₋₁+

p

X

k=2

θ_k∆y_it₋_k₊₁+e_it (2.12) where

θ₁ =−

p

X

k=1

α_k,

θ_l =

p

X

k=l

α_k, l = 2,3,· · ·, p

(38)

y_it =η_i+θ₁y_it₋₁+θ₂∆y_it₋₁+e_it. (2.13) As in the first order autoregressive process, a backward representation can be valid under the stationarity condition, that is,

y_it =η_i+θ₁y_it₊₁−θ₂∆y_it₊₂+u_it,

wheree_it andu_it have the same covariance structure for eachi. For simple symmetric estimation, the regression uses weights w_t = 0.5. In weighted symmetric estimation of a pth order process, the weight is defined as follows.

w_t= 0, t= 1,2,· · ·, p, t−p

T −2p+ 2, t=p+ 1, p+ 2,· · ·, T −p+ 1, 1, t=T −p+ 2, T −p+ 3,· · ·, T.

Fuller (1996) presents the regression table for weighted symmetric estimation in the univariate case. We generalize his table to panel data with fixed individual effects and obtain the weighted symmetric estimators from Table 2.6.

Let β

0

be a vector of parameters, that is, hη₁, η₂,· · ·, η_N, θ₁, θ₂i. Then the weighted symmetric estimator forβ is

ˆ

β = (X

0

WX)−1(X

0

WY),

where X is the 2N(T −2)×(N + 2) matrix that consists of columns corresponding to (N + 2) parameters, Y is the vector of dependent variables, and W is the

(39)

Table 2.6: Regression table for weighted symmetric estimation in AR(2) process

Weight Dependent Parameter

Variable η₁ η₂ · · · η_N θ₁ θ₂ w₃ y₁₃ 1 0 · · · 0 y₁₂ ∆y₁₂ w₄ y₁₄ 1 0 · · · 0 y₁₃ ∆y₁₃

..

. ... ... ... ... ... ... ... w_T y₁_T 1 0 · · · 0 y₁_T₋₁ ∆y₁_T₋₁ 1−w_T₋₁ y₁_T₋₂ 1 0 · · · 0 y₁_T₋₁ −∆y₁_T 1−w_T₋₂ y₁_T₋₃ 1 0 · · · 0 y₁_T₋₂ −∆y₁_T₋₁

..

. ... ... ... ... ... ... ... 1−w₂ y₁₁ 1 0 · · · 0 y₁₂ −∆y₁₃

.. . ... ... ... ... ... ... ... .. . ... ... ... ... ... ... ... .. . ... ... ... ... ... ... ... w₃ y_N₃ 0 0 · · · 1 y_N₂ ∆y_N₂ w₄ y_N₄ 0 0 · · · 1 y_N₃ ∆y_N₃

..

. ... ... ... ... ... ... ... w_T y_{N T} 0 0 · · · 1 y_{N T}₋₁ ∆y_{N T}₋₁ 1−w_T₋₁ y_{N T}₋₂ 0 0 · · · 1 y_{N T}₋₁ −∆y_{N T} 1−w_T₋₂ y_{N T}₋₃ 0 0 · · · 1 y_{N T}₋₂ −∆y_{N T}₋₁

..

. ... ... ... ... ... ... ... 1−w₂ y_N₁ 0 0 · · · 1 y_N₂ −∆y_N₃

ˆ

θ₁_,s = 1 c_s

h

− 1

T −2S22

N X i=1 [( T X t=3

y_it₋₁)2+ 1 2(

T

X

t=3

y_it₋₁)(∆y_iT −∆y_i₂)]

+ 1

2(T −2)S12

N X i=1 [( T X t=3

y_it₋₁)(∆y_i₂−∆y_iT)−1

2(∆yi2−∆yiT)

2_]

+ S₂₂

N X i=1 [ T X t=3

y_ity_it₋₁+1

2(yi1yi2−yiTyiT−1)]

− S₁₂

N X i=1 [ T X t=3

y_it∆y_it₋₁+1

i

(40)

where

S₁₁=

N X i=1 T X t=3

y_it₋₁2− 1 T −2

N X i=1 ( T X t=3

y_it₋₁)2,

S₁₂ =

N X i=1 T X t=3

y_it₋₁∆y_it₋₁+1 2

N

X

i=1

(y_i₂y_i₁−y_iTy_iT₋₁)

− 1

2(T −2)

N X i=1 ( T X t=3

y_it₋₁)(∆y_i₂−∆y_iT),

S₂₂ =

N X i=1 T X t=3

(∆y_it₋₁)2− 1 2

N

X

i=1

[(∆y_i₂)2−(∆y_iT)2]

− 1

4(T −2)

N

X

i=1

(∆y_i₂−∆y_iT)2, and

c_s =S₁₁S₂₂−S₁₂2. Also, the simple symmetric estimator of θ₂ is

ˆ

θ₂_,s = 1 c_s

h ₁

T −2S12

N X i=1 [( T X t=3

y_it₋₁)2+ 1 2(

T

X

t=3

y_it₋₁)(∆y_iT −∆y_i₂)]

− 1

2(T −2)S11

N X i=1 [( T X t=3

y_it₋₁)(∆y_i₂−∆y_iT)−1

2(∆yi2−∆yiT)

2_]

− S₁₂

N X i=1 [ T X t=3

y_ity_it₋₁+1

2(yi1yi2−yiTyiT−1)] + S₁₁

N X i=1 [ T X t=3

y_it∆y_it₋₁+1

i

. (2.15)

(41)

ˆ

θ₁_,ws = 1 c_ws

h

− T −1

(T −2)2W22

N

X

i=1

[T −3 T −2(

T

X

t=3

y_it₋₁)2+ (

T

X

t=3

y_it₋₁)(y_i₁+y_iT)]

− 2

(T −2)2W12

N X i=1 [( T X t=3

y_it₋₁)(y_i₁+y_iT) + T −3 T −2(

T

X

t=3

y_it₋₁)2] + T −1

(T −2)2W12

N

X

i=1

[(∆y_i₂−∆y_iT)(y_i₁+y_iT +T −3 T −2

T

X

t=3

y_it₋₁)]

+ 1

T −2W12

N

X

i=1

[(y_i₁+y_iT)2+ T −3 T −2(

T

X

t=3

y_it₋₁)(y_i₁ +y_iT)] + W₂₂

N

X

i=1

[y_i₁y_i₂+

T

X

t=3

y_ity_it₋₁] + W₁₂

N

X

i=1

[T −1 T −2

T

X

t=3

y_ity_it₋₂−

T

X

t=3

y_ity_it₋₁−y_i₁y_i₂]i, (2.16) where

W₁₁= T −1 T −2

N X i=1 T X t=3

y_it₋₁2−(T −1)

2

(T −2)3

N X i=1 ( T X t=3

y_it₋₁)2, W₁₂ = 1

T −2

N X i=1 [T T X t=3

y_it₋₁∆y_it₋₁−

T

X

t=3

y_it₋₁2−y_iTy_iT₋₁+ (T −1)y_i₂y_i₁] + T −1

(T −2)2

N

X

i=1

[ 2 T −2(

T

X

t=3

y_it₋₁)2−T −1 T −2(

T

X

t=3

y_it₋₁)(∆y_i₂−∆y_iT)]

− T −1

(T −2)2

N X i=1 ( T X t=3

y_it₋₁)(y_i₁+y_iT), W₂₂ = 1

T −2

N

X

i=1

[(∆y_i₂)2+ (∆y_iT)2+T

T_X−1 t=3

(∆y_it)2]

− 1

(T −2)3

N X i=1 [4( T X t=3

y_it₋₁)2+ (T −1)2(∆y_i₂−∆y_iT)2]

− 1

T −2

N

X

i=1

[(y_i₁+y_iT)2− 4(T −1) (T −2)2(

T

X

t=3

y_it₋₁)(∆y_i₂ −∆y_iT)]

+ 1

(T −2)2

N X i=1 [4( T X t=3

(42)

and

c_ws =W₁₁W₂₂−W₁₂2. Also, the weighted symmetric estimator ofθ₂ is

ˆ

θ₂_,ws = 1 c_ws

h _T ₋₁

(T −2)2W12

N

X

i=1

[T −3 T −2(

T

X

t=3

y_it₋₁)2+ (

T

X

t=3

y_it₋₁)(y_i₁+y_iT)]

+ 2

(T −2)2W11

N X i=1 [( T X t=3

y_it₋₁)(y_i₁+y_iT) + T −3 T −2(

T

X

t=3

y_it₋₁)2]

− T −1

(T −2)2W11

N

X

i=1

[(∆y_i₂−∆y_iT)(y_i₁+y_iT +T −3 T −2

T

X

t=3

y_it₋₁)]

− 1

T −2W11

N

X

i=1

[(y_i₁+y_iT)2+ T −3 T −2(

T

X

t=3

y_it₋₁)(y_i₁+y_iT)]

− W₁₂

N

X

i=1

[y_i₁y_i₂+

T

X

t=3

y_ity_it₋₁]

− W₁₁

N

X

i=1

[T −1 T −2

T

X

t=3

y_ity_it₋₂−

T

X

t=3

y_ity_it₋₁ −y_i₁y_i₂]i. (2.17) Based on simple symmetric estimators and weighted symmetric estimators in (2.14)-(2.17), we construct the studentized test statistics for a unit root. In the univariate case, it is known that the studentized test statistics in higher order processes have the same limiting distribution as in the first order autoregressive process. The following theorem shows that a similar invariance holds for the sequential limit distributions of the studentized test statistics in panel data.

Theorem 2.4: Given model (2.13), under the null hypothesis H₀ : θ₁ = 1 and assuming that η_i =y_i₀ =y_i,₋₁ = 0 for each i, then

i) τˆ(θ₁)_s= ˆ

θ₁_,s−1−b(2)_s (1−θˆ₂_,s) [ˆσ_s2S₂₂/c_s]0.5

D

=⇒N(0,1.2)

(43)

ii) τˆ(θ₁)_ws = ˆ

θ₁_,ws−1−b(2)_ws(1−θˆ₂_,ws) [ˆσ_ws2 W₂₂/c_ws]0.5

D

=⇒N(0,1.5) as T → ∞ is followed by N → ∞,

where

b(2)_s =− 6(T −2) 2(T −2)2+ 1, and

b(2)_ws =−2(T −2)

2 _{+ (T} ₋_{2) + 2}

(T −2)3+ 1 .

Here,b(2)_s andb(2)_ws can be obtained by pluggingT−2 instead ofT intob_s andb_wsin the AR(1) process. This idea comes from the fact that we use 2N(T −2) observations in the AR(2) regression table. However, this does not affect our limiting distribu-tion results. Also, ˆσ_s2 and ˆσ_ws2 are mean squared errors in symmetric and weighted symmetric estimations, respectively.

Proof:

Before we derive the sequential limit distributions of test statistics, we consider theT-limit distributions of several random variables which are used in our test statis-tics. Dickey (1976) presents theT-limit distribution results in higher order processes. Whenθ₁ = 1 andy_i₀ =y_i,₋₁ =η_i = 0 in model (2.13), we have the following equation.

(1−θ₂)y_it =

t

X

j=1

e_ij −θ₂∆y_it.

From the above equation and Dickey (1976)’s results, we can obtain the following T-limit results for each i.

P_T

t=3yit−12

T2

T_−→→∞ σ2

R₁

0 Bi2(t)dt

Unit Root Tests in Panel Data: Weighted Symmetric Estimation and Maximum Likelihood Estimation

To my eternal beloved

and

Biography

Acknowledgements

Contents

List of Tables

List of Figures

Chapter 1

Introduction

1.1

Statement of the problem

1.2

Literature Review

0

0

0

0

0

0

0

0

0

0

Chapter 2

Simple and Weighted Symmetric

Estimation

2.1

AR(1) process

0

0

0

0

0

0

0

0

0

0

0

2.2

Higher Order Processes

0

0

0