Dynamic Panel Estimation and Homogeneity Testing Under Cross Section Dependence

(1)

DYNAMIC PANEL ESTIMATION AND HOMOGENEITY TESTING UNDER CROSS SECTION DEPENDENCE

By

Peter C.B. Phillips and Donggyu Sul

May 2002

COWLES FOUNDATION DISCUSSION PAPER NO. 1362

COWLES FOUNDATION FOR RESEARCH IN ECONOMICS

YALE UNIVERSITY

Box 208281

New Haven, Connecticut 06520-8281

(2)

Dynamic Panel Estimation and Homogeneity Testing Under

Cross Section Dependence

∗

Peter C.B. Phillips

Cowles Foundation, Yale University

University of Auckland & University of York

Donggyu Sul

Department of Economics

University of Auckland

February 12, 2002

Abstract

This paper deals with cross section dependence, homogeneity restrictions and small sam-ple bias issues in dynamic panel regressions. To address the bias problem we develop a panel approach to median unbiased estimation that takes account of cross section dependence. The new estimators given here considerably reduce the eﬀects of bias and gain precision from estimating cross section error correlation. The paper also develops an asymptotic theory for tests of coeﬃcient homogeneity under cross section dependence, and proposes a modiÞed Hausman test to test for the presence of homogeneous unit roots. An orthogo-nalization procedure is developed to remove cross section dependence and permit the use of conventional and meta unit root tests with panel data. Some simulations investigating the Þnite sample performance of the estimation and test procedures are reported.

Keywords: Autoregression, Bias, Cross section dependence, Dynamic factors, Dynamic panel estimation, GLS estimation, Homogeneity tests, Median unbiased estimation, Modi-Þed Hausman tests, Median unbiased SUR estimation, Orthogonalization procedure, Panel unit root test.

JEL ClassiÞcation Numbers: C32 Time Series Models. C33 Panel Data First Draft : August, 2000

Completed Version: December, 2001

∗_{Presented at the Midwest Econometrics Conference, October, 2001. Our thanks go to Feng Zhu for pointing}

out some errors in the original version. Computational work was performed in GAUSS. Phillips thanks the NSF for support under Grant # SES 0092509.

(3)

1 Introduction

This paper suggests some simple and practical methods for treating three important and thorny issues that arise in estimation and testing with dynamic panel models: cross section depen-dence, homogeneity testing, and small sample bias (hereafter SB) problems. Each of these issues is individually important in dynamic panel regression and has received attention, par-ticularly the SB problem on which there is a large literature. But the problems are not independent and, when they are taken together, they substantially complicate estimation and inference in dynamic panel models. The rapidly growing number of applied panel studies in growth economics, international Þnance, and empirical labor economics in recent years accen-tuates the need for these issues to be addressed in a systematic fashion. As yet, however, there have been few attempt to address these issues at the same time and the present paper is a small step in that direction oﬀering some new possibilities in estimation and inference. We start by noting the following implications.

First, when there is cross section dependence in panel data, commonly used econometric estimators and tests about parameters of interest generally rely on the nuisance parameters of cross section dependence. As we will show, one of the most striking effects of cross section dependence is that the pooled ordinary least squares (OLS) estimator provides little gain in precision compared with single equation OLS when cross sectional dependence occurs but is ignored in the panel regression. Another effect is that commonly used panel unit root tests are no longer asymptotically similar. These effects are easily demonstrated using a simple but intuitive parametric structure for the cross section dependence.

Second, the well known problem of SB bias in least squares estimation of the coefficients in dynamic models is much more serious in panel models than it is in univariate autoregessions. We provide extensions of the Nickell (1981) bias formula for cases where there is cross section dependence, error heterogeneity and nonstationarity. In some cases the bias is so marked that the true autoregressive coefficient lies completely outside the empirical distribution of the pooled OLS estimator of the dynamic autoregressive coefficient. To address this problem, the paper introduces some new panel estimation procedures that are based on the idea of median unbiased estimation (Lehmann, 1959; Andrews, 1993).

Third, homogeneity assumptions in dynamic panel models are convenient and commonly employed to take advantage of pooling in panel regression. But these restrictions are some-times not well supported by the data and they can produce misleading results and invalidate inference, as argued for example, by Durlauf and Quah (1999) in connection with homogeneity restrictions used in the economic growth and convergence literatures. Of particular importance in applied work is the need to take account of cross section dependence in testing homogeneity restrictions in non stationary panels, especially in connection with panel unit root testing. This paper shows how to test for panel unit roots in the presence of cross section dependence and proposes two types of test statistic. The Þrst type is based on median unbiased correction after eliminating cross section dependence. The second type involves the use of meta statistics which seek to avoid small sample biases rather than correct for them.

The paper gives precedence initially to the treatment of the SB bias problem. It is not because this issue is more important than that of cross section dependence or homogeneity, but because the SB problem arises irrespective of homogeneity testing or the presence of cross section dependence. Further, as is already well recognised, SB bias can make a huge diﬀerence

(4)

in applied work, as the examples of HAC and dynamic response time estimation given in the next section illustrate1.

To handle the SB bias problem in dynamic panel estimation and the diﬃculties that can arise from it, this paper proposes some panel median unbiased estimators (MUE’s) that follow the approach taken by Andrews (1993) in the time series case2. Our starting point is a panel version of the MUE of Andrews in which the innovations in the panel are assumed to be free of cross sectional dependence and the autoregressive coeﬃcient is assumed to be homogenous across cross sectional units. Since both these assumptions are strong and are unlikely to be satisÞed in empirical work, we explore the consequences of relaxing these assumptions and develop some alternate MUE procedures that are more suitable in that event.

For this purpose, we use a generalized common time effect model to parameterize the structure of cross section dependence (see equation (6) below). This structure has been used in practical work (for example, Barro and Sara-i-Martin (1992)) because of its simplicity and economic interpretability. Also, other authors (e.g., Im, Pesaran, and Shin, 1997) have sug-gested this parametric structure as a possible model for cross section dependence, without providing analysis but indicating that such formulations can be expected to complicate asymp-totics in both stationary and nonstationary cases. Under this structure, we Þnd that pooling GLS (which takes account of the dependence) reduces variance, but the pooled GLS estimator suffers from downward bias. To deal with these effects of cross section dependence, we develop a panel generalized MUE and Þnd that this procedure restores the precision gains from pooling in the panel and largely removes the bias in GLS. Next, we consider the more realistic case in empirical research where there is cross sectional dependence among the innovations and heterogeneity in the autoregressive coefficients. In this case, we provide a seemingly unrelated MUE that deals with heterogeneity and cross section dependence in much the same way as the conventional SUR estimator, while also addressing the SB bias problem.

In panel applications it is often of interest to test whether the data support homogene-ity restrictions on the coefficients, an important example being that of panel unit roots, as mentioned above. In view of the potential gains from pooling and the changes in the limit theory in the nonstationary case, homogeneity of the autoregressive coefficients in a panel is an important restriction in dynamic panel models. In developing tests of such restrictions in dynamic panels it is particularly important in empirical applications to allow for cross section dependence. To this end, the present paper investigates the properties of Wald and Hausman-type tests of homogeneity under cross section dependence and proposes a modiÞed Hausman test procedure that helps to deal with the effects of such dependence in testing for the presence

1

The problem of small sample bias in the least squares estimation of the coeﬃcients in an autoregression has a long history, two important early contributions being Hurvicz (1950) and Orcutt (1948). In simple autoregressions, asymptotic formulae for the small sample bias were worked out by Kendall (1954) and Marriot and Pope (1954). Orcutt (1948) was the Þrst to show that Þtting an intercept in an autoregression produced an additional source of bias that can exacerbate the SB problem, and this was conÞrmed in a later simulation study by Orcutt and Winokur (1969). The point was echoed in Andrews’ (1993) more recent study, which provided further simulations that included the case of a Þtted linear trend.

2

Our work is also related to some recent independent work by Cermeno (1999). Using simulation methods, Cermeno investigates the use of MUE estimation in a dynamic panel regression with Þxed eﬀects, a common time eﬀect and homogeneous trends. Our framework extends Cermeno’s study by developing a class of panel MUE’s that address a more general case of cross section dependence and that enable tests of homogeneity restrictions on the dynamics, including the important case of unit root homogeneity

(5)

of homogeneous unit roots. An orthogonalization procedure3 _{is developed, which enables the}

development of a general class of unit root tests for panel models when there is cross section dependence.

The remainder of the paper is organized as follows. The next section shows how even a small time series SB can make a large difference in estimation and testing in the context of panel pooling. Section 3 studies the invariance properties of the panel MUE under the as-sumption of cross sectional independence. Since invariance breaks down under cross sectional dependence, this section also investigates alternative invariance properties that hold in the presence of cross section dependence and proposes two new estimators for this case — a pooled feasible generalized MUE and a seemingly unrelated MUE. Section 4 considers the asymptotic properties of Wald and Hausman tests for homogeneity under cross section dependence and develops some alternate procedures that offer advantages, especially in the case of unit roots. In section 5, we report the results of a simulation experiment examining the bias and efficiency of the various panel estimators and the performance of the tests of cross section homogene-ity. Section 6 provides an empirical application of the estimators to the growth convergence problem. Section 7 concludes. Derivations and some additional technical results are given in the Appendices: A derives some invariance results; B provides extensions of the Nickell (1981) bias formula to cases where there is cross section dependence, unit root nonstationarity and heterogeneous errors; C develops limit theory for the stationary and unit root nonstationary cases; D provides an algorithm for estimating the cross section dependence coefficients.

2 Dynamic Panel Models and Bias Illustrations

2.1 Model DeÞnitions

Three basic models are considered. These are panel versions of the models in Andrews (1993). As in that work, Gaussianity is assumed in order to construct the median unbiased estimator. Each of the basic models involves a latent panel {yi,t∗ : t = 0, 1, ...T ; i = 1, ..., n} that is

generated over time as an AR(1) with errors that are independent across section. The more complex case of cross section error dependence is taken up in Section 3.2 and allowance for more general time series eﬀects is considered in Section 4.3.

The model for y_i,t∗ is

y∗_i,t= ρy_i,t−1∗ + ui,t, for t = 1, · · · , T, and i = 1, · · · , N, where ρ ∈ (−1, 1], (1)

ui,t ∼ iid N(0, σ2i) over t and ui,t is independent of uj,s for all i 6= j and for all s, t and

initialization is as follows y_i,0∗ _∼ ( N (0, σ2i 1−ρ2) ρ ∈ (−1, 1) Op(1) ρ = 1 .

When ρ ∈ (−1, 1), yi,t∗ is a zero mean, Gaussian panel that follows an AR(1) structure over

time and that is independent over i. When ρ = 1, y∗

i,tis a Gaussian panel random walk starting 3

After the Þrst draft of this paper was done and it was in the Þnal stages of completion, the authors learnt that Moon and Perron (2001) have independently proposed the same approach to unit root testing in the context of dynamic panels with multiple factors.

(6)

from a (possibly random) initialization y∗

i,0 (not necessarily Gaussian) and that is independent

over i. The observed panel data {yi,t : t = 0, 1, ...T ; i = 1, ..., n} are deÞned in terms of y∗i,tas

follows:

M1: yi,t= yi,t∗ for t = 0, · · · , T and i = 1, · · · , N. and ρ ∈ (−1, 1)

M2: yi,t= µi+ yi,t∗ for t = 0, · · · , T, i = 1, · · · , N, µi ∈ R and ρ ∈ (−1, 1]

M3: yi,t= µi+ βit + yi,t∗ for t = 0, · · · , T, i = 1, · · · , N, µi, βi ∈ R, and ρ ∈ (−1, 1].

In each case, there is an equivalent dynamic panel representation in terms of yi,t :

M1 yi,t= ρyi,t−1+ uit for t = 1, · · · , T, i = 1, · · · , N, and ρ ∈ (−1, 1)

M2 yi,t= µ_i+ ρyi,t−1+ uit for t = 1, · · · , T, i = 1, · · · , N, with µ_i = µi(1 − ρ) and ρ ∈ (−1, 1]

M3 yi,t= µ_i+ β_it + ρyi,t−1+ uit for t = 1, · · · , T, i = 1, · · · , N, with µ_i= µi(1 − ρ) + ρβi, β_i=

β_i_{(1 − ρ), and ρ ∈ (−1, 1].}

In M1-M3, the initialization yi,0 ∼ N(0, σ2i/(1 − ρ2)) when ρ ∈ (−1, 1) and yi,0 = Op(1) when

ρ = 1.

2.2 Pooled Estimation and Bias Illustrations

Denote the pooled panel least squares (POLS) estimator of ρ by ˆρ_pols in each of the three models M1, M2 and M3. In M2, for instance, ˆρ_pols has the form

ˆ ρ_pols =

PN i=1

PT

t=1(yit−1− yi.−1)(yit− yi.) PN i=1 PT t=1(yit−1− yi.−1)2 , where yi.= T−1 T X t=1 yit, and yi.−1= T−1 T X t=1 y_it−1. (2) The exact quantiles of ˆρ_pols were computed by simulation using 100,000 replications for a selection of N, T , and ρ values and for σ2_i = 1. We report some summary statistics here (detailed results are available upon request) and make the following general observations: (i) the median values of the pooled OLS estimators are less than the true values for all models and all cases; (ii) the diﬀerence between the median value and the true value (which we call the median bias) is increasing as the true value of ρ increases for all conÞgurations of (N, T ).

Table 1 shows the bias of the POLS estimator for each model when ρ = 0.9. For model M1, the bias of the OLS estimator vanishes for moderate sizes of N and T . For example, the median values of ˆρ_pols are 0.88 for N =1,T =50, 0.89 for N=1,T =100 and 0.90 for N =10,T =50. Also, the empirical distribution of ˆρ_pols becomes tighter as N increases. In contrast to model M1, ˆρ_pols suﬀers from substantial SB in model M2 even when N or T are moderately large. But, as in Model M1, the distribution of ˆρ_pols concentrates quickly as N increases. In several cases, the bias and concentration of the POLS estimator are such that the true value of ρ lies almost completely outside the empirical distribution for moderate N . For example, for T = 50, the upper 95% points of ˆρ_pols are 0.94, 0.89, 0.88 and 0.85 for N = 1, 10, 20, and 30, respectively, when ρ = 0.9. Even for T = 200 and N = 30, 95% of the distribution of ˆρ_pols is below the true value. This problem becomes more severe for model M3, where the upper 95% points of ˆρ_pols are 0.904, 0.843, 0.831 and 0.825 for N = 1, 10, 20, and 30.

(7)

Table 1: Downward Bias in Dynamic Panel Estimation Part A: Quantiles of ˆρ_pols for ρ = 0.9

Sample Model M1 Model M2 Model M3

5% 50% 95% 5% 50% 95% 5% 50% 95% N=1, T=50 0.710 0.883 0.962 0.628 0.830 0.937 0.548 0.772 0.904 N=1, T=100 0.787 0.891 0.948 0.749 0.868 0.935 0.713 0.842 0.920 N=1, T=200 0.829 0.896 0.938 0.814 0.885 0.931 0.798 0.874 0.924 N=10, T=50 0.858 0.898 0.928 0.799 0.850 0.889 0.735 0.795 0.843 N=10, T=100 0.874 0.899 0.920 0.847 0.877 0.902 0.820 0.853 0.882 N=10, T=200 0.882 0.900 0.915 0.870 0.890 0.906 0.858 0.879 0.897 N=20, T=50 0.872 0.899 0.921 0.816 0.850 0.880 0.755 0.796 0.831 N=20, T=100 0.882 0.900 0.915 0.857 0.878 0.896 0.830 0.854 0.874 N=20, T=200 0.888 0.900 0.911 0.876 0.890 0.902 0.864 0.878 0.892 N=30, T=50 0.878 0.900 0.917 0.824 0.851 0.875 0.763 0.796 0.825 N=30, T=100 0.885 0.900 0.913 0.861 0.878 0.893 0.835 0.853 0.870 N=30, T=200 0.890 0.900 0.909 0.879 0.890 0.900 0.868 0.879 0.890

Part B: Quantiles ofbh when ρ = 0.9 and h = 6.579

N=1, T=50 2.027 5.569 18.036 1.487 3.709 10.730 1.153 2.685 6.905 N=1, T=100 2.890 6.029 13.034 2.403 4.895 10.393 2.051 4.033 8.342 N=1, T=200 3.704 6.303 10.783 3.366 5.670 9.698 3.071 5.130 8.734 N=10, T=50 4.532 6.465 9.244 3.086 4.250 5.897 2.248 3.024 4.071 N=10, T=100 5.130 6.502 8.332 4.184 5.293 6.753 3.487 4.362 5.518 N=10, T=200 5.524 6.549 7.764 4.995 5.921 7.041 4.520 5.352 6.364 N=20, T=50 5.073 6.479 8.454 3.407 4.257 5.422 2.462 3.033 3.745 N=20, T=100 5.530 6.550 7.799 4.477 5.310 6.305 3.717 4.377 5.164 N=20, T=200 5.831 6.557 7.410 5.254 5.922 6.689 4.745 5.348 6.042 N=30, T=50 5.313 6.556 8.019 3.573 4.306 5.171 2.561 3.046 3.614 N=30, T=100 5.698 6.554 7.617 4.645 5.321 6.095 3.847 4.372 4.973 N=30, T=200 5.957 6.573 7.242 5.391 5.934 6.555 4.882 5.360 5.920

(8)

Part C: Quantiles of lrv_lrvc when ρ = 0.9 and lrv = 100 N=1, T=50 0.113 0.763 7.047 0.064 0.339 2.580 0.040 0.182 1.091 N=1, T=100 0.206 0.863 3.880 0.147 0.575 2.501 0.109 0.403 1.643 N=1, T=200 0.337 0.918 2.608 0.282 0.753 2.127 0.235 0.616 1.726 N=10, T=50 0.501 0.965 1.933 0.235 0.425 0.810 0.129 0.220 0.390 N=10, T=100 0.620 0.986 1.565 0.420 0.656 1.035 0.292 0.449 0.696 N=10, T=200 0.717 0.994 1.385 0.587 0.810 1.137 0.489 0.669 0.928 N=20, T=50 0.615 0.981 1.596 0.281 0.432 0.670 0.152 0.223 0.331 N=20, T=100 0.711 0.988 1.382 0.478 0.658 0.908 0.333 0.449 0.616 N=20, T=200 0.791 0.996 1.264 0.649 0.815 1.029 0.537 0.670 0.845 N=30, T=50 0.671 0.990 1.479 0.307 0.435 0.626 0.164 0.225 0.309 N=30, T=100 0.759 0.993 1.302 0.510 0.663 0.866 0.352 0.453 0.587 N=30, T=200 0.824 0.993 1.201 0.678 0.814 0.986 0.557 0.670 0.810

The bias and concentration of the pooled estimator ˆρ_pols are pertinent in applications where they inßuence the distribution of derived statistics such as impulse responses, cumulative impulse response functions, the half-life of a unit shock (h) and the long run variance (lrv). We provide some brief illustrations of these eﬀects in the case of h and lrv. In the panel AR models above, the h and lrv estimates based on ˆρ_pols are bh = ln 0.5/ ln ˆρ_pols and _{lrv = 1/(1 − ˆρ}c _pols)2. As is apparent from Tables 1(B) and 1(C), even a small SB can have large eﬀects on these derived functions in the panel case because of the concentration of the estimate ˆρ_pols and the nonlinearity of the functions. As discussed in the last paragraph, the upper 95% point of the distribution of ˆρ_pols is smaller than ρ when N is moderately large, and then 95% of the distribution of bh is less than the true half-life h. In model M3, for example, when ρ = 0.9, N = 10 and T = 100, 95% of the distribution ofbh is less than 5.518, whereas the actual half-life is h = 6.597. Similarly, for the same model and parameter values, 95% of the distribution of

c

lrv/lrv lies below 0.696. Even for N = 30, T = 200, 95% of the distribution of lrv/lrv liesc below 0.89. Table 1(C) shows how serious the bias in lrv can be. When T = 50 and N = 1,c the median value of lrv for model M2 is about 76% of the true lrv. For model M3, it is lessc than 20% of the true value when T = 50 and N = 1, and still less than 45% when T = 100 and N = 30. Thus, when estimation of the lrv is based on panel data with Þtted Þxed effects or individual trends, the estimated lrv suffers from serious downward bias. We can expect test statistics that rely on these lrv estimates to be correspondingly affected.

3 Panel Median Unbiased Estimation

This section proposes three panel median unbiased estimators. The Þrst estimator is a panel exactly median unbiased (PEMU) estimator, constructed under the assumptions of a homoge-nous AR(1) parameter and cross sectional independence. This estimator is a panel version of Andrews’ exactly median unbiased estimator in the time series case. It is of interest to see how this procedure is aﬀected by panel observations. As mentioned in the introduction, Cermeno (1999) has independently proposed the use of a PEMU estimator for dynamic panel

(9)

models with a common time eﬀect and homogeneous trends and shows in simulations that the approach can work well in models of this type.

The PEMU estimator is based on the assumption of cross section independence (or the presence of a common time eﬀect) which will often be too strong in practical work, particularly with macroeconomic panels. In such applications, PEMU is likely to be less relevant than our second and third estimators, which are designed to take account of cross section dependence that is more general than a common time eﬀect. We will calibrate the performance of the new median unbiased estimators against that of the conventional POLS estimator in cases where there is cross sectional dependence amongst the regression errors. This comparison will highlight the gains of working with median unbiased estimators in the panel context, especially when there is cross section dependence.

3.1 Panel Exactly Median Unbiased Estimation

As discussed in Andrews (1993), it is useful in the construction of median unbiased estimators for the distribution of the least squares estimator to be invariant to scale and other nuisance parameters. It is well known (e.g. Dickey and Fuller, 1979) that least squares estimates of the autoregressive coeﬃcient in pure time series versions of models 1,2 and 3 satisfy such distributional invariance properties. These invariance results extend to the pooled panel forms of the least squares estimators in models 1,2 and 3 under certain conditions, which we now provide. The following property is a panel version of the property given in Andrews (1993) for the time series case. As before, the POLS estimator of ρ is generally denoted by ˆρ_pols for each of the three models M1, M2, and M3; but when there is possible ambiguity, we use an additional subscript and write ˆρ_polsj for the POLS estimator of ρ in model j.

Invariance Property IP1: Under the assumption of cross section independence, the distri-bution of ˆρ_polsj depends only on ρ when model j is correct and the error variance σ2_i = σ2 for all i. When yit is stationary, it does not depend on the common variance σ2i for model M1, or

( σ2

i, µi) for model M2, or ( σ2i, µi, βi) for model M3, nor on the value of yi0 when ρ = 1 and

yit is non-stationary.

The common variance condition in IP1 is a strong one and will be inappropriate in many ap-plications. It may be relaxed by allowing the individual error variances σ2

i to be iid draws from a

distribution with common scale. For example, if σ2_i/σ2are iid χ2₁, then uit/σ = (uit/σi)(σi/σ),

which is independent of nuisance parameters. The numerator and denominator of ˆρ_pols may then be rescaled by 1/σ2 _{and it is apparent that IP1 continues to hold, as shown in the}

Ap-pendix. For more general cases of variation in σ2_i over i, we may use weighted least squares in the construction of the panel estimator ˆρ_pols. This extension and other generalizations of ˆ

ρ_pols that are better suited to empirical applications are discussed later. For the time being, we conÞne our discussion to the estimator ˆρ_pols and those cases where property IP1 holds.

Property IP1 enables the construction of a panel version of the exactly median unbiased estimator(PEMU) in Andrews (1993). We start by noting that ˆρ_pols has a median function m(ρ) = mT,N(ρ) which simulation shows to be strictly increasing4 in ρ on the parameter space 4_{An analytic demonstration of this property would be useful but is not presently available either in the panel}

(10)

ρ ∈ (−1, 1]. Using this function (which depends on T and N), the panel median-unbiased estimator ˆρ_pemu can be deÞned as follows;

b ρ_pemu =      1 m−1(ˆρ_pols) −1 if if if ˆ ρ_pols> m(1), m(−1) < ˆρpols ≤ m(1), ˆ ρ_pols _{≤ m(−1),} (3)

where m(−1) = limρ→−1m(ρ) and m−1 is the inverse function of m(·) = mT,N(·) so that

m−1(m(ρ)) = ρ. Furthermore, a 100(1-p)% conÞdence interval for ρ in model j can be con-structed as follows. Let qL(·) and qU(·) be the lower and upper quantile functions for ˆρpols.

DeÞne b cL_{P U} =      1 q_U−1(ˆρ_pols) −1 if if if ˆ ρ_pols > qU(1), qU(−1) < ˆρpols≤ qU(1), ˆ ρ_pols_{≤ q}U(−1), (4) b cU_{P U} =      1 q_L−1(ˆρ_pols) −1 if if if ˆ ρ_pols> qL(1), qL(−1) < ˆρpols≤ qL(1), ˆ ρ_pols _{≤ q}L(−1), (5)

Then,bcU_{P U} and bcL_{P U} _{provide upper and lower conÞdence limits and the 100(1 − p)% conÞdence} interval for ρ is {ρ :bcL

P U ≤ ρ ≤cbUP U}. This construction follows Andrews (1993). The intervals

are obtained in precisely the same way as in that paper, but use tables of the quantiles of the panel estimator ˆρ_pols.

3.2 Panel Feasible Generalized Median Unbiased Estimator

The assumption of no cross sectional correlation among the regression residuals is a strong one and is unlikely to hold in many applications. When the structure of cross sectional dependence among the regression errors is completely unknown, it is generally infeasible to deal with the correlations because of degrees of freedom constraints. Hence, it is common to assume some simplifying form of dependence structure. The most conventional way to handle cross section dependence has been to include a common time dummy in the panel regression. The justiÞcation for the common time eﬀect is that certain co-movements of multivariate time series may be due to a common factor. For example, in cross country panels it might be argued that the time dummy represents a common international eﬀect (e.g. a global shock or a common business cycle factor), or in a panel study of purchasing power parity it may represent the numeraire currency.

The model we use here allows for a common time eﬀect that can impact individual series diﬀerently. SpeciÞcally, the model for the regression errors has the form

uit= δiθt+ εit, θt∼ iid N(0, 1) over t, (6)

in which θtis a common time eﬀect, whose variance is normalized to be unity for identiÞcation

purposes and whose coeﬃcients, δi, may be regarded as ‘idiosyncratic share’ parameters that values T ≥ 20 and N ≥ 5. There seems to be some evidence from simulations that the property fails for small T when N = 1. Andrews (1993, fn. 4) reports that the 0.95 quantile function appears to dip slightly for values of ρ close to unity for small values of T .

(11)

measure the impact of the common time eﬀect on series i. The δi are assumed to be

non-stochastic and we let δ = (δ1, ..., δN). In (6) the general error component εit is assumed to

satisfy

εi,t∼ iid N(0, σ2i) over t, and εi,tis independent of εj,sand θs for all i 6= j and for all s, t.

In this formulation, the source of the cross sectional dependence is generated from the common stochastic series θt and the extent of the dependence is measured by the coeﬃcients δi. In

particular, the covariance between uit and ujt(i 6= j) is given by

E(uitujt) = δiδj. (7)

There is no cross sectional correlation when δi= 0 for all i, and there is identical cross sectional

correlation when δi = δj = δ0 for all i and j. Thus, the degree of cross sectional correlation

is controlled by the components of δ. Setting ut = (u1t, ..., uN t)0 we have the conditional

covariance matrix Vu = E ³ utu0t|σ21, ..., σ2N ´ = Σ + δδ0, Σ = diag³σ2₁, ..., σ2_N´. (8) The model (6) can be regarded as a single factor model in which θt is the common factor

and δi is the factor loading for series i. It has been used in empirical research in studying

growth convergence by Barro and Sala-i-Martin (1992). More general versions of this model that allow for weakly dependent time series eﬀects and multiple factors have been considered in recent work by Bai and Ng (2001) and Moon and Perron (2001) that concentrates on model determination issues relating to the number of factors and panel unit root testing. The models used by these authors are more complex than (6), especially with regard to time series properties. Nonetheless, (6) is general enough to allow for interesting cases of high and low cross sectional dependence and yet simple enough to enable us to develop good procedures for bias removal in dynamic panel regressions where cross section dependence arises. In the panel unit root case, we show later in the paper that time series eﬀects in εit can be treated by a

simple augmented dynamic panel regression and that time series eﬀects in θt can be treated

simply by projecting on the space orthogonal to δ.

As in the earlier case with cross sectional independence, it will be convenient in what follows to assume that the individual error variances σ2_i are iid draws from a distribution with common scale. More particularly, we assume that τi = σ2i/σ2 are iid draws from an

independent distribution with density f (τ ) that does not involve further nuisance parameters and whose Þrst moment is Þnite. Then, the standardized error component

uit σ = δiθt+ εit σi σi σ, where δ_i= δi/σ, has unconditional variance matrix

E µ_u tu0t σ2 ¶ = Z _∞ 0 £ τ I + δδ0¤f (τ ) dτ = E (τ ) IN + δδ0. with δ = δ/σ.

(12)

With this formulation for the error variances, the numerator and denominator of ˆρ_pols may be rescaled by 1/σ2, giving some invariance characteristics to the panel estimator ˆρ_pols and stronger invariance properties to the panel generalized least squares estimator ˆρ_pgls deÞned by

ˆ ρ_pgls= PT t=1ybt−10 Vu−1ybt PT t=1yb_t−10 Vu−1yb_t−1 , (9)

wherey_bt= (yb1t, ...,ybN t)0 and where ybit denotes yit or demeaned or detrended yit, respectively

for Models M1,M2 and M3. In particular, we have the following property.

Invariance Property IP2: Under cross sectional dependence of the form (6), the

distri-bution of ˆρ_polsj depends only on (ρ, δ = δ/σ) when model j is correct and the error variance ratios τi = σ2i/σ2 are iid draws from an independent distribution with density f (τ ) that does

not involve further nuisance parameters. Further, the distribution of the panel GLS estimator ˆ

ρ_pgls depends only on ρ when model j is correct. When ρ = 1 and yit is non-stationary, the

distributions of ˆρ_pols and ˆρ_pgls for models 2 and 3 do not depend on the value of yi0.

Appendix B analyzes the bias of ˆρ_pols and shows that to Þrst order this is the same as the conventional Nickell (1981) bias under cross section independence and does not depend on the (standardized) cross section parameters δ_ito O(1/T ). However, the bias and the distribution of ˆ

ρ_polsdo depend on δ_i, as is apparent from equation (61) in Appendix B. On the other hand, the panel GLS estimator depends only on ρ. Accordingly, we now propose an iterative procedure that involves the use of a feasible GLS estimator, _bρ_{pf gls}, whose form is speciÞed below in (10). Our objective is to reduce the SB bias problems of these least squares procedures by constructing a feasible generalized version of the PMU estimator of ρ.

The Þrst stage in this iteration uses the residuals from a panel regression in which we use our median unbiased estimator ˆρ_pemu rather than OLS to reduce the SB bias problems in this primary stage. Simulations we have conducted that are reported below (see Fig.2) indicate that use of the PMU estimator in the Þrst stage helps to remove bias and improve estimates of the error variance matrix even in the presence of cross section dependence. The next stage of the iteration involves the construction of a panel feasible generalized median unbiased (PFGMU) estimator that utilizes this estimated error covariance matrix. In this construction, we use the median function m(ρ) = mT,N(ρ) of the estimatorbρpf gls, which simulations show to be strictly

increasing in ρ on the parameter space ρ ∈ (−1, 1]. Using this median function (which depends on T and N ), the panel feasible generalized median-unbiased estimator,_bρ_{pf gmu}, can be deÞned as in (3). The process can be continued, revising the estimate of the error covariance matrix in each iteration.

To Þx ideas, the steps in the iteration are laid out as follows:

Step 1: Obtain the estimator ˆρ_pemuand using the residuals from this regression construct the error covariance matrix estimateVbpemu.

(13)

0 0.04 0.08 0.12 0.16 0.2 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96

PEMU

POLS

Single

OLS

Figure 1: Empirical Distributions of Single Equation OLS, POLS and PEMU under No Cross Sctional Dependence (T = 100, N = 20, ρ = 0.9).

Step 2: UsingVbpemu, perform panel generalized least squares as in (9) and obtain the PFGLS

estimate of ρ deÞned by ˆ ρ_{pf gls} = PT t=1ybt−10 Vbpemu−1 ybt PT t=1yb_t−10 Vbpemu−1 yb_t−1 . (10)

Step 3: The panel feasible generalized median-unbiased estimator (PFGMU) is now calculated as_bρ_{pf gmu} = m(_bρ_{pf gls})−1 _{just as in (3) but using the median function m(ρ) = m}_T,N_{(ρ) of}

the estimatorρb_{pf gls}.

Step 4: Repeat Steps 1-3 (using updated estimates of ρ in the Þrst stage rather than ˆρ_pemu) until bρ_{pf gmu} converges.

Fig. 1 displays a kernel estimate of the distribution of POLS based on 100,000 replications with N = 20, T = 100, ρ = 0.9 when there is no cross sectional dependence. Apparently, the POLS estimator ˆρ_pols is more concentrated than single equation OLS (which does not use the additional cross section data) but is badly biased downwards. The bias is suﬃciently serious that almost the entire distribution of ˆρ_pols lies below the true value of ρ.

Fig. 2 shows the distributions of the POLS and PMU estimators for the same parameter conÞguration as Fig. 1 and based on the same number of replications, but with high cross sectional correlation5_{. As shown in Appendix B, the POLS bias in the case of cross section}

dependence is the same to Þrst order as the bias in the cross section independent case, and this

5

(14)

0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.7 0.75 0.8 0.85 0.9 0.95 1 PFGLS PFGMU Single OLS POLS _PMU

Figure 2: Empirical Distributions of POLS, PFGLS, and PFGMU under High Cross Section Dependence (T = 100, N = 20, ρ = 0.9). 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.75 0.8 0.85 0.9 0.95 1 Single OLS POLS POLS with CTE PFGLS PFGMU PMU

Figure 3: Same as in Fig. 2 with the addition of POLS with a common time eﬀect (CTE) under High Cross section Dependence (T = 100, N = 20, ρ = 0.9).

(15)

0 0.01 0.02 0.03 0.04 0.05 0.75 0.8 0.85 0.9 0.95 1 Single OLS POLS PMU HK FD-IV GMM

Figure 4: Extended Comparison of PMU with Common Panel IV Estimators under High Cross Section Dependence (T = 100, N = 20, ρ = 0.9).

bias equivalence between the two cases is born out in the simulation results. As is apparent from Fig. 2, the main effect of the cross sectional dependence is to increase the variation of both the POLS and PMU estimators. In fact, in the displayed case (where the average cross section correlation is around 0.82) the POLS and PMU estimators show only a slight gain in concentration over single equation OLS. In other words, if there is high cross sectional correlation, there is not much efficiency gain from pooling in the POLS estimator. Fig. 3 shows the distribution of the POLS estimator in which a common time effect (CTE) has been estimated. While this estimator is obviously inappropriate under the general form of cross section dependence considered in (6), it is a commonly used procedure in practice and is applicable when the elements of δ all take on a common value. As is apparent from Fig. 3, this estimator successfully reduces variance even though the presence of a common time effect in estimation provides only a crude approximation to the error structure (6).

Figs. 2 and 3 show that the PMU estimator is still quite eﬀective in removing the bias of POLS even under cross section dependence. But its high variance makes it a less appealing estimator for applications than our PFGMU estimator, which reduces variance and removes bias, as we now discuss. Figs. 2 and 3 show the distributions of both the feasible GLS procedures, PFGLS and PFGMU. Evidently, the PFGLS estimator bρ_{pf gls} does restore much of the original gains from pooling in terms of variance reduction that were apparent in Fig. 1 for ˆρ_pols. But, as is also apparent from Fig. 2, the distribution of_bρ_{pf gls} is seriously downward biased. Use of the PFGMU median unbiased procedure corrects for this bias while retaining the concentration gains of the GLS estimator. In particular, the distribution ofρb_{pf gmu} is well centered about the true value and has concentration close to that of the median unbiased estimator ˆρ_pemu under cross sectional independence (Fig. 1).

(16)

Fig. 4 shows some comparisons of POLS and PMU in the cross section dependent case against some alternative procedures that have been suggested for dynamic panel regression. The Þrst of these is the crude Þrst diﬀerence instrumental variable estimator (FD-IV) which

uses y_it−2 as an instrument in a Þrst diﬀerenced form of the model. Apparently, FD-IV

has variation substantially in excess of all the other estimators. The commonly used GMM estimator which uses the full set of instruments {yis : s = 0, 1, ..., t − 2} shows downward

bias but not as severe as POLS and it seems to have comparable variance. HK is the bias corrected GMM estimator suggested in Hahn and Kuersteiner (2000) and Hahn, Hausman and Kuersteiner (2001) and this estimator apparently has performance closest to that of the PMU estimator. All these procedures clearly show inferior performance to the _bρ_{pf gmu} estimator under high cross section dependence.

3.3 Seemingly Unrelated Median Unbiased Estimation

The results above indicate that, if we are to gain from panel estimation by pooling cross section and time series information when there is cross section dependence, we need to take account of the dependence in estimation. In contrast, most empirical studies that utilize dynamic panels in the international Þnance and the macroeconomic growth literatures tend to ignore issues of cross sectional dependence when pooling. Our results indicate that there is information in cross sectional correlation that is valuable in pooled estimation and that it can be accounted for, at least in situations where the cross section sample size N is not too large. Moreover, one can utilize this information and at the same time deal with SB bias problems in dynamic panel estimation.

Notwithstanding these potential advantages of pooling dependent data and adjusting for bias in dynamic panels, perhaps the most important issue in pooled regressions relates to the justiÞcation of the homogeneity restriction on the autoregressive coeﬃcient ρ. In the absence of this restriction, it might be thought that there would be little gain from pooling time series and cross section data. However, because of cross section dependence, there are advantages to pooling panel data even in the estimation of heterogeneous coeﬃcients. The reasoning is the same as that of a conventional seemingly unrelated regression (SUR) system. But in a dynamic panel context there are still SB bias problems that need attention. This section shows that these can be addressed using a SUR version of the panel median unbiased procedure.

An additional advantage to performing heterogenous coeﬃcient estimation is that it fa-cilitates testing of the homogeneity restriction. Therefore, this section also proposes a test for homogeneity that is based on the seemingly unrelated panel median-unbiased (SUR-MU) estimator.

We start the discussion by combining Models M1,M2 and M3 with the following heteroge-nous autoregressive panel model for the latent panel variable y∗

it:

y_it∗ = ρ_iy_it−1∗ + uit, for t = 1, · · · , T, and i = 1, · · · , N, (11)

in which the regression errors

ut∼ iid N(0, Vu), for t = 1, · · · , T, (12)

where ut = (uit,..., uN t)0. This formulation allows for a general form of cross section error

(17)

permitted for each of the models.

When |ρi| < 1 for all i, the cross section error correlations are higher than the cross section

correlations among the regressors y_it−1. To see this, note that the correlation between yit and

yjt is given by γy_i,j = _n E (yityjt) E¡y2 it ¢ E³y2 jt ´o1 2 = γ_ij p 1 − ρi2 q 1 − ρj2 1 − ρiρj < γ_ij, (13)

where γ_ij = E(uitujt)/{E(u2it)E(u2jt)} 1

2. We might therefore anticipate the potential gains

from SUR estimation to be substantial - the regressors are diﬀerent and less correlated across individual equations in the panel for which the errors are more correlated. In consequence, we propose a SUR-MU estimator based on the following iteration.

Step 1: Obtain the time series panel median unbiased estimates ˆρ_iemu for each series i = 1, ..., N (and the appropriate model) and use the regression residuals to construct the error covariance matrix estimateVbEMU.

Step 2: UsingVbEM U perform a conventional seemingly unrelated regression on the panel and

obtain the SUR estimates of the ρ_i,bρ_isur .

Step 3: The panel seemingly unrelated median unbiased (SUR-MU) estimator is now

cal-culated as bρ_isurmu = m(bρ_isur)−1 just as in (3) but using the median function m(ρ) = mT,N(ρ) of the estimatorbρisur for each i.

Step 4: Repeat steps 1-3 until bρ_isurmu converges.

4 Testing Homogeneity Restrictions

Using unrestricted estimates of the coeﬃcients ρ_i in the heterogeneous dynamic panel model (11), Wald tests can be constructed to test the homogeneity restriction H0 : ρi = ρ for all i.

It is well known that in Þnite samples, Wald tests suﬀer from size distortion that is sometimes serious even in simple univariate regressions. For the panel regression case here we have found that the size distortion of Wald tests becomes even more serious as the cross section sample size N increases. This section Þrst investigates the asymptotic properties of Wald tests based on the SUR approach in both the stationary and nonstationary cases and shows how cross section dependencies aﬀect the asymptotic theory under nonstationarity. We then propose an alternative Wald procedure for testing homogeneity that utilizes the structure of the cross section dependence in the construction of the Wald statistic.

4.1 The Wald Test and its Asymptotic Properties

The Stationary Case

Using the unrestricted estimatesbρ_isurmu of the coeﬃcients ρ_i in the heterogeneous dynamic panel model (11), Wald tests can be constructed to test the homogeneity restriction H0 :

(18)

ρ_i = ρ for all i. More speciÞcally, let_bρsurmu= (bρisurmu) be the SUR-MU estimate of the vector

ρ= (ρ₁, ..., ρ_N)0 and write the restrictions in H0 as Dρ = 0 where D = [i_{N −1}, −I_{N −1}] and iA

has A unit elements. Under Gaussianity and in the stationary case where |ρi| < 1 for all i, the

SUR-MU estimator_bρsurmuis asymptotically (T → ∞, N Þxed) equivalent to the unconstrained

maximum likelihood estimate6 of ρ. In that case, standard stationary asymptotics and some algebraic manipulations (outlined in Appendix C) lead to the limit theory

√ T³bρ_surmu_{− ρ}´_→dN (0, VSU R) , (14) where V_{SU R}−1 =·³v_uijE (yityjt) ´ ij ¸ = V_u−1_{∗ E}¡yty0t ¢ . (15)

In (15) the operator ∗ is the Hadamard product, vuij is the ij’th element of Vu−1, where Vu =

E(utu0t) = Σ + δδ0 as in (8), and E (yityjt) =    δiδj 1−ρiρj i 6= j σ2 i+δ 2 i 1−ρ2 i i = j , so that E¡ytyt0 ¢ =¡Σ + δδ0¢_{∗ R,} where R = (rij) and rij = 1 1 − ρiρj . (16)

>From (15) and (16) it is apparent that the covariance matrix VSUR depends on both ρ and

δ as well as Σ. When H0 holds, E (ytyt0) = ¡

Σ + δδ0¢_{/(1 − ρ}2_{) and V}

SU R has a simpler form in

which

V_{SU R}−1 = 1 1 − ρ2V

−1

u ∗ Vu, (17)

which depends on the common ρ and again on the cross section dependence parameter δ. The Wald statistic for testing H0 is

Wsurmu=bρ0_surmuD0 h DVbSU RM UD 0i−1 Dbρ_surmu, where b VSU RM U = " _T X t=1 Z_t0Vb_u−1Zt #−1 ,

in which Zt = diag(y1t−1, ..., yN t−1) and Vbu is an estimate of the error covariance matrix

Vu computed from the SUR-MU regression residuals. Under H0 and in the stationary case,

it is straightforward to show that traditional chi-squared limit theory for Wsurmu holds, i.e.

Wsurmu→ χ2N. 6

Note that the median function m (·) is asymptotically (T → ∞, N Þxed) the identity function and the SUR estimator of ρ is the vector of Gaussian maximum likelihood estimators of the autoregressive coeﬃcients in the unconstrained models.

(19)

The Unit Root Case

In the nonstationary ρ = 1 case, the asymptotic results depend, as might be expected, on whether M1, M2 or M3 is employed in estimation and also on the boundary condition that arises in the transition from the SUR estimator to SUR-MU - c.f. (3). In addition, the asymptotic theory for the SUR estimator is more complex than that of a traditional unit root model when there is cross section dependence. For instance, when model M1 is used and the null hypothesis H0 : ρi= 1 ∀i holds, derivations (outlined in Appendix C) using standard unit

root limit theory deliver the limit distribution of the SUR estimator bρsur. This estimator is

deÞned as b ρ_sur = Ã _T X t=1 Z_t0Vb_u−1Zt !−1Ã _T X t=1 Z_t0Vb_u−1yt ! ,

where Vbu is an estimate of Vu based on residuals from a Þrst stage regression. We Þnd the

following asymptotic distribution forbρ_sur

T³_bρ_sur_{− i}N ´ _d → · V_u−1_∗ Z 1 0 BB0 ¸−1·Z 1 0 B ∗ ³ V_u−1dB´¸= ξ, (18)

where B is vector Brownian motion with covariance matrix Vu. It is clear from (18) that

the limit distribution of T³_bρ

SUR− iN ´

depends on the cross section dependence parameter δ even in the homogeneous case where ρ_i _{= 1 ∀i. Correspondingly, the asymptotic distribution} of bρ_surmu in the unit root case also depends on cross section dependence and error variance nuisance parameters. The Wald statistic, Wsur, for testing H0 is given by

Wsur = ρb 0 SU RD 0h_D_V_b SU RD 0i−1 Dbρ_{SU R} d → ξ0D0 " D µ V_u−1_∗ Z 1 0 BB0 ¶−1 D0 #−1 Dξ. (19) whereVbSU R=³PTt=1Z 0 tVbu−1Zt ´−1

, and again the limit distribution (19) depends on nuisance parameters.

By contrast, in the unit root case where homogeneity of ρ across i is imposed, the pooled GLS estimator of ρ is b ρ = Ã _T X t=1 y_t−10 V_u−1y_t−1 !−1Ã _T X t=1 y_t−10 V_u−1yt ! ,

with a corresponding feasible SUR version. By straightforward derivation (see Appendix C), we Þnd that T (bρ − 1)→d R1 0 W0dW R1 0 W0W = PN i=1 R1 0 WidWi PN i=1 R1 0 Wi2 , (20)

where W = (Wi) is standard Brownian motion with covariance matrix IN. The limit (20) here

(20)

4.2 Hausman and ModiÞed Hausman Tests under Cross Section

Depen-dence

The Stationary Panel Case: H0 : ρi = ρ

The main problem with the conventional Wald test, as mentioned above, is that size dis-tortion can be serious and it typically increases with the number of restrictions. Also, the Wald test based on SUR or SUR-MU estimation requires N < T and is heavily inßuenced by the nuisance parameters of cross section correlation. This section proposes an alternate procedure for dealing with cross section dependence that takes into account the structure of the dependence.

Start by writing the model M1 (with suitable adjustments for models M2 and M3) in vector form as

yt= Ztρ + ut, Zt= diag (y1t−1, ..., yN t−1) , ρ = (ρ1, ..., ρN)0. (21)

Let ˆρ_i (respectively_bρ) be the OLS estimate of ρ_i (ρ) Then

b ρ = Ã _T X t=1 Z_t0Zt !−1Ã_XT t=1 Z_t0yt ! .

Let bρ_emu be the corresponding vector of median unbiased estimates of ρ_i. Under the null

hypothesis of homogenous autoregressive coeﬃcients ρ_i _{= ρ ∀i, and as T → ∞, we have}

√

T (ˆρ_i_{− ρ) →}dN (0, 1 − ρ2) for models M1, M2 and M3, with the same result for the median

unbiased estimators_bρ_iemu_{. Under cross section independence and as T → ∞ for Þnite N, we} have N X i=1 √ T (ˆρ_i_{− ρ)} p 1 − ρ2 →dN (0, N ).

On the other hand, if there is cross section dependence of the form implied by (6), then in the stationary case for model M1 we have

yit= ∞ X j=0 ρj(δiθt−j+ εit−j) = δi ∞ X j=0 ρjθ_t−j+ ∞ X j=0 ρjε_it−j = δiµt+ ηit, say.

It follows that the asymptotic covariance between ˆρ_i and ˆρ_j is given by

acov³ˆρ_i, ˆρ_j´= 1 T (δiδj)2¡1 − ρ2¢ ³ δ2_i + σ2 i ´ ³ δ2_j + σ2 j ´ = 1 T v_ij2 viivjj ³ 1 − ρ2´,

where vij is the ij’th element of Vu = Σ + δδ0. Setting ˆρ = (ˆρ1, ..., ˆρN)0 and letting iN be an

N − vector with unit elements, we Þnd that standard derivations lead to the following limit theory √ T³ˆ_{ρ − ρi}N ´ = Ã 1 T T X t=1 Z_t0Zt !−1Ã 1 √ T T X t=1 Z_t0ut ! → d N ³ 0, D−1_y £Vu∗ E ¡ ytyt0 ¢¤ D−1_y ´ = N³0,³_{1 − ρ}2´RV ∗ RV ´ , (22)

(21)

where Dy = diag(E(y21t), ..., E(yN t2 )) and the matrix RV has ij’th element vij/{viivjj}1/2. It follows that N X i=1 √ T (ˆρ_i_{− ρ)} p 1 − ρ2 →dN (0, i 0 N(RV ∗ RV) iN).

The same result applies when the median unbiased estimates_bρ_iemu are used in place of ˆρ_i. We propose to construct an estimate of the matrix RV that appears in the asymptotic

covariance matrix of (22) and use this estimate to develop an alternate test of H0. The following

moment based procedure may be used7_.

Moment Based Estimation of (δ, Σ)

Step 1: Estimate the ρ_i by using OLS or EMU and obtain the regression residuals ˆuit =

yit−bρiyit−1, which are asymptotically equivalent to OLS residuals and consistent (as

T → ∞, N Þxed) for uit. In particular,

ˆ

uit= uit+ (ρi−ρbi)yit−1 = uit+ op(1)

in both stationary and nonstationary cases.

Step 2: Construct the moment matrix of residuals MT = _T1 Pt=1T uˆtuˆ0t, which is a consistent

(as T → ∞, N Þxed) estimate of Vu. Let mT ij be the ij’th element of MT.

Step 3: Estimate the cross section coeﬃcients δ and the diagonal elements of Σ using the following moment procedure that Þnds the least squares best Þt to the matrix MT, that

is _³ b δ,Σb´= arg min δ,Σ tr h¡ MT − Σ − δδ0¢ ¡MT − Σ − δδ0¢0 i . (23)

The solution of (23) satisÞes the system of equations ˆ δ = (MTˆδ − Σˆδ)/ˆδ 0_ˆ δ, σˆ2_i = MT ii− ˆδ 2 i, i = 1, ..., N

and this can be solved using the iteration

δ(r) = (MTδ(r−1)− Σδ(r−1))/δ(r−1)0δ(r−1),

σ(r)2_i = MT ii− δ(r)2i , (24)

starting from some initialization δ(0) (such as the largest eigenvector of MT) until

con-vergence. Since MT →p Vu = Σ + δδ0 as T → ∞, it follows that (bδ,Σ) →b p (δ, Σ) as

T → ∞, with N Þxed. SinceΣ →b p Σ > 0 as T → ∞,Σ will be positive deÞnite for largeb

enough T.

Step 4: Construct the variance matrix estimate Vbu =Σ +b δbbδ0. Letvbij be the ij’th element of b

Vu and construct the estimate RbV whose ij’th element is bvij/{vbiivbjj}1/2.

7_{Appendix D gives an algorithm for Gaussian maximum likelihood estimation of the cross section coeﬃcients.}

Simulation results indicate that the moment based method described here gave superior results, especially for large N.

(22)

Since Vbu →p Vu, we have RbV →p RV as T → ∞. Now let eρ be the PFMGU estimate of

ρ under the assumption of homogeneity. Under H0, the pooled estimate eρ is asymptotically

equivalent to GLS and then by standard limit theory √ T (_e_{ρ − ρ) =} Ã 1 T T X t=1 y0_t−1V_u−1y_t−1 !−1Ã 1 √ T T X t=1 y_t−10 V_u−1ut ! →dN µ 0,ntracehV_u−1E¡yty0t ¢io−1¶ . Since E¡yty0t ¢ =³Σ + σ2δδ0´_{∗ R = V}u∗ R = 1 1 − ρ2Vu,

under H0, we end up with the simple result

√ T (eρ − ρ) →dN Ã 0,1 − ρ 2 N ! . Next consider the asymptotic covariance

Acov Ã 1 √ T T X t=1 Z_t0ut, 1 √ T T X t=1 y0_t−1V_u−1ut ! = 1 T T X t=1 Z_t0E¡utu0t ¢ V_u−1y_t−1= 1 T T X t=1 Z_t0y_t−1_→       E¡y_1t2¢ E¡y_2t2¢ .. . E¡y_{N t}2 ¢      = DyiN,

as T → ∞, from which we deduce that

Acov³√T³ˆ_{ρ − ρi}N ´ ,√T (eρ − ρ)´ = D−1_y [DyiN] n tracehV_u−1E¡ytyt0 ¢io−1 = iN ³ 1 − ρ2´. (25)

Our test statistic for H0 is based on the diﬀerence between the estimates

√ T³ˆρ_emu₋eρiN ´ =√T³ˆρ_emu_{− ρi}N ´ −√T (eρ − ρ) iN,

and from (22), (25) and joint convergence we Þnd that √ T³ˆρ_emu₋ρie N ´ q 1 −eρ2 = √ T³ˆρ_emu_{− ρi}N ´ q 1 −eρ2 − √ T (_e_{ρ − ρ)} q 1 −eρ2 iN →dN ( µ 0, RV ∗ RV − 1 NiNi 0 N ¶ . (26) It follows that we may construct the Hausman-type test statistic

G = T 1 −ρe2 ³ ˆ ρ emu−eρiN ´0½h b RV ∗RbV i−1 − 1 NiNi 0 N ¾ ³ ˆ ρ emu−eρiN ´ , (27)

which is based on the diﬀerence between the robust-to-heterogeneity estimate ˆρ_emuof ρ and the eﬃcient estimateeρ of ρ under the null, and which uses the moment based procedure outlined

(23)

above to construct estimates of Vu and RV. We use the notation Gpf mgu to indicate that the

pooled estimateeρ in (27) is the PFMGU estimate of the (common) ρ. Then, in view of (26) and the consistency ofRbV, we have

Gpf mgu→ χ2N, as T → ∞. (28)

One practical diﬃculty that can arise with (27) is that the variance matrix hRbV ∗RbV i−1

−

1

NiNi0N is not necessarily positive deÞnite and, in our simulations negative values of G have

occasionally occurred when N and T are small (N = 10, T = 50).

The Panel Unit Root Case: Ho: ρi= 1, ∀i

As shown in Appendix C, the Hausman test has a limit distribution in the unit root (ρ_i= 1, ∀i) case that is dependent on the cross section nuisance parameters. It is therefore unsuitable for testing homogeneity. However, there is a simple way of constructing a modiÞed test that is free of nuisance parameters, which we now describe.

Under the null hypothesis, we have as in (67) 1 √ Ty[T r] = 1 √ T [T r] X t=1 ut→dB (r) = BM (Vu) .

Note that we can decompose B into component Brownian motions as follows

B (r) = δBθ(r) + Bε(r) , (29) where 1 √ T [T r] X t=1 θt→dBθ(r) = BM ³ σ2´, and_√1 T [T r] X t=1 εt→dBε(r) = BM (Σ) .

Let δ_⊥ _{be an N × (N − 1) matrix that spans the orthogonal complement of the vector δ. Then}

h¡ δ0_⊥Σδ_⊥¢−1/2δ0_⊥i√1 Ty[T r] →d ¡ δ0_⊥Σδ_⊥¢−1/2δ0_⊥B (r) =¡δ0_⊥Σδ_⊥¢−1/2δ0_⊥Bε(r) = W_⊥(r) , (30) where W_⊥(r) = BM (I_{N −1}_{) , or (N − 1) - vector standard Brownian motion. The} transforma-tion matrix that appears in (30) can be estimated by implementing the following modiÞcatransforma-tion of our earlier procedure.

Orthogonalization Procedure (OP)

Step 1: Construct the moment matrix of diﬀerences (for models M1 and M2) or demeaned

diﬀerences (for model M3) which we write as MT = _T1 PTt=1uˆtuˆ0t. As in the stationary

case, MT is a consistent (as T → ∞, N Þxed) estimate of Vu. Again, let mT ij be the

ij’th element of MT.

Step 2: Estimate the cross section coeﬃcients δ and Σ by moment based optimization as in (23) leading to (bδ,Σ). As before, (b bδ,_{Σ) →}b p (δ, Σ) as T → ∞, with N Þxed, and Σ isb

(24)

Step 4: UsingΣ andb bδ, construct8 bδ_⊥ and Fbδ= ³ b δ0_⊥Σbbδ_⊥´−1/2bδ0_⊥. Clearly, b Fδ= ³ b δ0_⊥Σbbδ_⊥´−1/2bδ0_⊥_→p ¡δ0_⊥Σδ⊥¢−1/2δ0⊥, (31) as T → ∞.

UsingFbδ we transform the data yt(or demeaned/detrended data in the case of models M2

and M3) giving y_t+ = Fbδyt. As is apparent from (30), the transformation Fbδ asymptotically

removes cross section dependence in the panel and y+_t is asymptotically cross section indepen-dent as T → ∞. Using yt+ we may now construct estimates of the autoregressive coeﬃcients.

Let ˆρ+_i (respectively bρ+) be the OLS estimate of ρ_i = 1 (ρ = i_{N −1}). Then, in an obvious notation, b ρ+= Ã _T X t=1 Z_t+0Z_t+ !−1Ã _T X t=1 Z_t+0y_t+ ! .

Letbρ+_emu be the corresponding vector of median unbiased estimates of ρ_i. Similarly, let eρ+ be the PFMGU estimate of ρ obtained from the transformed data y_t+ under the assumption of homogeneous unit roots. The modiÞed Hausman statistic is deÞned as

G+_H = T2³ˆρ+_emu₋ρ_e+i_{N −1}´0³ˆρ+_emu₋_eρ+i_{N −1}´, (32) As shown in Appendix C G+_H _→dΞ0_{N −1}ΞN −1 (33) where Ξ_{N −1}=       hR1 0 W⊥,12 i₋₁_hR₁ 0 W⊥,1dW⊥,1 i −hR01W⊥0W⊥ i₋₁_hR₁ 0 W⊥0dW⊥ i .. . hR1 0 W⊥,N−12 i−1hR₁ 0 W⊥,N−1dW⊥,N−1 i −hR01W⊥0W⊥ i−1hR₁ 0 W⊥0dW⊥ i      , (34)

and where {W⊥,i: i = 1, ..., N − 1} are the components of the N − 1 vector standard Brownian

motion W_⊥ Clearly, G∗

H is free of nuisance parameters in the limit and is suitable for testing

the null H0: ρi= 1 ∀i.

An alternate approach is to construct panel unit root test statistics directly by taking the sum of the diﬀerences between the estimates ˆρ+_i , ˆρ+_i,emu and their limits under the null, viz.

G+_ols = N −1_X i=1 ˆ ρ+_i _{− i}_{N −1} ˆ σ_ˆ_ρ+ (35) G+_emu = N −1_X i=1 ˆ ρ+_i,emu_{− i}_{N −1} ˆ σ_ˆ_ρ+ i,emu (36) 8

The orthogonal complement matrix bδ_⊥ can be constructed by taking the eigenvectors of the projection matrix Pˆδ= I − ˆδ(ˆδ

0_ˆ

(25)

In contrast to (32), the test statistics (35) and (36) do not involve a pooled estimate of the homogeneous unit root parameter. As shown in Appendix C, for Þxed N we have the following limit theory for these statistics as T → ∞

G+_ols_{, →}d N −1_X i=1 ξ_i, , G+_emu_→d N −1_X i=1 ξ−_i (37) where ξ_i= (R₀1W_i2)−1(R₀1WidWi) and ξ−_i = ( ξ_i ξ_i < 0 0 ξ_i _{≥ 0}

The limits in (37) depend only on N. Both G+_ols, G+_emu are therefore suitable for testing the null H0.

Note that there are only N − 1 elements in (34) - (36). This is because the panel system has been transformed to dimension N − 1 in Step 4 above in order to remove the eﬀects of cross section dependence in the limit.

The tests (35) and (36) have the advantage that they lend themselves to simple large N asymptotics. In particular, the means and variances

E (ξ_i) , E³ξ−_i ´= µ_ξ, µ_ξ− Var(ξ_i), Var(ξ−_i ) = σ2_ξ, σ2_ξ−

can be computed and, noting that ξ_i, ξ−_i are iid over i, we have the large N limit theory 1 √ N N −1_X i=1 ³ ξ_i_{− µ}_ξ´_→dN ³ 0, σ2_ξ´, √1 N N −1_X i=1 ³ ξ−_i _{− µ}_ξ− ´ →dN ³ 0, σ2_ξ− ´ . It follows that in sequential asymptotics (see Phillips and Moon, 1999) as (T, N → ∞)seq

G++_ols = √1 N σξ P_{N −1} i=1 · ˆ ρ+_i−iN−1 ˆ σ_ˆ_ρ+ − µξ ¸ G++_emu= √ 1 N σ_ξ− P_{N −1} i=1 " ˆ ρ+_i,emu−iN−1 ˆ σ ˆ ρ+ i,emu − µξ− #          →dN (0, 1) .

All of these procedures are easy to implement. Their Þnite sample performance is assessed in Section 6 below. As shown in the next section, once the OP procedure has been applied to the data, a wide class of panel unit root and stationarity tests become applicable.

4.3 Dynamic AR(p) Panels with Cross Section Dependence

The procedures outlined above for panel unit root testing under cross section dependence may be applied to cases of higher order panel dynamics and cases where the common factor com-ponent θtis weakly dependent. SpeciÞcally, consider a panel of dynamic panel autoregressions

with (possibly) heterogenous lag orders `i for each i and allow for cross section dependence of

the same form as (6) above. The model is written in augmented format as

∆yit= µ_i+ β_it + (ρ − 1)yit−1+ `i X j=1

(26)

The OP procedure leading to (31) above is the same as that laid out above except for the Þrst step. Here, instead of using the moment matrix of differences or demeaned differences, one simply uses the moment matrix of the regression residuals ûit obtained under the (null

hypothesis) restriction ρ = 1 in (38).

Since the transformed data y+_it are asymptotically uncorrelated across i, regressions like (38) of y_it+on y_it−1+ and the lagged diﬀerences ∆y_it−j+ do not suﬀer (asymptotically) from cross section dependence. Importantly, this will be so even when the common time series factor θtis weakly

dependent rather than uncorrelated over time. This is because the transformation procedure leading to (31) continues to eliminate the contribution of the common factor component θt to

the limit Brownian motion in (29). It follows that several existing panel unit root tests that were designed to work with data that are independent across section can now be applied to test for panel unit roots when there is cross section dependence. Accordingly, we consider here two broad types of panel unit root tests.

Meta-Analysis Tests for Panel Unit Roots and Stationarity under Cross Section Dependence

The Þrst type of test is based on meta-analysis, wherein the p-values of tests for each cross section individual i are combined to construct a new test. Tests of this type were suggested in Choi (2001a) and Maddala and Wu (1999) for use in testing unit roots with panel data under cross section independence.9 These tests apply here under cross section dependence after our OP orthogonalization procedure has been implemented. Choi (2001a) provides a full discussion of tests of this type and his simulation results suggest use of the three tests that we concentrate on here.

Let pi be the p-value of a unit root test associated with cross section element i. DeÞne

P _{= −2} N −1_X i=1 ln(pi), (39) Pm = − 1 √ N N −1_X i=1 [ln(pi) + 1] (40) Z = √1 N N −1_X i=1 Φ−1(pi) (41)

The P test is called the inverse chi-square test or Fisher test after Fisher (1932). The Pm test

statistic is a centered and normalized version of P that is useful for large N. The Z test is called the inverse normal test, following Stouﬀer et al. (1949). As discussed in Choi (2001), we have the following limit distributions for P and Z as T → ∞

P →dχ2_{2(N −1)}, Z →dN (0, 1) for Þxed N, (42)

leading to the following sequential limit theory as (T, N → ∞)seq

Pm, Z →dN (0, 1). (43)

9

Choi (2001b) considers several statistics based on meta-analysis with random individual and time eﬀects in (1).

(27)

Each of these tests and the limit theory applies under the null hypothesis to dynamic panel autoregressions like (38) with cross section dependence after the OP procedure has been im-plemented.

Other Tests for Panel Unit Roots

In fact, after transforming the data using the OP procedure, we can apply most other methods for testing panel unit roots that are valid under cross section independence. Baltagi (2001) provides a recent discussion and overview of these tests, which generally take the form of cross section averages of time series test statistics and have the generic form

Gτ = 1 N − 1 N −1_X i=1 τi,

where τistands for an individual unit root test statistic. This class of tests can also be extended

by using the bias reduction techniques discussed earlier in present paper. For instance, we could use an ADF-t statistic based not on OLS estimation but instead on EMU estimation as explained earlier (c.f. Andrews and Chen, 1994).

Im, Pesaran and Shin (1997, IPS) use two cross-sectional average tests constructed like Gτ and study their small sample properties using simulations. Without modiÞcation, this

type of test typically suﬀers from serious size distortion in small samples due to SB bias. IPS use simulation to calculate the mean and variance of the Gτ statistics and they employ bias

correction in the implementation of these procedures. However, in the dynamic panel AR(p) case, the means and variances of the Gτ statistics heavily depend on the nuisance parameters

that arise in the augmented dynamic terms. Tanaka (1984) and Shaman and Stine (1988) provide formulae for the mean bias for cases up to an AR(6) for Model 1 and 2. For example, for an AR(2), the OLS estimator of ρ_iin (38) will be biased downward when the true coeﬃcient on y+_it−2 is negative, while it will be biased upward when the true coeﬃcient on y_it−2+ is large and positive. IPS also found that the size distortion problem of their Gτ tests heavily rely on

the sign of the true coeﬃcient on y_it−2+ . Since their Monte Carlo studies are based on AR(2) process, their size distortion corrections are based on the sign and magnitude of the coeﬃcient on y_it−2+ . For general dynamic panel AR(p) processes, the size of the Gτ test will depend on

all the nuisance parameters arising in the augmented terms and, in the absence of analytic formulae, extensive simulations are needed to make the appropriate corrections in such cases. The Þnite sample performance of these panel unit root tests and, more generally, tests of homogeneity are considered in the simulation experiments reported in Section 6 below.

5 Simulation Experiments

This section consists of three parts. First, we report the Þnite sample performance of the three panel median unbiased estimators. Second, we show the Þnite sample performance of the Wald statistic Wsurmu, and the Gpf mgu statistic. Finally, we examine the small sample performance

of the panel unit root tests G++

emu, G++ols, Pm, and Z, and show how well the orthogonalization

(28)

5.1 Design of Data Generating Process

The data generating process for the Þrst two parts is given by

yit = ρiyit−1+ uit, (44)

uit = δiθt+ εit, (45)

where εit∼ iid N (0, 1) over i and t, θt∼ iid N (0, 1) over t, and for (ρi, δi) parameter selections

that are detailed below. The primary distinction is between the homogeneous case where ρ_i= ρ for all i and the heterogeneous case where the ρ_i diﬀer across individuals i. We also distinguish cases of high and low cross section dependence according to the value of δi. Estimation is based

on the following two regression models that involve a Þtted mean and trend: yit= ai+ ρiyit−1+ uit for Model M2

yit= ai+ bit + ρiyit−1+ uit for Model M3

Panel data are generated under four speciÞcations which diﬀer according to their degree of the cross sectional dependence and whether or not the homogeneity restriction is imposed on ρ. These speciÞcations are as follows:

Case I: (Homogeneity and Low Cross-sectional Dependence) The homogeneity restriction is

imposed and we set ρ₁= ρ₂ _{= · · · = ρ}_N = 0.9, and allow low cross sectional dependence by setting δi ∼ U[0, 0.2], where U[a, b] represents the uniform distribution over the

in-terval [a, b]. In this experiment, the average error (uit) cross sectional dependence has

correlation coeﬃcient around 0.03.

Case II: (Homogeneity and High Cross-sectional Dependence) Again, we set ρ_i = 0.9 for all i and δi∼ U[1, 4]. Here, the lowest error (uit) cross sectional correlation is around 0.52,

the median is around 0.82, and the highest is around 0.94.

Case III: (Heterogeneity and Low Cross-sectional Dependence) Here, ρ_i _{∼ U[0.7, 0.9], and} δi ∼ U[0, 0.2].

Case IV: (Heterogeneity and High Cross-sectional Dependence) Here ρ_i _{∼ U[0.7, 0.9], and} δi ∼ U[1, 4].

Case V: (Testing Homogeneity under Stationarity) Under the null hypothesis of homogeneity of ρ, we set ρ_i = 0.8 for all i to investigate test size. Under the alternative, we set ρ_i_{∼ U[0.7, 0.9] and consider test power.}

Each experiment involves 5,000 replications of panel samples of (N, T ) observations. We use N = 10, 20, 30 and T = 50, 100, 200.

The third part of the simulation has two sections. In the Þrst section the Þtted models have intercepts and trends (as in M2 and M3) and the DGP is based on (45) and (46) with the following parameter settings: