DYNAMIC PANEL ESTIMATION AND HOMOGENEITY TESTING UNDER CROSS SECTION DEPENDENCE
By
Peter C.B. Phillips and Donggyu Sul
May 2002
COWLES FOUNDATION DISCUSSION PAPER NO. 1362
COWLES FOUNDATION FOR RESEARCH IN ECONOMICS
YALE UNIVERSITY
Box 208281
New Haven, Connecticut 06520-8281
Dynamic Panel Estimation and Homogeneity Testing Under
Cross Section Dependence
∗
Peter C.B. Phillips
Cowles Foundation, Yale University
University of Auckland & University of York
Donggyu Sul
Department of Economics
University of Auckland
February 12, 2002
Abstract
This paper deals with cross section dependence, homogeneity restrictions and small sam-ple bias issues in dynamic panel regressions. To address the bias problem we develop a panel approach to median unbiased estimation that takes account of cross section dependence. The new estimators given here considerably reduce the effects of bias and gain precision from estimating cross section error correlation. The paper also develops an asymptotic theory for tests of coefficient homogeneity under cross section dependence, and proposes a modiÞed Hausman test to test for the presence of homogeneous unit roots. An orthogo-nalization procedure is developed to remove cross section dependence and permit the use of conventional and meta unit root tests with panel data. Some simulations investigating the Þnite sample performance of the estimation and test procedures are reported.
Keywords: Autoregression, Bias, Cross section dependence, Dynamic factors, Dynamic panel estimation, GLS estimation, Homogeneity tests, Median unbiased estimation, Modi-Þed Hausman tests, Median unbiased SUR estimation, Orthogonalization procedure, Panel unit root test.
JEL ClassiÞcation Numbers: C32 Time Series Models. C33 Panel Data First Draft : August, 2000
Completed Version: December, 2001
∗Presented at the Midwest Econometrics Conference, October, 2001. Our thanks go to Feng Zhu for pointing
out some errors in the original version. Computational work was performed in GAUSS. Phillips thanks the NSF for support under Grant # SES 0092509.
1
Introduction
This paper suggests some simple and practical methods for treating three important and thorny issues that arise in estimation and testing with dynamic panel models: cross section depen-dence, homogeneity testing, and small sample bias (hereafter SB) problems. Each of these issues is individually important in dynamic panel regression and has received attention, par-ticularly the SB problem on which there is a large literature. But the problems are not independent and, when they are taken together, they substantially complicate estimation and inference in dynamic panel models. The rapidly growing number of applied panel studies in growth economics, international Þnance, and empirical labor economics in recent years accen-tuates the need for these issues to be addressed in a systematic fashion. As yet, however, there have been few attempt to address these issues at the same time and the present paper is a small step in that direction offering some new possibilities in estimation and inference. We start by noting the following implications.
First, when there is cross section dependence in panel data, commonly used econometric estimators and tests about parameters of interest generally rely on the nuisance parameters of cross section dependence. As we will show, one of the most striking effects of cross section dependence is that the pooled ordinary least squares (OLS) estimator provides little gain in precision compared with single equation OLS when cross sectional dependence occurs but is ignored in the panel regression. Another effect is that commonly used panel unit root tests are no longer asymptotically similar. These effects are easily demonstrated using a simple but intuitive parametric structure for the cross section dependence.
Second, the well known problem of SB bias in least squares estimation of the coefficients in dynamic models is much more serious in panel models than it is in univariate autoregessions. We provide extensions of the Nickell (1981) bias formula for cases where there is cross section dependence, error heterogeneity and nonstationarity. In some cases the bias is so marked that the true autoregressive coefficient lies completely outside the empirical distribution of the pooled OLS estimator of the dynamic autoregressive coefficient. To address this problem, the paper introduces some new panel estimation procedures that are based on the idea of median unbiased estimation (Lehmann, 1959; Andrews, 1993).
Third, homogeneity assumptions in dynamic panel models are convenient and commonly employed to take advantage of pooling in panel regression. But these restrictions are some-times not well supported by the data and they can produce misleading results and invalidate inference, as argued for example, by Durlauf and Quah (1999) in connection with homogeneity restrictions used in the economic growth and convergence literatures. Of particular importance in applied work is the need to take account of cross section dependence in testing homogeneity restrictions in non stationary panels, especially in connection with panel unit root testing. This paper shows how to test for panel unit roots in the presence of cross section dependence and proposes two types of test statistic. The Þrst type is based on median unbiased correction after eliminating cross section dependence. The second type involves the use of meta statistics which seek to avoid small sample biases rather than correct for them.
The paper gives precedence initially to the treatment of the SB bias problem. It is not because this issue is more important than that of cross section dependence or homogeneity, but because the SB problem arises irrespective of homogeneity testing or the presence of cross section dependence. Further, as is already well recognised, SB bias can make a huge difference
in applied work, as the examples of HAC and dynamic response time estimation given in the next section illustrate1.
To handle the SB bias problem in dynamic panel estimation and the difficulties that can arise from it, this paper proposes some panel median unbiased estimators (MUE’s) that follow the approach taken by Andrews (1993) in the time series case2. Our starting point is a panel version of the MUE of Andrews in which the innovations in the panel are assumed to be free of cross sectional dependence and the autoregressive coefficient is assumed to be homogenous across cross sectional units. Since both these assumptions are strong and are unlikely to be satisÞed in empirical work, we explore the consequences of relaxing these assumptions and develop some alternate MUE procedures that are more suitable in that event.
For this purpose, we use a generalized common time effect model to parameterize the structure of cross section dependence (see equation (6) below). This structure has been used in practical work (for example, Barro and Sara-i-Martin (1992)) because of its simplicity and economic interpretability. Also, other authors (e.g., Im, Pesaran, and Shin, 1997) have sug-gested this parametric structure as a possible model for cross section dependence, without providing analysis but indicating that such formulations can be expected to complicate asymp-totics in both stationary and nonstationary cases. Under this structure, we Þnd that pooling GLS (which takes account of the dependence) reduces variance, but the pooled GLS estimator suffers from downward bias. To deal with these effects of cross section dependence, we develop a panel generalized MUE and Þnd that this procedure restores the precision gains from pooling in the panel and largely removes the bias in GLS. Next, we consider the more realistic case in empirical research where there is cross sectional dependence among the innovations and heterogeneity in the autoregressive coefficients. In this case, we provide a seemingly unrelated MUE that deals with heterogeneity and cross section dependence in much the same way as the conventional SUR estimator, while also addressing the SB bias problem.
In panel applications it is often of interest to test whether the data support homogene-ity restrictions on the coefficients, an important example being that of panel unit roots, as mentioned above. In view of the potential gains from pooling and the changes in the limit theory in the nonstationary case, homogeneity of the autoregressive coefficients in a panel is an important restriction in dynamic panel models. In developing tests of such restrictions in dynamic panels it is particularly important in empirical applications to allow for cross section dependence. To this end, the present paper investigates the properties of Wald and Hausman-type tests of homogeneity under cross section dependence and proposes a modiÞed Hausman test procedure that helps to deal with the effects of such dependence in testing for the presence
1
The problem of small sample bias in the least squares estimation of the coefficients in an autoregression has a long history, two important early contributions being Hurvicz (1950) and Orcutt (1948). In simple autoregressions, asymptotic formulae for the small sample bias were worked out by Kendall (1954) and Marriot and Pope (1954). Orcutt (1948) was the Þrst to show that Þtting an intercept in an autoregression produced an additional source of bias that can exacerbate the SB problem, and this was conÞrmed in a later simulation study by Orcutt and Winokur (1969). The point was echoed in Andrews’ (1993) more recent study, which provided further simulations that included the case of a Þtted linear trend.
2
Our work is also related to some recent independent work by Cermeno (1999). Using simulation methods, Cermeno investigates the use of MUE estimation in a dynamic panel regression with Þxed effects, a common time effect and homogeneous trends. Our framework extends Cermeno’s study by developing a class of panel MUE’s that address a more general case of cross section dependence and that enable tests of homogeneity restrictions on the dynamics, including the important case of unit root homogeneity
of homogeneous unit roots. An orthogonalization procedure3 is developed, which enables the
development of a general class of unit root tests for panel models when there is cross section dependence.
The remainder of the paper is organized as follows. The next section shows how even a small time series SB can make a large difference in estimation and testing in the context of panel pooling. Section 3 studies the invariance properties of the panel MUE under the as-sumption of cross sectional independence. Since invariance breaks down under cross sectional dependence, this section also investigates alternative invariance properties that hold in the presence of cross section dependence and proposes two new estimators for this case — a pooled feasible generalized MUE and a seemingly unrelated MUE. Section 4 considers the asymptotic properties of Wald and Hausman tests for homogeneity under cross section dependence and develops some alternate procedures that offer advantages, especially in the case of unit roots. In section 5, we report the results of a simulation experiment examining the bias and efficiency of the various panel estimators and the performance of the tests of cross section homogene-ity. Section 6 provides an empirical application of the estimators to the growth convergence problem. Section 7 concludes. Derivations and some additional technical results are given in the Appendices: A derives some invariance results; B provides extensions of the Nickell (1981) bias formula to cases where there is cross section dependence, unit root nonstationarity and heterogeneous errors; C develops limit theory for the stationary and unit root nonstationary cases; D provides an algorithm for estimating the cross section dependence coefficients.
2
Dynamic Panel Models and Bias Illustrations
2.1
Model DeÞnitions
Three basic models are considered. These are panel versions of the models in Andrews (1993). As in that work, Gaussianity is assumed in order to construct the median unbiased estimator. Each of the basic models involves a latent panel {yi,t∗ : t = 0, 1, ...T ; i = 1, ..., n} that is
generated over time as an AR(1) with errors that are independent across section. The more complex case of cross section error dependence is taken up in Section 3.2 and allowance for more general time series effects is considered in Section 4.3.
The model for yi,t∗ is
y∗i,t= ρyi,t−1∗ + ui,t, for t = 1, · · · , T, and i = 1, · · · , N, where ρ ∈ (−1, 1], (1)
ui,t ∼ iid N(0, σ2i) over t and ui,t is independent of uj,s for all i 6= j and for all s, t and
initialization is as follows yi,0∗ ∼ ( N (0, σ2i 1−ρ2) ρ ∈ (−1, 1) Op(1) ρ = 1 .
When ρ ∈ (−1, 1), yi,t∗ is a zero mean, Gaussian panel that follows an AR(1) structure over
time and that is independent over i. When ρ = 1, y∗
i,tis a Gaussian panel random walk starting 3
After the Þrst draft of this paper was done and it was in the Þnal stages of completion, the authors learnt that Moon and Perron (2001) have independently proposed the same approach to unit root testing in the context of dynamic panels with multiple factors.
from a (possibly random) initialization y∗
i,0 (not necessarily Gaussian) and that is independent
over i. The observed panel data {yi,t : t = 0, 1, ...T ; i = 1, ..., n} are deÞned in terms of y∗i,tas
follows:
M1: yi,t= yi,t∗ for t = 0, · · · , T and i = 1, · · · , N. and ρ ∈ (−1, 1)
M2: yi,t= µi+ yi,t∗ for t = 0, · · · , T, i = 1, · · · , N, µi ∈ R and ρ ∈ (−1, 1]
M3: yi,t= µi+ βit + yi,t∗ for t = 0, · · · , T, i = 1, · · · , N, µi, βi ∈ R, and ρ ∈ (−1, 1].
In each case, there is an equivalent dynamic panel representation in terms of yi,t :
M1 yi,t= ρyi,t−1+ uit for t = 1, · · · , T, i = 1, · · · , N, and ρ ∈ (−1, 1)
M2 yi,t= µi+ ρyi,t−1+ uit for t = 1, · · · , T, i = 1, · · · , N, with µi = µi(1 − ρ) and ρ ∈ (−1, 1]
M3 yi,t= µi+ βit + ρyi,t−1+ uit for t = 1, · · · , T, i = 1, · · · , N, with µi= µi(1 − ρ) + ρβi, βi=
βi(1 − ρ), and ρ ∈ (−1, 1].
In M1-M3, the initialization yi,0 ∼ N(0, σ2i/(1 − ρ2)) when ρ ∈ (−1, 1) and yi,0 = Op(1) when
ρ = 1.
2.2
Pooled Estimation and Bias Illustrations
Denote the pooled panel least squares (POLS) estimator of ρ by ˆρpols in each of the three models M1, M2 and M3. In M2, for instance, ˆρpols has the form
ˆ ρpols =
PN i=1
PT
t=1(yit−1− yi.−1)(yit− yi.) PN i=1 PT t=1(yit−1− yi.−1)2 , where yi.= T−1 T X t=1 yit, and yi.−1= T−1 T X t=1 yit−1. (2) The exact quantiles of ˆρpols were computed by simulation using 100,000 replications for a selection of N, T , and ρ values and for σ2i = 1. We report some summary statistics here (detailed results are available upon request) and make the following general observations: (i) the median values of the pooled OLS estimators are less than the true values for all models and all cases; (ii) the difference between the median value and the true value (which we call the median bias) is increasing as the true value of ρ increases for all conÞgurations of (N, T ).
Table 1 shows the bias of the POLS estimator for each model when ρ = 0.9. For model M1, the bias of the OLS estimator vanishes for moderate sizes of N and T . For example, the median values of ˆρpols are 0.88 for N =1,T =50, 0.89 for N=1,T =100 and 0.90 for N =10,T =50. Also, the empirical distribution of ˆρpols becomes tighter as N increases. In contrast to model M1, ˆρpols suffers from substantial SB in model M2 even when N or T are moderately large. But, as in Model M1, the distribution of ˆρpols concentrates quickly as N increases. In several cases, the bias and concentration of the POLS estimator are such that the true value of ρ lies almost completely outside the empirical distribution for moderate N . For example, for T = 50, the upper 95% points of ˆρpols are 0.94, 0.89, 0.88 and 0.85 for N = 1, 10, 20, and 30, respectively, when ρ = 0.9. Even for T = 200 and N = 30, 95% of the distribution of ˆρpols is below the true value. This problem becomes more severe for model M3, where the upper 95% points of ˆρpols are 0.904, 0.843, 0.831 and 0.825 for N = 1, 10, 20, and 30.
Table 1: Downward Bias in Dynamic Panel Estimation Part A: Quantiles of ˆρpols for ρ = 0.9
Sample Model M1 Model M2 Model M3
5% 50% 95% 5% 50% 95% 5% 50% 95% N=1, T=50 0.710 0.883 0.962 0.628 0.830 0.937 0.548 0.772 0.904 N=1, T=100 0.787 0.891 0.948 0.749 0.868 0.935 0.713 0.842 0.920 N=1, T=200 0.829 0.896 0.938 0.814 0.885 0.931 0.798 0.874 0.924 N=10, T=50 0.858 0.898 0.928 0.799 0.850 0.889 0.735 0.795 0.843 N=10, T=100 0.874 0.899 0.920 0.847 0.877 0.902 0.820 0.853 0.882 N=10, T=200 0.882 0.900 0.915 0.870 0.890 0.906 0.858 0.879 0.897 N=20, T=50 0.872 0.899 0.921 0.816 0.850 0.880 0.755 0.796 0.831 N=20, T=100 0.882 0.900 0.915 0.857 0.878 0.896 0.830 0.854 0.874 N=20, T=200 0.888 0.900 0.911 0.876 0.890 0.902 0.864 0.878 0.892 N=30, T=50 0.878 0.900 0.917 0.824 0.851 0.875 0.763 0.796 0.825 N=30, T=100 0.885 0.900 0.913 0.861 0.878 0.893 0.835 0.853 0.870 N=30, T=200 0.890 0.900 0.909 0.879 0.890 0.900 0.868 0.879 0.890
Part B: Quantiles ofbh when ρ = 0.9 and h = 6.579
N=1, T=50 2.027 5.569 18.036 1.487 3.709 10.730 1.153 2.685 6.905 N=1, T=100 2.890 6.029 13.034 2.403 4.895 10.393 2.051 4.033 8.342 N=1, T=200 3.704 6.303 10.783 3.366 5.670 9.698 3.071 5.130 8.734 N=10, T=50 4.532 6.465 9.244 3.086 4.250 5.897 2.248 3.024 4.071 N=10, T=100 5.130 6.502 8.332 4.184 5.293 6.753 3.487 4.362 5.518 N=10, T=200 5.524 6.549 7.764 4.995 5.921 7.041 4.520 5.352 6.364 N=20, T=50 5.073 6.479 8.454 3.407 4.257 5.422 2.462 3.033 3.745 N=20, T=100 5.530 6.550 7.799 4.477 5.310 6.305 3.717 4.377 5.164 N=20, T=200 5.831 6.557 7.410 5.254 5.922 6.689 4.745 5.348 6.042 N=30, T=50 5.313 6.556 8.019 3.573 4.306 5.171 2.561 3.046 3.614 N=30, T=100 5.698 6.554 7.617 4.645 5.321 6.095 3.847 4.372 4.973 N=30, T=200 5.957 6.573 7.242 5.391 5.934 6.555 4.882 5.360 5.920
Part C: Quantiles of lrvlrvc when ρ = 0.9 and lrv = 100 N=1, T=50 0.113 0.763 7.047 0.064 0.339 2.580 0.040 0.182 1.091 N=1, T=100 0.206 0.863 3.880 0.147 0.575 2.501 0.109 0.403 1.643 N=1, T=200 0.337 0.918 2.608 0.282 0.753 2.127 0.235 0.616 1.726 N=10, T=50 0.501 0.965 1.933 0.235 0.425 0.810 0.129 0.220 0.390 N=10, T=100 0.620 0.986 1.565 0.420 0.656 1.035 0.292 0.449 0.696 N=10, T=200 0.717 0.994 1.385 0.587 0.810 1.137 0.489 0.669 0.928 N=20, T=50 0.615 0.981 1.596 0.281 0.432 0.670 0.152 0.223 0.331 N=20, T=100 0.711 0.988 1.382 0.478 0.658 0.908 0.333 0.449 0.616 N=20, T=200 0.791 0.996 1.264 0.649 0.815 1.029 0.537 0.670 0.845 N=30, T=50 0.671 0.990 1.479 0.307 0.435 0.626 0.164 0.225 0.309 N=30, T=100 0.759 0.993 1.302 0.510 0.663 0.866 0.352 0.453 0.587 N=30, T=200 0.824 0.993 1.201 0.678 0.814 0.986 0.557 0.670 0.810
The bias and concentration of the pooled estimator ˆρpols are pertinent in applications where they inßuence the distribution of derived statistics such as impulse responses, cumulative impulse response functions, the half-life of a unit shock (h) and the long run variance (lrv). We provide some brief illustrations of these effects in the case of h and lrv. In the panel AR models above, the h and lrv estimates based on ˆρpols are bh = ln 0.5/ ln ˆρpols and lrv = 1/(1 − ˆρc pols)2. As is apparent from Tables 1(B) and 1(C), even a small SB can have large effects on these derived functions in the panel case because of the concentration of the estimate ˆρpols and the nonlinearity of the functions. As discussed in the last paragraph, the upper 95% point of the distribution of ˆρpols is smaller than ρ when N is moderately large, and then 95% of the distribution of bh is less than the true half-life h. In model M3, for example, when ρ = 0.9, N = 10 and T = 100, 95% of the distribution ofbh is less than 5.518, whereas the actual half-life is h = 6.597. Similarly, for the same model and parameter values, 95% of the distribution of
c
lrv/lrv lies below 0.696. Even for N = 30, T = 200, 95% of the distribution of lrv/lrv liesc below 0.89. Table 1(C) shows how serious the bias in lrv can be. When T = 50 and N = 1,c the median value of lrv for model M2 is about 76% of the true lrv. For model M3, it is lessc than 20% of the true value when T = 50 and N = 1, and still less than 45% when T = 100 and N = 30. Thus, when estimation of the lrv is based on panel data with Þtted Þxed effects or individual trends, the estimated lrv suffers from serious downward bias. We can expect test statistics that rely on these lrv estimates to be correspondingly affected.
3
Panel Median Unbiased Estimation
This section proposes three panel median unbiased estimators. The Þrst estimator is a panel exactly median unbiased (PEMU) estimator, constructed under the assumptions of a homoge-nous AR(1) parameter and cross sectional independence. This estimator is a panel version of Andrews’ exactly median unbiased estimator in the time series case. It is of interest to see how this procedure is affected by panel observations. As mentioned in the introduction, Cermeno (1999) has independently proposed the use of a PEMU estimator for dynamic panel
models with a common time effect and homogeneous trends and shows in simulations that the approach can work well in models of this type.
The PEMU estimator is based on the assumption of cross section independence (or the presence of a common time effect) which will often be too strong in practical work, particularly with macroeconomic panels. In such applications, PEMU is likely to be less relevant than our second and third estimators, which are designed to take account of cross section dependence that is more general than a common time effect. We will calibrate the performance of the new median unbiased estimators against that of the conventional POLS estimator in cases where there is cross sectional dependence amongst the regression errors. This comparison will highlight the gains of working with median unbiased estimators in the panel context, especially when there is cross section dependence.
3.1
Panel Exactly Median Unbiased Estimation
As discussed in Andrews (1993), it is useful in the construction of median unbiased estimators for the distribution of the least squares estimator to be invariant to scale and other nuisance parameters. It is well known (e.g. Dickey and Fuller, 1979) that least squares estimates of the autoregressive coefficient in pure time series versions of models 1,2 and 3 satisfy such distributional invariance properties. These invariance results extend to the pooled panel forms of the least squares estimators in models 1,2 and 3 under certain conditions, which we now provide. The following property is a panel version of the property given in Andrews (1993) for the time series case. As before, the POLS estimator of ρ is generally denoted by ˆρpols for each of the three models M1, M2, and M3; but when there is possible ambiguity, we use an additional subscript and write ˆρpolsj for the POLS estimator of ρ in model j.
Invariance Property IP1: Under the assumption of cross section independence, the distri-bution of ˆρpolsj depends only on ρ when model j is correct and the error variance σ2i = σ2 for all i. When yit is stationary, it does not depend on the common variance σ2i for model M1, or
( σ2
i, µi) for model M2, or ( σ2i, µi, βi) for model M3, nor on the value of yi0 when ρ = 1 and
yit is non-stationary.
The common variance condition in IP1 is a strong one and will be inappropriate in many ap-plications. It may be relaxed by allowing the individual error variances σ2
i to be iid draws from a
distribution with common scale. For example, if σ2i/σ2are iid χ21, then uit/σ = (uit/σi)(σi/σ),
which is independent of nuisance parameters. The numerator and denominator of ˆρpols may then be rescaled by 1/σ2 and it is apparent that IP1 continues to hold, as shown in the
Ap-pendix. For more general cases of variation in σ2i over i, we may use weighted least squares in the construction of the panel estimator ˆρpols. This extension and other generalizations of ˆ
ρpols that are better suited to empirical applications are discussed later. For the time being, we conÞne our discussion to the estimator ˆρpols and those cases where property IP1 holds.
Property IP1 enables the construction of a panel version of the exactly median unbiased estimator(PEMU) in Andrews (1993). We start by noting that ˆρpols has a median function m(ρ) = mT,N(ρ) which simulation shows to be strictly increasing4 in ρ on the parameter space 4An analytic demonstration of this property would be useful but is not presently available either in the panel
ρ ∈ (−1, 1]. Using this function (which depends on T and N), the panel median-unbiased estimator ˆρpemu can be deÞned as follows;
b ρpemu = 1 m−1(ˆρpols) −1 if if if ˆ ρpols> m(1), m(−1) < ˆρpols ≤ m(1), ˆ ρpols ≤ m(−1), (3)
where m(−1) = limρ→−1m(ρ) and m−1 is the inverse function of m(·) = mT,N(·) so that
m−1(m(ρ)) = ρ. Furthermore, a 100(1-p)% conÞdence interval for ρ in model j can be con-structed as follows. Let qL(·) and qU(·) be the lower and upper quantile functions for ˆρpols.
DeÞne b cLP U = 1 qU−1(ˆρpols) −1 if if if ˆ ρpols > qU(1), qU(−1) < ˆρpols≤ qU(1), ˆ ρpols≤ qU(−1), (4) b cUP U = 1 qL−1(ˆρpols) −1 if if if ˆ ρpols> qL(1), qL(−1) < ˆρpols≤ qL(1), ˆ ρpols ≤ qL(−1), (5)
Then,bcUP U and bcLP U provide upper and lower conÞdence limits and the 100(1 − p)% conÞdence interval for ρ is {ρ :bcL
P U ≤ ρ ≤cbUP U}. This construction follows Andrews (1993). The intervals
are obtained in precisely the same way as in that paper, but use tables of the quantiles of the panel estimator ˆρpols.
3.2
Panel Feasible Generalized Median Unbiased Estimator
The assumption of no cross sectional correlation among the regression residuals is a strong one and is unlikely to hold in many applications. When the structure of cross sectional dependence among the regression errors is completely unknown, it is generally infeasible to deal with the correlations because of degrees of freedom constraints. Hence, it is common to assume some simplifying form of dependence structure. The most conventional way to handle cross section dependence has been to include a common time dummy in the panel regression. The justiÞcation for the common time effect is that certain co-movements of multivariate time series may be due to a common factor. For example, in cross country panels it might be argued that the time dummy represents a common international effect (e.g. a global shock or a common business cycle factor), or in a panel study of purchasing power parity it may represent the numeraire currency.
The model we use here allows for a common time effect that can impact individual series differently. SpeciÞcally, the model for the regression errors has the form
uit= δiθt+ εit, θt∼ iid N(0, 1) over t, (6)
in which θtis a common time effect, whose variance is normalized to be unity for identiÞcation
purposes and whose coefficients, δi, may be regarded as ‘idiosyncratic share’ parameters that values T ≥ 20 and N ≥ 5. There seems to be some evidence from simulations that the property fails for small T when N = 1. Andrews (1993, fn. 4) reports that the 0.95 quantile function appears to dip slightly for values of ρ close to unity for small values of T .
measure the impact of the common time effect on series i. The δi are assumed to be
non-stochastic and we let δ = (δ1, ..., δN). In (6) the general error component εit is assumed to
satisfy
εi,t∼ iid N(0, σ2i) over t, and εi,tis independent of εj,sand θs for all i 6= j and for all s, t.
In this formulation, the source of the cross sectional dependence is generated from the common stochastic series θt and the extent of the dependence is measured by the coefficients δi. In
particular, the covariance between uit and ujt(i 6= j) is given by
E(uitujt) = δiδj. (7)
There is no cross sectional correlation when δi= 0 for all i, and there is identical cross sectional
correlation when δi = δj = δ0 for all i and j. Thus, the degree of cross sectional correlation
is controlled by the components of δ. Setting ut = (u1t, ..., uN t)0 we have the conditional
covariance matrix Vu = E ³ utu0t|σ21, ..., σ2N ´ = Σ + δδ0, Σ = diag³σ21, ..., σ2N´. (8) The model (6) can be regarded as a single factor model in which θt is the common factor
and δi is the factor loading for series i. It has been used in empirical research in studying
growth convergence by Barro and Sala-i-Martin (1992). More general versions of this model that allow for weakly dependent time series effects and multiple factors have been considered in recent work by Bai and Ng (2001) and Moon and Perron (2001) that concentrates on model determination issues relating to the number of factors and panel unit root testing. The models used by these authors are more complex than (6), especially with regard to time series properties. Nonetheless, (6) is general enough to allow for interesting cases of high and low cross sectional dependence and yet simple enough to enable us to develop good procedures for bias removal in dynamic panel regressions where cross section dependence arises. In the panel unit root case, we show later in the paper that time series effects in εit can be treated by a
simple augmented dynamic panel regression and that time series effects in θt can be treated
simply by projecting on the space orthogonal to δ.
As in the earlier case with cross sectional independence, it will be convenient in what follows to assume that the individual error variances σ2i are iid draws from a distribution with common scale. More particularly, we assume that τi = σ2i/σ2 are iid draws from an
independent distribution with density f (τ ) that does not involve further nuisance parameters and whose Þrst moment is Þnite. Then, the standardized error component
uit σ = δiθt+ εit σi σi σ, where δi= δi/σ, has unconditional variance matrix
E µu tu0t σ2 ¶ = Z ∞ 0 £ τ I + δδ0¤f (τ ) dτ = E (τ ) IN + δδ0. with δ = δ/σ.
With this formulation for the error variances, the numerator and denominator of ˆρpols may be rescaled by 1/σ2, giving some invariance characteristics to the panel estimator ˆρpols and stronger invariance properties to the panel generalized least squares estimator ˆρpgls deÞned by
ˆ ρpgls= PT t=1ybt−10 Vu−1ybt PT t=1ybt−10 Vu−1ybt−1 , (9)
whereybt= (yb1t, ...,ybN t)0 and where ybit denotes yit or demeaned or detrended yit, respectively
for Models M1,M2 and M3. In particular, we have the following property.
Invariance Property IP2: Under cross sectional dependence of the form (6), the
distri-bution of ˆρpolsj depends only on (ρ, δ = δ/σ) when model j is correct and the error variance ratios τi = σ2i/σ2 are iid draws from an independent distribution with density f (τ ) that does
not involve further nuisance parameters. Further, the distribution of the panel GLS estimator ˆ
ρpgls depends only on ρ when model j is correct. When ρ = 1 and yit is non-stationary, the
distributions of ˆρpols and ˆρpgls for models 2 and 3 do not depend on the value of yi0.
Appendix B analyzes the bias of ˆρpols and shows that to Þrst order this is the same as the conventional Nickell (1981) bias under cross section independence and does not depend on the (standardized) cross section parameters δito O(1/T ). However, the bias and the distribution of ˆ
ρpolsdo depend on δi, as is apparent from equation (61) in Appendix B. On the other hand, the panel GLS estimator depends only on ρ. Accordingly, we now propose an iterative procedure that involves the use of a feasible GLS estimator, bρpf gls, whose form is speciÞed below in (10). Our objective is to reduce the SB bias problems of these least squares procedures by constructing a feasible generalized version of the PMU estimator of ρ.
The Þrst stage in this iteration uses the residuals from a panel regression in which we use our median unbiased estimator ˆρpemu rather than OLS to reduce the SB bias problems in this primary stage. Simulations we have conducted that are reported below (see Fig.2) indicate that use of the PMU estimator in the Þrst stage helps to remove bias and improve estimates of the error variance matrix even in the presence of cross section dependence. The next stage of the iteration involves the construction of a panel feasible generalized median unbiased (PFGMU) estimator that utilizes this estimated error covariance matrix. In this construction, we use the median function m(ρ) = mT,N(ρ) of the estimatorbρpf gls, which simulations show to be strictly
increasing in ρ on the parameter space ρ ∈ (−1, 1]. Using this median function (which depends on T and N ), the panel feasible generalized median-unbiased estimator,bρpf gmu, can be deÞned as in (3). The process can be continued, revising the estimate of the error covariance matrix in each iteration.
To Þx ideas, the steps in the iteration are laid out as follows:
Step 1: Obtain the estimator ˆρpemuand using the residuals from this regression construct the error covariance matrix estimateVbpemu.
0 0.04 0.08 0.12 0.16 0.2 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96
PEMU
POLS
Single
OLS
Figure 1: Empirical Distributions of Single Equation OLS, POLS and PEMU under No Cross Sctional Dependence (T = 100, N = 20, ρ = 0.9).
Step 2: UsingVbpemu, perform panel generalized least squares as in (9) and obtain the PFGLS
estimate of ρ deÞned by ˆ ρpf gls = PT t=1ybt−10 Vbpemu−1 ybt PT t=1ybt−10 Vbpemu−1 ybt−1 . (10)
Step 3: The panel feasible generalized median-unbiased estimator (PFGMU) is now calculated asbρpf gmu = m(bρpf gls)−1 just as in (3) but using the median function m(ρ) = mT,N(ρ) of
the estimatorρbpf gls.
Step 4: Repeat Steps 1-3 (using updated estimates of ρ in the Þrst stage rather than ˆρpemu) until bρpf gmu converges.
Fig. 1 displays a kernel estimate of the distribution of POLS based on 100,000 replications with N = 20, T = 100, ρ = 0.9 when there is no cross sectional dependence. Apparently, the POLS estimator ˆρpols is more concentrated than single equation OLS (which does not use the additional cross section data) but is badly biased downwards. The bias is sufficiently serious that almost the entire distribution of ˆρpols lies below the true value of ρ.
Fig. 2 shows the distributions of the POLS and PMU estimators for the same parameter conÞguration as Fig. 1 and based on the same number of replications, but with high cross sectional correlation5. As shown in Appendix B, the POLS bias in the case of cross section
dependence is the same to Þrst order as the bias in the cross section independent case, and this
5
0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.7 0.75 0.8 0.85 0.9 0.95 1 PFGLS PFGMU Single OLS POLS PMU
Figure 2: Empirical Distributions of POLS, PFGLS, and PFGMU under High Cross Section Dependence (T = 100, N = 20, ρ = 0.9). 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.75 0.8 0.85 0.9 0.95 1 Single OLS POLS POLS with CTE PFGLS PFGMU PMU
Figure 3: Same as in Fig. 2 with the addition of POLS with a common time effect (CTE) under High Cross section Dependence (T = 100, N = 20, ρ = 0.9).
0 0.01 0.02 0.03 0.04 0.05 0.75 0.8 0.85 0.9 0.95 1 Single OLS POLS PMU HK FD-IV GMM
Figure 4: Extended Comparison of PMU with Common Panel IV Estimators under High Cross Section Dependence (T = 100, N = 20, ρ = 0.9).
bias equivalence between the two cases is born out in the simulation results. As is apparent from Fig. 2, the main effect of the cross sectional dependence is to increase the variation of both the POLS and PMU estimators. In fact, in the displayed case (where the average cross section correlation is around 0.82) the POLS and PMU estimators show only a slight gain in concentration over single equation OLS. In other words, if there is high cross sectional correlation, there is not much efficiency gain from pooling in the POLS estimator. Fig. 3 shows the distribution of the POLS estimator in which a common time effect (CTE) has been estimated. While this estimator is obviously inappropriate under the general form of cross section dependence considered in (6), it is a commonly used procedure in practice and is applicable when the elements of δ all take on a common value. As is apparent from Fig. 3, this estimator successfully reduces variance even though the presence of a common time effect in estimation provides only a crude approximation to the error structure (6).
Figs. 2 and 3 show that the PMU estimator is still quite effective in removing the bias of POLS even under cross section dependence. But its high variance makes it a less appealing estimator for applications than our PFGMU estimator, which reduces variance and removes bias, as we now discuss. Figs. 2 and 3 show the distributions of both the feasible GLS procedures, PFGLS and PFGMU. Evidently, the PFGLS estimator bρpf gls does restore much of the original gains from pooling in terms of variance reduction that were apparent in Fig. 1 for ˆρpols. But, as is also apparent from Fig. 2, the distribution ofbρpf gls is seriously downward biased. Use of the PFGMU median unbiased procedure corrects for this bias while retaining the concentration gains of the GLS estimator. In particular, the distribution ofρbpf gmu is well centered about the true value and has concentration close to that of the median unbiased estimator ˆρpemu under cross sectional independence (Fig. 1).
Fig. 4 shows some comparisons of POLS and PMU in the cross section dependent case against some alternative procedures that have been suggested for dynamic panel regression. The Þrst of these is the crude Þrst difference instrumental variable estimator (FD-IV) which
uses yit−2 as an instrument in a Þrst differenced form of the model. Apparently, FD-IV
has variation substantially in excess of all the other estimators. The commonly used GMM estimator which uses the full set of instruments {yis : s = 0, 1, ..., t − 2} shows downward
bias but not as severe as POLS and it seems to have comparable variance. HK is the bias corrected GMM estimator suggested in Hahn and Kuersteiner (2000) and Hahn, Hausman and Kuersteiner (2001) and this estimator apparently has performance closest to that of the PMU estimator. All these procedures clearly show inferior performance to the bρpf gmu estimator under high cross section dependence.
3.3
Seemingly Unrelated Median Unbiased Estimation
The results above indicate that, if we are to gain from panel estimation by pooling cross section and time series information when there is cross section dependence, we need to take account of the dependence in estimation. In contrast, most empirical studies that utilize dynamic panels in the international Þnance and the macroeconomic growth literatures tend to ignore issues of cross sectional dependence when pooling. Our results indicate that there is information in cross sectional correlation that is valuable in pooled estimation and that it can be accounted for, at least in situations where the cross section sample size N is not too large. Moreover, one can utilize this information and at the same time deal with SB bias problems in dynamic panel estimation.
Notwithstanding these potential advantages of pooling dependent data and adjusting for bias in dynamic panels, perhaps the most important issue in pooled regressions relates to the justiÞcation of the homogeneity restriction on the autoregressive coefficient ρ. In the absence of this restriction, it might be thought that there would be little gain from pooling time series and cross section data. However, because of cross section dependence, there are advantages to pooling panel data even in the estimation of heterogeneous coefficients. The reasoning is the same as that of a conventional seemingly unrelated regression (SUR) system. But in a dynamic panel context there are still SB bias problems that need attention. This section shows that these can be addressed using a SUR version of the panel median unbiased procedure.
An additional advantage to performing heterogenous coefficient estimation is that it fa-cilitates testing of the homogeneity restriction. Therefore, this section also proposes a test for homogeneity that is based on the seemingly unrelated panel median-unbiased (SUR-MU) estimator.
We start the discussion by combining Models M1,M2 and M3 with the following heteroge-nous autoregressive panel model for the latent panel variable y∗
it:
yit∗ = ρiyit−1∗ + uit, for t = 1, · · · , T, and i = 1, · · · , N, (11)
in which the regression errors
ut∼ iid N(0, Vu), for t = 1, · · · , T, (12)
where ut = (uit,..., uN t)0. This formulation allows for a general form of cross section error
permitted for each of the models.
When |ρi| < 1 for all i, the cross section error correlations are higher than the cross section
correlations among the regressors yit−1. To see this, note that the correlation between yit and
yjt is given by γyi,j = n E (yityjt) E¡y2 it ¢ E³y2 jt ´o1 2 = γij p 1 − ρi2 q 1 − ρj2 1 − ρiρj < γij, (13)
where γij = E(uitujt)/{E(u2it)E(u2jt)} 1
2. We might therefore anticipate the potential gains
from SUR estimation to be substantial - the regressors are different and less correlated across individual equations in the panel for which the errors are more correlated. In consequence, we propose a SUR-MU estimator based on the following iteration.
Step 1: Obtain the time series panel median unbiased estimates ˆρiemu for each series i = 1, ..., N (and the appropriate model) and use the regression residuals to construct the error covariance matrix estimateVbEMU.
Step 2: UsingVbEM U perform a conventional seemingly unrelated regression on the panel and
obtain the SUR estimates of the ρi,bρisur .
Step 3: The panel seemingly unrelated median unbiased (SUR-MU) estimator is now
cal-culated as bρisurmu = m(bρisur)−1 just as in (3) but using the median function m(ρ) = mT,N(ρ) of the estimatorbρisur for each i.
Step 4: Repeat steps 1-3 until bρisurmu converges.
4
Testing Homogeneity Restrictions
Using unrestricted estimates of the coefficients ρi in the heterogeneous dynamic panel model (11), Wald tests can be constructed to test the homogeneity restriction H0 : ρi = ρ for all i.
It is well known that in Þnite samples, Wald tests suffer from size distortion that is sometimes serious even in simple univariate regressions. For the panel regression case here we have found that the size distortion of Wald tests becomes even more serious as the cross section sample size N increases. This section Þrst investigates the asymptotic properties of Wald tests based on the SUR approach in both the stationary and nonstationary cases and shows how cross section dependencies affect the asymptotic theory under nonstationarity. We then propose an alternative Wald procedure for testing homogeneity that utilizes the structure of the cross section dependence in the construction of the Wald statistic.
4.1
The Wald Test and its Asymptotic Properties
The Stationary Case
Using the unrestricted estimatesbρisurmu of the coefficients ρi in the heterogeneous dynamic panel model (11), Wald tests can be constructed to test the homogeneity restriction H0 :
ρi = ρ for all i. More speciÞcally, letbρsurmu= (bρisurmu) be the SUR-MU estimate of the vector
ρ= (ρ1, ..., ρN)0 and write the restrictions in H0 as Dρ = 0 where D = [iN −1, −IN −1] and iA
has A unit elements. Under Gaussianity and in the stationary case where |ρi| < 1 for all i, the
SUR-MU estimatorbρsurmuis asymptotically (T → ∞, N Þxed) equivalent to the unconstrained
maximum likelihood estimate6 of ρ. In that case, standard stationary asymptotics and some algebraic manipulations (outlined in Appendix C) lead to the limit theory
√ T³bρsurmu− ρ´→dN (0, VSU R) , (14) where VSU R−1 =·³vuijE (yityjt) ´ ij ¸ = Vu−1∗ E¡yty0t ¢ . (15)
In (15) the operator ∗ is the Hadamard product, vuij is the ij’th element of Vu−1, where Vu =
E(utu0t) = Σ + δδ0 as in (8), and E (yityjt) = δiδj 1−ρiρj i 6= j σ2 i+δ 2 i 1−ρ2 i i = j , so that E¡ytyt0 ¢ =¡Σ + δδ0¢∗ R, where R = (rij) and rij = 1 1 − ρiρj . (16)
>From (15) and (16) it is apparent that the covariance matrix VSUR depends on both ρ and
δ as well as Σ. When H0 holds, E (ytyt0) = ¡
Σ + δδ0¢/(1 − ρ2) and V
SU R has a simpler form in
which
VSU R−1 = 1 1 − ρ2V
−1
u ∗ Vu, (17)
which depends on the common ρ and again on the cross section dependence parameter δ. The Wald statistic for testing H0 is
Wsurmu=bρ0surmuD0 h DVbSU RM UD 0i−1 Dbρsurmu, where b VSU RM U = " T X t=1 Zt0Vbu−1Zt #−1 ,
in which Zt = diag(y1t−1, ..., yN t−1) and Vbu is an estimate of the error covariance matrix
Vu computed from the SUR-MU regression residuals. Under H0 and in the stationary case,
it is straightforward to show that traditional chi-squared limit theory for Wsurmu holds, i.e.
Wsurmu→ χ2N. 6
Note that the median function m (·) is asymptotically (T → ∞, N Þxed) the identity function and the SUR estimator of ρ is the vector of Gaussian maximum likelihood estimators of the autoregressive coefficients in the unconstrained models.
The Unit Root Case
In the nonstationary ρ = 1 case, the asymptotic results depend, as might be expected, on whether M1, M2 or M3 is employed in estimation and also on the boundary condition that arises in the transition from the SUR estimator to SUR-MU - c.f. (3). In addition, the asymptotic theory for the SUR estimator is more complex than that of a traditional unit root model when there is cross section dependence. For instance, when model M1 is used and the null hypothesis H0 : ρi= 1 ∀i holds, derivations (outlined in Appendix C) using standard unit
root limit theory deliver the limit distribution of the SUR estimator bρsur. This estimator is
deÞned as b ρsur = Ã T X t=1 Zt0Vbu−1Zt !−1Ã T X t=1 Zt0Vbu−1yt ! ,
where Vbu is an estimate of Vu based on residuals from a Þrst stage regression. We Þnd the
following asymptotic distribution forbρsur
T³bρsur− iN ´ d → · Vu−1∗ Z 1 0 BB0 ¸−1·Z 1 0 B ∗ ³ Vu−1dB´¸= ξ, (18)
where B is vector Brownian motion with covariance matrix Vu. It is clear from (18) that
the limit distribution of T³bρ
SUR− iN ´
depends on the cross section dependence parameter δ even in the homogeneous case where ρi = 1 ∀i. Correspondingly, the asymptotic distribution of bρsurmu in the unit root case also depends on cross section dependence and error variance nuisance parameters. The Wald statistic, Wsur, for testing H0 is given by
Wsur = ρb 0 SU RD 0hDVb SU RD 0i−1 DbρSU R d → ξ0D0 " D µ Vu−1∗ Z 1 0 BB0 ¶−1 D0 #−1 Dξ. (19) whereVbSU R=³PTt=1Z 0 tVbu−1Zt ´−1
, and again the limit distribution (19) depends on nuisance parameters.
By contrast, in the unit root case where homogeneity of ρ across i is imposed, the pooled GLS estimator of ρ is b ρ = Ã T X t=1 yt−10 Vu−1yt−1 !−1Ã T X t=1 yt−10 Vu−1yt ! ,
with a corresponding feasible SUR version. By straightforward derivation (see Appendix C), we Þnd that T (bρ − 1)→d R1 0 W0dW R1 0 W0W = PN i=1 R1 0 WidWi PN i=1 R1 0 Wi2 , (20)
where W = (Wi) is standard Brownian motion with covariance matrix IN. The limit (20) here
4.2
Hausman and ModiÞed Hausman Tests under Cross Section
Depen-dence
The Stationary Panel Case: H0 : ρi = ρ
The main problem with the conventional Wald test, as mentioned above, is that size dis-tortion can be serious and it typically increases with the number of restrictions. Also, the Wald test based on SUR or SUR-MU estimation requires N < T and is heavily inßuenced by the nuisance parameters of cross section correlation. This section proposes an alternate procedure for dealing with cross section dependence that takes into account the structure of the dependence.
Start by writing the model M1 (with suitable adjustments for models M2 and M3) in vector form as
yt= Ztρ + ut, Zt= diag (y1t−1, ..., yN t−1) , ρ = (ρ1, ..., ρN)0. (21)
Let ˆρi (respectivelybρ) be the OLS estimate of ρi (ρ) Then
b ρ = Ã T X t=1 Zt0Zt !−1ÃXT t=1 Zt0yt ! .
Let bρemu be the corresponding vector of median unbiased estimates of ρi. Under the null
hypothesis of homogenous autoregressive coefficients ρi = ρ ∀i, and as T → ∞, we have
√
T (ˆρi− ρ) →dN (0, 1 − ρ2) for models M1, M2 and M3, with the same result for the median
unbiased estimatorsbρiemu. Under cross section independence and as T → ∞ for Þnite N, we have N X i=1 √ T (ˆρi− ρ) p 1 − ρ2 →dN (0, N ).
On the other hand, if there is cross section dependence of the form implied by (6), then in the stationary case for model M1 we have
yit= ∞ X j=0 ρj(δiθt−j+ εit−j) = δi ∞ X j=0 ρjθt−j+ ∞ X j=0 ρjεit−j = δiµt+ ηit, say.
It follows that the asymptotic covariance between ˆρi and ˆρj is given by
acov³ˆρi, ˆρj´= 1 T (δiδj)2¡1 − ρ2¢ ³ δ2i + σ2 i ´ ³ δ2j + σ2 j ´ = 1 T vij2 viivjj ³ 1 − ρ2´,
where vij is the ij’th element of Vu = Σ + δδ0. Setting ˆρ = (ˆρ1, ..., ˆρN)0 and letting iN be an
N − vector with unit elements, we Þnd that standard derivations lead to the following limit theory √ T³ˆρ − ρiN ´ = à 1 T T X t=1 Zt0Zt !−1à 1 √ T T X t=1 Zt0ut ! → d N ³ 0, D−1y £Vu∗ E ¡ ytyt0 ¢¤ D−1y ´ = N³0,³1 − ρ2´RV ∗ RV ´ , (22)
where Dy = diag(E(y21t), ..., E(yN t2 )) and the matrix RV has ij’th element vij/{viivjj}1/2. It follows that N X i=1 √ T (ˆρi− ρ) p 1 − ρ2 →dN (0, i 0 N(RV ∗ RV) iN).
The same result applies when the median unbiased estimatesbρiemu are used in place of ˆρi. We propose to construct an estimate of the matrix RV that appears in the asymptotic
covariance matrix of (22) and use this estimate to develop an alternate test of H0. The following
moment based procedure may be used7.
Moment Based Estimation of (δ, Σ)
Step 1: Estimate the ρi by using OLS or EMU and obtain the regression residuals ˆuit =
yit−bρiyit−1, which are asymptotically equivalent to OLS residuals and consistent (as
T → ∞, N Þxed) for uit. In particular,
ˆ
uit= uit+ (ρi−ρbi)yit−1 = uit+ op(1)
in both stationary and nonstationary cases.
Step 2: Construct the moment matrix of residuals MT = T1 Pt=1T uˆtuˆ0t, which is a consistent
(as T → ∞, N Þxed) estimate of Vu. Let mT ij be the ij’th element of MT.
Step 3: Estimate the cross section coefficients δ and the diagonal elements of Σ using the following moment procedure that Þnds the least squares best Þt to the matrix MT, that
is ³ b δ,Σb´= arg min δ,Σ tr h¡ MT − Σ − δδ0¢ ¡MT − Σ − δδ0¢0 i . (23)
The solution of (23) satisÞes the system of equations ˆ δ = (MTˆδ − Σˆδ)/ˆδ 0ˆ δ, σˆ2i = MT ii− ˆδ 2 i, i = 1, ..., N
and this can be solved using the iteration
δ(r) = (MTδ(r−1)− Σδ(r−1))/δ(r−1)0δ(r−1),
σ(r)2i = MT ii− δ(r)2i , (24)
starting from some initialization δ(0) (such as the largest eigenvector of MT) until
con-vergence. Since MT →p Vu = Σ + δδ0 as T → ∞, it follows that (bδ,Σ) →b p (δ, Σ) as
T → ∞, with N Þxed. SinceΣ →b p Σ > 0 as T → ∞,Σ will be positive deÞnite for largeb
enough T.
Step 4: Construct the variance matrix estimate Vbu =Σ +b δbbδ0. Letvbij be the ij’th element of b
Vu and construct the estimate RbV whose ij’th element is bvij/{vbiivbjj}1/2.
7Appendix D gives an algorithm for Gaussian maximum likelihood estimation of the cross section coefficients.
Simulation results indicate that the moment based method described here gave superior results, especially for large N.
Since Vbu →p Vu, we have RbV →p RV as T → ∞. Now let eρ be the PFMGU estimate of
ρ under the assumption of homogeneity. Under H0, the pooled estimate eρ is asymptotically
equivalent to GLS and then by standard limit theory √ T (eρ − ρ) = Ã 1 T T X t=1 y0t−1Vu−1yt−1 !−1Ã 1 √ T T X t=1 yt−10 Vu−1ut ! →dN µ 0,ntracehVu−1E¡yty0t ¢io−1¶ . Since E¡yty0t ¢ =³Σ + σ2δδ0´∗ R = Vu∗ R = 1 1 − ρ2Vu,
under H0, we end up with the simple result
√ T (eρ − ρ) →dN Ã 0,1 − ρ 2 N ! . Next consider the asymptotic covariance
Acov à 1 √ T T X t=1 Zt0ut, 1 √ T T X t=1 y0t−1Vu−1ut ! = 1 T T X t=1 Zt0E¡utu0t ¢ Vu−1yt−1= 1 T T X t=1 Zt0yt−1→ E¡y1t2¢ E¡y2t2¢ .. . E¡yN t2 ¢ = DyiN,
as T → ∞, from which we deduce that
Acov³√T³ˆρ − ρiN ´ ,√T (eρ − ρ)´ = D−1y [DyiN] n tracehVu−1E¡ytyt0 ¢io−1 = iN ³ 1 − ρ2´. (25)
Our test statistic for H0 is based on the difference between the estimates
√ T³ˆρemu−eρiN ´ =√T³ˆρemu− ρiN ´ −√T (eρ − ρ) iN,
and from (22), (25) and joint convergence we Þnd that √ T³ˆρemu−ρie N ´ q 1 −eρ2 = √ T³ˆρemu− ρiN ´ q 1 −eρ2 − √ T (eρ − ρ) q 1 −eρ2 iN →dN ( µ 0, RV ∗ RV − 1 NiNi 0 N ¶ . (26) It follows that we may construct the Hausman-type test statistic
G = T 1 −ρe2 ³ ˆ ρ emu−eρiN ´0½h b RV ∗RbV i−1 − 1 NiNi 0 N ¾ ³ ˆ ρ emu−eρiN ´ , (27)
which is based on the difference between the robust-to-heterogeneity estimate ˆρemuof ρ and the efficient estimateeρ of ρ under the null, and which uses the moment based procedure outlined
above to construct estimates of Vu and RV. We use the notation Gpf mgu to indicate that the
pooled estimateeρ in (27) is the PFMGU estimate of the (common) ρ. Then, in view of (26) and the consistency ofRbV, we have
Gpf mgu→ χ2N, as T → ∞. (28)
One practical difficulty that can arise with (27) is that the variance matrix hRbV ∗RbV i−1
−
1
NiNi0N is not necessarily positive deÞnite and, in our simulations negative values of G have
occasionally occurred when N and T are small (N = 10, T = 50).
The Panel Unit Root Case: Ho: ρi= 1, ∀i
As shown in Appendix C, the Hausman test has a limit distribution in the unit root (ρi= 1, ∀i) case that is dependent on the cross section nuisance parameters. It is therefore unsuitable for testing homogeneity. However, there is a simple way of constructing a modiÞed test that is free of nuisance parameters, which we now describe.
Under the null hypothesis, we have as in (67) 1 √ Ty[T r] = 1 √ T [T r] X t=1 ut→dB (r) = BM (Vu) .
Note that we can decompose B into component Brownian motions as follows
B (r) = δBθ(r) + Bε(r) , (29) where 1 √ T [T r] X t=1 θt→dBθ(r) = BM ³ σ2´, and√1 T [T r] X t=1 εt→dBε(r) = BM (Σ) .
Let δ⊥ be an N × (N − 1) matrix that spans the orthogonal complement of the vector δ. Then
h¡ δ0⊥Σδ⊥¢−1/2δ0⊥i√1 Ty[T r] →d ¡ δ0⊥Σδ⊥¢−1/2δ0⊥B (r) =¡δ0⊥Σδ⊥¢−1/2δ0⊥Bε(r) = W⊥(r) , (30) where W⊥(r) = BM (IN −1) , or (N − 1) - vector standard Brownian motion. The transforma-tion matrix that appears in (30) can be estimated by implementing the following modiÞcatransforma-tion of our earlier procedure.
Orthogonalization Procedure (OP)
Step 1: Construct the moment matrix of differences (for models M1 and M2) or demeaned
differences (for model M3) which we write as MT = T1 PTt=1uˆtuˆ0t. As in the stationary
case, MT is a consistent (as T → ∞, N Þxed) estimate of Vu. Again, let mT ij be the
ij’th element of MT.
Step 2: Estimate the cross section coefficients δ and Σ by moment based optimization as in (23) leading to (bδ,Σ). As before, (b bδ,Σ) →b p (δ, Σ) as T → ∞, with N Þxed, and Σ isb
Step 4: UsingΣ andb bδ, construct8 bδ⊥ and Fbδ= ³ b δ0⊥Σbbδ⊥´−1/2bδ0⊥. Clearly, b Fδ= ³ b δ0⊥Σbbδ⊥´−1/2bδ0⊥→p ¡δ0⊥Σδ⊥¢−1/2δ0⊥, (31) as T → ∞.
UsingFbδ we transform the data yt(or demeaned/detrended data in the case of models M2
and M3) giving yt+ = Fbδyt. As is apparent from (30), the transformation Fbδ asymptotically
removes cross section dependence in the panel and y+t is asymptotically cross section indepen-dent as T → ∞. Using yt+ we may now construct estimates of the autoregressive coefficients.
Let ˆρ+i (respectively bρ+) be the OLS estimate of ρi = 1 (ρ = iN −1). Then, in an obvious notation, b ρ+= Ã T X t=1 Zt+0Zt+ !−1Ã T X t=1 Zt+0yt+ ! .
Letbρ+emu be the corresponding vector of median unbiased estimates of ρi. Similarly, let eρ+ be the PFMGU estimate of ρ obtained from the transformed data yt+ under the assumption of homogeneous unit roots. The modiÞed Hausman statistic is deÞned as
G+H = T2³ˆρ+emu−ρe+iN −1´0³ˆρ+emu−eρ+iN −1´, (32) As shown in Appendix C G+H →dΞ0N −1ΞN −1 (33) where ΞN −1= hR1 0 W⊥,12 i−1hR1 0 W⊥,1dW⊥,1 i −hR01W⊥0W⊥ i−1hR1 0 W⊥0dW⊥ i .. . hR1 0 W⊥,N−12 i−1hR1 0 W⊥,N−1dW⊥,N−1 i −hR01W⊥0W⊥ i−1hR1 0 W⊥0dW⊥ i , (34)
and where {W⊥,i: i = 1, ..., N − 1} are the components of the N − 1 vector standard Brownian
motion W⊥ Clearly, G∗
H is free of nuisance parameters in the limit and is suitable for testing
the null H0: ρi= 1 ∀i.
An alternate approach is to construct panel unit root test statistics directly by taking the sum of the differences between the estimates ˆρ+i , ˆρ+i,emu and their limits under the null, viz.
G+ols = N −1X i=1 ˆ ρ+i − iN −1 ˆ σˆρ+ (35) G+emu = N −1X i=1 ˆ ρ+i,emu− iN −1 ˆ σˆρ+ i,emu (36) 8
The orthogonal complement matrix bδ⊥ can be constructed by taking the eigenvectors of the projection matrix Pˆδ= I − ˆδ(ˆδ
0ˆ
In contrast to (32), the test statistics (35) and (36) do not involve a pooled estimate of the homogeneous unit root parameter. As shown in Appendix C, for Þxed N we have the following limit theory for these statistics as T → ∞
G+ols, →d N −1X i=1 ξi, , G+emu→d N −1X i=1 ξ−i (37) where ξi= (R01Wi2)−1(R01WidWi) and ξ−i = ( ξi ξi < 0 0 ξi ≥ 0
The limits in (37) depend only on N. Both G+ols, G+emu are therefore suitable for testing the null H0.
Note that there are only N − 1 elements in (34) - (36). This is because the panel system has been transformed to dimension N − 1 in Step 4 above in order to remove the effects of cross section dependence in the limit.
The tests (35) and (36) have the advantage that they lend themselves to simple large N asymptotics. In particular, the means and variances
E (ξi) , E³ξ−i ´= µξ, µξ− Var(ξi), Var(ξ−i ) = σ2ξ, σ2ξ−
can be computed and, noting that ξi, ξ−i are iid over i, we have the large N limit theory 1 √ N N −1X i=1 ³ ξi− µξ´→dN ³ 0, σ2ξ´, √1 N N −1X i=1 ³ ξ−i − µξ− ´ →dN ³ 0, σ2ξ− ´ . It follows that in sequential asymptotics (see Phillips and Moon, 1999) as (T, N → ∞)seq
G++ols = √1 N σξ PN −1 i=1 · ˆ ρ+i−iN−1 ˆ σˆρ+ − µξ ¸ G++emu= √ 1 N σξ− PN −1 i=1 " ˆ ρ+i,emu−iN−1 ˆ σ ˆ ρ+ i,emu − µξ− # →dN (0, 1) .
All of these procedures are easy to implement. Their Þnite sample performance is assessed in Section 6 below. As shown in the next section, once the OP procedure has been applied to the data, a wide class of panel unit root and stationarity tests become applicable.
4.3
Dynamic AR(p) Panels with Cross Section Dependence
The procedures outlined above for panel unit root testing under cross section dependence may be applied to cases of higher order panel dynamics and cases where the common factor com-ponent θtis weakly dependent. SpeciÞcally, consider a panel of dynamic panel autoregressions
with (possibly) heterogenous lag orders `i for each i and allow for cross section dependence of
the same form as (6) above. The model is written in augmented format as
∆yit= µi+ βit + (ρ − 1)yit−1+ `i X j=1
The OP procedure leading to (31) above is the same as that laid out above except for the Þrst step. Here, instead of using the moment matrix of differences or demeaned differences, one simply uses the moment matrix of the regression residuals ˆuit obtained under the (null
hypothesis) restriction ρ = 1 in (38).
Since the transformed data y+it are asymptotically uncorrelated across i, regressions like (38) of yit+on yit−1+ and the lagged differences ∆yit−j+ do not suffer (asymptotically) from cross section dependence. Importantly, this will be so even when the common time series factor θtis weakly
dependent rather than uncorrelated over time. This is because the transformation procedure leading to (31) continues to eliminate the contribution of the common factor component θt to
the limit Brownian motion in (29). It follows that several existing panel unit root tests that were designed to work with data that are independent across section can now be applied to test for panel unit roots when there is cross section dependence. Accordingly, we consider here two broad types of panel unit root tests.
Meta-Analysis Tests for Panel Unit Roots and Stationarity under Cross Section Dependence
The Þrst type of test is based on meta-analysis, wherein the p-values of tests for each cross section individual i are combined to construct a new test. Tests of this type were suggested in Choi (2001a) and Maddala and Wu (1999) for use in testing unit roots with panel data under cross section independence.9 These tests apply here under cross section dependence after our OP orthogonalization procedure has been implemented. Choi (2001a) provides a full discussion of tests of this type and his simulation results suggest use of the three tests that we concentrate on here.
Let pi be the p-value of a unit root test associated with cross section element i. DeÞne
P = −2 N −1X i=1 ln(pi), (39) Pm = − 1 √ N N −1X i=1 [ln(pi) + 1] (40) Z = √1 N N −1X i=1 Φ−1(pi) (41)
The P test is called the inverse chi-square test or Fisher test after Fisher (1932). The Pm test
statistic is a centered and normalized version of P that is useful for large N. The Z test is called the inverse normal test, following Stouffer et al. (1949). As discussed in Choi (2001), we have the following limit distributions for P and Z as T → ∞
P →dχ22(N −1), Z →dN (0, 1) for Þxed N, (42)
leading to the following sequential limit theory as (T, N → ∞)seq
Pm, Z →dN (0, 1). (43)
9
Choi (2001b) considers several statistics based on meta-analysis with random individual and time effects in (1).
Each of these tests and the limit theory applies under the null hypothesis to dynamic panel autoregressions like (38) with cross section dependence after the OP procedure has been im-plemented.
Other Tests for Panel Unit Roots
In fact, after transforming the data using the OP procedure, we can apply most other methods for testing panel unit roots that are valid under cross section independence. Baltagi (2001) provides a recent discussion and overview of these tests, which generally take the form of cross section averages of time series test statistics and have the generic form
Gτ = 1 N − 1 N −1X i=1 τi,
where τistands for an individual unit root test statistic. This class of tests can also be extended
by using the bias reduction techniques discussed earlier in present paper. For instance, we could use an ADF-t statistic based not on OLS estimation but instead on EMU estimation as explained earlier (c.f. Andrews and Chen, 1994).
Im, Pesaran and Shin (1997, IPS) use two cross-sectional average tests constructed like Gτ and study their small sample properties using simulations. Without modiÞcation, this
type of test typically suffers from serious size distortion in small samples due to SB bias. IPS use simulation to calculate the mean and variance of the Gτ statistics and they employ bias
correction in the implementation of these procedures. However, in the dynamic panel AR(p) case, the means and variances of the Gτ statistics heavily depend on the nuisance parameters
that arise in the augmented dynamic terms. Tanaka (1984) and Shaman and Stine (1988) provide formulae for the mean bias for cases up to an AR(6) for Model 1 and 2. For example, for an AR(2), the OLS estimator of ρiin (38) will be biased downward when the true coefficient on y+it−2 is negative, while it will be biased upward when the true coefficient on yit−2+ is large and positive. IPS also found that the size distortion problem of their Gτ tests heavily rely on
the sign of the true coefficient on yit−2+ . Since their Monte Carlo studies are based on AR(2) process, their size distortion corrections are based on the sign and magnitude of the coefficient on yit−2+ . For general dynamic panel AR(p) processes, the size of the Gτ test will depend on
all the nuisance parameters arising in the augmented terms and, in the absence of analytic formulae, extensive simulations are needed to make the appropriate corrections in such cases. The Þnite sample performance of these panel unit root tests and, more generally, tests of homogeneity are considered in the simulation experiments reported in Section 6 below.
5
Simulation Experiments
This section consists of three parts. First, we report the Þnite sample performance of the three panel median unbiased estimators. Second, we show the Þnite sample performance of the Wald statistic Wsurmu, and the Gpf mgu statistic. Finally, we examine the small sample performance
of the panel unit root tests G++
emu, G++ols, Pm, and Z, and show how well the orthogonalization
5.1
Design of Data Generating Process
The data generating process for the Þrst two parts is given by
yit = ρiyit−1+ uit, (44)
uit = δiθt+ εit, (45)
where εit∼ iid N (0, 1) over i and t, θt∼ iid N (0, 1) over t, and for (ρi, δi) parameter selections
that are detailed below. The primary distinction is between the homogeneous case where ρi= ρ for all i and the heterogeneous case where the ρi differ across individuals i. We also distinguish cases of high and low cross section dependence according to the value of δi. Estimation is based
on the following two regression models that involve a Þtted mean and trend: yit= ai+ ρiyit−1+ uit for Model M2
yit= ai+ bit + ρiyit−1+ uit for Model M3
Panel data are generated under four speciÞcations which differ according to their degree of the cross sectional dependence and whether or not the homogeneity restriction is imposed on ρ. These speciÞcations are as follows:
Case I: (Homogeneity and Low Cross-sectional Dependence) The homogeneity restriction is
imposed and we set ρ1= ρ2 = · · · = ρN = 0.9, and allow low cross sectional dependence by setting δi ∼ U[0, 0.2], where U[a, b] represents the uniform distribution over the
in-terval [a, b]. In this experiment, the average error (uit) cross sectional dependence has
correlation coefficient around 0.03.
Case II: (Homogeneity and High Cross-sectional Dependence) Again, we set ρi = 0.9 for all i and δi∼ U[1, 4]. Here, the lowest error (uit) cross sectional correlation is around 0.52,
the median is around 0.82, and the highest is around 0.94.
Case III: (Heterogeneity and Low Cross-sectional Dependence) Here, ρi ∼ U[0.7, 0.9], and δi ∼ U[0, 0.2].
Case IV: (Heterogeneity and High Cross-sectional Dependence) Here ρi ∼ U[0.7, 0.9], and δi ∼ U[1, 4].
Case V: (Testing Homogeneity under Stationarity) Under the null hypothesis of homogeneity of ρ, we set ρi = 0.8 for all i to investigate test size. Under the alternative, we set ρi∼ U[0.7, 0.9] and consider test power.
Each experiment involves 5,000 replications of panel samples of (N, T ) observations. We use N = 10, 20, 30 and T = 50, 100, 200.
The third part of the simulation has two sections. In the Þrst section the Þtted models have intercepts and trends (as in M2 and M3) and the DGP is based on (45) and (46) with the following parameter settings: