Chapter 5 Locational Determinants of Chinese OFDI in OECD Countries 98
5.1 Introduction 98
5.4.1 Panel Data Analysis 109
Baltagi and Griffin (1997) point out that pure cross-sectional studies cannot control for behavioural changes occurring over time, while pure time-series studies cannot control for unobservable country effects. Panel data, which can be described as time-series cross-sectional data, are generally believed to be able to widen the database in order to ensure better and more reliable estimates of the parameters of the model. Therefore, panel datasets generally possess several major advantages compared to conventional cross-sectional and time-series datasets (Hsiao, 2003).
Location of Chinese OFDI to OECD Countries H1. Human Mobility H2. Strategic Assets H3. Cultural Distance H4. Market Size H5. Inward FDI H6. Bilateral Trade + + + + + -
5.4.1.1 Poolability
Before running any regression with the dataset, it is necessary to test whether the regression parameters take values common to all cross-sectional units for all time periods, in order to satisfy the overall assumption of pooling the dataset, which is the homogeneity of the slope coefficients (Hsiao, 2003).
A Chow test is a test of whether the coefficients estimated over one group of the data are equal to the coefficients estimated over another. Using the Chow test, the question of whether ‘to pool or not to pool’ is reduced to a test of the validity of the null hypothesis H0: θi = θ for all i (Baltagi, et al., 2008).
Under H0, the following test statistic:
N i i N i i obs k T N ess k N ess ess F 1 1 ) ( / ) 1 /( ) ( Equation 5.1is distributed as F((N-1)k, N(T-k)), where ess is the error sum of squares from the pooled regression, essi is the error sum of squares from the separate
regressions of each cross-section individual, N is the number of the cross- section individuals, T is the length of the time-series, and k is the number of the estimated parameters.
Rejection of the null hypothesis means the existence of heterogeneity across the data units, which breaks the panel data assumption of pooling observations. The existence of heterogeneity leads to inaccurate estimates and even wrong signs for the coefficients (Maddala et al., 1997). The panel data estimation should not be applied when the hypothesis of homogeneity of the coefficients is rejected. Therefore, testing for the homogeneity of the dataset should be the first step of panel data analysis. Until the homogeneity assumption is confirmed, panel data analysis could be meaningless.
5.4.1.2 Fixed or random effects estimator
In panel data analysis, one assumes that the effects of all omitted variables are driven by three types of variables (Hsiao, 2003):
individual time-invariant: same for a given cross-sectional individual through time but variant across all cross-sectional individuals;
period individual-invariant: same for all cross-sectional individuals at a specific point in time but variant through time;
individual time-variant: variant across cross-sectional individuals at a given point in time and through time.
There are two common ways to deal with the unobserved effects: fixed effects estimation, which treats unobserved effects as parameters to be estimated, or random effects estimation which treats unobserved effects as variables. Whether the fixed effects model or the random effects model should be considered depends on the context of the data, the manner in which they were collected, and the environment from which they came (Hsiao, 2003). For this study, a fixed effects model cannot be used since the equation includes a time dummy variable (cultural distance).
Since no lagged dependent variables have been used as regressors in this study, the model of this study will be a static single equation panel model written as: it it it X u y '
Equation 5.2where yit is the measure of Chinese OFDI flow, Xit represents the vector of
explanatory variables, and uit is the error term which contains the unobserved heterogeneity components. Equation 5.2 can be transformed into the estimating Equation 5.3, which provides an explicit specification of the linkage
between OFDI flow of China to OECD countries and related country specific variables. it i i i i i i i i i i i i it FDI GDP TRADE R D HC CD u OFDI
&
Equation 5.3In most applications, the error component structure can be specified as a case of the following representation:
it i it
u
Equation5.4
Where
i is unobserved heterogeneity terms, and
it is i.i.d. across individuals and time periods. As discussed above, in this study the unobserved heterogeneity
i is treated as a random effect.The two explicit assumptions for applying a random effects analysis are strict exogeneity on the unobserved effect
i, which means that once Xit and
iare controlled for, Xis has no partial effect on yit for s t, and the
orthogonality between
i and Xit (Wooldridge, 2003).However, the fundamental exogeneity assumption for the regressors may not be supported by the data. The consistency of parameter estimates is in doubt when some explanatory variables are correlated with the model residuals. The endogenous regressors will appear when some of the regressors in a regression equation are the dependent variables in others and consequently are correlated with the disturbances of the equation under consideration. The existence of endogeneity indicates that an alternative but more consistent estimator is needed. In situations like this, instrumental variable procedures are indispensable (Wooldridge, 2003).
5.4.1.3 Instrumental variable methods
The modern approach to system instrumental variables estimation is based on the principle of GMM (Wooldridge, 2003). ‘With GMM, we can consider
different exogeneity assumptions related to
i or
it , producing different orthogonality conditions. Apart from the difference between random and fixed effect specifications (instruments correlated or not with
i ), we can also consider strictly or weakly exogenous instruments if explanatory instruments are correlated with
it’ (Boumahdi and Thomas, 2008, p. 107).Besides addressing the endogeneity problem, GMM can also provide consistent and efficient estimation under the condition of heteroscedasticity. The existence of heteroscedasticity leads to unbiased but inefficient least square estimators, and also invalidates the tests of significance since the estimates of the variances are biased (Maddala and Lahiri, 2009). Under the no conditional heteroscedasticity assumption, GMM provide equally consistent variance-covariance matrix and efficient estimation with the same instrument set, compared to traditional 2SLS or 3SLS estimators, but GMM is more efficient if the assumption does not hold (Boumahdi and Thomas, 2008).