• No results found

Chapter 4  Domestic Determinants of Chinese OFDI 68 

4.4  Research Design 79 

4.4.1  Time-series Data Analysis 79 

Following conventional IDP studies (e.g. Barry et al., 2003; Kalotay, 2004; Liu

et al., 2005), this study also adopts an aggregative approach by using macro-

level time-series data. The author follows the standard procedure for the time- series analysis including the unit root test, the cointegration test, and the system exogeneity test in order to detect the possible existence of certain characteristics of economic data which may influence the validity and

reliability of the final regression. These pre-tests helped the author to choose the appropriate estimation method.

4.4.1.1 Unit root test

For time series data, the first step is to test the stationarity property of the variables. The existence of a unit root is often a theoretical implication which questions the rational use of information available to researchers (Phillips and Perron, 1988). In particular, the presence of a unit root indicates that the trend is stochastic (variation is systematic but hardly predictable), rather than deterministic (variation is completely predictable) through the presence of a polynomial time trend (Maddala and Kim, 1998; Phillips and Perron, 1988). Unfortunately, it is well known that macro economic data is often non- stationary at level and standard regression techniques, such as OLS, will produce spurious results if the variables under consideration contain unit- roots and are non-stationary (Johansen and Juselius, 1990, Johansen, 1992; Phillips and Perron, 1988). In fact, in spurious regressions, the statistically significant relations among the variables that appear in the results merely reflect a contemporaneous correlation rather than a causal relationship.

The augmented Dickey-Fuller (ADF) test (Dickey and Fuller, 1981) is applied to detect the possible existence of unit roots of the variables. The null hypothesis of this test is that the variable contains a unit root, and the alternative is that the variable is generated by a stationary process.

4.4.1.2 Cointegration

Cointegration is an econometric property of time-series variables, which means that if two or more series are themselves non-stationary, but a linear combination of them is stationary, then the series are said to be cointegrated (Maddala and Kim, 1998). Cointegration may happen when two variables cannot drift too far apart because of the market, or when two series are the input and the output of a black box of limited capacity, or of finite memory (Granger, 1981). The existence of cointegration indicates that the two series

‘move in a similar way, ignoring lags, over the long swings of the economy and in trend...although the two series may be unequal in the short term, they are tied together in the long run’ (Granger, 1981).

A regression equation only makes sense when the dependent variable and the independent variable(s) do not drift too far apart from each other over time, which means there is a long-run equilibrium relation between them (Maddala and Kim, 1998). If the dependent variable and its explanatory variables drifted apart from each other over time, the relation obtained by regression would be spurious (Enders, 2004; Maddala and Kim, 1998). Therefore, it is important to test whether the variables are cointegrated and have a long-run relation.

In the case of multi-cointegration, the vector error correction model (VECM) should be applied, as the deviation of the equilibrium from its long-run relation will be fed into its short-run dynamics in the VECM (Burke and Hunter, 2005; Enders, 2004). The error correction term (ECT) represents the long-run relation between the variables. The VECM(p-1) can be written in the following form: t p i i t i t t c X A X X      

    1 1 1 Equation 4.2

where Xt is an n×1 vector of non-stationary I(1) variables, c is an n×1 vector

of constants,

t is an n×1 vector of white noise,  is n p i i I A

1 (In being the

n×n identity matrix) and Ai is an n×n coefficient matrix (Burke and Hunter, 2005). Xt1 in Equations 4.2 is sometimes called the long-term part and

represents the long-term relation between the variables in the equation (Lutkepohl and Kratzig, 2004). If the coefficient matrix  has reduced rank r <

n,  can be written as =αβ’ where α (loading matrix) and β (cointegration

matrix) are n×r matrices with rank r. r is the number of cointegration relations and each column of β is the cointegration vector (Lutkepohl and Kratzig, 2004).

Cointegration can be tested by applying the Johansen and Juselius (1990) procedure in which two tests are conducted: the trace test and maximal eigenvalue test. The method is to estimate the  matrix from an unrestricted vector autoregressive model and to test whether the restrictions implied by the reduced rank of  can be rejected. The null hypothesis for the trace test is the number of co-integration vector r ≤ n, for eigenvalue test is r = n. The non- rejection of the null hypothesis indicates that there are n co-integration relations in the dataset. The rejection of the null hypothesis indicates the acceptance of the alternative hypothesis r ≥ n and the need to continue to the next stage to test null r ≤ n+1.

4.4.1.3 System exogeneity

Including endogenous variables into the regression model may be problematic for inference and analysis purposes. The existence of endogeneity implies the existence of two-way causal relations between dependent variables and its explanatory variables (Liu et al., 2005), which means the traditional OLS regression method cannot be utilised because the regression coefficients of the OLS regression will be biased. Therefore, the system exogeneity characteristic of the data needs to be tested in order to choose the appropriate regression model.

The variable is said to be weakly exogenous for the estimated parameter vector if estimating the parameter vector within a conditional model (conditional on the variable) does not entail a loss of information compared to estimating the vector in a full model without conditioning on the variable (Lutkepohl and Kratzig, 2004). Weak exogeneity can be tested against data by defining the stochastic properties of the conditioning variables in the VECM, which has the advantage that one can formulate a partial system as a conditional model and discuss its properties (Johansen, 1992). Restrictions can be imposed to the loading matrix, α, in order to detect the existence of weak exogeneity. If the i-th row of the α matrix is all zero, then the i-th variable is said to be weakly exogenous with respect to the cointegration matrix β.

The null hypothesis of the weakly exogenous test is that the variable is weakly exogenous to the system. Therefore, the rejection of the null hypothesis indicates that the variable is endogenous and needs to be addressed in regression method selection.

4.4.1.4 Generalized method of moments (GMM)

Equation 4.1 estimated in this study can be re-written in a simplified version of Equation 4.2, as in Equation 4.3: t n i i t i t t c X OFDI       

   1 1 Equation 4.3

where  means the differences, ODFIt is the dependent variable, Xt is a

vector of the explanatory variables and

t is the disequilibrium of the last period, t OFDIt Xt

'

   (Patterson, 2000).

t1 in Equations 4.3 is the ECT and represents the long-term relationship between the variables in the equation.

Where cointegration and endogeneity exist, Equation 4.3 should be estimated by using the GMM (Greene, 2000). In the twenty years since it was first introduced, GMM has become a very popular tool among empirical researchers. Many standard estimators, including OLS, can be seen as special cases of GMM estimators (Baum et al., 2003; David Roodman, 2006). GMM differs from OLS regression by taking into account reverse causation between variables. Liu et al. (2005) conducted a GMM estimation technique that takes into account time trends and co-movements between variables for the first time to increase the reliability of the conclusion, which will also be applied in this study.