Chapter 4 Research Methodology and Data
4.1 Research Model Assessing the Scale, Technique and Composition Effects of Trade
4.1.2 Empirical Model Estimation Methods
This study applies the autoregressive distributed lag model to test for the existence of the long-run relationship among the time series variables.
One of the main advantages of the ARDL technique is that it can be applied irrespective of whether the variable is I(0) or I(1) or fractionally co-integrated (Pesaran & Pesaran, 1997). The ARDL model takes a sufficient number of lags to capture the dynamics impacts of all dependent and independent variables as well as from the error term. Furthermore, the error correction model (ECM) is derived from ARDL through a simple linear transformation. ECM integrates short-run adjustments with long- run equilibrium without losing long-run information.
Perasan and Shin (1999) also demonstrated that the simultaneous estimation of long-run and short- run components and appropriate lags in the ARDL framework remove the problems that are
associated with serial correlation and endogeneity problems. Another important advantage of ARDL procedure is that the estimation is possible even when the explanatory variables are endogenous
(Pesaran, Shin, & Smith, 2001). Finally, the ARDL model has proved to be suitable for a small sample size study (Farhani et al., 2014).
The estimation procedures are described next.
(i) Data Stationarity
In applying time series data in regression analysis, it is important to determine whether a time series is stationary or non-stationary to avoid the spurious regression problem. Although the ARDL
techniques can be applied in time series data if the data is I(0) or I(1); however, ARDL estimation is not applicable if the data is I(2). This is because the computed F-statistics provided by Pesaran et al. (2001) are not valid for I(2) data. We apply three different unit root tests to check for the stationarity of data, namely: (i) the Augmented-Dickey-Fuller (ADF) test; (ii) the Dickey-Fuller GLS unit root test; and (iii) the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.
The Dickey-Fuller GLS test is a simple modified version of the conventional ADF test that de-trends the series prior to the estimation of ADF test regression (Begum et al., 2015). The critical value of the Dickey-Fuller GLS unit root test is calculated for 50 observations by Elliot-Rotherberg-Stock (1996). The KPSS unit root test is conducted to complement the ADF and Dickey-Fuller GLS test, based on the argument that tests designed on the basis of the null that a series is I(1) has a low power of rejecting the null (Ang, 2008).
Besides, we conduct the Johansen co-integration test to check the existence of a long-run equilibrium among the variables at the levels, including: trade openness, GDP per capita, energy consumption and CO2 emissions. The Johansen co-integration test provides information on the
existence of co-integrations among the variables. However, the test does not calculate the
magnitude of the possible long-run impact. Therefore, we further proceed to the Bounds test (or the F-test) to estimate the short and long-run co-integration among the variables.
(ii) Optimal Lag Length of Each Variable
In order to choose the optimal lag length for each variable, the ARDL method estimates (p+1)k
number of regressions, where p is the maximum number of lags and k is the number of variables in the equation. The model can be selected based on Schawrz (SC) also known as the Bayes information criterion (BIC) and Akaikeβs information criterion (AIC). The BIC is a parsimonious model that selects the smallest possible lag length. AIC is used to select the maximum relevant lag length.
The Akaike information criterion is given as (Hill, Grifο¬ths, & Lim, 2011):
π΄πΌπΆ = ln (πππΈ
π ) +
2πΎ
where K is the number of coefficients that are estimated; T is the sample size.
The Schwarz criterion, or the Bayes information criterion (Hill et al., 2011), is given as:
π΅πΌπΆ = ln (πππΈ
π ) +
πΎππ(π)
π (4.5)
where K is the number of coefficients that are estimated; T is the sample size.
(iii) Co-integration among Variables
The ARDL framework for equation (4.2) is given as follows:
βπΆπ‘= π0 + βπππ=1πΏπβπΆπ‘βπ+ βπππ=0ππβπΈπ‘βπ+ βπ=0ππ ππβπππ‘π‘βπ+ βπππ=0πΎπβππ‘βπ+ βπππ=0ππβππ‘βπ2 +
π1πΆπ‘β1+ π2πΈπ‘β1+ π3πππ‘π‘β1+ π4ππ‘β1+ π5ππ‘β12 + ππ‘ (4.6)
where: π0 is the drift component and ππ‘ is white noise. The terms with summation signs represent the error correction dynamics. ππ (i=1-5) corresponds to the long-run relationship. ππ(i=1-5) is maximum lag levels of each variable.
The ARDL framework for equation (4.3) is given as follows:
βπΆπ‘= π½0+ βπππ=1πΏπβπΆπ‘βπ+ βπππ=0ππβπΈπ‘βπ+ βπππ=0ππβπππ‘π‘βπ+ βπ=0ππ πΏπβππ‘βπ + π1πΆπ‘β1+
π2πΈπ‘β1+ π3πππ‘π‘β1+ π4ππ‘β1 + Β΅π‘ (4.7)
where: π½0 is the drift component and Β΅π‘ is white noise. The terms with summation signs represent the error correction dynamics. ππ (i=1-4) corresponds to the long-run relationship. ππ(i=1-4) is maximum lag levels of each variable.
The F-test (or the Bounds test) tests the existing long-run relationship among the variables.
For equation (4.6): the null hypothesis π»0: π1= π2= π3= π4= π5= 0 (the non-existence of long-run relationships); the alternative hypothesis π»1: π1β π2β π3β π4β π5β 0.
For equation (4.7): π»0: π1= π2= π3= π4= 0; π»1: π1β π2β π3β π4 β 0.
The calculated F-statistic value is compared with two sets of critical values provided by Pesaran et al. (2001). One set assumes that all variables are I(0) and the other assumes they are I(1). If the
calculated F-statistics exceed the upper critical value, then the null hypothesis of no co-integration will be rejected irrespective of whether the variable is I(0) or I(1). If it is below the lower critical value then the null hypothesis of no co-integration cannot be rejected. If it falls inside the critical value band, the test is inconclusive.
(iv) Estimation of the Long-run and Short-run Elasticities
It has been proven that if a set of series is cointegrated then there exists an error correction mechanism. The error correction mechanism helps the variables move closely together over time, while allowing for a wide range of short-run dynamics (Engle & Granger, 1987, as cited in Baek & Kim, 2013, p. 746). This dynamic relationship is described by the error correction model demonstrating the short-run and long-run adjustment parameters. The results of the ECM allow us to measure the speed of adjustment required to adjust to long-run equilibrium after a short-term shock. The coefficient of the ECM term is expected to be negative and statistically significant.
Following the selection of the ARDL model by the AIC or BIC criterion, the long-run relationship among the variables can be estimated by the ordinary least square (OLS) method for equations (4.6) and (4.7), then the ECM frameworks for equations (4.6) and (4.7) are estimated using equations (4.8) and (4.9), respectively.
βπΆπ‘= πΌ0 + βππ=1πΏπβπΆπ‘βπ + βππ=0ππβπΈπ‘βπ+ βπ=0π ππβπππ‘π‘βπ+ βππ=0πΏπβππ‘βπ+ βππ=0πΏπβππ‘βπ2 +
Ξ±πΈπΆππ‘β1+ ππ‘ (4.8)
βπΆπ‘= π½0 + βππ=1πΏπβπΆπ‘βπ+ βππ=0ππβπΈπ‘βπ+ βππ=0ππβπππ‘π‘βπ+ βπ=0π πΏπβππ‘βπ+ + Ξ±πΈπΆππ‘β1+ Β΅π‘
(4.9)
where Ct is CO2 emissions per capita, Et is commercial energy use per capita, Yt is real per capita GDP,
Yt2 is the square of per capita real GDP, Trt is the openness ratio which is used as a proxy for foreign
trade, and Ut and Β΅t are the regression error terms. β stands for the first difference of the variable. ECM is the Error Correction Model that is derived from the ARDL model.
All variables in equations (4.8) and (4.9) are in their natural logarithmic form.
(v) Diagnostic Tests and Stability of the Estimated Model
Lastly, the selected ARDL specification is checked for robustness, including the tests to check the normality, the heteroscedasticity, the freedom from serial correlation and the stability.
Figure 4-1: Estimation procedures for empirical models