Empirical Model Estimation Methods - Research Model Assessing the Scale, Technique and Composit

Chapter 4 Research Methodology and Data

4.1 Research Model Assessing the Scale, Technique and Composition Effects of Trade

4.1.2 Empirical Model Estimation Methods

This study applies the autoregressive distributed lag model to test for the existence of the long-run relationship among the time series variables.

One of the main advantages of the ARDL technique is that it can be applied irrespective of whether the variable is I(0) or I(1) or fractionally co-integrated (Pesaran & Pesaran, 1997). The ARDL model takes a sufficient number of lags to capture the dynamics impacts of all dependent and independent variables as well as from the error term. Furthermore, the error correction model (ECM) is derived from ARDL through a simple linear transformation. ECM integrates short-run adjustments with long- run equilibrium without losing long-run information.

Perasan and Shin (1999) also demonstrated that the simultaneous estimation of long-run and short- run components and appropriate lags in the ARDL framework remove the problems that are

associated with serial correlation and endogeneity problems. Another important advantage of ARDL procedure is that the estimation is possible even when the explanatory variables are endogenous

(Pesaran, Shin, & Smith, 2001). Finally, the ARDL model has proved to be suitable for a small sample size study (Farhani et al., 2014).

The estimation procedures are described next.

(i) Data Stationarity

In applying time series data in regression analysis, it is important to determine whether a time series is stationary or non-stationary to avoid the spurious regression problem. Although the ARDL

techniques can be applied in time series data if the data is I(0) or I(1); however, ARDL estimation is not applicable if the data is I(2). This is because the computed F-statistics provided by Pesaran et al. (2001) are not valid for I(2) data. We apply three different unit root tests to check for the stationarity of data, namely: (i) the Augmented-Dickey-Fuller (ADF) test; (ii) the Dickey-Fuller GLS unit root test; and (iii) the Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test.

The Dickey-Fuller GLS test is a simple modified version of the conventional ADF test that de-trends the series prior to the estimation of ADF test regression (Begum et al., 2015). The critical value of the Dickey-Fuller GLS unit root test is calculated for 50 observations by Elliot-Rotherberg-Stock (1996). The KPSS unit root test is conducted to complement the ADF and Dickey-Fuller GLS test, based on the argument that tests designed on the basis of the null that a series is I(1) has a low power of rejecting the null (Ang, 2008).

Besides, we conduct the Johansen co-integration test to check the existence of a long-run equilibrium among the variables at the levels, including: trade openness, GDP per capita, energy consumption and CO2 emissions. The Johansen co-integration test provides information on the

existence of co-integrations among the variables. However, the test does not calculate the

magnitude of the possible long-run impact. Therefore, we further proceed to the Bounds test (or the F-test) to estimate the short and long-run co-integration among the variables.

(ii) Optimal Lag Length of Each Variable

In order to choose the optimal lag length for each variable, the ARDL method estimates (p+1)k

number of regressions, where p is the maximum number of lags and k is the number of variables in the equation. The model can be selected based on Schawrz (SC) also known as the Bayes information criterion (BIC) and Akaike’s information criterion (AIC). The BIC is a parsimonious model that selects the smallest possible lag length. AIC is used to select the maximum relevant lag length.

The Akaike information criterion is given as (Hill, Grifﬁths, & Lim, 2011):

𝐴𝐼𝐶 = ln (𝑆𝑆𝐸

𝑇 ) +

2𝐾

where K is the number of coefficients that are estimated; T is the sample size.

The Schwarz criterion, or the Bayes information criterion (Hill et al., 2011), is given as:

𝐵𝐼𝐶 = ln (𝑆𝑆𝐸

𝑇 ) +

𝐾𝑙𝑛(𝑇)

𝑇 (4.5)

where K is the number of coefficients that are estimated; T is the sample size.

(iii) Co-integration among Variables

The ARDL framework for equation (4.2) is given as follows:

∆𝐶𝑡= 𝜆0 + ∑𝑝𝑖𝑖=1𝛿𝑖∆𝐶𝑡−𝑖+ ∑𝑝𝑖𝑖=0𝜑𝑖∆𝐸𝑡−𝑖+ ∑𝑖=0𝑝𝑖 𝜔𝑖∆𝑇𝑟𝑡𝑡−𝑖+ ∑𝑝𝑖𝑖=0𝛾𝑖∆𝑌𝑡−𝑖+ ∑𝑝𝑖𝑖=0𝜃𝑖∆𝑌𝑡−𝑖2 +

𝜆1𝐶𝑡−1+ 𝜆2𝐸𝑡−1+ 𝜆3𝑇𝑟𝑡𝑡−1+ 𝜆4𝑌𝑡−1+ 𝜆5𝑌𝑡−12 + 𝑈𝑡 (4.6)

where: 𝜆0 is the drift component and 𝑈𝑡 is white noise. The terms with summation signs represent the error correction dynamics. 𝜆𝑖 (i=1-5) corresponds to the long-run relationship. 𝑝𝑖(i=1-5) is maximum lag levels of each variable.

The ARDL framework for equation (4.3) is given as follows:

∆𝐶𝑡= 𝛽0+ ∑𝑝𝑖𝑖=1𝛿𝑖∆𝐶𝑡−𝑖+ ∑𝑝𝑖𝑖=0𝜑𝑖∆𝐸𝑡−𝑖+ ∑𝑝𝑖𝑖=0𝜔𝑖∆𝑇𝑟𝑡𝑡−𝑖+ ∑𝑖=0𝑝𝑖 𝛿𝑖∆𝑌𝑡−𝑖 + 𝜆1𝐶𝑡−1+

𝜆2𝐸𝑡−1+ 𝜆3𝑇𝑟𝑡𝑡−1+ 𝜆4𝑌𝑡−1 + µ𝑡 (4.7)

where: 𝛽0 is the drift component and µ𝑡 is white noise. The terms with summation signs represent the error correction dynamics. 𝜆𝑖 (i=1-4) corresponds to the long-run relationship. 𝑝𝑖(i=1-4) is maximum lag levels of each variable.

The F-test (or the Bounds test) tests the existing long-run relationship among the variables.

For equation (4.6): the null hypothesis 𝐻0: 𝜆1= 𝜆2= 𝜆3= 𝜆4= 𝜆5= 0 (the non-existence of long-run relationships); the alternative hypothesis 𝐻1: 𝜆1≠ 𝜆2≠ 𝜆3≠ 𝜆4≠ 𝜆5≠ 0.

For equation (4.7): 𝐻0: 𝜆1= 𝜆2= 𝜆3= 𝜆4= 0; 𝐻1: 𝜆1≠ 𝜆2≠ 𝜆3≠ 𝜆4 ≠ 0.

The calculated F-statistic value is compared with two sets of critical values provided by Pesaran et al. (2001). One set assumes that all variables are I(0) and the other assumes they are I(1). If the

calculated F-statistics exceed the upper critical value, then the null hypothesis of no co-integration will be rejected irrespective of whether the variable is I(0) or I(1). If it is below the lower critical value then the null hypothesis of no co-integration cannot be rejected. If it falls inside the critical value band, the test is inconclusive.

(iv) Estimation of the Long-run and Short-run Elasticities

It has been proven that if a set of series is cointegrated then there exists an error correction mechanism. The error correction mechanism helps the variables move closely together over time, while allowing for a wide range of short-run dynamics (Engle & Granger, 1987, as cited in Baek & Kim, 2013, p. 746). This dynamic relationship is described by the error correction model demonstrating the short-run and long-run adjustment parameters. The results of the ECM allow us to measure the speed of adjustment required to adjust to long-run equilibrium after a short-term shock. The coefficient of the ECM term is expected to be negative and statistically significant.

Following the selection of the ARDL model by the AIC or BIC criterion, the long-run relationship among the variables can be estimated by the ordinary least square (OLS) method for equations (4.6) and (4.7), then the ECM frameworks for equations (4.6) and (4.7) are estimated using equations (4.8) and (4.9), respectively.

∆𝐶𝑡= 𝛼0 + ∑𝑝𝑖=1𝛿𝑖∆𝐶𝑡−𝑖 + ∑𝑝𝑖=0𝜑𝑖∆𝐸𝑡−𝑖+ ∑𝑖=0𝑝 𝜔𝑖∆𝑇𝑟𝑡𝑡−𝑖+ ∑𝑝𝑖=0𝛿𝑖∆𝑌𝑡−𝑖+ ∑𝑝𝑖=0𝛿𝑖∆𝑌𝑡−𝑖2 +

α𝐸𝐶𝑀𝑡−1+ 𝑈𝑡 (4.8)

∆𝐶𝑡= 𝛽0 + ∑𝑝𝑖=1𝛿𝑖∆𝐶𝑡−𝑖+ ∑𝑝𝑖=0𝜑𝑖∆𝐸𝑡−𝑖+ ∑𝑝𝑖=0𝜔𝑖∆𝑇𝑟𝑡𝑡−𝑖+ ∑𝑖=0𝑝 𝛿𝑖∆𝑌𝑡−𝑖+ + α𝐸𝐶𝑀𝑡−1+ µ𝑡

(4.9)

where Ct is CO2 emissions per capita, Et is commercial energy use per capita, Yt is real per capita GDP,

Yt2 is the square of per capita real GDP, Trt is the openness ratio which is used as a proxy for foreign

trade, and Ut and µt are the regression error terms. ∆ stands for the first difference of the variable. ECM is the Error Correction Model that is derived from the ARDL model.

All variables in equations (4.8) and (4.9) are in their natural logarithmic form.

(v) Diagnostic Tests and Stability of the Estimated Model

Lastly, the selected ARDL specification is checked for robustness, including the tests to check the normality, the heteroscedasticity, the freedom from serial correlation and the stability.

Figure 4-1: Estimation procedures for empirical models

In document Impact of trade openness on the environment: An assessment of CO2 emissions in Vietnam (Page 68-72)