THE ECONOMETRIC MODEL AND ESTIMATION METHODS

The observed results of previously used large econometric models of GDP growth produced large errors around the forecasting of businesses cycle turning points (Khomo and Aziakpono, 2007:203). As a result, Estrella and Hardouvelis (1991) developed a new method of predicting business cycle turning points. This model is a non-linear probit model which involves the estimation of the probability of a recession at a particular time period. This method has been adopted widely by academic researchers in forecasting the probabilities of recessions (see: Bernard and Gerlach (1996), Estrella and Mishkin (1998), Karunaratne (1999), Moolman (2002 and 2003), Khomo and Aziakpono (2007), Nyberg (2010)). This method was adopted in this study.

In their study, Estrella and Hardouvelis (1991:562) indicate that the yield curve is a forecaster of a binary variable say which provides the signals that an economy is in a recession (i.e. ) or an economy is not in a recession (i.e. ). The implication of this is that a model that is able to link the binary variable with the yield curve slope can assist in the investigation of the terms structure of interest rates to forecast business cycle turning points, and hence economic recession. This binary variable can be represented as follows:

Thus conditional on the set of information quarters earlier, that is , contains a Bernoulli distribution (Hao and Ng 2011: 1303) and is represented as:

Page 60 of 119

Define the and as representing the conditional expectations and the probability expectations, and in that order, provided that the information set at time is . If it is assumed that , for instance, then the conditional expectations of may be represented as follows (Kauppi and Saikkonen, 2008: 778):

Since the binary variable indicates whether an economy is in recession or not, the standard linear regression that links the slope of the yield curve to the binary variable can be specified as follows (Moolman, 2002: 46):

Where is considered to be an unobservable variable that is used to determine the event of an economic recession at time , is the explanatory variables used at time , represents the error term of the model, and , are the model coefficients to be estimated. Since the dependent variable can only assume two values, 1 representing a recession and 0 otherwise, a model that is binary dependent ought to be used (Khomo and Aziakpono, 2007: 203). As such, according to Gujarati (2003:582), one of the ways of developing a binary dependent variable model is through a probit model, which is the model used in this study so as to also compare the results with those of previous studies.

The probit model is nonlinear and its focus is to relate the probability of an economic recession in the current quarter , with the slope of the yield curve observed at earlier quarters such as (Estrella and Hardouvelis, 1991:562). As such, and in accordance with the theory of probabilities, the nonlinearity nature of a probit model ensures that the probability value produced falls within the interval [0, 1]. In that way, the probability of a recession at time , observed at time , with being an integer

Page 61 of 119

representing the number of quarters in the past the recession was observed. This can be given by the following traditional probit model (Nyberg, 2010:217):

Where signals the probability of recession (i.e. the probability that ), based on the information provided by the explanatory variable ( ) at time , and represents the cumulative normal distribution. The coefficients and are estimated using the maximum likelihood (ML) method (Hao and Ng, 2011:1304). The estimates of these coefficients determined by the ML are computed by maximizing the following log likelihood function (Estrella and Hardouvelis, 1991:562):

(4.10)

According to Dueker (1997:45), one of the issues of the traditional probit model (sometimes referred to as the static probit model), is the absence of the dynamic structure with regard to the dependent variable when it is applied to the time series data. The general assumption made by the probit model is that the random shocks in the model are independent and identically distributed and have a mean of zero (Dueker 1997:45). However, this is not possible with many times series models. In their study, Estrella and Mishkin (1998:47) indicate that the probit model has a general constraint of overlapping data in such a way that the errors of forecasting are more likely to be serially correlated. The implication of this is that, the significance tests of model variables conducted through the standard test statistic are likely to yield results that are meaningless.

Due to the limitations of the traditional probit model, another approach was proposed by Dueker (1997). This method aimed at removing the serial correlations present in the error term by adding a lag of the dependent variable as an explanatory variable to

Page 62 of 119

include its dynamic structure in the model; this model is deemed the dynamic probit model. According to Dueker (1997:45), including a lag of the dependent variable as an explanatory variable resolves the serial correlations of the error term by increasing the likelihood and the validity of the assumption that the error terms have a mean of zero overtime. The implication of the error term having a mean of zero overtime significantly reduces the chances of the error terms being correlated. Furthermore, Nyberg (2010:217) suggests that adding a lag of the dependent variable improves the forecasting performance of the probit model by taking into consideration the previous state (i.e. ) of the economy in predicting the future state (i.e. ). This dynamic probit model is specified as follows:

Such that is the lag of the dependent variable added and are coefficients which are estimated using the same maximum likelihood function presented in (4.10).

The other issue in the probit model estimation is, according to Khomo and Aziakpono (2007: 204), the measure of the goodness of fit for the explanatory variables used in the model. In his review, Estrella (1995) suggested the main measure of the forecasting power of a certain explanatory variable for a probit estimation given a certain time horizon. This measure is known as the , and it has been adopted by many researchers such as Bernard and Gerlach (1996), Moneta (2003) and Hao and Ng (2011). Estrella and Mishkin (1998:47) maintain that for a certain estimated equation, a is a goodness of fit which in a standard linear regression it naturally corresponds to the coefficient of determination. This measure can be specified as follows (Dueker, 1997:43):

Page 63 of 119

Such that indicates the unrestricted value of the log-likelihood of the estimated model, represent the maximum value under the constraint that all coefficients are zero excluding the constant, and is the number of observations. According to Hao and Ng (2011:1306), a high indicates that the explanatory variables used are relevant in that the variables increase the likelihood function of the model, as opposed to a model that only has a constant term. As such, the measure is used in conjunction with the test statistic to determine the time lags that provide the best fit with regard to all the explanatory variables considered.

While the measure is used to determine the time lags that produce the best fit of the explanatory variables used, it cannot however be used to compare the forecasting performance of explanatory variables in a specified model. This is due to the fact that the underlying dependent variable is unobservable; thus it is impossible to determine what percentage of its variance is explained by the model (Hao and Ng, 2011:1306). According to Estrella and Mishkin (1998:47), the does not have an intuitive interpretation of the values other than the values 0 or 1. For instance, a of 0.65 provides the indication of 65% increase in the likelihood function of the estimated model, which is a figure without any apparent meaning that can be used to compare explanatory variables and models.

Thus to compare the accuracy of the predictions produced by the explanatory variables, the forecasting error of the probit model is used (Moneta, 2003:26). Recall that the ideal aim of the probit model is to produce 1 in an event of a recession and 0 otherwise, a lower error of forecasting may be associated with a better predictive accuracy of the model. Studies such as Dotsey (1998) and Khomo and Aziakpono (2007) make use of the root mean square error (RMSE) as the forecasting measure. This study also makes use of the RMSE in comparing the predictive performance of the explanatory variables used. The reason for the use of the RMSE as a comparative tool is that RMSE is a frequently used error tool which measures the dispersion around the true value of a data point, by measuring the difference between the values predicted by a model with

Page 64 of 119

the actual observed values (Gujarati, 2003: 901). This means that it determines how close a fitted line is from a true value of a data point. Thus the use of RMSE as a comparative tool makes it possible to determine how the predictive accuracy of the explanatory variables considered change in different time periods. Moreover, since the tool is frequently used by similar studies, it allows for the results to be compared with those of previous studies.

To test the effectiveness of the dynamic probit model used in this study, the model was tested in-sample and out-of-sample. The in-sample run tests the forecasting ability of the model within the periods in which the model was estimated for, while the out-of- sample testing run tests the model for the quarters beyond which the model was estimated for. Dueker (2002:30) asserts that the out-of-sample assists to ensure that the model parameters are not over-fitting in the in-sample data. According to Estrella and Mishkin (1996:1), the out-of-sample performance of a model provides a much truer reflection of its real-world forecasting ability.

In this study the in-sample testing was performed with data within the period from 1980 to 2000. On the other hand, the out-of-sample testing was conducted with data within the period from 2001 to 2012. The choice of the out-of-sample period was due to the following two reasons. First, it allowed for the evaluation of the performance of the dynamic probit model in 2003 in which there was an incorrect probability of recession produced by the model, as reported by Khomo and Aziakpono (2007). In their study, the probit model was analysed only in-sample. Second, the chosen out-of-sample period allowed for the evaluation of the real-world forecasting ability of the dynamic probit model under the recent global economic recession of 2007.

In document The explanatory power of the yield curve in predicting recessions in South Africa (Page 61-66)