Forecasting evaluation - Empirical analysis

3.4 Empirical analysis

3.4.3 Forecasting evaluation

Our forecasting analysis is based on model estimations up to the end of 1999 and an out-of-sample evaluation period from 2000 to 2010. Subsample model estimations as well as descriptive statistics for the estimated time-varying persistence are presented

3.4.3 Forecasting evaluation 93

in the Appendix. Note that the LM test statistic is now found to be significant at

the 1% level for the RV_t(22) variable. Also, the average estimated persistence from

the TVP-GARCH model estimations are found to be lower for the subsample, which excludes the financial crisis, than for the full sample.

In the direct forecasting evaluation, we use the mean squared error (M SE) and the quasi-likelihood (QLIKE) loss functions, since Patton (2011) showed that they are both robust in the sense that they yield the same ranking of two volatility forecasts when using an observed (unbiased) volatility proxy instead of the unobserved true volatility. The two loss functions differ in an important way: the MSE is a symmetric loss function, whereas the QLIKE depends on the relative forecast error and penalizes more heavily volatility forecasts that underestimate volatility. More- over, Brownlees et al. (2011) show that the M SE has a bias that is proportional to the true volatility, whereas the bias of QLIKE is independent of the volatility level.

Let RVt+l denote the realized volatility proxy that is based on 5-minutes intra-day

returns and let ˆht+l|t denote the l-step ahead volatility forecast.20 For observation

t, the two loss functions are then given by

M SEt= RVt+l− ˆht+l|t 2 , QLIKEt= RVt+l ˆ ht+l|t − log RVt+l ˆ ht+l|t ! − 1.

Note that at the beginning of period t + 1, the time-varying GARCH coefficient

βt+1 = β1+ β2F (γ, Φ0xt) is predetermined with respect to Ft, since the transition

function includes lags of x beyond period t. Thus, the one-step ahead volatility

prediction ˆht+1|t from the TVP-GARCH-MIDAS model is simply ht+1, just as in

the GJR-GARCH model. In computing volatility forecasts from the TVP-GARCH- MIDAS model beyond horizon l = 1, we make the simplifying assumption that

E[βt+l|Ft] = βt+1 for l > 1, that is we keep βt+1 fixed. Volatility forecasts are then

obtained iteratively in a similar way as in the GJR-GARCH model.

The results of a forecast evaluation based on the QLIKE for daily forecasts horizons l = 1, 10, 22, 65 are presented in Table 3.7. Based on a Diebold-Mariano test, we find significant improvements over forecasts from the GJR-GARCH benchmark model for the TVP-GARCH-MIDAS model including the RV across all horizons. On the other hand, the models including the ADS yield no significant improvements over the benchmark model. We find similar results for the mean squared error loss

function. In Figure 3.11, we show the R2_{s obtained from Mincer-Zarnowitz regres-} sions across horizons l = 1, . . . , 65, that is from regressing realized volatility for period t + l on a constant and the respective l-step ahead volatility forecast given t.

The R2 _{obtained from the RV forecasts constantly lies above the other ones across}

the horizons. On the other hand, the R2 _{from the benchmark model is not dis-}

tinguishable from the ADS models. In summary, we find strong evidence that the

model with time-varying persistence determined by RV_t(22) significantly improves

volatility forecasts over the GJR-GARCH model.

Finally, we compare the volatility forecasts across volatility regimes in order to get a sense when the adjustment in volatility persistence pays off the most. We follow the approach in Lanne and Saikkonen (2005) and split realized volatility into three categories. Then, based on each realized volatility observation, we calculate forecasts implied by the different model estimations at horizons from 1 to 65 days and take the average of the forecasts for a given horizon within each category. The average realized volatility as well as the volatility forecasts at each horizon are depicted in

Figure 3.12. The forecasts are based on the initial realized volatility value RV0 with

RV0 < 0.6, 0.6 ≤ RV0 < 4.3, and RV0 ≥ 4.3. The thresholds correspond to the 50%

and 95% quantile of realized volatility and most of the observations falling into the

last category coincide with the financial crisis period. Note that we plug in RV0 as

a starting value for all models and then iterate the forecasts based on the respective persistence estimates. This exercise does therefore not evaluate the actual volatility (point) forecasts, but rather compares the persistence evolvement that is implied by the models.

We find that on average, forecasts from the TVP-GARCH-MIDAS model with

RV_t(22) capture very well the actual rate of persistence of realized volatility for the

first (low initial RV0) and the last (high initial RV0) category. The two models

with the ADS variables yield similar forecast persistence, though the negative ADS improves over the standard ADS variable in the high initial realized volatility regime. In the low volatility regime, the TVP-GARCH-MIDAS models imply a lower volatility persistence than the GJR-GARCH, but the level implied by the ADS is still to high compared to the actual persistence of realized volatility in this regime.

Similarly, the ADS overestimates persistence in the high volatility regime.21

21_{In line with the full sample estimations, the TVP-GARCH MIDAS model with realized volatil-}

ity yields a greater time variation in persistence. The minimum persistence value for the model is 0.7788, compared to (0.9829) 0.9777 for the (negative) ADS. The maximum persistence value for the model with realized volatility is 0.9545, compared to (0.9922) 0.9972 for the (negative) ADS. The estimated persistence implied by the GJR-GARCH model equals 0.9881. The full descriptive

In document Financial Volatility, Dynamic Correlations, and Macroeconomic Fundamentals (Page 104-107)