3 Vector Autoregressive and Vector Error Correction
3.6 Forecasting VAR Processes and VECMs
So far in this chapter we have focused on constructing an adequate model for the DGP of a system of variables. Once such a model has been found, it may be used for forecasting as well as economic analysis. In this section, forecasting VAR processes will be discussed first. In Section 3.7, the concept of Granger-causality will be introduced as one tool for economic analysis. The concept is based on forecast performance and has received considerable attention in the theoretical and empirical literature. Other tools for analyzing VARs and VECMs will be discussed in the next chapter.
3.6.1 Known Processes
Forecasting vector processes is completely analogous to forecasting univariate processes, as discussed in Chapter 2. The levels VAR form (3.1) is particu-larly convenient to use in forecasting the variables yt. We will again initially ignore deterministic terms and exogenous variables. Moreover, it is assumed first that the process parameters are known. Suppose the uts are generated by an independent white noise process. In that case the minimum mean-squared error (MSE) forecast is the conditional expectation. For example, at forecast origin T , an h-step ahead forecast is obtained recursively as
yT+h|T = A1yT+h−1|T + · · · + ApyT+h−p|T, (3.45) where yT+ j|T = yT+ jfor j≤ 0. The corresponding forecast error is
yT+h− yT+h|T = uT+h+ %1uT+h−1+ · · · + %h−1uT+1, (3.46) where it can be shown by successive substitution that
%s =
s j=1
%s− jAj, s = 1, 2, . . . , (3.47)
with%0= IK and Aj = 0 for j > p [see L¨utkepohl (1991, Sec. 11.3)]. As in the univariate case, ut is the 1-step forecast error in period t− 1, and the forecasts are unbiased; that is, the forecast errors have expectation 0. The MSE matrix of an h-step forecast is
y(h)= E{(yT+h− yT+h|T)(yT+h− yT+h|T)} =
h−1
j=0
%ju%j.
(3.48) If utis uncorrelated white noise and is not necessarily independent over time, the forecasts obtained via a recursion as in (3.45) are just best linear forecasts.
Also analogous to the univariate case, the forecast MSEs y(h) for a stationary process converge to the unconditional covariance matrix of yt, E[(yt− E(yt))(yt− E(yt))]=∞
j=0%ju%j. Thus, the forecast uncertainty as reflected in the MSEs is bounded even for long-term forecasts for station-ary processes. In contrast, for integrated processes the MSEs are generally unbounded as the horizon h goes to infinity. Thus, the forecast uncertainty in-creases without bounds for forecasts of the distant future. This does not rule out, however, that forecasts of some components or linear combinations of I(1) variables have bounded MSEs. In fact, forecasts of cointegration relations have bounded MSEs even for horizons approaching infinity because they are forecasts for stationary variables.
The corresponding forecast intervals reflect these properties as well. If the process yt is Gaussian, that is, ut ∼ iid N(0, u), the forecast errors are also
multivariate normal. Using this result, the following forecast intervals can be established:
[yk,T +h|T − c1−γ /2σk(h), yk,T +h|T + c1−γ /2σk(h)]. (3.49) Here c1−γ /2is the (1−γ2)100 percentage point of the standard normal distri-bution, yk,T +h|T denotes the kth component of yT+h|T, andσk(h) denotes the square root of the kth diagonal element ofy(h), that is,σk(h) is the standard deviation of the h-step forecast error for the kth component of yt. Obviously, if σk(h) is unbounded for h→ ∞, the same is true for the length of the interval in (3.49).
Of course, if the DGP is modeled as a VECM, it may be rewritten in VAR form for forecasting. Alternatively, equivalent forecasting equations can be obtained directly from the VECM.
If a variable enters the system in differenced form only, it is, of course, still possible to generate forecasts of the levels. This can be done by using the relation between first differences and levels mentioned in the univariate case (see Chapter 2). More precisely, suppose that ykt enters as ykt only.
Then yk,T +h= yk,T+ yk,T +1+ · · · + yk,T +h, and thus an h-step forecast yk,T +h|T = yk,T+ yk,T +1|T+ · · · + yk,T +h|T may be obtained via forecast-ing the differences. The properties of the forecast errors includforecast-ing their MSEs follow from the joint distribution of the forecasts yk,T +1|T, . . . , yk,T +h|T
[see L¨utkepohl (1991)].
If deterministic or exogenous variables are present, or both, it is straightfor-ward to extend the formula (3.45) to allow for such terms. Because the future development of deterministic variables is known by definition of the term “de-terministic,” they are particularly easy to handle. They may simply be added to the stochastic part. Exogenous variables may be more difficult to deal with in some respects. They are also easy to handle if their future development is known. Otherwise they have to be predicted along with the endogenous vari-ables, in which case a model for their DGP is called for. Alternatively, if the exogenous variables are under full control of a policy maker, it may be desirable to forecast the endogenous variables conditionally on a specific future path of the exogenous variables to check the future implications of their specific values.
Suppose the following reduced form model is given:
yt = A1yt−1+ · · · + Apyt−p+ CDt+ Bzt+ ut.
As usual, Dt summarizes the deterministic terms and ztrepresents exogenous variables. In that case, one may consider conditional expectations
E(yT+h|yT, yT−1, . . . , zT+h, zT+h−1, . . .)
= A1E(yT+h−1| · · ·) + · · · + ApE(yT+h−p| · · ·) + CDT+h+ BzT+h. The forecast errors and MSEs will be unaffected if there is no uncertainty in the future values of the exogenous variables.
3.6.2 Estimated Processes
So far we have worked under the assumption that the DGP is known, in-cluding its parameters. Of course, this assumption is unrealistic in practice.
Therefore we will now consider the implications of using estimated VARs for the forecast precision. Denoting the optimal h-step forecast by yT+h|T as in (3.45) and furnishing its counterpart based on estimated coefficients by a hat give
ˆyT+h|T = ˆA1ˆyT+h−1|T + · · · + ˆApˆyT+h−p|T, (3.50) where ˆyT+ j|T = yT+ j for j ≤ 0 and the ˆAis (i = 1, . . . , p) are estimated pa-rameters. The corresponding forecast error is
yT+h− ˆyT+h|T = [yT+h− yT+h|T]+ [yT+h|T− ˆyT+h|T]
=
h−1
j=0
%juT+h− j+ [yT+h|T − ˆyT+h|T]. (3.51)
The first term on the right-hand side involves future residuals ut with t> T only, whereas the second term is determined by present and past variables if only past variables have been used for estimation. It follows that the two terms are independent if ut is independent white noise. Moreover, under standard assumptions, the difference yT+h|T− ˆyT+h|T is small in probability as the sample size used for estimation gets large and the VAR coefficients are estimated more and more precisely. Hence, the forecast error covariance matrix is
ˆy(h)= E{(yT+h− ˆyT+h|T)(yT+h− ˆyT+h|T)}
= y(h)+ o(1). (3.52)
Here the quantity o(1) denotes a term that tends to zero with increasing sample size. Thus, as far as the forecast MSE is concerned, the estimation uncertainty may be ignored in large samples. The same holds for setting up asymptotic forecast intervals. In small samples, it may still be preferable to include a correction term. Clearly, such a term will depend on the precision of the esti-mators. Hence, if precise forecasts are desired, it is a good strategy to look for precise parameter estimators. Further details on possible correction terms may be found in L¨utkepohl (1991, Chapter 3) for the stationary case and in Reimers (1991), Engle & Yoo (1987), and Basu & Sen Roy (1987) for nonstationary processes.
Again, extensions to processes with deterministic terms and exogenous variables are straightforward. The problems associated with the use of esti-mated rather than known parameters are analogous to those discussed for the VAR parameters. Of course, correction factors for forecast MSEs and forecast
intervals may become more complicated, depending on the terms to be included in addition to the VAR part.
Example. To give an example we have reestimated the subset VECM (3.41) using only data up to the fourth quarter of 1994, and we use that model to predict the interest rate and inflation variables for the next 16 quarters after the sample end. The resulting forecasts are presented in Figure 3.6. The 95%
forecast intervals, which are also shown in the figure, do not take into account the estimation uncertainty. Notice thatJMulTidoes not provide a correction of forecast intervals for parameter estimation uncertainty if VECMs are used.
In other words, the forecast intervals shown in Figure 3.6 are smaller than more precise forecast intervals based on an asymptotic approximation that takes into account estimation uncertainty. Nevertheless all of the actually observed values for the years 1995–98 are within the forecast intervals. To see this fact a little better, the forecast period is magnified in the lower part of Figure 3.6. That all observed values are within the approximate 95% forecast intervals may be viewed as an additional confirmation of the model adequacy for forecasting purposes.
In Figure 3.6 the two time series are plotted in addition to the forecasts.
In these plots the time series variability can be compared with the size of the forecast intervals, and it becomes apparent that the forecast intervals reflect the overall variability of the series, as one would expect. Clearly, the intrinsic variability of a series must be taken into account in assessing the uncertainty of a forecast. Hence, it is not surprising that, especially for longer term forecasts, the overall series variability is reflected in the forecast intervals.