2 Univariate Time Series Analysis
2.5 Model Specification
Specifying the kinds of models we have discussed so far requires deciding on the orders of the various operators and possibly deterministic terms and dis-tributional assumptions. This can be done by fitting a model, which includes all the terms that may be of interest, and then performing tests for model ad-equacy and model reduction in the usual way. This approach is limited by the fact, however, that the parameters in an overspecified ARMA model may not be unique; therefore, the estimators do not have the usual asymptotic properties.
Thus, model selection procedures are often applied for specifying the orders.
We discuss some of them in the context of pure AR models next.
2.5.1 AR Order Specification Criteria
Many of the AR order selection criteria are of the general form Cr (n)= logσu2(n)+ cTϕ(n),
whereσu2(n)= T−1T
t=1uˆt(n)2 is the error variance estimator based on the OLS residuals ˆut(n) from an AR model of order n, cTis a sequence indexed by the sample size, andϕ(n) is a function that penalizes large AR orders. For the criteria discussed in this section,ϕ(n) is the order of the fitted process and cT
is a weighting factor that may depend on the sample size. The way this factor is chosen effectively distinguishes the different criteria. The first term on the right-hand side, logσu2(n), measures the fit of a model with order n. This term decreases for increasing order because there is no correction for degrees of freedom in the variance estimator. It is important to notice, however, that the sample size is assumed to be constant for all orders n and, hence, the number of presample values set aside for estimation is determined by the maximum order pmax, say. The order that minimizes the criterion is chosen as estimator ˆp of the true AR order p.
The following are examples of criteria that have been used in practice:
AIC(n)= logσu2(n)+ 2
Tn [Akaike (1973, 1974)],
Table 2.1. Order selection criteria for U.S. investment series
n
0 1 2 3 4 5 6 7 8 9 10
AIC (n) 2.170 1.935 1.942 1.950 1.942 1.963 1.990 2.018 1.999 1.997 2.032 HQ (n) 2.180 1.956 1.974 1.997 1.995 2.027 2.065 2.104 2.097 2.107 2.153 SC (n) 2.195 1.987 2.020 2.059 2.073 2.122 2.176 2.231 2.241 2.268 2.331
HQ(n)= logσu2(n)+2 log log T
T n [Hannan & Quinn (1979)]
and
SC(n)= logσu2(n)+log T
T n [Schwarz (1978) and Rissanen (1978)].
Here the term cT equals 2/T , 2 log log T/T , and log T/T for the Akaike infor-mation criterion (AIC), the Hannan-Quinn criterion (HQ), and the Schwarz criterion (SC), respectively. The criteria have the following properties: AIC asymptotically overestimates the order with positive probability, HQ estimates the order consistently (plim ˆp= p), and SC is even strongly consistent ( ˆp → p a.s.) under quite general conditions if the actual DGP is a finite-order AR pro-cess and the maximum order pmaxis larger than the true order. These results hold for both stationary and integrated processes [Paulsen (1984)]. Denoting the orders selected by the three criteria by ˆp( AIC), ˆp(HQ), and ˆp(SC ), respec-tively, the following relations hold even in small samples of fixed size T ≥ 16 [see L¨utkepohl (1991, Chapters 4 and 11)]:
ˆ
p(SC)≤ ˆp(HQ) ≤ ˆp(AIC).
Thus, using SC results in more parsimonious specifications with fewer param-eters than HQ and AIC if there are differences in the orders chosen by the three criteria.
In Table 2.1, the values of the order selection criteria for the U.S. investment series are given. They all suggest an order of 1, although it was seen earlier that the coefficient attached to lag four has a t-value greater than 2. Using the t-ratios of the estimated coefficients and reducing the lag length by 1 if the t-ratio of the coefficient associated with the highest lag is smaller than 2 or some other threshold value is another obvious possibility for choosing the lag length. Of course, by relying on model selection criteria one may end up with a different model than with sequential testing procedures or other possible tools for choosing the AR order.
2.5.2 Specifying More General Models
In principle, model selection criteria may also be used in specifying more gen-eral models such as ARMA processes. One possible difficulty may be that estimation of many models with different orders is required, some of which have overspecified orders, and thus cancellation of parts of the AR and MA operators is possible. In that case iterative algorithms may not converge owing to the nonuniqueness of parameters. Therefore, simpler estimation methods are sometimes proposed for ARMA models at the specification stage. For example, the method for computing start-up values for ML estimation may be used (see Section 2.4.2). In other words, an AR(h) model with large order h is fitted first by OLS to obtain residuals ˆut(h). Then models of the form
yt = α1yt−1+ · · · + αnyt−n+ ut+ m1uˆt−1(h)+ · · · + mluˆt−l(h) (2.12) are fitted for all combinations (n, l) for which n, l ≤ pmax< h. The combination of orders minimizing a criterion
Cr (n, l) = logσu2(n, l) + cTϕ(n, l)
is then chosen as an estimator for the true order ( p, q). This procedure was proposed by Hannan & Rissanen (1982). It is therefore known as the Hannan–
Rissanen procedure. Here the symbols have definitions analogous to those of the pure AR case. In other words,σu2(n, l) = T−1T
t=1uˆt(n, l)2, where ˆut(n, l) is the residual from fitting (2.12) by OLS, cT is a sequence depending on the sample size T , andϕ(n, l) is a function that penalizes large orders. For example, the corresponding AIC is now
AIC(n, l) = logσu2(n, l) + 2 T(n+ l).
Here the choice of h and pmaxmay affect the estimated ARMA orders. Hannan
& Rissanen (1982) have suggested letting h increase slightly faster than log T . In any case, h needs to be greater than pmax, which in turn may depend on the data of interest. For example, pmax should take into account the observation frequency.
Generally, there may be deterministic terms in the DGP. They can, of course, be accommodated in the procedure. For example, if the observations fluctuate around a nonzero mean, the sample mean should be subtracted, that is, the observations should be mean-adjusted before the Hannan–Rissanen procedure is applied. Similarly, if the series has a deterministic linear trend,µ0+ µ1t, then the trend parameters may be estimated by OLS in a first step and the estimated trend function is subtracted from the original observations before the order selection procedure is applied. Alternatively, the linear trend may be estimated
Table 2.2. Hannan–Rissanen model selection for generated AR(1) (yt= 0.5yt−1+ ut) with h= 8 and pmax= 4
MA order Selection AR
criterion order 0 1 2 3 4
0 0.259 0.093 0.009 −0.013 −0.002
1 −0.055∗ −0.034 −0.017 −0.001 0.019
AIC 2 −0.034 −0.012 0.004 0.017 0.032
3 −0.012 0.011 0.006 0.028 0.049
4 0.005 0.028 0.026 0.045 0.060
from the first-stage AR(h) approximation, and the corresponding trend function may be subtracted from the yt’s before the ARMA order selection routine is applied. It is also worth noting that, in this procedure, the stochastic part is assumed to be stationary. Therefore, if the original series is integrated, it should be differenced appropriately to make it stationary.
For illustration, T = 100 observations were generated with the AR(1) pro-cess yt= 0.5yt−1+ ut, and the Hannan–Rissanen procedure was applied with h= 8 and pmax= 4. These settings may be realistic for a quarterly series. The results obtained with the AIC criterion are shown in Table 2.2. In this case the ARMA orders p= 1 and q = 0 are detected correctly. Although the series was generated without a deterministic term, a constant term is included in the procedure. More precisely, the sample mean is subtracted from the yt’s before the procedure is applied.
There are also other formal statistical procedures for choosing ARMA or-ders [e.g., Judge, Griffiths, Hill, L¨utkepohl & Lee (1985, Section 7.5)]. A more subjective method is the classical Box–Jenkins approach to ARMA model spec-ification. It relies on an examination of the sample autocorrelations and partial autocorrelations of a series to decide on the orders. As we have seen in Section 2.3, the true autocorrelations of a pure MA process have a cutoff point corre-sponding to the MA order, whereas the partial autocorrelations of such processes taper off. In contrast, for pure, finite-order AR processes the autocorrelations taper off, whereas the partial autocorrelations have a cutoff point corresponding to the AR order. These facts can help in choosing AR and MA orders.
For example, in Figure 2.9 sample autocorrelations and partial autocorre-lations of AR(1) and AR(2) processes are depicted that illustrate the point. In particular, the autocorrelations and partial autocorrelations of the first AR(1) clearly reflect the theoretical properties of the corresponding population quan-tities (see also Figure 2.6). The dashed lines in the figures are±2/√
T bounds around the zero line that can be used to assess whether the estimated quantities are different from zero, as explained in Section 2.2.2. Generally, it is impor-tant to keep in mind that the sampling variability of the autocorrelations and
Figure 2.9. Estimated autocorrelations and partial autocorrelations of artificially gen-erated AR(1) and AR(2) time series (AR(1): yt = 0.5yt−1+ ut; AR(2): yt = yt−1− 0.5yt−2+ ut; sample size T= 100).
partial autocorrelations may lead to patterns that are not easily associated with a particular process order. For example, the second set of autocorrelations and partial autocorrelations of an AR(1) process shown in Figure 2.9 are generated with the same DGP, yt = 0.5yt−1+ ut, as the first set, and still they cannot be associated quite so easily with an AR(1) process.
Also for the AR(2) processes underlying Figure 2.9, specifying the order correctly from the estimated autocorrelations and partial autocorrelations is not easy. In fact, the pattern for the first AR(2) process is similar to the one of an MA(1) process shown in Figure 2.7, and the pattern obtained for the second AR(2) time series could easily come from a mixed ARMA process. In summary, the estimated autocorrelations and partial autocorrelations in Figure 2.9 show that guessing the ARMA orders from these quantities can be a challenge. This experience should not be surprising because even the true theoretical autocor-relations and partial autocorautocor-relations of AR, MA, and mixed ARMA processes can be very similar. Therefore, it is not easy to discriminate between them on the basis of limited sample information.
Figure 2.10. Estimated autocorrelations and partial autocorrelations of artificially generated MA(1) and MA(2) time series (MA(1): yt = ut− 0.7ut−1; MA(2): yt= ut− ut−1+ 0.5ut−2; sample size T = 100).
Figure 2.11. Estimated autocorrelations and partial autocorrelations of artificially generated ARMA(1,1) time series (DGP: yt = 0.5yt−1+ ut+ 0.5ut−1; sample size T = 100).
Table 2.3. Hannan–Rissanen model selection for generated MA(2) (yt = ut− ut−1+ 0.5ut−2) with h= 8 and pmax= 4
MA order Selection AR
criterion order 0 1 2 3 4
0 0.820 0.358 0.155 0.171 0.164
1 0.192 0.118 0.131 0.115∗ 0.134
AIC 2 0.127 0.138 0.153 0.134 0.157
3 0.138 0.159 0.162 0.156 0.179
4 0.144 0.151 0.173 0.178 0.201
0 0.820 0.386 0.212 0.255 0.277
1 0.220 0.174∗ 0.216 0.227 0.275
SC 2 0.183 0.223 0.266 0.275 0.326
3 0.223 0.272 0.303 0.325 0.376
4 0.257 0.292 0.342 0.375 0.426
A similar situation can be observed in Figures 2.10 and 2.11, where sample autocorrelations and partial autocorrelations of MA(1), MA(2), and ARMA(1,1) processes are depicted. Although the MA(1) processes can perhaps be inferred from the estimated autocorrelations and partial autocorrelations (see also the theoretical quantities depicted in Figure 2.7), it is difficult to guess the orders of the underlying DGPs of the other time series correctly. In this context it may be of interest that using the Hannan–Rissanen procedure instead of looking at the sample autocorrelations and partial autocorrelations does not necessarily result in correct estimates of the ARMA orders. For example, we have applied the procedure to the MA(2) time series underlying the last set of autocorrela-tions and partial autocorrelaautocorrela-tions shown in Figure 2.10. Using an AR order of h= 8 in fitting a long AR in the first step of the procedure and a maximum order of pmax= 4 in the second step, we obtained the results in Table 2.3. Neither the AIC nor the SC finds the correct orders. This outcome illustrates that it may also be difficult for formal procedures to find the true ARMA orders on the basis of a time series with moderate length. Again, given the possible similarity of the theoretical autocorrelations and partial autocorrelations of ARMA processes with different orders, this observation should not be surprising. Nevertheless, less experienced time series analysts may be better off using more formal and less subjective approaches such as the Hannan–Rissanen procedure.
Also, keep in mind that finding the correct ARMA orders in the specification step of the modeling procedure may not be possible in practice anyway when real economic data are analyzed. Usually in that case no true ARMA orders exist because the actual DGP is a more complicated creature. All we can hope for is finding a good and, for the purposes of the analysis, useful approximation.
Moreover, the model specification procedures should just be regarded as a preliminary way to find a satisfactory model for the DGP. These procedures
need to be complemented by a thorough model checking in which modifications of, and changes to, the preliminary model are possible. Model specification procedures are described next.