3.3 (Hyper-) parameter estimation for UC models
3.4 Advanced Examples using Variance Intervention
The following two examples introduce an iterative approach for the simultaneous estimation of the trend NVR and perturbations when variance intervention is required. Example 3.3 Steel consumption in the UK revisited
In Chapter 2, Example 2.3 considered the quarterly steel consumption data (Figure 2.4) in the context of a simple trend only model. However, as pointed out then, a better option
would be to estimate a UC model in which all the components were fitted jointly, an approach pursued below.
Spectral identification is a very important step in this analysis, required in order to check for the existence of all or part of the seasonal harmonics. In this case, it is often preferable to calculate the AR-spectrum of the series when any dramatic jumps in the trend have already been removed using the reconst function, as discussed in Example 2.2 (see Figure 2.6). Such jumps in the trend could distort the spectrum estimate of the original series, especially the frequency band corresponding to the trend, as illustrated in Figure 3.8. In fact, it is clear from Figure 3.8, that while the distortion is minimal with respect to the seasonal spectral peaks (4 and 2 samples/cycle), it is quite noticeable in the low frequency band of the spectrum. Figure 3.8 also reveals the existence of an approximate four years cycle in the data and most of the associated harmonics (note that the seasonal peak is one of these harmonics). 20 10 6.67 5 4 3.33 2.86 2.5 2.22 20 0 1 2 3 4 Period (samples/cycle) lo g1 0( P)
Figure 3.8 AR(17)-spectrum of raw data (dotted) and the reconstructed series from Figure 2.6 (solid).
Because of such potential distortions in the spectrum, frequency domain methods should always be applied with care, especially when estimating the trend (and cyclical) NVR hyper-parameters. Indeed, this is a typical case where ML in the time domain with simultaneous variance intervention estimation may yield a superior answer. In this regard, a DHR model may be estimated by entering the following commands.
>> load steel.dat
>> P = [0 16 8 16/3 4 2]; >> TVP = [1 0];
>> nvr = dhropt(steel, P, TVP, -17, [],[],[],[],[],[],[],[],[87 106]); >> [fit, fitse, tr, trse, comp] = ...
>> dhr(steel, P, TVP, nvr, [], [], [], [], [], [87 106]);
The fourth input argument to dhropt specifies ML optimization using a frequency domain estimation based on the AR(17) model as the initial condition. The final input argument
provides the variance intervention points, using a similar syntax to the irwsm/irwsmopt pair introduced in Chapter 2.
A more satisfactory solution may be obtained by an iterative procedure which combines both frequency and time domain methods. Although more complex to implement, this approach arguably provides a more objective and complete UC analysis. Indeed, while the simple commands above may be regarded as the basic default option, the code below is a good illustration of the open architecture of CAPTAIN in the MATLAB® environment, by
which each user may find different ways to exploit potential solutions to a particular time series problem.
The idea here is that NVR frequency estimation based on the raw data is contaminated by jumps in the trend. If these jumps were known, they could be removed and the contamination problem would disappear. Obviously, such information is not immediately available and has to be estimated simultaneously with the NVR hyper-parameters. In other words, trend jumps may be estimated conditional on given NVR parameters by variance intervention, while the NVR parameters themselves may be estimated in the frequency domain conditional on given estimates of the trend jumps. In this case, a simultaneous, unconditional estimate of the trend jumps and NVR parameters may be obtained by the following iterative algorithm (cf. Example 2.3).
>> Int=[87 106]; % interventions 1975Q1 and 1980Q1 >> nvr0 = irwsmopt(steel, 1, 'f12', Int);
>> tnew = irwsm(steel, 1, nvr0, Int);
>> ynew = steel - tnew + reconst(tnew, Int); >> P = [0 16 8 16/3 4 2]; TVP = [1 0];
>> tol = 1e-4; iter= 0; obj= 1000*tol; >> while obj > tol
>> nvr = dhropt(ynew, P, TVP, 17); >> [fit, fitse, tnew, trse, comp] = ...
>> dhr(steel, P, TVP, nvr, [], [], [], [], [], Int); >> ynew = steel - tnew(:, 1) + reconst(tnew(:, 1), Int);
>> if iter>0, obj = max(abs(nvrold-nvr)); end
>> nvrold = nvr; iter = iter+1; >> [iter obj]
>> end
The first four lines yield an initial estimate of the constructed series without the trend jumps based on the IRW + noise model (see Figure 2.6). Subsequently, the iterations are built in such a way that the NVR parameters are estimated using continuously updated versions of the jump-free series (first line after the while sentence). However, in each
case the smoothing is based on the DHR model for the raw data, the current estimate of the NVR parameters and the variance intervention points. The iterations end when the
difference between two subsequent estimates of NVR values are less than the tol control
value, here with NVR values [0.0001,0.0434,0.0000,0.0009,0.0233,0.0017]. Finally, the trend estimates, together with the cyclical and seasonal components are all illustrated in Figure 3.9, obtained as follows.
>> subplot(311), plot([steel tnew]) % Trend and series >> subplot(312), plot(sum(comp(:, 1:3)')') % Cycle
>> subplot(313), plot(sum(comp(:, 4:5)')') % Seasonal
300 400 500 T re nd -50 0 50 C yc le 1955 1960 1965 1970 1975 1980 1985 1990 -40 -20 0 20 S ea so na l Months
Figure 3.9 Estimated components of the steel consumption series, using the frequency domain method.
Example 3.4 Car drivers killed and seriously injured
In order to show that the iterative procedure developed in the previous example is applicable to other data sets, a similar analysis is performed for the monthly car drivers killed and seriously injured in Great Britain from January 1969 to December 1984. Figure 3.10 displays this series using the following commands.
>> load cars.dat
>> t = (1969 : 1/12 : 1984.99)'; >> plot(t, cars)
1970 1975 1980 1985 1500 2000 2500 C as ua lti es Months
Figure 3.10 Car drivers killed and seriously injured in Great Britain from January 1969 to December 1984.
A mild trend and seasonal components are clearly visible in these data. However, more interesting, are the apparent jumps in the series related to seat-belt legislations in January 1974 and February 1983 (samples 61 and 179). As before, these may be accounted for with the trend signal using variance interventions. In this case, it is also clear that the seasonal pattern changes in 1974 and 1975 after the first intervention point, so for the purposes of the present example, a third intervention is set at sample 74 (February 1975).
An initial estimate of the trend is first obtained using irwsm/irwsmopt with variance interventions. The AR-spectrum of the raw data and the reconstructed intervened data are shown in Figure 3.11, where it is clear that the final harmonic is not present and is, therefore, not necessary in the analysis. Furthermore, the distortion due to the trend jumps are less significant here than in the steel consumption series. Nonetheless, for illustrative purposes, the full iterative procedure outlined in the previous subsection is still implemented and yields the components illustrated in Figure 3.12, together with the reconstructed series in Figure 3.13.
20 10 6.67 5 4 3.33 2.86 2.5 2.22 20 0 1 2 3 4 Period (samples/cycle) lo g1 0( P)
1500 2000 2500 T re nd -200 0 200 400 Se as on al 1970 1975 1980 1985 -200 -100 0 100 200 Ir re gu la r Months
Figure 3.12 Estimated components of the driver casualties data.
1970 1975 1980 1985 1500 2000 2500 C as ua lt ie s Months
3.5 Conclusions
The present chapter has introduced the Unobserved Components (UC) modelling tools in
CAPTAIN,and shown how the toolbox may be utilised for forecasting and signal extraction
of periodic time series with widely varying characteristics.
One point worth stressing again, is that in every example included in the chapter, the seasonal components are such that not all of the theoretical harmonics are observed in the data. In this regard, the present authors believe that the identification stage utilised above, based on both spectral and time domain methods, is a particularly important part of the analysis, although it is often neglected in conventional UC modelling. In addition to providing evidence on whether all or only some of the harmonics are really necessary (especially important for frequency domain estimation methods), this also identifies the relative importance of each component (measured by the relative magnitude of their NVR). This latter information is useful if the model has to be constrained in some way at the estimation stage.
To illustrate the methodology, the models above have been limited to a trend and a cyclical or seasonal component. However, in fact, CAPTAIN allows for a much wider range of
models than considered so far. The following chapters, therefore, introduce various additional components such as exogenous variables.
Chapters 2 and 3 have developed an approach to nonstationary signal processing based on the identification and estimation of time variable parameter (TVP) stochastic models. The methodological tools that underpin this modelling philosophy are unified in terms of the discrete-time Unobserved Components (UC) model (3.1). Here, in order to allow for non stationarity in the time series yt, it is assumed that the various components in the model, including the trend Tt, can be characterised by TVP’s.
Most often, the nature of such parametric time variability will not be known prior to the analysis and so each TVP is defined as a non stationary stochastic variable. This adds a statistical degree of freedom to the estimation problem, so allowing for the estimation of any ‘slow’ parameter variations. By slow, we mean here variations that are slow in relation to the variations in the time series itself. Such variations may result from slow physical changes in the process or from some form of nonlinearity in the data. In this manner, the models obtained are all inherently self-adaptive: namely, they change their parameters automatically in an optimal manner to reflect changes in the nature of the time series. For this reason, they can be exploited in applications such as self-adaptive forecasting, operational control and management.
In practice, as mentioned in Chapter 3, not all the possible components in the UC model are necessary: indeed, the simultaneous presence of all these components can induce identifiability problems in which it is not possible to unambiguously estimate the model. For this reason, the models considered in Chapter 3 are limited to univariate time series characterised by a trend, together with a sustained cyclical and/or seasonal component. To complete the discussion, therefore, the present Chapter considers the other optimal recursive TVP models that may be estimated using CAPTAIN.
In particular, the Chapter describes the entire class of TVP, or ‘dynamic’, regression models, including Dynamic Linear Regression (DLR), Dynamic Harmonic Regression (DHR) and Dynamic Auto-Regression (DAR), as well as the closely related, TVP version of the Auto-Regressive eXogenous variables model (DARX). Finally, the Chapter considers an alternative Dynamic Transfer Function (DTF) model, estimated using an