Prediction error models
5.4 Results from the system identification procedure
5.4.2 Comparison of different model structures
Selection of an appropriate model structure and orders is a difficult step and requires considerable background detail of the model structures. In general, even though a complicated (ie high order) model may explain the characteristics of our unknown system better than a less complicated (ie lower order) model, such attempts may lead to an over-parameterised problem. This is why the number of parameters is explicitly taken into account in a criterion of the type described in (5.27) which tends to select models of lower order than the criterion described in (5.25).
In this section, four candidate model structures, including ARX, ARMAX, OE and BJ, have been preselected to represent the lymph node system. Subsequently, 3 data sets (R705, R634 and R797) were randomly selected and searched for the “best” model for each candidate model structure. Details of this procedure have been given in sections 5.3.4 and 5.3.5. As mentioned in section 5.3.5, the final model in the present study will be chosen according to the simulation
performance of selected models. The simulation cost function (VN) is calculated for all the selected models (from each individual data set) and used to compare their simulation performances. The results of this study are shown in Table 5.1.
TABLE 5.1: A comparison of the different model structures and their simulation performances (Vsim). Sheep N nk Vsim OE BJ ARX ARMAX R705 168 7 0.0062 0.0071 0.0101 0.0075 (1,2,nk) (1,1,0,2,nk) (3,1 ,nk) (2,1,2,nk) R634 108 5 0.0217 0.0245 0.0732 0.0240 (1,2,nk) (1,0,1,2,nk) (3,1 ,nk) (2,1,2,nk) R797 156 5 0.0138 0.0148 0.0366 0.0492 (1,2,nk) (1,0,1,2,nk) (3,1, nk) (2,1,2,nk) Remarks: The orders of the models are given in parentheses. N and nk denote the number of data points and the value of the system delay used in the calculation. Vsim denotes the simulation cost function described in (5.28).
Among those four candidate models, the OE model gives the minimum value of
VN in all 3 cases. This implies that the simulation performance of OE models is the best of the four candidate models. Indeed, as described before, ARX and
ARMAX models have a common polynomial A(q) in input-output G(q) and noise dynamics H(q). Any misfit in the noise dynamics (for example, because H(q) can not be modelled exactly) will be reflected by a misfit in the input-output dynamics. This misfit leads to poor simulation results. However, this is not the case for OE or BJ models which have independently parameterised G(q) and H(q). So, even if H(q) can not be modelled exactly (for example, if H(q) is fixed to 1 in the case of OE models), this does not necessarily lead to a misfit in G(q) provided that the model structure used is flexible enough (ie nb and nf are large enough in the case of an OE model).
The OE model is to be preferred over the BJ model since only a correct modelling of G(q) is of interest in the present study. The identification of an unnecessarily noise dynamic model would only lead to an increase in the
variance of the identified parameters. Indeed, more parameters (including some which were unnecessary) could be identified with the same data set. Note that in validating an OE model (Figure 5.5), only the lower plot, (/?a) should be
considered since the noise dynamics H(q) of the OE model (which is equal to one) is unsuitable for the description of the noise dynamic model.
Overall, the parameters obtained from the OE models have much smaller
standard deviations (SD) than those obtained from the other models (Table 5.2). For example, in the BJ model of R705, the value o f/2 is 0.5303 while its SD is 0.1304 (ie almost 25% of its own value). An extreme example of this is in the ARMAX model of both R705 and R797. In these, the SD values of c; (ie 0.0617 and 0.0935, respectively) exceed their own c} values (ie 0.0190 and 0.0533, respectively. Such high SD values reflect a degree of uncertainty about the estimated parameters in that model which, indicates that, in practice, its use should be avoided. The relatively low SD of parameters estimated in the OE model provides further evidence supporting the selection of the low order OE model as the “final model” in the present study.
TABLE 5.2: A comparison of the parameters and their standard deviations estimated for the four candidate models.
Sheep Model Estimated parameters ± Standard deviations (SD)
bo Ci di fi f2 R705 OE 0.0122 - - -1.7928 0.8033 ±0.0015 - - ±0.0250 ±0.0237 BJ 0.0263 0.6425 - -1.5078 0.5303 ±0.0071 ±0.1210 - ±0.1363 ±0.1304 a 7 a2 a3 b0 Ci c2 ARX -1.5369 0.9796 -0.4092 0.0400 - - ±0.0704 ±0.1157 ±0.0678 ±0.0082 - - ARMAX -1.8460 0.8539 - 0.0093 0.0190 -0.9286 ±0.0680 ±0.0650 - ±0.0041 ±0.0617 ±0.0601 b0 Ci d1 fi f? R634 OE 0.0233 - - -1.7489 0.7649 ±0.0021 - - ±0.0224 ±0.0210 BJ 0.0257 - -0.9006 -1.7115 0.7295 ±0.0059 - ±0.0527 ±0.0721 ±0.0686 ai a2 a3 b0 Ci c2 ARX -1.5935 0.9580 -0.3329 0.0476 - - ±0.0910 ±0.1526 ±0.0863 ±0.0102 • - ARMAX -1.7986 0.8126 - 0.0200 -0.0677 -0.9322 ±0.0433 ±0.0406 - ±0.0042 ±0.0642 ±0.0641 bo Ci d i fi f2 R797 OE 0.0186 - - -1.6867 0.6981 ±0.0029 - - ±0.0438 ±0.0420 BJ 0.0155 - -0.9504 -1.7813 0.7908 ±0.0032 - ±0.0323 ±0.0493 ±0.0477 a? a 2 a 3 b0 Ci c 2 ARX -1.7245 1.0002 -0.2603 0.0247 - - ±0.0778 ±0.1370 ±0.0754 ±0.0059 - - ARMAX -1.8998 0.9049 - 0.0083 0.0533 -0.6312 ±0.0662 ±0.0643 - ±0.0045 ±0.0935 ±0.0901
5.4.3 Model validation
The OE model order (l,2,nk) was finally selected to model the lymph node system. Residual analysis was performed on all data sets and their final OE models. According to the OE model structure, the noise dynamic is assumed to be 1 (ie H (q )=1). Thus, the result of autocorrelation of the residuals (R££) is ignored and cross correlation between the input and the residual is focused upon as an alternative. Cross correlation between the inputs and the residuals of each OE model should give a result close to zero for all lags. Any OE models giving an overall cross correlation exceeding the 99% confidence interval at any lag should be rejected. Typical results of the residual analysis of data sets of R705 are given in Figure 5.5 and demonstrate that each OE model is validated, ie there is no value of R^ which exceeds the 99 % confidence interval.
Figure 5.5: Residual analysis
Autocorrelation function of residuals: Ree
Cross corr. function between input and residuals: Reu
lag
The lower panel illustrates the residual analysis of the data sets of R705, which indicates that the OE model is validated (ie there is no value of exceeding the 99 % confidence interval defined by a region between the two dotted lines). In the OE model, the noise dynamic, H(q), is assumed to be 1. The autocorrelation of the residuals, R ee,(upper panel) has to be ignored as it is evident that H(q)=1 is not a good noise model.