Chapter 7 LONDON UNDERGROUND CASE-STUDY
7.5 Model estimation results
7.5.5 Model comparison
The final stage of the analysis, before model validation, is to compare substantive estimation results across models. It is of interest to reveal whether the non-EUT models proposed in this research actually outperform EVT and EUT models, and which modelling techniques lead to an improvement in model fit. The answer to these questions relies on a series of formal statistical tests to determine the structure that best fits the data. It is essential to realize that these candidate models have different parameters and even structures. Specifically, PT and CPT cannot be treated as the parametrical generalization of any other models. Thus, we applied both nested and non-nested tests to assess their empirical performances.
Table 7.11 shows the measures of fit in the estimation sample. The adjusted likelihood ratio index favours PT and CPT models, with the highest value and . Only the SEV model achieves the same level of as PT and CPT, while CPT appears slightly to outperform PT from an AIC point of view, with an
0 0.2 0.4 0.6 0.8 1 1.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Weighting function Probability
= 0.38
178
improvement of just 0.336 unit. This finding highlights the importance of reference dependence and diminishing sensitivity for modelling route choice behaviour, but does not necessarily support the complexity of CPT.
EVT EUT SEV SEU RDEV RDEU PT CPT
Parameter 4 6 5 7 5 7 7 8
LL(β) -301.040 -298.705 -296.311 -295.458 -298.547 -296.824 -294.209 -293.041
̅ 0.115 0.115 0.125 0.122 0.119 0.118 0.126 0.126
̅ 0.014 0.015 0.026 0.022 0.018 0.017 0.026 0.026
AIC 610.080 609.410 602.622 604.916 607.094 607.648 602.418 602.082 BIC 626.409 633.903 623.033 633.492 627.505 636.224 630.994 634.740 ConAIC 630.409 639.903 628.033 640.492 632.505 643.224 637.994 642.740 CorAIC 610.636 610.973 603.596 607.266 608.068 609.998 604.768 605.446
LR_EVT 4.670 9.458 11.164 4.986 8.432 NA NA
LR_EUT 4.788 6.494 0.316 3.762 NA NA
LL(β): The final log-likelihood based on the calibration sample;
ConAIC:Consistent AIC;
CorAIC: Corrected AIC;
LR_EVT: Likelihood ratio w.r.t EVT;
LR_EUT: Likelihood ratio w.r.t EUT;
NA: Not applicable.
Table 7.11: Measures of goodness-of-fit
When the other measures of fit are used, we surprisingly obtain a rather different conclusion.
That is, the SEV model turns out to be consistently favoured by BIC, consistent AIC and corrected AIC. Given the relatively simple model structure of SEV, we can conclude that its good performance is due to the nonlinear probability weighting function. This finding suggests that ignoring the weighting function may lead to an incorrect model for analysing risky choice behaviour, and it is therefore essential to account for subjective probability in future research. SEU provides an even better LL than SEV, but it fails to compete with SEV if the number of parameters is taken into account.
We cannot find evidence in favour of rank dependence using any of the measures of fit. In fact, although RDEV and RDEU outperform EVT and EUT, they provide worse model
179
fit compared to SEV and SEU. Given that RDEV and RDEU incorporate rank dependence on the basis of SEV and SEU, their relatively unexpected performance raises a question as to whether rank dependence actually matters in reality. If it is not a factor of concerned to passengers, future research should turn more attention to the other models like PT and SEV.
If it does matter, we should concentrate on the techniques for determining the real rank orders of outcomes perceived by passengers in an RP context.
To test if the model fits between EUT and non-EUT are significantly different, we use both nested and non-nested test. The latter applies to PT and CPT, while the former employs an LR test to assess the other nested models. The LR statistics for EVT, EUT, SEV, SEU, RDEV and RDEU are set out at the bottom of Table 7.11. In terms of nested models, SEV significantly improves model fit compared to EVT, with a p-value of less than 0.005.
Although nonlinear utility is not included in SEV, it still significantly outperforms the EUT model, which again reinforces the benefit of using the nonlinear probability weighting function. SEU also gives a statistically significant improvement in terms of LR statistics, with a p-value of 0.01. RDEV provides similar LL as EUT, whilst RDEV is still preferred, given it has one less parameter than EUT. Although the LL of RDEU is almost 2 units more than EUT, it only delivers a p-value of 0.05 compared to EUT.
We also applied a non-nested test to compare PT and CPT with all the other models (refer to Chapter 5 for the non-nested test method), with the results shown in Table 7.12.
PT Test statistics P-values
vs EVT -3.265 0.001
vs EUT -2.827 0.002
vs SEV -1.485 0.069
vs SEU -1.581 0.057
vs RDEV -2.584 0.005 vs RDEU -2.287 0.012 CPT Test statistics P-values
vs EVT -3.464 0.000
vs EUT -3.054 0.001
vs SEV -1.881 0.030
vs SEU -1.958 0.026
vs RDEV -2.831 0.002 vs RDEU -2.562 0.005
vs PT -1.156 0.124
Table 7.12: Non-nested test results for PT and CPT
180
It should be noted that the p-value corresponds to the upper bound of probability, although the other models could provide higher adjusted likelihood ratios than PT or CPT models by chance. The results clearly favour the PT and CPT models, which outperform most models with a very low p-value, except the SEV and SEU models. Specifically, both PT and CPT give a statistically significantly better model fit compared to EVT and EUT models, with a p-value of only 0 to 0.002. These are also the models which merit more consideration compared to the RDEV and RDEU models for this data, given that the highest probability is only 0.012 for PT versus RDEU. We cannot be entirely sure, however, whether the PT and CPT models provide better performances than the SEV and SEU models, in particular PT versus SEV, where the probability is approximately 0.07.
To conclude, the comparison results based on the calibration sample offer the same evidence that non-EUT models are preferred to EVT and EUT models. CPT provides the highest LL, while, according to the non-nested test results, the statistical benefit of data fit does not seem to overcome the penalty of having more parameters. SEV turns out to be an efficient model specification with a fair model fit and a relatively simple model structure.
These findings highlight the particular importance of three modelling techniques, namely nonlinear utility (and diminishing sensitivity), nonlinear probability weighting function, and reference dependence.