• No results found

The prediction of collision occurrence is important but forms only part of the picture. Understanding collision severities allows resource to be prioritised rationally (Milton et al., 2008). This section reviews the collision severity modelling literature to describe the range of econometric latent variable choice models used for this purpose. Savolainen et al. (2011) have reviewed and assessed the methodological approaches to collision severity modelling and discussed future methodological directions.

Collision severities are not measured precisely and are often reported on ordinal scales which do not usually include abbreviated injury scales determined by medical staff. Severity can range from property damage only, through various grades of injury (e.g. possible, slight or serious) to those involving fatalities which may occur up to a year later depending on the jurisdiction. These data can be converted into binary (e.g. fatal or not) or nominal forms depending on the type of model chosen (Savolainen et al., 2011).

Chapter 3: Review of statistical techniques for road safety improvement 49

Logistic or probit regression models have been used for binary collision outcomes.

Personal and behaviour risk factors associated with fatal collisions in Japanese motor vehicle collisions used a logit model (Shibata et al., 1994). Boufous et al.

(2009) used a binary logistic model to find factors associated with injury in work-related collisions. A logistic model indicated occupational stress and safety climate to be significant predictors of fatigue-related near misses for occupational drivers (Strahan et al., 2008). Binary models can be chained to form a sequential binary model to represent ordinal data (Savolainen et al., 2007; Yamamoto et al., 2008).

Ordinal logit or ordinal probit models appear to be ideally suited to the prediction of progressive severity category membership (Savolainen et al., 2007). These models hypothesise a latent interval variable that determines collision severity. Fitting the model determines the significant variables and calculates threshold values that assign the category membership. Both the logit and probit forms were first used in road safety to investigate the influence of Australian road users’ personal attributes on injury severity (O'Donnell et al., 1996). No differences were reported between the results obtained from the two models. Ordinal probit models were also used to analyse Singaporean motorcyclists’ collision severity (Quddus et al., 2002). Fitting these ordinal response models uses fixed explanatory variables to estimate model parameters and category threshold values but there may be heterogeneity across the categories. This issue has been overcome by including random variables to produce a mixed ordinal logit model (Eluru et al., 2007). This model was given further flexibility to yield the mixed generalised ordinal response logit (MGORL) models which allows the thresholds to vary with fixed and random variables (Eluru et al., 2008) thus allowing the threshold to vary by observation. These latent class ordinal response models have the disadvantage of using the same variables for each

outcome level.

Further issues exist with the logit model (Yamamoto et al., 2008). Ordinal models assume an orderly and continuous increase or decrease in probability of group membership which may not always be appropriate. For example air bags may simultaneously decrease the probabilities of membership of both the highest and lowest severity categories, a situation that cannot be incorporated in an ordered logit or probit model (Savolainen et al., 2011). This suggests that unordered choice

models such as multinomial logistic (MNL) models would be useful. These models assume severity categories are nominal, multinomially distributed and independent (McCullagh et al., 1989). MNL models have been used to analyse factors

determining severity in Belgian collision segments (Depaire et al., 2008).

MNL models are based on logit models where a separate model is built for each outcome and the predicted observation outcome is assigned as that with the highest probability. MNL regression has the advantage of simple computation but suffers from independence of irrelevant alternatives (IIA). These models have fixed odds ratios between options even when one or more options are close or perfect

substitutions, an example being the introduction of another colour of bus to a commuter’s journey choices (Savolainen et al., 2011). The problem arises from the inability of the MNL model to include correlated errors. Kennedy (2008) gives five solutions to IIA which are; ignore the issue altogether; combine categories that are correlated; fit a multinomial probit model in which the error terms are multivariate normal and thus allow correlation; fit a nested model; or fit a mixed (random

variable) model. No references to MNP models in collision severity modelling have been found.

Nested models group correlated choices together within the same nest while placing other uncorrelated choices into separate nests (Savolainen et al., 2011). Thus IIA is allowed to exist between some of the choices but not others. Fitting these models uses the generalised extreme value distributions for the errors rather than a normal distribution. Using nested logit models Shankar et al. (1996) analysed the severity of collisions on a Washington interstate highway while Savolainen et al. (2007) investigated the factors contributing to single vehicle motorcycle collisions.

Mixed (random effects) logit models address the problems of MNL models by using random parameters allowing correlation between categories. Such parameters can vary by observation and unlike multinomial probit models the error terms are not restricted to multivariate normality. The integral density function, however, contained in the model makes computation difficult. Brownstone et al. (1999) has demonstrated the use of random-parameter discrete outcome models (multinomial and ordered logit models). Mixed logit models were used by Milton et al. (2008) to evaluate the severity of traffic injuries in Washington State using aggregated data.

Chapter 3: Review of statistical techniques for road safety improvement 51

Underreporting of collisions is a general issue in road safety but was not explicitly addressed.

The customary separation of count and severity models has been questioned and has led to the development in multivariate approaches to collision modelling. Wang et al. (2011) reported the estimation of the number of collisions at different severities using a two-stage mixed multivariate model comprised of Bayesian plus mixed logit models to represent the severity and frequency outcomes.

While collision severity models have been extensively used in general collision analysis, they have not been applied in work-related road safety.