Analysis and methodological approach - STUDY DESIGN AND METHODOLOGY

Temporal and Spatial Identifiers Reference Guide

Equation 1: Calculation of risk index developed by Toledo et al (2008)

4 STUDY DESIGN AND METHODOLOGY

4.3 Analysis and methodological approach

To test the hypotheses (Section 4.1), the datasets introduced in Section 4.2 are linked on the basis of common driver, vehicle, trip and road characteristics. Measures of driver behaviour are the core units of analysis. Specifically, these are the frequencies and magnitudes of speeding, aggressive acceleration and aggressive braking. In this research, driving 1 km/h or more above the posted speed limit is considered to be speeding. This reflected the enforcement regime in place at the time the data were collected and that odometers are designed to overstate the actual speed (Australian Commonwealth Government, 2004) and therefore there is a built-in tolerance of (approximately) 3 km/h between the GPS speed and the speed indicated on the speedometer. Acceleration of 4 m/s2_{or more and braking of -4 m/s}2_{or more was}

considered to be aggressive. The rationale for these thresholds is discussed in Section 8.4.2. This is based on previous studies (discussed in Section 2.2.2) which found that

behaviour in excess of these levels is commonly observed immediately before crashes.67

Figure 4-6 illustrates the different components that fit into the study and which ones are used to test each of the hypotheses. Hypotheses 1 use only data from the ‗before‘ phase whilst hypotheses 2 uses data from both the before and the after phases. Regardless of the original source of each variable, the final dataset groups variables into four categories: driver and vehicle, trip (temporal), road segment (spatial) and behavioural. The driver and vehicle variables remain unchanged for the duration of the study whilst the other variables change at various frequencies. Factors exogenous to the driver and vehicle that potentially influence behavioural outcomes are

controlled by a Temporal and Spatial Identifier (TSI) which uniquely describes the environment in which an observation occurs. These are created by combining the temporal and spatial variables from which each unique combination forms a single TSI. The behavioural measures are analysed within these unique environments. For hypotheses 2, changes in the behavioural measures before and after drivers are made aware of their behaviour are compared between phases within each TSI. The

construction of TSIs are further discussed in Chapter 7.

Figure 4-6: Methodological framework

To account for the hierarchical relationships between the variables included in this study, composite profiles are built to describe drivers‘ behaviour as shown in Figure 4-7. The driver behaviour profile (DBP) is comprised of the summation of the

frequency of each behaviour multiplied by the magnitude and by the weight associated with each behaviour (speeding, aggressive acceleration and aggressive braking)68_.

68_{These weights are derived from the literature. The rationale for these weights is discussed in Section}

This is done for each TSI and weighted by the distance travelled such that (for

example) a TSI comprising 100 km of driving has twice the weight of a TSI with 50 km of driving. The driver and vehicle variables – including a driver‘s risk perceptions and personality characteristics – are also linked to each DBP. Factors associated only with the ‗after‘ period are also included as additional elements. The a priori expectation is that the driver and vehicle characteristics influence the driver behaviour profile. By creating composite profiles that describe the driver, vehicle and behaviour it is possible to model and compare driver behaviour across time, within the same environment or between drivers. The composition, calculation and use of these profiles are explained in detail in Chapter 8.

Figure 4-7: Driver profiles

4.3.1 Levels of aggregation of GPS data

The large volume of data – in terms of the number of records and the number of individual and composite variables – necessitates analyses to be conducted at various levels of aggregation. Although information is lost when aggregating, many types of analyses would be too time computationally intensive, if not impossible, to conduct using fully disaggregate data. Therefore, the process of aggregating and analysing the aggregate data by driver or space, allows for a gradual refinement in the selection of variables, which are required to be included in the more disaggregate analyses.

The results and analysis section is organised in order of aggregation, from the most aggregate to the most disaggregate. Figure 4-8 summarises the different levels used. The same variable(s) may be used in more than one level of aggregation. Each level is differentiated by the temporal or spatial coverage or how much time or distance is covered by each aggregated observation.

Figure 4-8: Summary of levels of aggregation

Data processing is conducted at the most disaggregate level (second-by-second) where each observation represents one second of driving behaviour. Modelling is conducted primarily at the spatiotemporal and road segment levels and to a lesser extent at the most aggregate level where one observation represents one driver.

4.3.2 Data processing and analytical techniques

This section summarises the data processing and analytical techniques that are

applied in this research. Table 4-5 outlines the research process. The individual steps are discussed in detail in the indicated chapters.

Level 4 (Most Disaggregate) Second-by-Second (original GPS data)

Level 3

Road segment (consecutive observations sharing the same spatiotemporal characteristics) Level 2 (Spatiotemporal Environment)

All observations (same driver) sharing the same spatiotemporal characteristics Level 1 (Most Aggregate)

Table 4-5: Summary of data processing and analysis steps

Description Category Section(s)

In document Evaluating changes in driver behaviour for road safety outcomes: a risk profiling approach (Page 147-153)