Part C: Statistical analysis - Part C: Group rider violations

CHAPTER 5: PHASE 2 NATURALISTIC GROUP RIDING STUDY:

5.3 Part C: Group rider violations

5.3.7 Part C: Statistical analysis

5.3.7.1 Description of violations

Initially, the 64 ‘red light violations’ were described using percentages in terms of rider behaviour and situational characteristics. The exposure measure for ‘red light

violations’ was the total number of red traffic lights. Therefore, the characteristics of

the 64 violations were compared to the 473 red lights where no violation occurred in terms of ‘day of week’, ‘direction of travel’ of the group and ‘number of riders’ using chi-square tests for categorical variables and t-tests for continuous variables. Due to the low number of ‘red light violations’ and small number of violations per group, no further analyses were undertaken for ‘red light violations’.

Secondly, the 103 ‘stop sign violations’ were described using percentages in terms of situational characteristics. The exposure measure for ‘stop sign violations’ was the total number of stop signs. Therefore, the characteristics of the 103 violations were compared to the 26 stop signs where no violation occurred in terms of ‘day of week’,

‘direction of travel’ of the group and ‘number of riders’ using chi-square and t-tests.

Since ‘stop sign violations’ were determined to be based completely on traffic circumstances and not group characteristics, no further analyses were undertaken for

‘stop sign violations’.

Finally, the 232 ‘other violations’ were described in terms of violation type (‘one-way

167 percentages. Then, the total number of ‘other violations’ (combined) were described in terms of group characteristics, using percentages.

5.3.7.2 Univariate analyses for other violations

Further analyses examining the number of ‘other violations’ (combined) were undertaken. The exposure measure for ‘other violations’ was hours of eligible footage, since these violation types could generally occur at any time. First, outlier values for the rate of ‘other violations’ (number per hour) were identified using boxplots. Then, using the value modification method, each outlier was replaced with the largest observed value that was not an outlier (Kwak & Kim, 2017). The rate of ‘other

violations’ was calculated for the 91 eligible trips and presented by group

characteristics, in terms of means and SDs.

The data was clustered. However, for this examination of group characteristics associated with the rate of violations per trip, it was the clustering of trips within riding groups that needed to be accounted for. Ordinary Poisson or Negative Binomial Regression used to examine count outcomes, assumes that all observations are independent of each other. Since this assumption was violated, GEE modelling was again chosen. GEE negative binomial regression was used since the dependent variable, ‘number of other violations’ was count data and it was over-dispersed (conditional variance exceeds the conditional mean) (Coxe, West, & Aiken, 2009). Negative binomial regression has an extra parameter to model the over-dispersion (Coxe et al., 2009). As for standard negative binomial regression, the GEE model provides an IRR and 95% CI, but the model also accounts for the dependence within clusters. IRRs are the ratio of two incidence rates with the incidence rate being the number of events (‘number of other violations per trip’) divided by the person-time at risk (‘hours of eligible footage per trip’) (Hilbe, 2011).

Due to the clustered data and the ‘number of other violations per trip’ being related to exposure (‘hours of eligible footage per trip’), univariate analyses were conducted using GEE negative binomial regression in SPSS version 22. The ‘number of other

violations per trip’ (count) was entered as the dependent variable. Then the univariate

168 using unadjusted IRRs and 95% CIs with only one variable entered in the model. This method was used so that the univariate associations could be examined while still accounting for the clustering of trips within groups and also for exposure.

Each riding group was treated as a different ‘subject’ in SPSS and the different trips within each group were entered as the ‘within subject’ variable in the GEE model. The natural log ‘ln (hours of eligible footage per trip)’ was entered as the offset variable to control for exposure, as required for negative binomial regression which uses a log link (Coxe et al., 2009). An exchangeable working correlation matrix was again chosen. From the GEE modelling, correlations among observations from within the same cluster (riding group) were estimated to range from 0.105 to 0.368 for the independent variables, indicating weak to moderate marginal correlations (Cohen, 1988).

5.3.7.3 Multivariate GEE negative binomial regression model

A multivariate GEE negative binomial regression model was undertaken in order to examine the association between multiple group and trip-related factors and the rate of ‘other violations.’ Again, ‘number of other violations per trip’ was entered as the dependent variable and ‘ln (hours of eligible footage per trip)’ entered as the offset variable. Group-level variables considered for inclusion as independent variables in the model were those with p-values of less than 0.25 in the univariate analyses. Trip- level variables considered for inclusion in the model were based on findings from the limited literature on factors associated with rider violations.

The final multivariate GEE model included the following independent variables:

‘organisational structure’ (semi-formal, informal, formal), ‘sprint points’ (no, yes)

and ‘average number of riders’ (count). Since the ‘organisational structure’ variable was based on five specific group characteristics, ‘organisational structure’ was then removed and the multivariate model run with each of the five specific characteristics entered separately (‘cost’, ‘committee/ incorporated business’, ‘written code of

conduct’, ‘designated ride leader’ and ‘uniform’). This was to determine which of

these specific characteristics were associated with the rate of other violations per trip. QIC and QICC values were also determined for each of the five models.

169

5.4 Ethical considerations

Phase 2 was approved by the Curtin University Human Research Ethics Committee as a ‘sub-study’ of the larger ARC-funded cycling study. Group riders who participated in Phase 2 of the study received a PIS and signed a consent form (Appendix 10) before cameras were attached to their bicycles. A waiver of consent was granted by the Curtin University Human Research Ethics Committee concerning group riders other than the study participant who appeared in the group riding footage. This was on the grounds of there being negligible risk from being filmed, the potential road safety benefits, being impracticable to obtain consent from all riders who appear in the footage and sufficient protection of privacy. Participants were asked to ride as they usually would while cameras were attached, so participation did not increase their risk of crash involvement. Participants were informed that in the event of a crash, video footage could be subpoenaed in a court of law. Otherwise, all footage and data collected was kept completely confidential and only viewed by the Curtin University researchers. All participants gave verbal permission for the researcher-administered questionnaire to be audio recorded. The identity of all participants and groups will be concealed in any publications.

5.5 Data management

Phase 2 data was stored according to the Curtin University Research Data and Primary Materials Policy. All paper-based data including consent forms from Phase 2 were stored in a locked filing cabinet at C-MARC. Online survey data from Phase 2 was recorded and stored electronically using the Google Docs program and was password protected. Final databases were downloaded and saved with only participant IDs and no identifying information. All electronic files including questionnaire databases, video footage, GPS data, maps and audio files were stored on the Curtin Research drive in a project folder that could only be accessed by nominated researchers on the project. All data and files will be retained for a period of seven years following the conclusion of the project and then destroyed.

170

In document Group Riding in Western Australia: An Examination of Crashes, Outcomes and Behaviour (Page 191-195)