CHAPTER 4: DEVELOPMENT OF MACRO-LEVEL COLLISION PREDICTION MODELS
4.10 Summary and Recommendations for Future Work
This study used a dataset for Regina, Saskatchewan, Canada to develop and compare two different types of geographically weighted regression model: GWPR and GWNBR. It initially considered 66 input variables. Using a set of conventional NB models, it was found that six of the 66 variables were statistically significant in FI collision prediction, and eight were statistically significant in PDO collision prediction, giving a total of 11 different variables (all significant at a confidence interval not less than 80%). No socio-demographic input variables were statistically significant. The statistically significant variables included traffic volume, road inventory variables (e.g., arterial length, intersection density), and land use variables (e.g., industrial area and commercial area).
Moran’s I local indicator was used to check for spatial dependency in the variables. As all the selected variables showed spatial dependence, advanced models (GWPR and GWNBR) were used to handle this issue.
The study evaluated the impact of two different types of bandwidth (fixed Gaussian and adaptive bi-square) on the predictive performance of the GWPR and GWNBR collision prediction models. It used cross-validation error to select the optimal bandwidth for each approach. The results of the study showed that the type of bandwidth could significantly affect the predictive performance of the collision prediction models. Fixed Gaussian bandwidth appeared to offer improved predictive performance over adaptive bi-square bandwidths for all the GWPR and GWNBR models developed to predict the number of zonal levels FI and PDO collisions. The variation in the parameters of the models developed using fixed Gaussian bandwidth had a wider range than those developed using adaptive bi-square bandwidth. It was noticed that this wider
170
range in the parameters helps to explain unobserved heterogeneity within each zone and helps to improve the predictive performance of the collision prediction models.
The results of the seven most popularly used GOF tests did not favour either the GWPR or the GWNBR models as the results of the GOF tests were not consistent. The CURE plots provided additional insights and helped us select the better performing model between GWPR and GWNBR. The GWNBR models were preferable to the GWPR models for explaining collision variation across zones in the study area considered although the performance gap between the two models was not large.
The smaller cumulative residuals, tighter error band, and greater convergence to zero of the fixed Gaussian bandwidth results compared to the adaptive bi-square bandwidth results suggested that the fixed Gaussian bandwidth models were preferable to the adaptive bi-square bandwidth models. The observations regarding the fixed Gaussian and adaptive bi-square bandwidth approaches cannot, however, be generalized for other collision datasets as the findings may be unique to this study’s data. This issue requires additional investigation. Rather than rely on the experience of previous studies, it may be advisable to use the characteristics of the particular dataset to select the most appropriate bandwidth.
Future work should research a better way to select the optimal bandwidth for use in a GWPR or GWNBR model. The cross-validation method used to select the bandwidth used in the study produced similar results in terms of the individual parameter values, but suggested a different predictive performance for the different models. Future work should also investigate a better way to evaluate the predictive performance of the collision models.
171
Inconsistent results from the various GOF tests were also observed. CURE plots provide richer information than a single numeric GOF test result and appear to offer a potentially better technique.
A third area in need of research is the over-dispersion issue. As it is known that conventional NB models have an advantage over conventional Poisson models when handling an over-dispersed dataset, GWPR and GWNBR models should be compared to increase our understanding of how they handle over-dispersion.
Lastly, the processing time required to develop the GWPR and GWNBR models quantitatively was not analyzed. This issue was outside the scope of this study and it is understood that the processing time varies with the computer used, but it was noticed that a substantial amount of time (between 30 to 40 minutes) was needed to obtain calibrated parameters for each GWPR and GWNBR model developed. Improving optimization techniques for searching the parameters of GWPR and GWNBR models may help reduce the processing time and may encourage the application of GWPR and GWNBR models over conventional NB models wherever appropriate and feasible.
References
Amoh-Gyimah, R., Saberi, M., and Sarvi, M. (2017). The effect of variations in spatial units on unobserved heterogeneity in macroscopic crash models. Analytic Methods in Accident Research, 13, 28-51.
Anastasopoulos, P. C., Mannering, F. L., Shankar, V. N., and Haddock, J. E., (2012). A study of factors affecting highway accident rates using the random-parameters tobit model. Accident Analysis &and Prevention, 45, 628-633.
172
Brunsdon, C., Fotheringham, A.S., and Charlton M.E., (1996). Geographically weighted regression: A method for exploring spatial non-stationary. Geographical Analysis, 28(4), 281- 298.
Charlton M., and Fotheringham A.S., (2009). Geographically weighted regression. White Paper. National Centre for Geocomputation, National University of Ireland Maynooth, Maynooth, Co Kildare, Ireland
Chen, E., and Tarko, A. P., (2014). Modeling safety of highway work zones with random parameters and random effects models. Analytic Methods in Accident Research, 1, 86-95. Da Silva, A. R., and Rodrigues, T. C. V., (2014). Geographically weighted negative binomial
regression—incorporating overdispersion. Statistics and Computing, 24(5), 769-783.
De Dios Ortuzar, J., and Willumsen, L. G., (2011). Modelling transport. New Jersey: Wiley. ISBN: 978-0-470-76039-0
Dong, C., Clarke, D. B., Yan, X., Khattak, A., and Huang, B. (2014). Multivariate random- parameters zero-inflated negative binomial regression model: An application to estimate crash frequencies at intersections. Accident Analysis & and Prevention, 70, 320-329.
Easa, S. M. (1993). Urban trip distribution in practice. I: Conventional analysis. Journal of Transportation Engineering, 119(6), 793-815.
Farber, S., and Páez, A., (2007). A systematic investigation of cross-validation in GWR model estimation: empirical analysis and Monte Carlo simulations. Journal of Geographical Systems, 9(4), 371-396.
Finley, A. O., (2011). Comparing spatially‐varying coefficients models for analysis of ecological data with non‐stationary and anisotropic residual dependence. Methods in Ecology and Evolution, 2(2), 143-154.
173
Fotheringham, A. S., Brunsdon, C., and Charlton, M., (2003). Geographically weighted regression: the analysis of spatially varying relationships. John Wiley and Sons.
Fotheringham, A. S., Charlton, M. E., and Brunsdon, C., (1998). Geographically weighted regression: a natural evolution of the expansion method for spatial data analysis. Environment and Planning A, 30(11), 1905-1927.
Fotheringham, A. S., and Oshan, T. M. (2016). Geographically weighted regression and multicollinearity: dispelling the myth. Journal of Geographical Systems, 18(4), 303-329. Guo, L., Ma, Z., and Zhang, L. (2008). Comparison of bandwidth selection in application of
geographically weighted regression: a case study. Canadian Journal of Forest Research, 38(9), 2526-2534.
Hadayeghi, A., Shalaby, A., and Persaud, B., (2003). Macrolevel accident prediction models for evaluating safety of urban transportation systems. Transportation Research Record: Journal of the Transportation Research Board, (1840), 87-95.
Hadayeghi, A., Shalaby, A. S., and Persaud, B. N. (2010). Development of planning level transportation safety tools using geographically weighted Poisson regression. Accident Analysis and Prevention, 42(2), 676-688.
Hauer, E., (2015). The art of regression modeling in road Safety, Springer.
Hauer, E., (2001). Overdispersion in modelling accidents on road sections and in empirical Bayes estimation. Accident Analysis and Prevention, 33(6), 799-808.
Huang B., Wu B., and Barry M., (2012). Geographically and temporally weighted regression for modeling Spatio-temporal variation in house prices. International Journal of Geographical Information Science, 24(3), 383–401
174
Johnson, E., Turochy, R. E., and LaMondia, J. J. (2016). Trip generation of student-oriented housing developments. Journal of Urban Planning and Development, 04016029.
Kwigizile, V., Teng, H., (2009). Comparison of methods for defining geographical connectivity for variables of trip generation models. Journal of Transportation Engineering, 135(7), 454- 466.
Levine, N., Kim, K. E., and Nitz., L. H., (1995). Spatial analysis of Honolulu motor vehicle crashes: I. spatial patterns. Accident Analysis and Prevention, 27(5), 663-674.
Li Z., Wand W., Liu P., Bigham J.M., and Ragland D.R., (2013). Using geographical weighted Poisson regression for county level crash modeling. Safety Science 58, 89-97.
Lord, D., and Park, P. Y. J., (2008). Investigating the effects of the fixed and varying dispersion parameters of Poisson-gamma models on empirical Bayes estimates. Accident Analysis & and Prevention, 40(4), 1441-1457.
Lord, D., and Mannering, F., (2010). The statistical analysis of crash-frequency data: a review and assessment of methodological alternatives. Transportation Research Part A: Policy and Practice, 44(5), 291-305.
Lovegrove, G. R., and Sayed, T., (2006). Macro-level collision prediction models for evaluating neighbourhood traffic safety. Canadian Journal of Civil Engineering, 33(5), 609-621.
Lovegrove, G., Lim, C., and Sayed, T., (2009). Community-based, macrolevel collision prediction model use with a regional transportation plan. Journal of Transportation Engineering, 136(2), 120-128.
Martin, W. A., and McGuckin, N. A., (1998). Travel estimation techniques for urban planning (Vol. 365). Washington, DC: National Academy Press.
175
Mannering, F. L., Shankar, V., and Bhat, C. R., (2016). Unobserved heterogeneity and the statistical analysis of highway accident data. Analytic Methods in Accident Research, 11, 1-16.
Mennis, J. 2006., Mapping the results of geographically weighted regression. The Cartographic Journal, 43(2), 171-179.
Moeinaddini, M., Asadi-Shekari, Z., Sultan, Z., and Shah, M. Z., (2015). Analyzing the relationships between the number of deaths in road accidents and the work travel mode choice at the city level. Safety Science, 72, 249-254.
Nakaya, T., Fotheringham, A.S., Brunsdon, C., and Charlton, M., (2005). Geographically weighted Poisson regression for disease association mapping. Statistics in. Medicine. 24, 2695–2717. Nelder, J. A., and Wedderburn, R. W. M., (1972). Generalized linear models. Journal of the Royal
Statistical Society, 135, 370-384
Neuhaus, J. M., Hauck, W. W., and Kalbfleisch, J. D., (1992). The effects of mixture distribution misspecification when fitting mixed-effects logistic models. Biometrika, 755-762.
Neuhaus, J. M., McCulloch, C. E., and Boylan, R., (2013). Estimation of covariate effects in generalized linear mixed models with a misspecified distribution of random intercepts and slopes. Statistics in Medicine, 32(14), 2419-2429.
Páez, A., Farber, S., and Wheeler, D., (2011). A simulation-based study of geographically weighted regression as a method for investigating spatially varying relationships. Environment and Planning-Part A, 43(12), 2992.
Pirdavani, A., Bellemans, T., Brijs, T., and Wets, G., (2014). Application of geographically weighted regression technique in spatial analysis of fatal and injury crashes. Journal of Transportation. Engineering, 140(8).
176
crashes in Seoul. Accident Analysis and Prevention, 91, 190-199.
SAS University Edition. SAS Institute Inc., Cary, NC, USA Accessed on 11th July 2016.
Shariat‐Mohaymany, A., Shahri, M., Mirbagheri, B., and Matkan, A. A., (2015). Exploring spatial non‐stationarity and varying relationships between crash data and related factors using geographically weighted poisson regression. Transactions in GIS, 19(2), 321-337.
Schwarz, G., (1978). Estimating the dimension of a model. The Annals of Statistics, 6(2), 461- 464.
Shmueli, G., (2010). To explain or to predict? Statistical Science, 289-310.
Washington, S. P., Karlaftis, M. G., and Mannering, F. (2010). Statistical and econometric methods for transportation data analysis. CRC press.
Washington, S. P., Persaud, B. N., Lyon, C., and Oh, J. (2005). Validation of accident models for intersections (No. FHWA-RD-03-037).
Xu, P., and Huang, H., (2015). Modeling crash spatial heterogeneity: Random parameter versus geographically weighting. Accident Analysis and Prevention, 75, 16-25.
Yao, S., Loo, B. P., and Lam, W. W., (2015). Measures of activity-based pedestrian exposure to the risk of vehicle-pedestrian collisions: Space-time path vs. potential path tree methods. Accident Analysis and Prevention, 75, 320-332.
Young, J., and Park, P.Y., (2013). Benefits of small municipalities using jurisdiction-specific safety performance functions rather than the highway safety manual's calibrated or uncalibrated safety performance functions, Canadian Journal of Civil Engineering, 40 (6), 2013
Yu, D., (2010). Exploring spatiotemporally varying regressed relationships: The geographically weighted panel regression analysis. Proceedings of the Joint International Conference on Theory, Data Handling and Modelling in Geospatial Information Science.
177
CHAPTER 5:DEVELOPMENT OF MACRO-LEVEL CRIMES PREDICTION MODELS