Integrated Accident Resilience Framework (IARF) – A Theoretical Approach Using Spatial and Statistical Analysis

(1)

Integrated Accident Resilience Framework (IARF) – A Theoretical Approach Using Spatial and Statistical Analysis

Sumanta Ghosh

[email protected]; [email protected] Qatar Design Consortium, Doha, Qatar

Srinivasan Manickam [email protected] Qatar Design Consortium, Doha, Qatar

Muhammad Haider [email protected] Qatar Design Consortium, Doha, Qatar ABSTRACT

Throughout the world, road accidents have become a nightmare for any local government. Data shows that every 24 seconds someone dies on the road (WHO, 2018). Generally, there are multiple factors causing road accidents such as traffic volumes/composition, speed, infrastructure conditions, climatic conditions, and vehicle factors etc. Through this paper, an effort has been made to bring an effective Integrated Accident Resilience Framework (IARF). The framework is in the form of a theoretical method which may help transportation agencies and governments to develop a practical system for crash analysis and mitigation. The Integrated Accident Resilience Framework (IARF) showcased in this paper consists of different stages such as data collection, storage, and analysis, which help to compute correlations between crash causational parameters and crash frequency. The tools used to perform the analysis functions in the framework consist of the GIS platform, as well as the application of the negative binomial regression model. The computed results help identify the major influencing parameters that are linked to traffic accidents and their contribution to crash frequency in black spot locations. This can be used to mitigate future crashes by taking appropriate remedial measures in collision-prone regions. The methodology presented can also be scaled up to a city level network. The entire transportation network can be spatially marked to develop a resilient accident management strategy; even a real-time also.

Keywords: Framework; Accident; Resilience; Parameters; Correlation 1 INTRODUCTION

The World Health Organization recognizes that of all the systems that citizens must encounter in their day to day lives, road transport is the most complex and most dangerous (WHO, 2018). In fact, road accidents are globally the leading cause of death for individuals within the ages of (5) and (29). An approximate 1.35 million deaths occur due to vehicle crashes every year; and following the current trends, this number could rise to 2 million deaths per year by 2030 (WHO, 2018). A fact that is often overlooked is that along with the loss of lives is the economic impacts that accidents have, which is an astonishing 1-3% of a nation’s total GDP (WHO, 2015).

International Conference on Civil Infrastructure and Construction (CIC 2020) February 2-5, 2020 Doha, Qatar

(2)

Although it is impossible to prevent road crashes completely, effective reformatory actions can assist in reducing crash rates for the future. Traditionally, traffic agencies used methods such as using crash rate to traffic volume ratios at a location to identify possible areas of high collision risk. Thenceforth, other locations with similar traffic/ geometric conditions were compared to improve the investigation. However, such methods of predicting crash frequency and their contributing variables are viewed as outdated, and shortsighted (Al-Marafi et al., 2018). Accurate verification of the causes of crashes at specific locations is vital to identify the major deficiencies of the road network within a study area, and to help take appropriate mitigation measures to tackle them. This requires the development of a practical data collection, storage, analysis, and management system known as an accident resilience framework. The success of an accident resilience framework depends on the ability of transportation agencies, analysts, the police, academicians and local governments to work collaboratively with one another. This is essential to ensure that the whole process of the framework is accurately carried out and that necessary actions are taken based on the conclusions.

2 BACKGROUND RESEARCH

Traffic safety studies aim to associate different variables that may cause accidents to crash frequency. Such variables may be the road geometry, environmental conditions, traffic factors, vehicle factors, and human behavior (Tarko et al., 2008). Early models that were used to predict traffic accidents consisted of simple multiple linear regression which assume a normal distribution of errors. It was later determined by analysts that a Poisson (exponential) distribution would represent a better fit for crash occurrence (Mohammad et al., 2018). However, since crash frequency data tends to be over-dispersed, negative binomial regression models are used in the latest models in order to accommodate for such situations (Zhang et al., 2014). The benefit of using negative binomial distribution is that Poisson distribution restricts the mean and variance to be equal to one another, an assumption that does not always hold true. This can be overcome by using the negative binomial distribution model, which can account for over-dispersion that may exist in crash data counts, and nullify the mean equals the variance condition (Abdulhafedh, 2017). Negative binomial regression models have been used for over decades and still remain the most accepted practice of modelling crash data as of today. Application of the Negative Binomial Distribution model can be found in multiple safety research studies such as “Forecasting Crashes at the Planning Level” (Guevara, 2004), “The Relationship Between Truck Accidents and Geometric Design of Road Sections: Poisson verses Negative Binomial Regressions” (Miaou, 1994), “Accident Prediction Model for Freeways” (Persaud et al., 1993), and “Negative Binomial Regression Model for Road Crash Severity Prediction” (Naghawi, 2018).

In order to develop an accurate statistical model representing crash data, the collection and storage of data is equally as important as the analysis. For this, GIS platforms are a pivotal technique used to visualize and store road accident data (Zhang, 2014). In fact, GIS platforms have been used in several traffic studies such as “Crash Prediction and Risk Evaluation Based on Traffic Analysis Zones” (Zhang, 2014), “A GIS based traffic accident data collection, referencing and analysis framework for Abu Dhabi” (Khan et al., 2004), “GIS for Road Accident Analysis in Hong Kong” (Chan, 2009), and

(3)

“GIS-based spatial analysis of urban accidents: Case study in Mashhad, Iran”. Safety analysts prefer to use GIS due to its efficiency in handling data, and its numerous spatial analysis techniques which can help identify accident black spot locations such as proximity analysis with buffer operations (Chan, 2009), or using a hotspot analysis (Shafabakhsh, 2017).

3 FRAMEWORK

After examining numerous studies, as mentioned in the literature review, an optimal accident resilience framework is developed. There are two different stages encompassing the accident resilience framework. The first stage, focuses on a procedure for the police to record, store, and maintain data in a database accessible to transportation agencies and analysts. The successful conduction of this stage will be critical for accurate model development as it directly corresponds to the final output results. The second stage highlights the method of how the data can be analyzed using multiple tools to identify and mitigate future collision in accident-prone areas. This stage will be the responsibility of transportation agencies and their analysts working towards a mitigation strategy for accident black spot locations.

The effectiveness of this specific framework lies in its ability to combine both spatial and statistical analysis tools which will provide maximum insight for analysts to dissect the major concerns in prevalent high-risk areas. Due to this, the GIS platform, in conjunction with modern statistical software’s are used to produce the final model results. The final model results will clearly specify the contribution of each tested parameter to crash frequency. The accuracy of the results will be validated by the use of multiple methods, e.g. mean absolute deviance; mean squared predictive error etc., to assure that the model measure of prediction is acceptable.

3.1 Accident Resilience Framework - Stage 1

As seen below in Figure 1, the first stage of the accident resilience framework starts at the data collection and storage stage. The responsibility of ensuring that this component of the framework is carried out properly falls on the police or enforcement that arrives at the site of the collision.

(4)

3.1.1 STEP 1: Data Collection

The first step involves the completion of a crash description log which clearly notes the type and severity of the collision. Other details such as the time, date, weather/ environmental conditions must also be noted. The second component of the data collection stage consist of recording the exact location of the crash. This can be done in two ways, either by using a GPS receiver, or by pinning the crash location onto a map on a mobile phone. The accuracy of the pinned crash location is crucial as it plays a vital role in examining collision black spots in the analysis stage.

3.1.2 STEP 2: Data Storage

Once the data has been collected, the police must ensure the proper storage of all recorded on-site data. The data will be uploaded by the police onto a centralized database which can be easily accessed by transportation agencies/analysts. This database can be referred to as a “Crash Analysis System” (CAS). The Crash Analysis System will be in the form of a GIS compatible map with geo-referenced locations of all crashes as uploaded by the police, including the details recorded from the crash description log.

Based on a previous literature study, the Highway Safety Information System (HSIS) was a database, which contained accident data for a select group of states in the United States (USDOT, 1999). The HSIS was integrated with a GIS system to develop a crash referencing and analysis system. This improved the ability of analysts and engineers to identify safety deficiencies and perform countermeasure evaluation studies. Furthermore, reporting of accidents was made quicker and more accurate with the combined use of the global positioning system (GPS). Problems were quickly evaluated based on spatial relationships, and even non-traditional data such as land use, zonal ordinance, and population characteristics were fed into the GIS system (USDOT, 1999). Hence, the development of a GIS based crash analysis system is of paramount importance to allow for an effective spatial analysis of crash data for any region. Due to this, the application of GIS has been integrated within our framework. In the next stage, spatial data captured from stage 1 will be further analyzed in the GIS platform, and crash data will be statistically modelled.

3.2 Accident Resilience Framework - Stage 2

As seen in Figure 2 below, the second stage of the accident resilience framework will focus on the analysis of the collected/stored data in a sequential process. The steps involved in this process are briefly explained in the subsequent sections below.

(5)

3.2.1 STEP 1: Spatial Analysis

The first step of the analysis stage will be completed in the form of a spatial analysis on a GIS platform. The spatial analysis will help in identifying road segments that are prone to frequent collisions, otherwise known as accident black spots (Reshma et al., 2012). This can be conducted in two ways, by conducting a proximity analysis utilizing the buffer operation (Chan, 2009), or by computing the nearest neighbor index followed by a hotspot analysis (Shafabakhsh et al., 2017).

The nearest neighbor index is a calculation of the distance between neighboring feature centroids. It can be easily computed under the pattern toolset in a GIS platform. If the result of the nearest neighbor is a value less than 1, the pattern exhibits clustering. However, if the result is less than 1, the pattern leans towards dispersion. If clustering is present, a hotspot analysis will be conducted in order to identify areas of high collision (hot spot) and low collision (cold spot) frequency. This is done under the spatial statistics tool, and by defining the number of crashes feature as the input parameter. The generated layer will be in the form of hot/cold spots and help in defining segments of interest which will be used as the datasets for statistical modelling in the next stage. However, if the value for the nearest neighbor index is found to be above 1, the data exhibits dispersion and a proximity analysis may be conducted for a more adequate visualization of the data. The second method of obtaining accident black spots is by conducting a proximity analysis in the form of using the buffer toolset. Aggregation through buffering operation can assist in focusing on disputed locations to undergo mitigation assessments. Larger buffers correspond to areas of higher frequency of collisions within a specified distance. This technique is also showcased in a study conducted in Hong Kong as mentioned in the literature review (Chan, 2009).

Using either, or both methods will assist the user in defining the high-risk road segments for which all input variable information will be needed from the appropriate departments. These details are explained in the next section.

3.2.2 STEP 2: Parameters - Input Variables

The next phase of the framework showcases the parameters to be assessed for their potential contribution to road accidents in the analyzed black spots.

Table 1: Selected Input Parameters and Required Details

(6)

in the previous section, their respective geometric, traffic, weather, and crash data are needed. In the case of applying this framework locally, any information regarding the road geometry characteristics can be requested from the local public works authority, traffic factors data can be extracted from the transport model of the region, and climate data can be obtained from the local meteorology department. However, geometric details of roads may also be extracted straight from the GIS map features as a backup if the goal is to perform a quick analysis. A list of parameters that should be tested are specified in Table 1 above. Depending on the available data, parameters to be tested may be added to the users liking. The available input parameters are later modelled through a statistical approach to establish a relationship among the parameters (independent variable) against the number of crashes (dependent variable).

3.2.3 STEP 3: Statistical Model

A comparative assessment among different statistical models has carried out to determine the most suitable crash modelling technique (Mohammad et al., 2018). The best suited statistical model was found to be in the form of negative binomial regression. The general form of the negative binomial regression model has been presented in the equation (1) below:

(1) Where P(y_i) is the probability of segment i experiencing y_i accidents in one year and λ_iequals to the number of accidents per year on segment i. ɛ_i is an error term. The addition of ɛ_imakes the variance different from the mean, which is a limitation of the Poisson distribution (Mohammad et al, 2018).

The model is to be developed in a software capable of performing negative binomial regression. Several statistical modelling software’s are available such as SPSS (Chengye, 2013), amL (Guevara, 2004), R Statistic (Oh et al, 2003), which allow users to input the dependent and independent variable data in a spreadsheet format. The software’s are capable of performing a negative binomial regression model of the data and present an output in a tabular format.

Causational parameters such as geometric data, traffic data, and climate data will act as the independent variables for the model as described in the section 3.2.2. Conversely, the crash frequency within the specified road segment will act as the dependent variable. Each year’s data for a segment is to be treated as a separate sample. However, the most recent 20% of data should not be used for the model’s development as that will be used for validating the prediction performance. The development of the negative binomial regression model helps to define the degree of impact of each input parameter through its computed coefficient value. This has been explained in the next section.

3.2.4 STEP 4: Results

The results of the model will be in the form of a table providing coefficient/estimate values for each parameter along with other computed values such as standard errors, z-values, t-statistics, goodness of fit etc. An example of a set of results are presented in the

(7)

study “Modelling Motorway Accidents using Negative Binomial Regression” (Chengye, 2013), and “Negative Binomial Regression Model for Crash Severity Prediction (Naghawi, 2018). The significant takeaway from the results are the correlation/estimate coefficients, which express the significance of contribution of each parameter to crash frequency in the black spots; e.g. a positive coefficient of high magnitude will express a significant contribution to causing road accidents, whereas negative coefficient of high magnitude will express a negative impact on crash frequency. As coefficient values approach 0, the more insignificant the parameter will be in affecting crash frequency. 3.2.5 STEP 5: Model Validation

As stated in section 3.2.3, the latest 20% of the crash data shall not be used in the model development. The latest 20% of data shall be used for the validation of the model. The two techniques used for model validation are the computation of Mean Absolute Deviation (MAD) and Mean-Squared Predictive Error (MSPE), as stated in the study “Validation of FHWA Crash Models for Rural Intersections” (Oh et al., 2003).

The Mean Absolute Deviance calculates the average miss-prediction of the model. A value closer to 0 implies that the model predicts the observed data well on average. The equation for Mean Absolute Deviance (MAD) is given as follows:

(2)

̂

Where, Y_i is the fitted value of Y_i.Variable n is the data sample size.

The second step of validating the model prediction is by computing Mean-Squared Predictive Error (MSPE). The Mean-Squared Predictive Error is an error corresponding with a validation or external data set. The equation for Mean-Squared Predictive Error (MSPE) is given as follows:

(3) Based on the magnitude of the deviance between the estimated and actual value of crash frequency, an iterative model building process may be adopted for a more accurate representation. This iterative process may consist of a series of steps including the elimination of skewed datasets, removal of less significant input parameters etc. 3.2.6 STEP 6: Crash Mitigation

A crash mitigation strategy can be formulated based on the results of the regression model. The final estimate/coefficient values for each parameter will clearly specify the affect they have on crash frequency. This will give analysts a clearer picture of the possibilities that may be the common cause of crashes at the collision hotspots. E.g. if maximum curvature is found to cause the biggest increase in crash frequency, analysts can aim to provide proper signage or reduce the speed to ensure drivers can safely manoeuvre around the curves. Similarly, certain contributors may be found to

(8)

reduce crash frequency if the calculated coefficient is found to be a negative value. E.g. if the median width value is modelled to have a negative coefficient, it will imply that the presence of a wider median is actually beneficial in mitigating crashes. The crash mitigation strategy will aim to propose mitigations in an order of weightage, with the highest contributors of crash frequency to be mitigated first.

4 BENEFITS OF THE FRAMEWORK

The application of the accident resilience framework proposed in this paper is beneficial for society as it:

a. Promotes the creation of a crash analysis system which will act as a database for future research and development in the area of accident resilience

b. Combines spatial and statistical analysis tools that are best suited to visualize and model crash data, based on modern day research.

c. Is dynamic in nature and allows analysts to choose any accident related parameters to be tested to their liking, based on the depth of available data.

d. Accurately depicts the contribution that tested parameters have on road accidents occurring at black spot locations.

e. Has a quick and cost effective procedure to acquire test results.

f. Allows transportation agencies and analysts to develop mitigation strategies based on a precise and validated model.

g. Encourages a range of stakeholders to cooperate and work in unison to improve safety of the public.

5 FUTURE WORK

The primary focus in this proposed framework is the modelling of total number of crashes and their association with the various parameters that may contribute to their occurrence. The model can help identify hazardous road segments that are in need of being treated through taking the necessary mitigation measures. However, there is still scope for further work to be done on the subject matter of crash data modelling such as correlating zonal based parameters such as demographic, social, and economic factors. Some of these zonal based factors that have not been considered are age, gender, population density, average income per person, occupation types, total time of driving experience, behavioral characteristics etc. However, this will require a detailed survey/ consensus of the population in the study area to obtain an accurate model which can be explained.

Another area that has a potential for more research is the GIS platform. The GIS platform is an effective way to visualize and store crash data. However, GIS data in 3D can be difficult to create and maintain than 2D data. E.g. a road network that is modelled in 3D will contain z-values that would be a lot more complex to maintain than a 2D model. Hence, newer developments in the application of 3D GIS can help create a more comprehensive understanding of a particular area, and in our case, the high-risk collision areas. Moreover, 3D based modelling can help retrieve geometric parameter information such as alignments/grading much faster than having to rely on public authorities to provide the data, which will definitely take much longer and delay the analysis process.

(9)

readiness. E.g. Once the final model has been validated to identify the probabilities of crashes in real time, a mechanism/strategy can be developed for authorities such as emergency response teams to reach out to these accident black spots as quickly as possible. E.g. if parameters such as the road curvature, traffic composition, or vertical alignments are computed to have a significantly high contribution to crashes at a location with high validity (95%, 90% or 85% etc.), the spatial model will immediately alarm users that there is a high risk (85%+ probability) of an accident that may occur. This will allow to further enhance the operational strategy for accident response teams and help to ensure that citizens affected by road accidents can be assisted timely and effectively. It is also worth mentioning that there is a prospect of developing an accident resilience framework for Qatar using the fundamental theoretical model provided in this paper. Such a framework would be particularly beneficial for a booming nation like Qatar that is constantly experiencing a rapid growth in its population, as well as an increase in traffic saturation.

6 CONCLUSION

Traffic accident prediction and safety evaluation is critical in order to combat the ever-growing issue of road accidents. This paper presents a feasible Integrated Accident Resilience Framework (IARF) that is developed combining multiple modern day methods and tools. The framework encompasses a broad series of actions that can be taken by different stakeholders to improve the transport system of society. A set procedure is described of how crash data is to be collected and stored by local police, a step which is critical to act as a baseline for all future studies to be conducted on the subject of crash analysis. The GIS tool is used for spatial analysis as it contains a range of tools to identify high-risk areas that need to be assessed immediately. The two major tools selected for this were the hotspot analysis, as well as the proximity analysis. Both methods are effective ways to visualize the study areas, as well as efficient and quick to perform. Three classifications of parameters to be tested have been included such as road geometry, traffic factors, and environmental conditions. Specific parameters to be tested may be added or deducted based on the depth of available information available to analysts. In terms of the statistical modelling of the crashes, numerous literature reviews were examined which confirmed that that the negative binomial regression was the most widely accepted and best fitting regression for crash data. The results are provided in a tabular format which clearly identify the magnitude of the effect each parameter has on crash frequency in high-risk areas within a region, municipality, city etc. Lastly, a mitigation strategy is developed based on the ranked weightage of which parameters are the largest contributors to crash frequency. When looking at potential future work, there is still room for further research in order to improve the accuracy of the model by weighing in demographic, social and economic factors, as well as incorporating more widely used 3D technology for modelling road networks in GIS. Additionally, there is opportunity to improve the operational readiness of the model by formulating accident response strategies in the case of high alert results from the model. Lastly, there is a definite possibility of implementing a similar accident resilience framework in a growing nation such as Qatar.

(10)

REFERENCES

Abdulhafedh, A. (2017). Road crash prediction models: different statistical modeling approaches.

Journal of Transportation Technologies, 07(02), 190-205.

Al-Marafi, M. N. & Somasundaraswaran, K. (2018). Review of crash prediction models and their applicability in black spot identification to improve road safety. Indian Journal of Science and Technology, 11(5), 1-7.

Chan, W. Y. (2009). GIS for Road Accident Analysis in Hong Kong. A Journal of the Association of Chinese Professionals in Geographic Information Systems, 10(01), 58-67

Chengye, P. & Ranjitkar, P. (2013). Modelling motorway accidents using negative binomial regression. Journal of the Eastern Asia Society for Transport Studies, 10, 1946-1963. Guevara, F. L. D., Washington, S. P. & Oh, J. (2004). Forecasting Crashes at the Planning Level:

Simultaneous Negative Binomial Crash Model Applied in Tucson, Arizona. Transportation Research Record: Journal of the Transportation Research Board, 1897(1), 191-199. Khan, A., Salim A., Kathairi. A. & Garib, A. (2004). A GIS based traffic accident data collection,

referencing and analysis framework for Abu Dhabi. Paper presented in CODATU XI: World Congress: Towards more attractive urban transportation, Bucharest, Romania.

Miaou, S.-P. (1994). The relationship between truck accidents and geometric design of road sections: Poisson versus negative binomial regressions. Accident Analysis & Prevention, 26(4), 471-482.

Mohammed, A., Ambak, K., Mosa, A. & Syamsunur, D. (2018). Classification of Traffic Accident Prediction Models: A Review Paper. International Journal of Advances in Science Engineering and Technology, 6(2), 35-38.

Naghawi, H. (2018). Negative Binomial Regression Model for Road Crash Severity Prediction.

Modern Applied Science, 12(4), 38.

Oh, J., Lyon, C., Washington, S., Persaud, B. & Bared, J. (2003). Validation of FHWA Crash Models for Rural Intersections: Lessons Learned. Transportation Research Record: Journal of the Transportation Research Board. 1840(1), 41-49

Peden, M. M., Organization, W. H. & Bank, W. (2004). World Report on Road Traffic Injury Prevention. Chapter 1, Page 3. World Health Organization.

Persaud, B. & Dzbik, L. (1993). Accident Prediction Model for Freeways. Transportation Research Record. 1401, 55-60.

Reshma, E. K. & Sharif, U. (2012). Prioritization of Accident Black Spots Using GIS. International Journal of Emerging Technology and Advanced Engineering, 02(09), 117-125.

Shafabakhsh, G. A., Famili, A. & Bahadori, M. S. (2017). GIS-based spatial analysis of urban traffic accidents: Case study in Mashhad, Iran. Journal of Traffic and Transportation Engineering (English Edition), 4(3), 290-299.

Tarko, A. P., Inerowicz, M., Ramos, J. & Li, W. (2008). Tool with road-level crash prediction for transportation safety planning. Transportation Research Record: Journal of the Transportation

(11)

Research Board, 2083(1), 16-25.

USDOT. (1999). GIS-Based Crash Referencing and Analysis System. U.S. Department of Transportation Washington D.C.

World Health Organization. (2018). Global Status Report on Road Safety 2018. Chapter 1, Page 2. Geneva, Switzerland.

World Health Organization. (2015). Global Status Report on Road Safety. Chapter 1, Page 4. Geneva, Switzerland

Zhang, C., Yan, X., Ma, L. & An, M. (2014). Crash Prediction and Risk Evaluation Based on Traffic Analysis Zones. Mathematical Problems in Engineering, 2014, 1-9

Cite this article as: Ghosh S., Manickam S., Haider M., “Integrated Accident Resilience Framework

(IARF) – A Theoretical Approach Using Spatial and Statistical Analysis”, International Conference on

Civil Infrastructure and Construction (CIC 2020), Doha, Qatar, 2-5 February 2020, DOI: https://doi.