1
|
Samerei SA, Aghabayk K. Interpretable machine learning for evaluating risk factors of freeway crash severity. Int J Inj Contr Saf Promot 2024; 31:534-550. [PMID: 38768184 DOI: 10.1080/17457300.2024.2351972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 04/27/2024] [Accepted: 05/02/2024] [Indexed: 05/22/2024]
Abstract
Machine learning (ML) models are widely employed for crash severity modelling, yet their interpretability remains underexplored. Interpretation is crucial for comprehending ML results and aiding informed decision-making. This study aims to implement an interpretable ML to visualize the impacts of factors on crash severity using 5 years of freeways data from Iran. Methods including classification and regression trees (CART), K-nearest neighbours (KNNs), random forest (RF), artificial neural network (ANN) and support vector machines (SVM) were applied, with RF demonstrating superior accuracy, recall, F1-score and ROC. The accumulated local effects (ALE) were utilized for interpretation. Findings suggest that light traffic conditions (volume / capacity < 0.5 ) with critical values around 0.05 or 0.38, and higher proportion of large trucks and buses, particularly at 10% and 4%, are associated with severe crashes. Additionally, speeds exceeding 90 km/h, drivers younger than 30 years, rollover crashes, collisions with fixed objects and barriers, nighttime driving and driver fatigue elevate the likelihood of severe crashes.
Collapse
Affiliation(s)
- Seyed Alireza Samerei
- School of Civil Engineering, College of Engineering, University of Tehran, Tehran, Iran
| | - Kayvan Aghabayk
- School of Civil Engineering, College of Engineering, University of Tehran, Tehran, Iran
| |
Collapse
|
2
|
Zhou Y, Fu C, Jiang X, Yu Q, Liu H. Who might encounter hard-braking while speeding? Analysis for regular speeders using low-frequency taxi trajectories on arterial roads and explainable AI. ACCIDENT; ANALYSIS AND PREVENTION 2024; 195:107382. [PMID: 37979465 DOI: 10.1016/j.aap.2023.107382] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 09/29/2023] [Accepted: 11/13/2023] [Indexed: 11/20/2023]
Abstract
Regular speeders are those who commit speeding recidivism during a period. Among their speeding behaviors, some occurring in specific scenarios may cause more hazards to road users. Therefore, there is a need to evaluate the driving risks if the regular speeders have different speeding propensities. This study considers speeding-related hard-braking events (SHEs) as a safety surrogate measure and recognizes the regular speeders who encounter at least one SHEs during the study period as risky individuals. To identify speeding behaviors and hard-braking events from low-frequency GPS trajectories, we compare the average travel speed between pairwise adjacent GPS points to the posted speed limit and examine the speed curve and the corresponding travel distance between these GPS points, respectively. Thereafter, a logistic model, XGBoost, and three 1D Convolutional Neural Networks (CNNs) including AlexNet CNN, Mini-AlexNet CNN, and Simple CNN are respectively developed to recognize the regular speeders who encountered SHEs based on their speeding propensities. The proposed Mini-AlexNet CNN achieves a global F1-score of 91% and recall of 90% on the testing data, which are superior to other models. Further, the study uses the Shapley Additive exPlanation (SHAP) framework to visually interpret the contribution of speeding propensities on SHE likelihood. It is found that speeding by 50% or greater for no more than 285 m is the most dangerous kind among all the speeding behaviors. Speeding on roads without bicycle lanes or on roads with roadside parking and excessive accesses increases the probability of encountering SHEs. Based on the analyses, we put forward tailored recommendations that aim to restrict hazard-related speeding behaviors rather than speeding behaviors of all kinds.
Collapse
Affiliation(s)
- Yue Zhou
- Flight Technology College, Civil Aviation Flight University of China, Guanghan 618307, China
| | - Chuanyun Fu
- School of Transportation Science and Engineering, Harbin Institute of Technology, Harbin 150090, China.
| | - Xinguo Jiang
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China; National United Engineering Laboratory of Integrated and Intelligent Transportation, Southwest Jiaotong University, Chengdu 611756, China; National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Southwest Jiaotong University, Chengdu 611756, China
| | - Qiong Yu
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China
| | - Haiyue Liu
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu 611756, China
| |
Collapse
|
3
|
Xiao Y, Lin L, Zhou H, Tan Q, Wang J, Yang Y, Xu Z. Fatal crashes and rare events logistic regression: an exploratory empirical study. Front Public Health 2024; 11:1294338. [PMID: 38249366 PMCID: PMC10796722 DOI: 10.3389/fpubh.2023.1294338] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 11/27/2023] [Indexed: 01/23/2024] Open
Abstract
Objective Fatal road accidents are statistically rare, posing challenges for accurate estimation through the classic logit model (LM). This study seeks to validate the efficacy of a rare events logistic model (RELM) in enhancing the precision of fatal crash estimations. Methods Both LM and RELM were employed to examine the relationship between pertinent risk factors and the incidence of fatal crashes. Crash-injury datasets sourced from Hillsborough County, Florida served as the empirical basis for evaluating the performance metrics of both LM and RELM. Results The analysis revealed that RELM yielded more accurate predictions of fatal crashes compared to LM. Receiver operating characteristic (ROC) curves were constructed, and the area under the curve (AUC) for each model was computed to offer a comparative performance assessment. The empirical evidence notably favored RELM over LM as substantiated by superior AUC values. Conclusion The study offers empirical validation that RELM is demonstrably more proficient in predicting fatal crashes than the LM, thereby recommending its application for nuanced traffic safety analytics.
Collapse
Affiliation(s)
- Yuxie Xiao
- School of Public Health, Sun Yat-sen University, Guangzhou, China
- Engineering Consulting Department, Changsha Planning and Design Institute Co., Ltd., Changsha, China
| | - Lulu Lin
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| | - Hanchu Zhou
- School of Traffic and Transportation Engineering, Central South University, Changsha, China
| | - Qian Tan
- Engineering Consulting Department, Changsha Planning and Design Institute Co., Ltd., Changsha, China
| | - Junjie Wang
- Institute of Transportation System Science and Engineering, Beijing Jiaotong University, Beijing, China
| | - Yi Yang
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, China
- National Engineering Laboratory of Integrated Transportation Big Data Application Technology, Chengdu, China
| | - Zhongzhi Xu
- School of Public Health, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
4
|
Liu D, Li D, Sze NN, Ding H, Song Y. An integrated data- and theory-driven crash severity model. ACCIDENT; ANALYSIS AND PREVENTION 2023; 193:107282. [PMID: 37722256 DOI: 10.1016/j.aap.2023.107282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/20/2023]
Abstract
For crash severity modeling, researchers typically view theory-driven models and data-driven models as different or even conflicting approaches. The reason is that the machine-learning models offer good predictability but weak interpretability, while the latter has robust interpretability but moderate predictability. In order to alleviate the tension between them, this study proposes an integrated data- and theory-driven crash-severity model, known as Embedded Fusion model based on Text Vector Representations (TVR-EF), by leveraging the complementary strengths of both. The model specification consists of two parts. (i) the data-driven component not only mitigate the deficiencies of traditional econometric models, where one-hot encoding is frequently used and makes it impossible to observe semantic relatedness between variable categories, but also enhances the interpretability for the relationship between crash severity and potential influencing factors using the learned embedding weight matrix. (ii) In the theory-driven component, the multinomial logit model is implemented as a 2D-Convolutional Neural Network (2D-CNN) to increase flexibility and decrease dependency on prior knowledge for different crash-severity outcomes. A crash dataset from Guangdong Province, China, is utilized to estimate the TVR-EF model, which is then benchmarked against two traditional econometric models and three widely used machine-learning models. Results indicate that TVR-EF model does not only improve the predictive performance but also makes it easier to interpret.
Collapse
Affiliation(s)
- Dongjie Liu
- School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China
| | - Dawei Li
- School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China; Jiangsu Key Laboratory of Urban ITS, Southeast University, Nanjing, Jiangsu 211189, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Nanjing, Jiangsu 211189, China.
| | - N N Sze
- Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Hongliang Ding
- Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China; Institute of Smart City and Intelligent Transporttaion, Institute of Urban Rail Transportation, Southwest Jiaotong University, Chengdu, Sichuan 611756, China
| | - Yuchen Song
- School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China
| |
Collapse
|
5
|
Zuo D, Qian C, Xiao D, Xu X, Wang H. Data-driven crash prediction by injury severity using a recurrent neural network model based on Keras framework. Int J Inj Contr Saf Promot 2023; 30:561-570. [PMID: 37493264 DOI: 10.1080/17457300.2023.2239211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Accepted: 07/18/2023] [Indexed: 07/27/2023]
Abstract
With the development of big data technology and the improvement of deep learning technology, data-driven and machine learning application have been widely employed. By adopting the data-driven machine learning method, with the help of clustering processing of data sets, a recurrent neural network (RNN) model based on Keras framework is proposed to predict the injury severity in urban areas. First, with crash data from 2014 to 2017 in Nevada, OPTICS clustering algorithm is employed to extract the crash injury in Las Vegas. Next, by virtue of Keras' high efficiency and strong scalability, the parameters of loss function, activation function and optimizer of the deep learning model are determined to realize the training of the model and the visualization of the training results, and the RNN model is constructed. Finally, on the basis of training and testing data, the model can predict the injury severity with high accuracy and high training speed. The results provide an alternative and some potential insights on the injury severity prediction.
Collapse
Affiliation(s)
- Dajie Zuo
- School of Transportation and Logistics, Southwest Jiaotong University, Chengdu, China
| | - Cheng Qian
- Shanghai Municipal Engineering Design Institute(Group) Co. Ltd, Shanghai, China
| | - Daiquan Xiao
- School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Xuecai Xu
- School of Civil and Hydraulic Engineering, Huazhong University of Science and Technology, Wuhan, China
| | - Hui Wang
- Wuhan Huake Quanda Transport Planning and Design Consulting Co. Ltd, Wuhan, China
| |
Collapse
|
6
|
Elalouf A, Birfir S, Rosenbloom T. Developing machine-learning-based models to diminish the severity of injuries sustained by pedestrians in road traffic incidents. Heliyon 2023; 9:e21371. [PMID: 38027877 PMCID: PMC10665667 DOI: 10.1016/j.heliyon.2023.e21371] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Revised: 10/16/2023] [Accepted: 10/20/2023] [Indexed: 12/01/2023] Open
Abstract
An essential step in devising measures to improve road safety is road accident prediction. In particular, it is important to identify the risk factors that increase the likelihood of severe injuries in the event of an accident. There are two distinct ways of analyzing data in order to produce predictions: machine learning and statistical methods. This study explores the severity of road traffic injuries sustained by pedestrians through the use of machine-learning methodology. In general, the goal of the statistician is to model and understand the connections between variables, whereas machine learning focuses on more intricate and expansive datasets, with the aim of creating algorithms that can recognize patterns and make predictions without being explicitly programmed. The ability to handle very large datasets constitutes a distinct advantage of machine learning over statistical techniques. In addition, machine-learning models can be adapted to a wide range of data sources and problem domains, and can be utilized for numerous tasks, from image identification to natural language processing. Machine-learning models may be taught to recognize patterns and make predictions automatically, minimizing the need for manual involvement and enabling rapid data processing of enormous quantities of data. The use of new data to retrain or fine-tune a machine-learning model allows the model to adapt to changing conditions and enhances its accuracy over time. Finally, while non-linear interactions between variables can be difficult to predict using conventional statistical techniques, they can be recognized by machine-learning models. The study begins by compiling an inventory of features linked to both the accident and the environment, focusing on those that exert the greatest influence on the severity of pedestrian injuries. The "optimal" algorithm is then chosen based on its superior levels of accuracy, precision, recall, and F1 score. The developed model should not be regarded as fixed; it should be updated and retrained on a regular basis using new traffic accident data that mirror the evolving interplay between the road environment, driver characteristics, and pedestrian conduct. Having been constructed using Israeli data, the current model is predictive of injury outcomes within Israel. For broader applicability, the model should undergo retraining and reassessment using traffic accident data from the pertinent country or region.
Collapse
Affiliation(s)
- Amir Elalouf
- Bar-Ilan University, Department of Management, Ramat-Gan 52900, Israel
| | - Slava Birfir
- Bar-Ilan University, Department of Management, Ramat-Gan 52900, Israel
- Elbit Systems Company, Haifa 3100401, Israel
| | - Tova Rosenbloom
- Bar-Ilan University, Department of Management, Ramat-Gan 52900, Israel
| |
Collapse
|
7
|
Li Y, Yang Z, Xing L, Yuan C, Liu F, Wu D, Yang H. Crash injury severity prediction considering data imbalance: A Wasserstein generative adversarial network with gradient penalty approach. ACCIDENT; ANALYSIS AND PREVENTION 2023; 192:107271. [PMID: 37659275 DOI: 10.1016/j.aap.2023.107271] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 07/29/2023] [Accepted: 08/24/2023] [Indexed: 09/04/2023]
Abstract
For each road crash event, it is necessary to predict its injury severity. However, predicting crash injury severity with the imbalanced data frequently results in ineffective classifier. Due to the rarity of severe injuries in road traffic crashes, the crash data is extremely imbalanced among injury severity classes, making it challenging to the training of prediction models. To achieve interclass balance, it is possible to generate certain minority class samples using data augmentation techniques. Aiming to address the imbalance issue of crash injury severity data, this study applies a novel deep learning method, the Wasserstein generative adversarial network with gradient penalty (WGAN-GP), to investigate a massive amount of crash data, which can generate synthetic injury severity data linked to traffic crashes to rebalance the dataset. To evaluate the effectiveness of the WGAN-GP model, we systematically compare performances of various commonly-used sampling techniques (random under-sampling, random over-sampling, synthetic minority over-sampling technique and adaptive synthetic sampling) with respect to dataset balance and crash injury severity prediction. After rebalancing the dataset, this study categorizes the crash injury severity using logistic regression, multilayer perceptron, random forest, AdaBoost and XGBoost. The AUC, specificity and sensitivity are employed as evaluation indicators to compare the prediction performances. Results demonstrate that sampling techniques can considerably improve the prediction performance of minority classes in an imbalanced dataset, and the combination of XGBoost and WGAN-GP performs best with an AUC of 0.794 and a sensitivity of 0.698. Finally, the interpretability of the model is improved by the explainable machine learning technique SHAP (SHapley Additive exPlanation), allowing for a deeper understanding of the effects of each variable on crash injury severity. Findings of this study shed light on the prediction of crash injury severity with data imbalance using data-driven approaches.
Collapse
Affiliation(s)
- Ye Li
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China; Hunan Key Laboratory of Smart Roadway and Cooperative Vehicle-Infrastructure Systems, Changsha University of Science & Technology, Changsha, 410114 Hunan, China.
| | - Zhanhao Yang
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China.
| | - Lu Xing
- School of Traffic and Transportation Engineering, Changsha University of Science and Technology, Changsha, Hunan 410114, China.
| | - Chen Yuan
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China; Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong, China.
| | - Fei Liu
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China.
| | - Dan Wu
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China.
| | - Haifei Yang
- School of Civil and Transportation Engineering, Hohai University, Nanjing, Jiangsu 210098, China.
| |
Collapse
|
8
|
Almannaa M, Zawad MN, Moshawah M, Alabduljabbar H. Investigating the effect of road condition and vacation on crash severity using machine learning algorithms. Int J Inj Contr Saf Promot 2023; 30:392-402. [PMID: 37079354 DOI: 10.1080/17457300.2023.2202660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 03/14/2023] [Accepted: 04/10/2023] [Indexed: 04/21/2023]
Abstract
Investigating the contributing factors to traffic crash severity is a demanding topic in research focusing on traffic safety and policies. This research investigates the impact of 16 roadway condition features and vacations (along with the spatial and temporal factors and road geometry) on crash severity for major intra-city roads in Saudi Arabia. We used a crash dataset that covers four years (Oct. 2016 - Feb. 2021) with more than 59,000 crashes. Machine learning algorithms were utilized to predict the crash severity outcome (non-fatal/fatal) for three types of roads: single, multilane, and freeway. Furthermore, features that have a strong impact on crash severity were examined. Results show that only 4 out of 16 road condition variables were found to be contributing to crash severity, namely: paints, cat eyes, fence side, and metal cable. Additionally, vacation was found to be a contributing factor to crash severity, meaning crashes that occur on vacation are more severe than non-vacation days.
Collapse
Affiliation(s)
- Mohammed Almannaa
- Department of Civil Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia
| | - Md Nabil Zawad
- Department of Civil Engineering, College of Engineering, King Saud University, Riyadh, Saudi Arabia
| | - May Moshawah
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Haifa Alabduljabbar
- Department of Computer Science, College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
9
|
Mohammadpour SI, Khedmati M, Zada MJH. Classification of truck-involved crash severity: Dealing with missing, imbalanced, and high dimensional safety data. PLoS One 2023; 18:e0281901. [PMID: 36947539 PMCID: PMC10032500 DOI: 10.1371/journal.pone.0281901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 02/02/2023] [Indexed: 03/23/2023] Open
Abstract
While the cost of road traffic fatalities in the U.S. surpasses $240 billion a year, the availability of high-resolution datasets allows meticulous investigation of the contributing factors to crash severity. In this paper, the dataset for Trucks Involved in Fatal Accidents in 2010 (TIFA 2010) is utilized to classify the truck-involved crash severity where there exist different issues including missing values, imbalanced classes, and high dimensionality. First, a decision tree-based algorithm, the Synthetic Minority Oversampling Technique (SMOTE), and the Random Forest (RF) feature importance approach are employed for missing value imputation, minority class oversampling, and dimensionality reduction, respectively. Afterward, a variety of classification algorithms, including RF, K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), Gradient-Boosted Decision Trees (GBDT), and Support Vector Machine (SVM) are developed to reveal the influence of the introduced data preprocessing framework on the output quality of ML classifiers. The results show that the GBDT model outperforms all the other competing algorithms for the non-preprocessed crash data based on the G-mean performance measure, but the RF makes the most accurate prediction for the treated dataset. This finding indicates that after the feature selection is conducted to alleviate the computational cost of the machine learning algorithms, bagging (bootstrap aggregating) of decision trees in RF leads to a better model rather than boosting them via GBDT. Besides, the adopted feature importance approach decreases the overall accuracy by only up to 5% in most of the estimated models. Moreover, the worst class recall value of the RF algorithm without prior oversampling is only 34.4% compared to the corresponding value of 90.3% in the up-sampled model which validates the proposed multi-step preprocessing scheme. This study also identifies the temporal and spatial (roadway) attributes, as well as crash characteristics, and Emergency Medical Service (EMS) as the most critical factors in truck crash severity.
Collapse
Affiliation(s)
| | - Majid Khedmati
- Department of Industrial Engineering, Sharif University of Technology, Tehran, Iran
| | | |
Collapse
|
10
|
Se C, Champahom T, Jomnonkwao S, Ratanavaraha V. Motorcyclist injury severity analysis: a comparison of Artificial Neural Networks and random parameter model with heterogeneity in means and variances. Int J Inj Contr Saf Promot 2022; 29:500-515. [PMID: 35666153 DOI: 10.1080/17457300.2022.2081985] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
In Thailand, the motorcyclist mortality rate is steadily on the rise and remains a serious concern for highway administrators and burden on both economic and local people. Using motorcycle-crash data in Thailand from 2016 to 2019, this study empirically employed and compared the Artificial Neural Networks (ANN) model and random parameters binary probit model with heterogeneity in means and variances (RPBPHM) to explore the effects of a wide range of associated risk characteristics on the severity outcomes of the motorcyclist. Study results revealed that probabilities of injury or fatal crash increase for crashes that involve male riders, riding with pillion, speeding, improper overtaking, riders under influence of alcohol, fatigue riders, undivided road and so on. The probability of non-injury crash increases for crashes on main or frontage traffic lane, four-lane road, concrete road, during rain, involving collision with other motorcycles, rear-end crashes, sideswipe crashes, single-motorcycle crashes and crashes within urban areas. The RPBPHM models were found to outperform the ANN model (quadratic support vector machine) in all performance metrics. The findings could potentially assist policymaker, safety professionals, practitioners, trainers, government agencies or highway designers in future planning and serve as guidance for mitigation policies directed at safety improvement for motorcyclists.
Collapse
Affiliation(s)
- Chamroeun Se
- School of Transportation Engineering, Institute of Engineering, Suranaree University of Technology, Nakhon Ratchasima, Thailand
| | - Thanapong Champahom
- Department of Management, Faculty of Business Administration, Rajamangala University of Technology Isan, Nakhon Ratchasima, Thailand
| | - Sajjakaj Jomnonkwao
- School of Transportation Engineering, Institute of Engineering, Suranaree University of Technology, Nakhon Ratchasima, Thailand
| | - Vatanavongs Ratanavaraha
- School of Transportation Engineering, Institute of Engineering, Suranaree University of Technology, Nakhon Ratchasima, Thailand
| |
Collapse
|
11
|
Sattar K, Chikh Oughali F, Assi K, Ratrout N, Jamal A, Masiur Rahman S. Transparent deep machine learning framework for predicting traffic crash severity. Neural Comput Appl 2022. [DOI: 10.1007/s00521-022-07769-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
12
|
AlKheder S, AlRukaibi F, Aiash A. Analysis of risk factors affecting traffic accident injury in United Arab Emirates (UAE). Eur J Trauma Emerg Surg 2022; 48:4823-4835. [PMID: 35674805 DOI: 10.1007/s00068-022-02010-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 05/15/2022] [Indexed: 11/03/2022]
Abstract
The mortality and severe injuries due to traffic accidents in United Arab Emirates (UAE) are hastening the necessity for a study that can identify the consequential risk factors. This study was conducted by utilizing a 5740 traffic accidents police reports that occurred in Abu Dhabi, UAE between 2008 and 2013. A multinomial logit regression model was applied to determine the significant factors among the 14 potential risk factors that were used in this study. The dependent variable was the level of injury that consisted of four categories: slight injury, medium injury, severe injury, and fatal injury. The results showed that pedestrian, the unutilized seatbelt, roads that had four or more than four lanes, male casualty, 100 km/h speed limit or higher, and casualty older than 60 years were found to be the factors that can increase the probability of being involved in a fatal traffic accident. In contrast, rear-end collisions and intersections had a lower probability of causing fatal injury. Then, the eight significant predictors were included in a neural network to compare the performance of both methods and to identify the normalized importance values for the significant independent variables. The neural network had proven to be more accurate in general than the traditional regression models such as the multinomial logit model.
Collapse
Affiliation(s)
- Sharaf AlKheder
- Civil Engineering Department, College of Engineering and Petroleum, Kuwait University, Kuwait City, Kuwait.
| | - Fahad AlRukaibi
- Civil Engineering Department, College of Engineering and Petroleum, Kuwait University, Kuwait City, Kuwait
| | - Ahmad Aiash
- ETSECCPB-School of Civil Engineering of Barcelona, Universitat Politècnica de Catalunya, Barcelona, Spain
| |
Collapse
|
13
|
An Injury-Severity-Prediction-Driven Accident Prevention System. SUSTAINABILITY 2022. [DOI: 10.3390/su14116569] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Traffic accidents are inevitable events that occur unexpectedly and unintentionally. Therefore, analyzing traffic data is essential to prevent fatal accidents. Traffic data analysis provided insights into significant factors and driver behavioral patterns causing accidents. Combining these patterns and the prediction model into an accident prevention system can assist in reducing and preventing traffic accidents. This study applied various machine learning models, including neural network, ordinal regression, decision tree, support vector machines, and logistic regression to have a robust prediction model in injury severity. The trained model provides timely and accurate predictions on accident occurrence and injury severity using real-world traffic accident datasets. We proposed an informative negative data generator using feature weights derived from multinomial logit regression to balance the non-fatal accident data. Our aim is to resolve the bias that happens in the favor of the majority class as well as performance improvement. We evaluated the overall and class-level performance of the machine learning models based on accuracy and mean squared error scores. Three hidden layered neural networks outperformed the other models with 0.254 ± 0.038 and 0.173 ± 0.016 MSE scores for two different datasets. A neural network, which provides more accurate and reliable results, should be integrated into the accident prevention system.
Collapse
|
14
|
Wen X, Xie Y, Jiang L, Li Y, Ge T. On the interpretability of machine learning methods in crash frequency modeling and crash modification factor development. ACCIDENT; ANALYSIS AND PREVENTION 2022; 168:106617. [PMID: 35202941 DOI: 10.1016/j.aap.2022.106617] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 01/29/2022] [Accepted: 02/15/2022] [Indexed: 06/14/2023]
Abstract
Machine learning (ML) model interpretability has attracted much attention recently given the promising performance of ML methods in crash frequency studies. Extracting accurate relationship between risk factors and crash frequency is important for understanding the causal effects of risk factors and developing safety countermeasures. However, there is no study that comprehensively summarizes ML model interpretation methods and provides guidance for safety researchers and practitioners. This research aims to fill this gap. Model-based and post-hoc ML interpretation methods are critically evaluated and compared to study their suitability in crash frequency modeling. These methods include classification and regression tree (CART), multivariate adaptive regression splines (MARS), Local Interpretable Model-agnostic Explanations (LIME), Local Sensitivity Analysis (LSA), Partial Dependence Plots (PDP), Global Sensitivity Analysis (GSA), and SHapley Additive exPlanations (SHAP). Model-based interpretation methods cannot reveal the detailed interaction relationships among risk factors. LIME can only be used to analyze the effects of a risk factor at the prediction level. LSA and PDP assume that different risk factors are independently distributed. Both GSA and SHAP can account for the potential correlation among risk factors. However, only SHAP can visualize the detailed relationships between crash outcomes and risk factors. This study also demonstrates the potential and benefits of using ML and SHAP to derive Crash Modification Factors (CMF). Finally, it is emphasized that statistical and ML models may not directly differentiate causation from correlation. Understanding the differences between them is critical for developing reliable safety countermeasures.
Collapse
Affiliation(s)
- Xiao Wen
- Department of Civil and Environmental Engineering, University of Massachusetts Lowell, United States
| | - Yuanchang Xie
- Department of Civil and Environmental Engineering, University of Massachusetts Lowell, United States.
| | - Liming Jiang
- Department of Civil and Environmental Engineering, University of Massachusetts Lowell, United States
| | - Yan Li
- Department of Computer Science, University of Massachusetts Lowell, United States
| | - Tingjian Ge
- Department of Computer Science, University of Massachusetts Lowell, United States
| |
Collapse
|
15
|
Santos K, Dias JP, Amado C. A literature review of machine learning algorithms for crash injury severity prediction. JOURNAL OF SAFETY RESEARCH 2022; 80:254-269. [PMID: 35249605 DOI: 10.1016/j.jsr.2021.12.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/10/2021] [Revised: 07/21/2021] [Accepted: 12/08/2021] [Indexed: 06/14/2023]
Abstract
INTRODUCTION Road traffic crashes represent a major public health concern, so it is of significant importance to understand the factors associated with the increase of injury severity of its interveners when involved in a road crash. Determining such factors is essential to help decision making in road safety management, improving road safety, and reducing the severity of future crashes. METHOD This paper presents a recent literature review of the methods that have been applied to road crash injury severity modeling. It includes 56 studies from 2001 to 2021 that consider more than 20 different statistical or machine learning techniques. RESULTS Random Forest was the algorithm with the best results, achieving the best performance in 70% of the times that it was applied and in 29% of all studies. Support Vector Machine and Decision Tree achieved the best performance in 53% and 31% of the times and in 16% and 14% of all studies, respectively. Bayesian Networks and K-Nearest Neighbors achieved the best performance in 67% and 40% of the times that were used but only achieved the best performance in 4% and 7% of all the studies analyzed, respectively. CONCLUSIONS At this point, Random Forest revealed to be a good approach for road traffic crash injury severity prediction followed by Support Vector Machine, Decision Tree, and K-Nearest Neighbor. However, there is still a lot of room in this area to explore other techniques that can best suit this purpose as not only the model's performance should be considered but also causality issues, unobserved heterogeneity, and temporal instability. Practical Applications: This review enables researchers to understand the recent techniques applied in the analysis of injury severity modeling, and the ones that achieved the best performance results. Based on the reviewed studies, challenges and future research directions are presented.
Collapse
Affiliation(s)
- Kenny Santos
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal.
| | - João P Dias
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal.
| | - Conceição Amado
- Department of Mathematics and CEMAT, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
16
|
Wu K, Zhang F, Zhang YH, Yan Y, Butt SI. Surrogate-adjoint refine based global optimization method combining with multi-stage fuzzy clustering space reduction strategy for expensive problems. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
17
|
Ma Z, Mei G, Cuomo S. An analytic framework using deep learning for prediction of traffic accident injury severity based on contributing factors. ACCIDENT; ANALYSIS AND PREVENTION 2021; 160:106322. [PMID: 34365042 DOI: 10.1016/j.aap.2021.106322] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Revised: 07/22/2021] [Accepted: 07/23/2021] [Indexed: 06/13/2023]
Abstract
Vulnerable road users (VRUs) are exposed to the highest risk in the road traffic environment. Analyzing contributing factors that affect injury severity facilitates injury severity prediction and further application in developing countermeasures to guarantee VRUs safety. Recently, machine learning approaches have been introduced, in which analyses tend to be one-sided and may ignore important information. To solve this problem, this paper proposes a comprehensive analytic framework that employs a deep learning model referred to as the stacked sparse autoencoder (SSAE) to predict the injury severity of traffic accidents based on contributing factors. The essential idea of the method is to integrate various analyses into an analytical framework that performs corresponding data processing and analysis by different machine learning approaches. In the proposed method, first, we utilize a machine learning approach (i.e., Catboost) to analyze the importance and dependence of the contributing factors to injury severity and remove low correlation factors; second, according to the geographical information, we classify the data into different classes by utilizing a machine learning approach (i.e., k-means clustering); third, by employing high correlation factors, we employ an SSAE-based deep learning model to perform injury severity prediction in each data class. By experiments with a real-world traffic accident dataset, we demonstrated the effectiveness and applicability of the framework. Specifically, (1) the importance and dependence of contributing factors were obtained by CatBoost and the Shapley value, and (2) the SSAE-based deep learning model achieved the best performance compared to other baseline models. The proposed analytic framework can also be utilized for other accident data for severity or other risk indicator analyses involving VRUs safety.
Collapse
Affiliation(s)
- Zhengjing Ma
- School of Engineering and Technology, China University of Geosciences (Beijing), Beijing 100083, China
| | - Gang Mei
- School of Engineering and Technology, China University of Geosciences (Beijing), Beijing 100083, China.
| | - Salvatore Cuomo
- Department of Mathematics and Applications "R. Caccioppoli", University of Naples Federico II, Italy
| |
Collapse
|
18
|
Using Hybrid Artificial Intelligence and Evolutionary Optimization Algorithms for Estimating Soybean Yield and Fresh Biomass Using Hyperspectral Vegetation Indices. REMOTE SENSING 2021. [DOI: 10.3390/rs13132555] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Recent advanced high-throughput field phenotyping combined with sophisticated big data analysis methods have provided plant breeders with unprecedented tools for a better prediction of important agronomic traits, such as yield and fresh biomass (FBIO), at early growth stages. This study aimed to demonstrate the potential use of 35 selected hyperspectral vegetation indices (HVI), collected at the R5 growth stage, for predicting soybean seed yield and FBIO. Two artificial intelligence algorithms, ensemble-bagging (EB) and deep neural network (DNN), were used to predict soybean seed yield and FBIO using HVI. Considering HVI as input variables, the coefficients of determination (R2) of 0.76 and 0.77 for yield and 0.91 and 0.89 for FBIO were obtained using DNN and EB, respectively. In this study, we also used hybrid DNN-SPEA2 to estimate the optimum HVI values in soybeans with maximized yield and FBIO productions. In addition, to identify the most informative HVI in predicting yield and FBIO, the feature recursive elimination wrapper method was used and the top ranking HVI were determined to be associated with red, 670 nm and near-infrared, 800 nm, regions. Overall, this study introduced hybrid DNN-SPEA2 as a robust mathematical tool for optimizing and using informative HVI for estimating soybean seed yield and FBIO at early growth stages, which can be employed by soybean breeders for discriminating superior genotypes in large breeding populations.
Collapse
|
19
|
Wang Q, Gan S, Chen W, Li Q, Nie B. A data-driven, kinematic feature-based, near real-time algorithm for injury severity prediction of vehicle occupants. ACCIDENT; ANALYSIS AND PREVENTION 2021; 156:106149. [PMID: 33933716 DOI: 10.1016/j.aap.2021.106149] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 03/14/2021] [Accepted: 04/15/2021] [Indexed: 06/12/2023]
Abstract
Accurate real-time prediction of occupant injury severity in unavoidable collision scenarios is a prerequisite for enhancing road traffic safety with the development of highly automated vehicles. Specifically, a safety prediction model provides a decision reference for the trajectory planning system in the pre-crash phase and the adaptive restraint system in the in-crash phase. The main goal of the current study is to construct a data-driven, vehicle kinematic feature-based model to realize accurate and near real-time prediction of in-vehicle occupant injury severity. A large-scale numerical database was established focusing on occupant kinetics. A first-step deep-learning model was established to predict occupant kinetics and injury severity using a convolutional neural network (CNN). To reduce the computational time for real-time application, the second step was to extract simplified kinematic features from vehicle crash pulses via a feature extraction method, which was inspired by a visualization approach applied to the CNN-based model. The features were incorporated with a low-complexity machine-learning algorithm and achieved satisfactory accuracy (85.4 % on the numerical database, 78.7 % on a 192-case real-world dataset) and decreased computational time (1.2 ± 0.4 ms) on the prediction tasks. This study demonstrated the feasibility of using data-driven and feature-based approaches to achieve accurate injury risk estimation prior to collision. The proposed model is expected to provide a decision reference for integrated safety systems in the next generation of automated vehicles.
Collapse
Affiliation(s)
- Qingfan Wang
- State Key Lab of Automotive Safety and Energy, School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
| | - Shun Gan
- State Key Lab of Automotive Safety and Energy, School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
| | - Wentao Chen
- State Key Lab of Automotive Safety and Energy, School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
| | - Quan Li
- State Key Lab of Automotive Safety and Energy, School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China
| | - Bingbing Nie
- State Key Lab of Automotive Safety and Energy, School of Vehicle and Mobility, Tsinghua University, Beijing, 100084, China.
| |
Collapse
|
20
|
Thapa D, Mishra S. Using worker's naturalistic response to determine and analyze work zone crashes in the presence of work zone intrusion alert systems. ACCIDENT; ANALYSIS AND PREVENTION 2021; 156:106125. [PMID: 33878572 DOI: 10.1016/j.aap.2021.106125] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 03/08/2021] [Accepted: 04/02/2021] [Indexed: 06/12/2023]
Abstract
Work zone Intrusion Alert Systems (WZIAS) are alert mechanisms that detect and alert workers of vehicles intruding into a work zone. These systems pre-dominantly employ two components-sensors placed near the work zone perimeter that detect intrusions, and alarms placed closed to or carried by the workers that alerts them. This study investigates the association between layout of these components for three WZIAS on work zone crashes based on worker reaction. Also, the key determinants of work zone crashes in presence of the WZIAS is identified using survival analysis. The ideal deployment strategy and use case scenarios for the three WZIAS is presented based on the findings of the study. The systems were subjected to rigorous testing that emulated intrusions to record worker reaction and determine occurrence of crashes. Analysis of results indicate that the key determinants of work zone crashes are speed of the intruding vehicle, distance between the sensor and worker, and accuracy of a system in detecting intrusions and alerting workers. Results from field experiments suggest that identification of appropriate use cases for WZIAS is necessary to ensure they work effectively. Based on the findings from this study it is suggested that current guidelines on work zones be modified to standardize WZIAS setup.
Collapse
Affiliation(s)
- Diwas Thapa
- Department of Civil Engineering, University of Memphis, Memphis, TN, 38152, United States.
| | - Sabyasachee Mishra
- Department of Civil Engineering, University of Memphis, Memphis, TN, 38152, United States.
| |
Collapse
|
21
|
Montella A, Mauriello F, Pernetti M, Rella Riccardi M. Rule discovery to identify patterns contributing to overrepresentation and severity of run-off-the-road crashes. ACCIDENT; ANALYSIS AND PREVENTION 2021; 155:106119. [PMID: 33848813 DOI: 10.1016/j.aap.2021.106119] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 03/04/2021] [Accepted: 03/30/2021] [Indexed: 06/12/2023]
Abstract
The main objective of this paper was to analyse the roadway, environmental, and driver-related factors associated with an overrepresentation of frequency and severity of run-off-the-road (ROR) crashes. The data used in this study refer to the 6167 crashes occurred in the section Naples-Candela of A16 motorway, Italy in the period from 2001 to 2011. The analysis was carried out using the rule discovery technique due to its ability of extracting knowledge from large amounts of data previously unknown and indistinguishable by investigating patterns that occur together in a given event. The rules were filtered by support, confidence, lift, and validated by the lift increase criterion. A two-step analysis was carried out. In the first step, rules discovering factors contributing to ROR crashes were identified. In the second step, studying only ROR crashes, rules discovering factors contributing to severe and fatal injury (KSI) crashes were identified. As a result, 94 significant rules for ROR crashes and 129 significant rules for KSI crashes were identified. These rules represent several combinations of geometric design, roadside, barrier performance, crash dynamic, vehicle, environmental and drivers' characteristics associated with an overrepresentation of frequency and severity of ROR crashes. From the methodological point of view, study results show that the a priori algorithm was effective in providing new information which was previously hidden in the data. Finally, several countermeasures to solve or mitigate the safety issues identified in this study were discussed. It is worthwhile to observe that the study showed a combination of factors contributing to the overrepresentation of frequency and severity of ROR crashes. Consequently, the implementation of a combination of countermeasures is recommended.
Collapse
Affiliation(s)
- Alfonso Montella
- University of Naples Federico II, Department of Civil, Architectural and Environmental Engineering, Via Claudio 21, 80125, Naples, Italy.
| | - Filomena Mauriello
- University of Naples Federico II, Department of Civil, Architectural and Environmental Engineering, Via Claudio 21, 80125, Naples, Italy.
| | - Mariano Pernetti
- University of Campania Luigi Vanvitelli, Department of Engineering, Via Roma 29, 81031, Aversa, CE, Italy.
| | - Maria Rella Riccardi
- University of Naples Federico II, Department of Civil, Architectural and Environmental Engineering, Via Claudio 21, 80125, Naples, Italy.
| |
Collapse
|
22
|
Hosseinzadeh A, Moeinaddini A, Ghasemzadeh A. Investigating factors affecting severity of large truck-involved crashes: Comparison of the SVM and random parameter logit model. JOURNAL OF SAFETY RESEARCH 2021; 77:151-160. [PMID: 34092305 DOI: 10.1016/j.jsr.2021.02.012] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2020] [Revised: 12/08/2020] [Accepted: 02/22/2021] [Indexed: 06/12/2023]
Abstract
INTRODUCTION Reducing the severity of crashes is a top priority for safety researchers due to its impact on saving human lives. Because of safety concerns posed by large trucks and the high rate of fatal large truck-involved crashes, an exploration into large truck-involved crashes could help determine factors that are influential in crash severity. The current study focuses on large truck-involved crashes to predict influencing factors on crash injury severity. METHOD Two techniques have been utilized: Random Parameter Binary Logit (RPBL) and Support Vector Machine (SVM). Models have been developed to estimate: (1) multivehicle (MV) truck-involved crashes, in which large truck drivers are at fault, (2) MV track-involved crashes, in which large truck drivers are not at fault and (3) and single-vehicle (SV) large truck crashes. RESULTS Fatigue and deviation to the left were found as the most important contributing factors that lead to fatal crashes when the large truck-driver is at fault. Outcomes show that there are differences among significant factors between RPBL and SVM. For instance, unsafe lane-changing was significant in all three categories in RPBL, but only SV large truck crashes in SVM. CONCLUSIONS The outcomes showed the importance of the complementary approaches to incorporate both parametric RPBL and non-parametric SVM to identify the main contributing factors affecting the severity of large truck-involved crashes. Also, the results highlighted the importance of categorization based on the at-fault party. Practical Applications: Unrealistic schedules and expectations of trucking companies can cause excessive stress for the large truck drivers, which could leads to further neglect of their fatigue. Enacting and enforcing comprehensive regulations regarding large truck drivers' working schedules and direct and constant surveillance by authorities would significantly decrease large truck-involved crashes.
Collapse
Affiliation(s)
- Aryan Hosseinzadeh
- Department of Civil and Environmental Engineering, University of Louisville, Louisville, KY 40292, United States.
| | - Amin Moeinaddini
- Department of Civil and Environmental Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Ali Ghasemzadeh
- Department of Civil and Architectural Engineering, University of Wyoming, Laramie, WY 82071, United States
| |
Collapse
|
23
|
Jamal A, Zahid M, Tauhidur Rahman M, Al-Ahmadi HM, Almoshaogeh M, Farooq D, Ahmad M. Injury severity prediction of traffic crashes with ensemble machine learning techniques: a comparative study. Int J Inj Contr Saf Promot 2021; 28:408-427. [PMID: 34060410 DOI: 10.1080/17457300.2021.1928233] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
A better understanding of injury severity risk factors is fundamental to improving crash prediction and effective implementation of appropriate mitigation strategies. Traditional statistical models widely used in this regard have predefined correlation and intrinsic assumptions, which, if flouted, may yield biased predictions. The present study investigates the possibility of using the eXtreme Gradient Boosting (XGBoost) model compared with few traditional machine learning algorithms (logistic regression, random forest, and decision tree) for crash injury severity analysis. The data used in this study was obtained from the traffic safety department, ministry of transport (MOT) at Riyadh, KSA, and contains 13,546 motor vehicle collisions along 15 rural highways reported between January 2017 to December 2019. Empirical results obtained using k-fold (k = 10) for various performance metrics showed that the XGBoost technique outperformed other models in terms of the collective predictive performance as well as injury severity individual class accuracies. XGBoost feature importance analysis indicated that collision type, weather status, road surface conditions, on-site damage type, lighting conditions, and vehicle type are the few sensitive variables in predicting the crash injury severity outcome. Finally, a comparative analysis of XGBoost based on different performance statistics showed that our model outperformed most previous studies.
Collapse
Affiliation(s)
- Arshad Jamal
- Department of Civil and Environmental Engineering, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia
| | - Muhammad Zahid
- College of Metropolitan Transportation, Beijing University of Technology, Beijing, China
| | - Muhammad Tauhidur Rahman
- Department of City and Regional Planning, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia
| | - Hassan M Al-Ahmadi
- Department of Civil and Environmental Engineering, King Fahd University of Petroleum & Minerals, Dhahran, Saudi Arabia
| | - Meshal Almoshaogeh
- Department of Civil Engineering, College of Engineering, Qassim University, Buraydah, Qassim, Saudi Arabia
| | - Danish Farooq
- Department of Transport Technology and Economics, Budapest University of Technology and Economics, Budapest, Hungary.,Department of Civil Engineering, University of Engineering and Technology Peshawar (Bannu Campus), Peshawar, Pakistan
| | - Mahmood Ahmad
- Department of Civil Engineering, University of Engineering and Technology Peshawar (Bannu Campus), Peshawar, Pakistan
| |
Collapse
|
24
|
Wang X, Qu Z, Song X, Bai Q, Pan Z, Li H. Incorporating accident liability into crash risk analysis: A multidimensional risk source approach. ACCIDENT; ANALYSIS AND PREVENTION 2021; 153:106035. [PMID: 33607319 DOI: 10.1016/j.aap.2021.106035] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/13/2021] [Accepted: 02/07/2021] [Indexed: 06/12/2023]
Abstract
In the field of traffic safety, the occurrence of accidents remains a cause of concern for road regulators as well as users. Exploring risk factors inducing the accidents and quantifying the accident risk will not only benefit the prevention and control of traffic accidents but also assist in developing effective risk propagation model for road accidents. This study uses detailed accident record data to mine the risk factors affecting the occurrence of accidents, and quantify the accident risk under the combination of risk factors. First, by reviewing relevant literature and analyzing historical accident, we construct a multi-dimension characterization framework of risk factors with bi-level structure. The Human Factors Analysis and Classification System (HFACS) is applied to supplement and improve the framework. Next, under this framework, we identify the risk factors in traffic accident record, and analyze the statistical characteristics from the level of risk sources and risk characteristics. Then, the concept of accident liability weight is proposed to measure the impact of risk factors on accident occurrence. Through the liability affirmation of risk factors, the accident probability are updated. Last, we establish an accident risk quantify model (ARQM) based on the mean mutual information to compare the likelihood of accidents in different scenarios. In addition, we compare the accident probability and risk under equivalent liability and liability affirmation, as well as give some fundamental ideas regarding how to effectively prevent accidents.
Collapse
Affiliation(s)
- Xin Wang
- Department of Transportation, Jilin University, Changchun, 130022, China.
| | - Zhaowei Qu
- Department of Transportation, Jilin University, Changchun, 130022, China.
| | - Xianmin Song
- Department of Transportation, Jilin University, Changchun, 130022, China.
| | - Qiaowen Bai
- Department of Transportation, Jilin University, Changchun, 130022, China
| | - Zhaotian Pan
- Department of Transportation, Jilin University, Changchun, 130022, China
| | - Haitao Li
- Department of Transportation, Jilin University, Changchun, 130022, China
| |
Collapse
|
25
|
Zhang X, Wen H, Yamamoto T, Zeng Q. Investigating hazardous factors affecting freeway crash injury severity incorporating real-time weather data: Using a Bayesian multinomial logit model with conditional autoregressive priors. JOURNAL OF SAFETY RESEARCH 2021; 76:248-255. [PMID: 33653556 DOI: 10.1016/j.jsr.2020.12.014] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Revised: 09/22/2020] [Accepted: 12/16/2020] [Indexed: 06/12/2023]
Abstract
INTRODUCTION It has been demonstrated that weather conditions have significant impacts on freeway safety. However, when employing an econometric model to examine freeway crash injury severity, most of the existing studies tend to categorize several different adverse weather conditions such as rainy, snowy, and windy conditions into one category, "adverse weather," which might lead to a large amount of information loss and estimation bias. Hence, to overcome this issue, real-time weather data, the value of meteorological elements when crashes occurred, are incorporated into the dataset for freeway crash injury analysis in this study. METHODS Due to the possible existence of spatial correlations in freeway crash injury data, this study presents a new method, the spatial multinomial logit (SMNL) model, to consider the spatial effects in the framework of the multinomial logit (MNL) model. In the SMNL model, the Gaussian conditional autoregressive (CAR) prior is adopted to capture the spatial correlation. In this study, the model results of the SMNL model are compared with the model results of the traditional multinomial logit (MNL) model. In addition, Bayesian inference is adopted to estimate the parameters of these two models. RESULT The result of the SMNL model shows the significance of the spatial terms, which demonstrates the existence of spatial correlation. In addition, the SMNL model has a better model fitting ability than the MNL model. Through the parameter estimate results, risk factors such as vertical grade, visibility, emergency medical services (EMS) response time, and vehicle type have significant effects on freeway injury severity. Practical Application: According to the results, corresponding countermeasures for freeway roadway design, traffic management, and vehicle design are proposed to improve freeway safety. For example, steep slopes should be avoided if possible, and in-lane rumble strips should be recommended for steep down-slope segments. Besides, traffic volume proportion of large vehicles should be limited when the wind speed exceeds a certain grade.
Collapse
Affiliation(s)
- Xuan Zhang
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong, 510641, PR China.
| | - Huiying Wen
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong, 510641, PR China.
| | - Toshiyuki Yamamoto
- Institute of Materials and Systems for Sustainability, Nagoya University, Nagoya 464-8603, Japan.
| | - Qiang Zeng
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong, 510641, PR China.
| |
Collapse
|
26
|
Machine learning applied to road safety modeling: A systematic literature review. JOURNAL OF TRAFFIC AND TRANSPORTATION ENGINEERING (ENGLISH EDITION) 2020. [DOI: 10.1016/j.jtte.2020.07.004] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
27
|
Li J, Liu J, Liu P, Qi Y. Analysis of Factors Contributing to the Severity of Large Truck Crashes. ENTROPY 2020; 22:e22111191. [PMID: 33286959 PMCID: PMC7711803 DOI: 10.3390/e22111191] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 10/15/2020] [Accepted: 10/16/2020] [Indexed: 11/16/2022]
Abstract
Crashes that involved large trucks often result in immense human, economic, and social losses. To prevent and mitigate severe large truck crashes, factors contributing to the severity of these crashes need to be identified before appropriate countermeasures can be explored. In this research, we applied three tree-based machine learning (ML) techniques, i.e., random forest (RF), gradient boost decision tree (GBDT), and adaptive boosting (AdaBoost), to analyze the factors contributing to the severity of large truck crashes. Besides, a mixed logit model was developed as a baseline model to compare with the factors identified by the ML models. The analysis was performed based on the crash data collected from the Texas Crash Records Information System (CRIS) from 2011 to 2015. The results of this research demonstrated that the GBDT model outperforms other ML methods in terms of its prediction accuracy and its capability in identifying more contributing factors that were also identified by the mixed logit model as significant factors. Besides, the GBDT method can effectively identify both categorical and numerical factors, and the directions and magnitudes of the impacts of the factors identified by the GBDT model are all reasonable and explainable. Among the identified factors, driving under the influence of drugs, alcohol, and fatigue are the most important factors contributing to the severity of large truck crashes. In addition, the exists of curbs and medians and lanes and shoulders with sufficient width can prevent severe large truck crashes.
Collapse
Affiliation(s)
- Jinhong Li
- School of Mathematics and Statistics, Qilu University of Technology (Shandong Academy of Sciences), University Road 3501, Jinan 250353, China;
| | - Jinli Liu
- Department of Transportation Studies, Texas Southern University, 3100 Cleburne Street, Houston, TX 77004-9986, USA;
| | - Pengfei Liu
- Department of Civil and Environmental Engineering, the University of North Carolina at Charlotte, EPIC Building, Room 3366, 9201 University City Boulevard, Charlotte, NC 28223-0001, USA;
| | - Yi Qi
- Department of Transportation Studies, Texas Southern University, 3100 Cleburne Street, Houston, TX 77004-9986, USA;
- Correspondence:
| |
Collapse
|
28
|
Assi K. Traffic Crash Severity Prediction-A Synergy by Hybrid Principal Component Analysis and Machine Learning Models. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:E7598. [PMID: 33086567 PMCID: PMC7589286 DOI: 10.3390/ijerph17207598] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 10/14/2020] [Accepted: 10/17/2020] [Indexed: 12/24/2022]
Abstract
The accurate prediction of road traffic crash (RTC) severity contributes to generating crucial information, which can be used to adopt appropriate measures to reduce the aftermath of crashes. This study aims to develop a hybrid system using principal component analysis (PCA) with multilayer perceptron neural networks (MLP-NN) and support vector machines (SVM) in predicting RTC severity. PCA shows that the first nine components have an eigenvalue greater than one. The cumulative variance percentage explained by these principal components was found to be 67%. The prediction accuracies of the models developed using the original attributes were compared with those of the models developed using principal components. It was found that the testing accuracies of MLP-NN and SVM increased from 64.50% and 62.70% to 82.70% and 80.70%, respectively, after using principal components. The proposed models would be beneficial to trauma centers in predicting crash severity with high accuracy so that they would be able to prepare for appropriate and prompt medical treatment.
Collapse
Affiliation(s)
- Khaled Assi
- Civil & Environmental Engineering Department, King Fahd University of Petroleum & Minerals, Dhahran 31261, Saudi Arabia
| |
Collapse
|
29
|
Exploring the Injury Severity Risk Factors in Fatal Crashes with Neural Network. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17207466. [PMID: 33066522 PMCID: PMC7602238 DOI: 10.3390/ijerph17207466] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/29/2020] [Accepted: 10/01/2020] [Indexed: 01/28/2023]
Abstract
A better understanding of circumstances contributing to the severity outcome of traffic crashes is an important goal of road safety studies. An in-depth crash injury severity analysis is vital for the proactive implementation of appropriate mitigation strategies. This study proposes an improved feed-forward neural network (FFNN) model for predicting injury severity associated with individual crashes using three years (2017–2019) of crash data collected along 15 rural highways in the Kingdom of Saudi Arabia (KSA). A total of 12,566 crashes were recorded during the study period with a binary injury severity outcome (fatal or non-fatal injury) for the variable to be predicted. FFNN architecture with back-propagation (BP) as a training algorithm, logistic as activation function, and six number of hidden neurons in the hidden layer yielded the best model performance. Results of model prediction for the test data were analyzed using different evaluation metrics such as overall accuracy, sensitivity, and specificity. Prediction results showed the adequacy and robust performance of the proposed method. A detailed sensitivity analysis of the optimized NN was also performed to show the impact and relative influence of different predictor variables on resulting crash injury severity. The sensitivity analysis results indicated that factors such as traffic volume, average travel speeds, weather conditions, on-site damage conditions, road and vehicle type, and involvement of pedestrians are the most sensitive variables. The methods applied in this study could be used in big data analysis of crash data, which can serve as a rapid-useful tool for policymakers to improve highway safety.
Collapse
|
30
|
Zou X, Vu HL, Huang H. Fifty Years of Accident Analysis & Prevention: A Bibliometric and Scientometric Overview. ACCIDENT; ANALYSIS AND PREVENTION 2020; 144:105568. [PMID: 32562929 DOI: 10.1016/j.aap.2020.105568] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 03/31/2020] [Accepted: 04/18/2020] [Indexed: 06/11/2023]
Abstract
Accident Analysis & Prevention (AA&P) is a leading academic journal established in 1969 that serves as an important scientific communication platform for road safety studies. To celebrate its 50th anniversary of publishing outstanding and insightful studies, a multi-dimensional statistical and visualized analysis of the AA&P publications between 1969 and 2018 was performed using the Web of Science (WoS) Core Collection database, bibliometrics and mapping-knowledge-domain (MKD) analytical methods, and scientometric tools. It was shown that the annual number of AA&P's publications has grown exponentially and that over the course of its development, AA&P has been a leader in the field of road safety, both in terms of innovation and dissemination. By determining its key source countries and organizations, core authors, highly co-cited published documents, and high burst-strength publications, we showed that AA&P's areas of focus include the "effects of hazard and risk perception on driving behavior", "crash frequency modeling analysis", "intentional driving violations and aberrant driving behavior", "epidemiology, assessment and prevention of road traffic injuries", and "crash-injury severity modeling analysis". Furthermore, the key burst papers that have played an important role in advancing research and guiding AA&P in new directions - particularly those in the fields of crash frequency and crash-injury severity modeling analyses were identified. Finally, a modified Haddon matrix in the era of intelligent, connected and autonomous transportation systems is proposed to provide new insights into the emerging generation of road safety studies.
Collapse
Affiliation(s)
- Xin Zou
- Institute of Transport Studies, Monash University, Clayton, VIC 3800, Australia.
| | - Hai L Vu
- Institute of Transport Studies, Monash University, Clayton, VIC 3800, Australia
| | - Helai Huang
- School of Traffic and Transportation Engineering, Central South University, Changsha 410075, China
| |
Collapse
|
31
|
Katanalp BY, Eren E. The novel approaches to classify cyclist accident injury-severity: Hybrid fuzzy decision mechanisms. ACCIDENT; ANALYSIS AND PREVENTION 2020; 144:105590. [PMID: 32623320 DOI: 10.1016/j.aap.2020.105590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/16/2020] [Revised: 05/09/2020] [Accepted: 05/10/2020] [Indexed: 06/11/2023]
Abstract
In this study, two novel fuzzy decision approaches, where the fuzzy logic (FL) model was revised with the C4.5 decision tree (DT) algorithm, were applied to the classification of cyclist injury-severity in bicycle-vehicle accidents. The study aims to evaluate two main research topics. The first one is investigation of the effect of road infrastructure, road geometry, street, accident, atmospheric and cyclist related parameters on the classification of cyclist injury-severity similarly to other studies in the literature. The second one is examination of the performance of the new fuzzy decision approaches described in detail in this study for the classification of cyclist injury-severity. For this purpose, the data set containing bicycle-vehicle accidents in 2013-2017 was analyzed with the classic C4.5 algorithm and two different hybrid fuzzy decision mechanisms, namely DT-based converted FL (DT-CFL) and novel DT-based revised FL (DT-RFL). The model performances were compared according to their accuracy, precision, recall, and F-measure values. The results indicated that the parameters that have the greatest effect on the injury-severity in bicycle-vehicle accidents are gender, vehicle damage-extent, road-type as well as the highly effective parameters such as pavement type, accident type, and vehicle-movement. The most successful classification performance among the three models was achieved by the DT-RFL model with 72.0 % F-measure and 69.96 % Accuracy. With 59.22 % accuracy and %57.5 F-measure values, the DT-CFL model, rules of which were created according to the splitting criteria of C4.5 algorithm, gave worse results in the classification of the injury-severity in bicycle-vehicle accidents than the classical C4.5 algorithm. In light of these results, the use of fuzzy decision mechanism models presented in this study on more comprehensive datasets is recommended for further studies.
Collapse
Affiliation(s)
- Burak Yiğit Katanalp
- Adana Alparslan Turkes Science and Technology University, Faculty of Engineering, Civil Engineering Department, Adana, Turkey.
| | - Ezgi Eren
- Adana Alparslan Turkes Science and Technology University, Faculty of Engineering, Civil Engineering Department, Adana, Turkey.
| |
Collapse
|
32
|
Development of a Binary Classification Model to Assess Safety in Transportation Systems Using GMDH-Type Neural Network Algorithm. SUSTAINABILITY 2020. [DOI: 10.3390/su12176735] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Evaluating road safety is an enduring research topic in Infrastructure and Transportation Engineering. The prediction of crash risk is very important for avoiding other crashes and safeguarding road users. According to this task, awareness of the number of vehicles involved in an accident contributes greatly to safety analysis, hence, it is necessary to predict it. In this study, the main aim is to develop a binary model for predicting the number of vehicles involved in an accident using Neural Networks and the Group Method of Data Handling (GMDH). For this purpose, 775 accident cases were accurately recorded and evaluated from the urban and rural areas of Cosenza in southern Italy and some notable parameters were considered as input data including Daylight, Weekday, Type of accident, Location, Speed limit and Average speed; and the number of vehicles involved in an accident was considered as output. In this study, 581 cases were selected randomly from the dataset to train and the rest were used to test the developed binary model. A confusion matrix and a Receiver Operating Characteristic curve were used to investigate the performance of the proposed model. According to the obtained results, the accuracy values of the prediction model were 83.5% and 85.7% for testing and training, respectively. Finally, it can be concluded that the developed binary model can be applied as a reliable tool for predicting the number of vehicles involved in an accident.
Collapse
|
33
|
Lee J, Chung K, Papakonstantinou I, Kang S, Kim DK. An optimal network screening method of hotspot identification for highway crashes with dynamic site length. ACCIDENT; ANALYSIS AND PREVENTION 2020; 135:105358. [PMID: 31765928 DOI: 10.1016/j.aap.2019.105358] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2019] [Revised: 09/20/2019] [Accepted: 11/05/2019] [Indexed: 06/10/2023]
Abstract
We propose a novel network screening method for hotspot (i.e., sites that suffer from high collision concentration and have high potential for safety improvement) identification based on the optimization framework to maximize the total summation of a selected safety measure for all hotspots considering a resource constraint for conducting detailed engineering studies (DES). The proposed method allows the length of each hotspot to be determined dynamically based on constraints the users impose. The calculation of the Dynamic Site Length (DSL) method is based on Dynamic Programming, and it is shown to be effective to find the close-to-optimal solution with computationally feasible complexity. The screening method has been demonstrated using historical crash data from extended freeway routes in San Francisco, California. Using the Empirical Bayesian (EB) estimate as a safety measure, we compare the performance of the proposed DSL method with other conventional screening methods, Sliding Window (SW) and Continuous Risk Profile (CRP), in terms of their optimal objective value (i.e., performance of detecting sites under the highest risk). Moreover, their spatio-temporal consistency is compared through the site and method consistency tests. Findings show that DSL can outperform SW and CRP in investigating more hotspots under the same amount of resources allocated to DES by pinpointing hotspot locations with greater accuracy and showing improved spatio-temporal consistency.
Collapse
Affiliation(s)
- Jinwoo Lee
- The Cho Chun Shik Graduate School of Green Transportation, Korea Advanced Institute of Science and Technology, 193, Munji-ro, Yuseong-gu, Daejeon, 34051, Republic of Korea.
| | - Koohong Chung
- School of Civil, Environmental and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk Gu, Seoul, 02841, Republic of Korea.
| | - Ilia Papakonstantinou
- Department of Civil and Urban Engineering, New York University, Brooklyn, NY, 11201, United States.
| | - Seungmo Kang
- School of Civil, Environmental and Architectural Engineering, Korea University, 145 Anam-ro, Seongbuk Gu, Seoul, 02841, Republic of Korea.
| | - Dong-Kyu Kim
- Department of Civil and Environmental Engineering, Seoul National University, 1 Gwanak-ro, Gwanak-gu, Seoul, 08826, Republic of Korea.
| |
Collapse
|
34
|
Siamidoudaran M, Iscioglu E, Siamidodaran M. Traffic injury severity prediction along with identification of contributory factors using learning vector quantization: a case study of the city of London. SN APPLIED SCIENCES 2019. [DOI: 10.1007/s42452-019-1314-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
|
35
|
Zeng Q, Gu W, Zhang X, Wen H, Lee J, Hao W. Analyzing freeway crash severity using a Bayesian spatial generalized ordered logit model with conditional autoregressive priors. ACCIDENT; ANALYSIS AND PREVENTION 2019; 127:87-95. [PMID: 30844540 DOI: 10.1016/j.aap.2019.02.029] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2018] [Revised: 02/21/2019] [Accepted: 02/27/2019] [Indexed: 06/09/2023]
Abstract
This study develops a Bayesian spatial generalized ordered logit model with conditional autoregressive priors to examine severity of freeway crashes. Our model can simultaneously account for the ordered nature in discrete crash severity levels and the spatial correlation among adjacent crashes without fixing the thresholds between crash severity levels. The crash data from Kaiyang Freeway, China in 2014 are collected for the analysis, where crash severity levels are defined considering the combination of injury severity, financial loss, and numbers of injuries and deaths. We calibrate the proposed spatial model and compare it with a traditional generalized ordered logit model via Bayesian inference. The superiority of the spatial model is indicated by its better model fit and the statistical significance of the spatial term. Estimation results show that driver type, season, traffic volume and composition, response time for emergency medical services, and crash type have significant effects on crash severity propensity. In addition, vehicle type, season, time of day, weather condition, vertical grade, bridge, traffic volume and composition, and crash type have significant impacts on the threshold between median and severe crash levels. The average marginal effects of the contributing factors on each crash severity level are also calculated. Based on the estimation results, several countermeasures regarding driver education, traffic rule enforcement, vehicle and roadway engineering, and emergency services are proposed to mitigate freeway crash severity.
Collapse
Affiliation(s)
- Qiang Zeng
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong, 510641, PR China; Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region, PR China.
| | - Weihua Gu
- Department of Electrical Engineering, The Hong Kong Polytechnic University, Hong Kong Special Administrative Region, PR China.
| | - Xuan Zhang
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong, 510641, PR China.
| | - Huiying Wen
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong, 510641, PR China.
| | - Jinwoo Lee
- The Cho Chun Shik Graduate School of Green Transportation, Korea Advanced Institute of Science and Technology, Daejeon, Republic of Korea.
| | - Wei Hao
- School of Traffic and Transportation, Changsha University of Science and Technology, Changsha, Hunan, 410114, PR China.
| |
Collapse
|
36
|
Dong C, Xie K, Sun X, Lyu M, Yue H. Roadway traffic crash prediction using a state-space model based support vector regression approach. PLoS One 2019; 14:e0214866. [PMID: 30951535 PMCID: PMC6450638 DOI: 10.1371/journal.pone.0214866] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Accepted: 03/21/2019] [Indexed: 11/18/2022] Open
Abstract
Conventional traffic crash analyzing methods focus on identifying the relationship between traffic crash outcomes and impact risk factors and explaining the effects of risk factors, which ignore the changes of roadway systems and can lead to inaccurate results in traffic crash predictions. To address this issue, an innovative two-step method is proposed and a support vector regression (SVR) model is formulated into state-space model (SSM) framework for traffic crash prediction. The SSM was developed in the first step to identify the dynamic evolution process of the roadway systems that are caused by the changes of traffic flow and predict the changes of impact factors in roadway systems. Using the predicted impact factors, the SVR model was incorporated in the second step to perform the traffic crash prediction. A five-year dataset that obtained from 1152 roadway segments in Tennessee was employed to validate the model effectiveness. The proposed models result in an average prediction MAPE of 7.59%, a MAE of 0.11, and a RMSD of 0.32. For the performance comparison, a SVR model and a multivariate negative binomial (MVNB) model were developed to do the same task. The results show that the proposed model has superior performances in terms of prediction accuracy compared to the SVR and MVNB models. Compared to the SVR and MVNB models, the benefit of incorporating a state-space model to identify the changes of roadway systems is significant evident in the proposed models for all crash types, and the prediction accuracy that measured by MAPE can be improved by 4.360% and 6.445% on average, respectively. Apart from accuracy improvement, the proposed models are more robust and the predictions can retain a smoother pattern. Furthermore, the results show that the proposed model has a more precise and synchronized response behavior to the high variations of the observed data, especially for the phenomenon of extra zeros.
Collapse
Affiliation(s)
- Chunjiao Dong
- Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Ministry of Transport, Beijing Jiaotong University, Shangyuancun, Haidian District, Beijing, China
| | - Kun Xie
- National Demonstration Center for Experimental Traffic and Transportation Education, School of Traffic and Transportation, Beijing Jiaotong University, Shangyuancun, Haidian District, Beijing, China
- * E-mail:
| | - Xubin Sun
- School of Electronic and Information Engineering, Beijing Jiaotong University, Shangyuancun, Haidian District, Beijing, China
| | - Miaomiao Lyu
- School of Transportation and Logistics, Southwest Jiaotong University, Jinniu District Chengdu, China
| | - Hao Yue
- Key Laboratory of Transport Industry of Big Data Application Technologies for Comprehensive Transport, Ministry of Transport, Beijing Jiaotong University, Shangyuancun, Haidian District, Beijing, China
| |
Collapse
|
37
|
Tang J, Liang J, Han C, Li Z, Huang H. Crash injury severity analysis using a two-layer Stacking framework. ACCIDENT; ANALYSIS AND PREVENTION 2019; 122:226-238. [PMID: 30390518 DOI: 10.1016/j.aap.2018.10.016] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2018] [Revised: 10/18/2018] [Accepted: 10/22/2018] [Indexed: 06/08/2023]
Abstract
Crash injury severity analysis is useful for traffic management agency to further understand severity of crashes. A two-layer Stacking framework is proposed in this study to predict the crash injury severity: The fist layer integrates advantages of three base classification methods: RF (Random Forests), AdaBoost (Adaptive Boosting), and GBDT (Gradient Boosting Decision Tree); the second layer completes classification of crash injury severity based on a Logistic Regression model. A total of 5538 crashes were recorded at 326 freeway diverge areas. In the model calibration, several parameters including the number of trees in three base classification methods, learning rate, and regularization coefficient are optimized via a systematic grid search approach. In the model validation, the performance of the Stacking model is compared with several traditional models including the Support Vector Machine (SVM), Multi-Layer Perceptron (MLP) and Random Forests (RF) in the multi classification experiments. The prediction results show that Stacking model achieves superior performance evaluated by two indicators: accuracy and recall. Furthermore, all the factors used in severity prediction are classified into different categories according to their influence on the results, and sensitivity analysis of several significant factors is finally implemented to explore the impact of their value variation on the prediction accuracy.
Collapse
Affiliation(s)
- Jinjun Tang
- School of Traffic and Transportation Engineering, Smart Transport Key Laboratory of Hunan Province, Central South University, Changsha, 410075, China
| | - Jian Liang
- School of Traffic and Transportation Engineering, Smart Transport Key Laboratory of Hunan Province, Central South University, Changsha, 410075, China
| | - Chunyang Han
- School of Traffic and Transportation Engineering, Smart Transport Key Laboratory of Hunan Province, Central South University, Changsha, 410075, China
| | - Zhibin Li
- School of Transportation, Southeast University, Nanjing, 210096, China.
| | - Helai Huang
- School of Traffic and Transportation Engineering, Smart Transport Key Laboratory of Hunan Province, Central South University, Changsha, 410075, China
| |
Collapse
|
38
|
Study on Crash Injury Severity Prediction of Autonomous Vehicles for Different Emergency Decisions Based on Support Vector Machine Model. ELECTRONICS 2018. [DOI: 10.3390/electronics7120381] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Motor vehicle crashes remain a leading cause of life and property loss to society. Autonomous vehicles can mitigate the losses by making appropriate emergency decision, and the crash injury severity prediction model is the basis for autonomous vehicles to make decisions in emergency situations. In this paper, based on the support vector machine (SVM) model and NASS/GES crash data, three SVM crash injury severity prediction models (B-SVM, T-SVM, and BT-SVM) corresponding to braking, turning, and braking + turning respectively are established. The vehicle relative speed (REL_SPEED) and the gross vehicle weight rating (GVWR) are introduced into the impact indicators of the prediction models. Secondly, the ordered logit (OL) and back propagation neural network (BPNN) models are established to validate the accuracy of the SVM models. The results show that the SVM models have the best performance than the other two. Next, the impact of REL_SPEED and GVWR on injury severity is analyzed quantitatively by the sensitivity analysis, the results demonstrate that the increase of REL_SPEED and GVWR will make vehicle crash more serious. Finally, the same crash samples under normal road and environmental conditions are input into B-SVM, T-SVM, and BT-SVM respectively, the output results are compared and analyzed. The results show that with other conditions being the same, as the REL_SPEED increased from the low (0–20 mph) to middle (20–45 mph) and then to the high range (45–75 mph), the best emergency decision with the minimum crash injury severity will gradually transition from braking to turning and then to braking + turning.
Collapse
|
39
|
Lee J, Chae J, Yoon T, Yang H. Traffic accident severity analysis with rain-related factors using structural equation modeling - A case study of Seoul City. ACCIDENT; ANALYSIS AND PREVENTION 2018; 112:1-10. [PMID: 29306084 DOI: 10.1016/j.aap.2017.12.013] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 11/09/2017] [Accepted: 12/14/2017] [Indexed: 06/07/2023]
Abstract
Weather conditions are strongly correlated with traffic accident severity. In particular, rain-related factors are an important cause of traffic accidents due to the poor visibility and reduced friction resulting from slippery road conditions. This paper presents a systematic approach to analyze the extent to which the rainfall intensity and level of water depth are responsible for traffic accidents using Seoul City, Korea, as a case study. The rainfall and traffic accident data over a nine-year period (from 2007 to 2015) for Seoul were analyzed through Structural Equation Modeling to identify the relationships among variables by handling endogenous and exogenous variables simultaneously. In the model, four latent variables, namely those representing the road; traffic, environmental, and human factors; and rain and water depth factors, were defined and the coefficients of the latent, endogenous, and exogenous variables were estimated to obtain the level of accident severity. Furthermore, a statistical goodness of fit index was suggested for model fitting. In conclusion, traffic, environmental, and human factors; rain and water depth factors; and road factors are mutually correlated with the level of accident severity. Compact cars, young drivers, female drivers, heavy rain, deep water, and roads with a long drainage length are more likely to be associated with an increase in the level of accident severity, as are features like a tangent, down slope, right-hand curve, and shorter curve length.
Collapse
Affiliation(s)
- Jonghak Lee
- WISE Institute, 11F Centennial Complex, Hankuk University of Foreign Studies, 81 oedae-ro, Mohyeon-myeon, Cheoin-gu Gyeonggi-do, 17035, Republic of Korea.
| | - Junghyo Chae
- WISE Institute, 11F Centennial Complex, Hankuk University of Foreign Studies, 81 oedae-ro, Mohyeon-myeon, Cheoin-gu Gyeonggi-do, 17035, Republic of Korea.
| | - Taekwan Yoon
- Smart Infrastructure Center, Korea Research Institute for Human Settlements, 5 Gukchaegyeonguwon-ro, Sejong-si, 30149, Republic of Korea.
| | - Hojin Yang
- WISE Institute, 11F Centennial Complex, Hankuk University of Foreign Studies, 81 oedae-ro, Mohyeon-myeon, Cheoin-gu Gyeonggi-do, 17035, Republic of Korea.
| |
Collapse
|
40
|
Severity Prediction of Traffic Accidents with Recurrent Neural Networks. APPLIED SCIENCES-BASEL 2017. [DOI: 10.3390/app7060476] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
41
|
Zeng Q, Huang H, Pei X, Wong SC, Gao M. Rule extraction from an optimized neural network for traffic crash frequency modeling. ACCIDENT; ANALYSIS AND PREVENTION 2016; 97:87-95. [PMID: 27591417 DOI: 10.1016/j.aap.2016.08.017] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/29/2015] [Revised: 04/22/2016] [Accepted: 08/17/2016] [Indexed: 06/06/2023]
Abstract
This study develops a neural network (NN) model to explore the nonlinear relationship between crash frequency and risk factors. To eliminate the possibility of over-fitting and to deal with the black-box characteristic, a network structure optimization algorithm and a rule extraction method are proposed. A case study compares the performance of the trained and modified NN models with that of the traditional negative binomial (NB) model for analyzing crash frequency on road segments in Hong Kong. The results indicate that the optimized NNs have somewhat better fitting and predictive performance than the NB models. Moreover, the smaller training/testing errors in the optimized NNs with pruned input and hidden nodes demonstrate the ability of the structure optimization algorithm to identify the insignificant factors and to improve the model generalization capacity. Furthermore, the rule-set extracted from the optimized NN model can reveal the effect of each explanatory variable on the crash frequency under different conditions, and implies the existence of nonlinear relationship between factors and crash frequency. With the structure optimization algorithm and rule extraction method, the modified NN model has great potential for modeling crash frequency, and may be considered as a good alternative for road safety analysis.
Collapse
Affiliation(s)
- Qiang Zeng
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong 510641, PR China; Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, PR China.
| | - Helai Huang
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, PR China.
| | - Xin Pei
- Department of Automation, Tsinghua University, Beijing, PR China.
| | - S C Wong
- Department of Civil Engineering, The University of Hong Kong, Pokfulam Road, Hong Kong.
| | - Mingyun Gao
- Business School of Hunan University, Changsha, Hunan 410082, PR China.
| |
Collapse
|
42
|
Zeng Q, Wen H, Huang H. The interactive effect on injury severity of driver-vehicle units in two-vehicle crashes. JOURNAL OF SAFETY RESEARCH 2016; 59:105-111. [PMID: 27846993 DOI: 10.1016/j.jsr.2016.10.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 09/22/2016] [Accepted: 10/06/2016] [Indexed: 06/06/2023]
Abstract
INTRODUCTION This study sets out to investigate the interactive effect on injury severity of driver-vehicle units in two-vehicle crashes. METHOD A Bayesian hierarchical ordered logit model is proposed to relate the variation and correlation of injury severity of drivers involved in two-vehicle crashes to the factors of both driver-vehicle units and the crash configurations. A total of 6417 crash records with 12,834 vehicles involved in Florida are used for model calibration. RESULTS The results show that older, female and not-at-fault drivers and those without use of safety equipment are more likely to be injured but less likely to injure the drivers in the other vehicles. New vehicles and lower speed ratios are associated with lower injury degree of both drivers involved. Compared with automobiles, vans, pick-ups, light trucks, median trucks, and heavy trucks possess better self-protection and stronger aggressivity. The points of impact closer to the driver's seat in general indicate a higher risk to the own drivers while engine cover and vehicle rear are the least hazardous to other drivers. Head-on crashes are significantly more severe than angle and rear-end crashes. We found that more severe crashes occurred on roadways than on shoulders or safety zones. CONCLUSIONS Based on these results, some suggestions for traffic safety education, enforcement and engineering are made. Moreover, significant within-crash correlation is found in the crash data, which demonstrates the applicability of the proposed model.
Collapse
Affiliation(s)
- Qiang Zeng
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong 510641, PR China.
| | - Huiying Wen
- School of Civil Engineering and Transportation, South China University of Technology, Guangzhou, Guangdong 510641, PR China.
| | - Helai Huang
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, PR China.
| |
Collapse
|
43
|
Huang H, Li C, Zeng Q. Crash protectiveness to occupant injury and vehicle damage: An investigation on major car brands. ACCIDENT; ANALYSIS AND PREVENTION 2016; 86:129-136. [PMID: 26551733 DOI: 10.1016/j.aap.2015.10.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Revised: 09/13/2015] [Accepted: 10/10/2015] [Indexed: 06/05/2023]
Abstract
This study sets out to investigate vehicles' crash protectiveness on occupant injury and vehicle damage, which can be deemed as an extension of the traditional crash worthiness. A Bayesian bivariate hierarchical ordered logistic (BVHOL) model is developed to estimate the occupant protectiveness (OP) and vehicle protectiveness (VP) of 23 major car brands in Florida, with considering vehicles' crash aggressivity and controlling external factors. The proposed model not only takes over the strength of the existing hierarchical ordered logistic (HOL) model, i.e. specifying the order characteristics of crash outcomes and cross-crash heterogeneities, but also accounts for the correlation between the two crash responses, driver injury and vehicle damage. A total of 7335 two-vehicle-crash records with 14,670 cars involved in Florida are used for the investigation. From the estimation results, it's found that most of the luxury cars such as Cadillac, Volvo and Lexus possess excellent OP and VP while some brands such as KIA and Saturn perform very badly in both aspects. The ranks of the estimated safety performance indices are even compared to the counterparts in Huang et al. study [Huang, H., Hu, S., Abdel-Aty, M., 2014. Indexing crash worthiness and crash aggressivity by major car brands. Safety Science 62, 339-347]. The results show that the rank of occupant protectiveness index (OPI) is relatively coherent with that of crash worthiness index, but the ranks of crash aggressivity index in both studies is more different from each other. Meanwhile, a great discrepancy between the OPI rank and that of vehicle protectiveness index is found. What's more, the results of control variables and hyper-parameters estimation as well as comparison to HOL models with separate or identical threshold errors, demonstrate the validity and advancement of the proposed model and the robustness of the estimated OP and VP.
Collapse
Affiliation(s)
- Helai Huang
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, PR China.
| | - Chunyang Li
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, PR China.
| | - Qiang Zeng
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, PR China.
| |
Collapse
|
44
|
Dong N, Huang H, Zheng L. Support vector machine in crash prediction at the level of traffic analysis zones: Assessing the spatial proximity effects. ACCIDENT; ANALYSIS AND PREVENTION 2015; 82:192-198. [PMID: 26091769 DOI: 10.1016/j.aap.2015.05.018] [Citation(s) in RCA: 60] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2014] [Revised: 05/22/2015] [Accepted: 05/26/2015] [Indexed: 06/04/2023]
Abstract
In zone-level crash prediction, accounting for spatial dependence has become an extensively studied topic. This study proposes Support Vector Machine (SVM) model to address complex, large and multi-dimensional spatial data in crash prediction. Correlation-based Feature Selector (CFS) was applied to evaluate candidate factors possibly related to zonal crash frequency in handling high-dimension spatial data. To demonstrate the proposed approaches and to compare them with the Bayesian spatial model with conditional autoregressive prior (i.e., CAR), a dataset in Hillsborough county of Florida was employed. The results showed that SVM models accounting for spatial proximity outperform the non-spatial model in terms of model fitting and predictive performance, which indicates the reasonableness of considering cross-zonal spatial correlations. The best model predictive capability, relatively, is associated with the model considering proximity of the centroid distance by choosing the RBF kernel and setting the 10% of the whole dataset as the testing data, which further exhibits SVM models' capacity for addressing comparatively complex spatial data in regional crash prediction modeling. Moreover, SVM models exhibit the better goodness-of-fit compared with CAR models when utilizing the whole dataset as the samples. A sensitivity analysis of the centroid-distance-based spatial SVM models was conducted to capture the impacts of explanatory variables on the mean predicted probabilities for crash occurrence. While the results conform to the coefficient estimation in the CAR models, which supports the employment of the SVM model as an alternative in regional safety modeling.
Collapse
Affiliation(s)
- Ni Dong
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan, 410075 PR China.
| | - Helai Huang
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan, 410075 PR China.
| | - Liang Zheng
- Urban Transport Research Center, School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan, 410075 PR China.
| |
Collapse
|