1
|
Wu D, Xing L, Li Y, Wong YD, Lee JJ, Dong C. A framework for real-time traffic risk prediction incorporating cost-sensitive learning and dynamic thresholds. ACCIDENT; ANALYSIS AND PREVENTION 2025; 218:108087. [PMID: 40328008 DOI: 10.1016/j.aap.2025.108087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Revised: 04/07/2025] [Accepted: 04/28/2025] [Indexed: 05/08/2025]
Abstract
In recent years, researchers have explored an innovative approach that leverages real vehicle trajectory data to simultaneously derive traffic state and risk level for real-time risk prediction, which is crucial for traffic safety. However, existing studies largely overlook the costs associated with incorrect predictions and the varying consequences of different misclassifications, which undermines the reliability of the obtained prediction results. To address these gaps, this study refined traffic risk classification into four levels (i.e., no, low, medium, and high risks) and incorporated misclassification costs into the prediction process through cost-sensitive learning (CSL). Furthermore, considering that multi-class prediction tasks often face performance degradation and increased risk level granularity worsens class imbalance, further amplifying this degradation, this study introduced dynamic thresholds (DTs) to improve model performance. The aforementioned cost coefficients and thresholds were pinpointed using a genetic algorithm (GA). Furthermore, the employed data, comprising variables related to traffic state and associated risk data, were sourced from the HighD dataset. Subsequently, CSL-DTs-based models were built by integrating CSL and DTs with four distinct baseline machine/deep learning models, and the prediction performance (e.g., precision) and computation time of these models were compared. Results show that, compared to the corresponding baseline models, the proposed models perform better for multi-class prediction tasks. Additionally, the computation time of the CSL-DTs-based models is found to be acceptable for real-time prediction purposes. Finally, to ensure the reliability of the results obtained through the GA optimization (e.g., avoiding local optima), convergence curves were plotted, confirming the robustness of the optimization process. A robustness analysis also demonstrates that the models are highly stable under slight perturbations of cost coefficients and thresholds, with minimal impact on performance. Findings of this study are expected to enhance the reliability of real-time traffic risk prediction, holding the promise of significantly promoting proactive traffic safety management.
Collapse
Affiliation(s)
- Dan Wu
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China; School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore.
| | - Lu Xing
- School of Automation, Central South University, Changsha, China; Department of Automation, Tsinghua University, Beijing, China; Hunan Key Laboratory of Smart Roadway and Cooperative Vehicle-infrastructure Systems, Changsha University of Science &Technology, China.
| | - Ye Li
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China.
| | - Yiik Diew Wong
- School of Civil and Environmental Engineering, Nanyang Technological University, 639798, Singapore.
| | - Jaeyoung Jay Lee
- School of Traffic and Transportation Engineering, Central South University, Changsha, Hunan 410075, China; Department of Civil, Environmental & Construction Engineering, University of Central Florida, Orlando, FL 32816, USA.
| | - Changyin Dong
- School of Aeronautics, Northwestern Polytechnical University, Xi'an 710072, China; National Key Laboratory of Aircraft Configuration Design, Xi'an 710072, China.
| |
Collapse
|
2
|
Xuan Q, Zhang G, Wei S, Li K. Bayesian networks for identifying causal effects of factors on crash injury severity at signalized intersections. Int J Inj Contr Saf Promot 2025:1-9. [PMID: 40305029 DOI: 10.1080/17457300.2025.2495141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Revised: 03/30/2025] [Accepted: 04/15/2025] [Indexed: 05/02/2025]
Abstract
Signalized intersections are the areas where traffic crashes with severe injuries frequently happen. Although existing studies have explored the factors affecting crash injury severity at signalized intersections, intricate causal relationships between factors often fail to be captured. Thus, usage of Bayesian network reveals factors contributing to injury severity and the causal relationships between them, with the use of crash data extracted from the Crash Report Sampling System in 2021. The K2 algorithm and Expectation-Maximization algorithms are adopted for structure learning and parameter learning in Bayesian networks, respectively. The results indicate that 1) factors such as speeding, drunk driving, and use of airbags can significantly affect the injury severity, 2) causal relationships exist between distraction, running the red signal, collision type, and crash injury severity, and 3) compared to the random parameter logit model and random forest, Bayesian network has better accuracy in predicting the crash injury severity. The findings can serve to propose effective traffic safety intervention measures to reduce the injury severity of crashes at signalized intersections.
Collapse
Affiliation(s)
- Qianwei Xuan
- College of Engineering, Zhejiang Normal University, Jinhua, China
| | - Guopeng Zhang
- College of Engineering, Zhejiang Normal University, Jinhua, China
- Key Laboratory of Urban Rail Transit Intelligent Operation and Maintenance Technology & Equipment of Zhejiang Province, Zhejiang Normal University, Zhejiang, China
| | - Shuwu Wei
- College of Engineering, Zhejiang Normal University, Jinhua, China
| | - Kun Li
- College of Engineering, Zhejiang Normal University, Jinhua, China
| |
Collapse
|
3
|
Li M, Shen ZL, Xian HC, Zheng ZJ, Yu ZW, Liang XH, Gao R, Tang YL, Zhang Z. A Recognition System for Diagnosing Salivary Gland Neoplasms Based on Vision Transformer. THE AMERICAN JOURNAL OF PATHOLOGY 2025; 195:221-231. [PMID: 39490441 DOI: 10.1016/j.ajpath.2024.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 09/09/2024] [Accepted: 09/19/2024] [Indexed: 11/05/2024]
Abstract
Salivary gland neoplasms (SGNs) represent a group of human neoplasms characterized by a remarkable cytomorphologic diversity, which frequently poses diagnostic challenges. Accurate histologic categorization of salivary gland tumors is crucial to make precise diagnoses and guide decisions regarding patient management. Within the scope of this study, a computer-aided diagnosis model using Vision Transformer (ViT), a cutting-edge deep learning model in computer vision, was developed to accurately classify the most prevalent subtypes of SGNs. These subtypes include pleomorphic adenoma, myoepithelioma, Warthin tumor, basal cell adenoma, oncocytic adenoma, cystadenoma, mucoepidermoid carcinoma, and salivary adenoid cystic carcinoma. The data set comprised 3046 whole slide images of histologically confirmed salivary gland tumors, encompassing nine distinct tissue categories. SGN-ViT exhibited impressive performance in classifying the eight salivary gland tumors, achieving an accuracy of 0.9966, an area under the receiver operating characteristic curve value of 0.9899, precision of 0.9848, recall of 0.9848, and an F1 score of 0.9848. Diagnostic performance of SGN-ViT surpassed that of benchmark models. In a subset of 100 whole slide images, SGN-ViT demonstrated comparable diagnostic performance to that of the chief pathologist while significantly reducing the diagnosis time. These observations indicate that SGN-ViT holds the potential to serve as a valuable computer-aided diagnostic tool for salivary gland tumors, enhancing the diagnostic accuracy of junior pathologists.
Collapse
Affiliation(s)
- Mao Li
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Pathology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Ze-Liang Shen
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Pathology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Hong-Chun Xian
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Pathology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Zhi-Jian Zheng
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Pathology, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Zhen-Wei Yu
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Xin-Hua Liang
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Oral and Maxillofacial Surgery, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Rui Gao
- State Key Laboratory of Electronic Thin Films and Integrated Devices, University of Electronic Science and Technology of China, Chengdu, China
| | - Ya-Ling Tang
- State Key Laboratory of Oral Diseases and National Clinical Research Center for Oral Diseases, Department of Pathology, West China Hospital of Stomatology, Sichuan University, Chengdu, China.
| | - Zhong Zhang
- State Key Laboratory of Electronic Thin Films and Integrated Devices, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
4
|
Chammout B, Ahmed MO, El-adaway I. Assessing the Critical Factors Influencing Worker Safety in Roadway Work Zones. JOURNAL OF CONSTRUCTION ENGINEERING AND MANAGEMENT 2024; 150. [DOI: 10.1061/jcemd4.coeng-15205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/02/2024] [Indexed: 01/05/2025]
Affiliation(s)
- Bahaa Chammout
- Ph.D. Student, Dept. of Civil, Architectural, and Environmental Engineering, Missouri Univ. of Science and Technology, 326 Butler-Carlton Hall,1401 N. Pine St., Rolla, MO 65409
| | - Muaz O. Ahmed
- Construction Engineer, Crawford, Murphy & Tilly, Inc., 550 N Commons Dr., Suite 116, Aurora, IL 60504; formerly, Ph.D. Candidate, Dept. of Civil, Architectural, and Environmental Engineering, Missouri Univ. of Science and Technology, Rolla, MO 65409
| | - Islam El-adaway
- Associate Dean for Academic Partnerships, Hurst-McCarthy Professor of Construction Engineering and Management, Professor of Civil Engineering, and Founding Director of the Missouri Consortium of Construction Innovation, Dept. of Civil, Architectural, and Environmental Engineering, Dept. of Engineering Management and Systems Engineering, Missouri Univ. of Science and Technology, 228 Butler-Carlton Hall, 1401 N. Pine St., Rolla, MO 65409 (corresponding author). ORCID:
| |
Collapse
|
5
|
Liao H, Li Y, Li Z, Bian Z, Lee J, Cui Z, Zhang G, Xu C. Real-time accident anticipation for autonomous driving through monocular depth-enhanced 3D modeling. ACCIDENT; ANALYSIS AND PREVENTION 2024; 207:107760. [PMID: 39226856 DOI: 10.1016/j.aap.2024.107760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 07/12/2024] [Accepted: 08/24/2024] [Indexed: 09/05/2024]
Abstract
The primary goal of traffic accident anticipation is to foresee potential accidents in real time using dashcam videos, a task that is pivotal for enhancing the safety and reliability of autonomous driving technologies. In this study, we introduce an innovative framework, AccNet, which significantly advances the prediction capabilities beyond the current state-of-the-art 2D-based methods by incorporating monocular depth cues for sophisticated 3D scene modeling. Addressing the prevalent challenge of skewed data distribution in traffic accident datasets, we propose the Binary Adaptive Loss for Early Anticipation (BA-LEA). This novel loss function, together with a multi-task learning strategy, shifts the focus of the predictive model towards the critical moments preceding an accident. We rigorously evaluate the performance of our framework on three benchmark datasets - Dashcam Accident Dataset (DAD), Car Crash Dataset (CCD), and AnAn Accident Detection (A3D), and DADA-2000 Dataset - demonstrating its superior predictive accuracy through key metrics such as Average Precision (AP) and mean Time-To-Accident (mTTA).
Collapse
Affiliation(s)
- Haicheng Liao
- State Key Laboratory of Internet of Things for Smart City and Department of Computer and Information Science, University of Macau, Macao Special Administrative Region of China
| | - Yongkang Li
- Department of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Zhenning Li
- State Key Laboratory of Internet of Things for Smart City and Departments of Civil and Environmental Engineering and Computer and Information Science, University of Macau, Macao Special Administrative Region of China.
| | - Zilin Bian
- Transportation Planning and Engineering in the Department of Civil and Urban Engineering, New York University, NY, United States
| | - Jaeyoung Lee
- School of Traffic and Transportation Engineering, Central South University, Changsha, China
| | - Zhiyong Cui
- School of Transportation Science and Engineering, Beihang University, Beijing, China
| | - Guohui Zhang
- Department of Civil and Environmental Engineering, University of Hawaii, Honolulu HI, United States
| | - Chengzhong Xu
- State Key Laboratory of Internet of Things for Smart City and Department of Computer and Information Science, University of Macau, Macao Special Administrative Region of China
| |
Collapse
|
6
|
Yan X, He J, Wu G, Sun S, Wang C, Fang Z, Zhang C. Driving risk identification of urban arterial and collector roads based on multi-scale data. ACCIDENT; ANALYSIS AND PREVENTION 2024; 206:107712. [PMID: 39002352 DOI: 10.1016/j.aap.2024.107712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 06/18/2024] [Accepted: 07/07/2024] [Indexed: 07/15/2024]
Abstract
Urban arterial and collector roads, while interconnected within the urban transportation network, serve distinct purposes, leading to different driving risk profiles. Investigating these differences using advanced methods is of paramount significance. This study aims to achieve this by primarily collecting and processing relevant vehicle trajectory data alongside driver-vehicle-road-environment data. A comprehensive risk assessment matrix is constructed to assess driving risks, incorporating multiple conflict and traffic flow indicators with statistically temporal stability. The Entropy weight-TOPSIS method and the K-means algorithm are employed to determine the risk scores and levels of the target arterial and collector roads. Using risk levels as the outcome variables and multi-scale features as the explanatory variables, random parameters models with heterogeneity in means and variances are developed to identify the determinants of driving risks at different levels. Likelihood ratio tests and comparisons of out-of-sample and within-sample prediction are conducted. Results reveal significant statistical differences in the risk profiles between arterial and collector roads. The marginal effects of significant parameters are then calculated separately for arterial and collector roads, indicating that several factors have different impacts on the probability of risk levels for arterial and collector roads, such as the number of movable elements in road landscape pictures, the standard deviation of the vehicle's lateral acceleration, the average standard deviation of speed for all vehicles on the road segment, and the number of one-way lanes on the road segment. Some practical implications are provided based on the findings. Future research can be implemented by expanding the collected data to different regions and cities over longer periods.
Collapse
Affiliation(s)
- Xintong Yan
- School of Transportation, Southeast University, 2 Si pai lou, Nanjing 210096, PR China.
| | - Jie He
- School of Transportation, Southeast University, 2 Si pai lou, Nanjing 210096, PR China.
| | - Guanhe Wu
- HUAWEI Software Technology Co., Ltd., Yuhuatai, Nanjing 518116, PR China.
| | - Shuang Sun
- BYD Co., Ltd., 2 Yadi, Xi'an 710119, PR China.
| | - Chenwei Wang
- School of Transportation, Southeast University, 2 Si pai lou, Nanjing 210096, PR China.
| | - Zhiming Fang
- School of Transportation, Southeast University, 2 Si pai lou, Nanjing 210096, PR China.
| | - Changjian Zhang
- School of Transportation, Southeast University, 2 Si pai lou, Nanjing 210096, PR China.
| |
Collapse
|
7
|
Sharma A, Lysenko A, Jia S, Boroevich KA, Tsunoda T. Advances in AI and machine learning for predictive medicine. J Hum Genet 2024; 69:487-497. [PMID: 38424184 PMCID: PMC11422165 DOI: 10.1038/s10038-024-01231-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 02/04/2024] [Accepted: 02/12/2024] [Indexed: 03/02/2024]
Abstract
The field of omics, driven by advances in high-throughput sequencing, faces a data explosion. This abundance of data offers unprecedented opportunities for predictive modeling in precision medicine, but also presents formidable challenges in data analysis and interpretation. Traditional machine learning (ML) techniques have been partly successful in generating predictive models for omics analysis but exhibit limitations in handling potential relationships within the data for more accurate prediction. This review explores a revolutionary shift in predictive modeling through the application of deep learning (DL), specifically convolutional neural networks (CNNs). Using transformation methods such as DeepInsight, omics data with independent variables in tabular (table-like, including vector) form can be turned into image-like representations, enabling CNNs to capture latent features effectively. This approach not only enhances predictive power but also leverages transfer learning, reducing computational time, and improving performance. However, integrating CNNs in predictive omics data analysis is not without challenges, including issues related to model interpretability, data heterogeneity, and data size. Addressing these challenges requires a multidisciplinary approach, involving collaborations between ML experts, bioinformatics researchers, biologists, and medical doctors. This review illuminates these complexities and charts a course for future research to unlock the full predictive potential of CNNs in omics data analysis and related fields.
Collapse
Affiliation(s)
- Alok Sharma
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Institute for Integrated and Intelligent Systems, Griffith University, Queensland, Australia.
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
| | - Shangru Jia
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
8
|
Wang X, Su Y, Zheng Z, Xu L. Prediction and interpretive of motor vehicle traffic crashes severity based on random forest optimized by meta-heuristic algorithm. Heliyon 2024; 10:e35595. [PMID: 39224374 PMCID: PMC11367028 DOI: 10.1016/j.heliyon.2024.e35595] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 07/24/2024] [Accepted: 07/31/2024] [Indexed: 09/04/2024] Open
Abstract
Providing accurate prediction of the severity of traffic collisions is vital to improve the efficiency of emergencies and reduce casualties, accordingly improving traffic safety and reducing traffic congestion. However, the issue of both the predictive accuracy of the model and the interpretability of predicted outcomes has remained a persistent challenge. We propose a Random Forest optimized by a Meta-heuristic algorithm prediction framework that integrates the spatiotemporal characteristics of crashes. Through predictive analysis of motor vehicle traffic crash data on interstate highways within the United States in 2020, we compared the accuracy of various ensemble models and single-classification prediction models. The results show that the Random Forest (RF) model optimized by the Crown Porcupine Optimizer (CPO) has the best prediction results, and the accuracy, recall, f1 score, and precision can reach more than 90 %. We found that factors such as Temperature and Weather are closely related to vehicle traffic crashes. Closely related indicators were analyzed interpretatively using a geographic information system (GIS) based on the characteristic importance ranking of the results. The framework enables more accurate prediction of motor vehicle traffic crashes and discovers the important factors leading to motor vehicle traffic crashes with an explanation. The study proposes that in some areas consideration should be given to adding measures such as nighttime lighting devices and nighttime fatigue driving alert devices to ensure safe driving. It offers references for policymakers to address traffic management and urban development issues.
Collapse
Affiliation(s)
- Xing Wang
- School of Civil Engineering and Transportation, Northeast Forestry University, Harbin, 150040, China
| | - Yikun Su
- School of Civil Engineering and Transportation, Northeast Forestry University, Harbin, 150040, China
| | - Zhizhe Zheng
- School of Civil Engineering and Transportation, Northeast Forestry University, Harbin, 150040, China
| | - Liang Xu
- School of Civil Engineering, Changchun Institute of Technology, Changchun, 130012, China
| |
Collapse
|
9
|
Sharma A, López Y, Jia S, Lysenko A, Boroevich KA, Tsunoda T. Enhanced analysis of tabular data through Multi-representation DeepInsight. Sci Rep 2024; 14:12851. [PMID: 38834670 PMCID: PMC11724076 DOI: 10.1038/s41598-024-63630-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 05/30/2024] [Indexed: 06/06/2024] Open
Abstract
Tabular data analysis is a critical task in various domains, enabling us to uncover valuable insights from structured datasets. While traditional machine learning methods can be used for feature engineering and dimensionality reduction, they often struggle to capture the intricate relationships and dependencies within real-world datasets. In this paper, we present Multi-representation DeepInsight (MRep-DeepInsight), a novel extension of the DeepInsight method designed to enhance the analysis of tabular data. By generating multiple representations of samples using diverse feature extraction techniques, our approach is able to capture a broader range of features and reveal deeper insights. We demonstrate the effectiveness of MRep-DeepInsight on single-cell datasets, Alzheimer's data, and artificial data, showcasing an improved accuracy over the original DeepInsight approach and machine learning methods like random forest, XGBoost, LightGBM, FT-Transformer and L2-regularized logistic regression. Our results highlight the value of incorporating multiple representations for robust and accurate tabular data analysis. By leveraging the power of diverse representations, MRep-DeepInsight offers a promising new avenue for advancing decision-making and scientific discovery across a wide range of fields.
Collapse
Affiliation(s)
- Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Australia.
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
| | | | - Shangru Jia
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
10
|
Gedamu WT, Plank-Wiedenbeck U, Wodajo BT. A spatial autocorrelation analysis of road traffic crash by severity using Moran's I spatial statistics: A comparative study of Addis Ababa and Berlin cities. ACCIDENT; ANALYSIS AND PREVENTION 2024; 200:107535. [PMID: 38489942 DOI: 10.1016/j.aap.2024.107535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 02/25/2024] [Accepted: 03/02/2024] [Indexed: 03/17/2024]
Abstract
Methodological advancements in road safety research reveal an increasing inclination toward integrating spatial approaches in hot spot identification, spatial pattern analysis, and developing spatially lagged models. Previous studies on hot spot identification and spatial pattern analysis have overlooked crash severities and the spatial autocorrelation of crashes by severity, missing valuable insights into crash patterns and underlying factors. This study investigates the spatial autocorrelation of crash severity by taking two capital cities, Addis Ababa and Berlin, as a case study and compares patterns in low and high-income countries. The study used three-year crash data from each city. It employed the average nearest neighbor distance (ANND) method to determine the significance of spatial clustering of crash data by severity, Global Moran's I to examine the statistical significance of spatial autocorrelation, and Local Moran's I to identify significant cluster locations with High-High (HH) and Low-Low (LL) crash severity values. The ANND analysis reveals a significant clustering of crashes by severity in both cities, except in Berlin's fatal crashes. However, different Global Moran's I results were obtained for the two cities, with a strong and statistically significant value for Addis Ababa compared to Berlin. The Local Moran's I result indicates that the central business district and residential areas have LL values, while the city's outskirts exhibit HH values in Addis Ababa. With some persistent HH value locations, Berlin's HH and LL grid clusters are intermingled on the city's periphery. Socio-economic factors, road user behavior and roadway factors contribute to the difference in the result. Nevertheless, it is interesting to note the similarity of significant HH value locations on the outskirts of both cities. Finally, the results are consistent with previous studies and indicate the need for further investigation in other locations.
Collapse
Affiliation(s)
- Wondwossen Taddesse Gedamu
- Chair of Transport System Planning, Faculty of Civil Engineering, Bauhaus University Weimar, Schwanseestr. 13, 99423 Weimar, Germany; School of Civil & Environmental Engineering, Addis Ababa Institute of Technology, AAiT, Addis Ababa University, Addis Ababa, Ethiopia.
| | - Uwe Plank-Wiedenbeck
- Chair of Transport System Planning, Faculty of Civil Engineering, Bauhaus University Weimar, Schwanseestr. 13, 99423 Weimar, Germany
| | - Bikila Teklu Wodajo
- School of Civil & Environmental Engineering, Addis Ababa Institute of Technology, AAiT, Addis Ababa University, Addis Ababa, Ethiopia
| |
Collapse
|
11
|
Khan MN, Das S, Liu J. Predicting pedestrian-involved crash severity using inception-v3 deep learning model. ACCIDENT; ANALYSIS AND PREVENTION 2024; 197:107457. [PMID: 38219599 DOI: 10.1016/j.aap.2024.107457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 12/17/2023] [Accepted: 01/02/2024] [Indexed: 01/16/2024]
Abstract
This research leverages a novel deep learning model, Inception-v3, to predict pedestrian crash severity using data collected over five years (2016-2021) from Louisiana. The final dataset incorporates forty different variables related to pedestrian attributes, environmental conditions, and vehicular specifics. Crash severity was classified into three categories: fatal, injury, and no injury. The Boruta algorithm was applied to determine the importance of variables and investigate contributing factors to pedestrian crash severity, revealing several associated aspects, including pedestrian gender, pedestrian and driver impairment, posted speed limits, alcohol involvement, pedestrian age, visibility obstruction, roadway lighting conditions, and both pedestrian and driver conditions, including distraction and inattentiveness. To address data imbalance, the study employed Random Under Sampling (RUS) and the Synthetic Minority Oversampling Technique (SMOTE). The DeepInsight technique transformed numeric data into images. Subsequently, five crash severity prediction models were developed with Inception-v3, considering various scenarios, including original, under-sampled, over-sampled, a combination of under and over-sampled data, and the top twenty-five important variables. Results indicated that the model applying both over and under sampling outperforms models based on other data balancing techniques in terms of several performance metrics, including accuracy, sensitivity, precision, specificity, false negative ratio (FNR), false positive ratio (FPR), and F1-score. This model achieved prediction accuracies of 93.5%, 77.5%, and 85.9% for fatal, injury, and no injury categories, respectively. Additionally, comparative analysis based on several performance metrics and McNemar's tests demonstrated that the predictive performance of the Inception-v3 deep learning model is statistically superior compared to traditional machine learning and statistical models. The insights from this research can be effectively harnessed by safety professionals, emergency service providers, traffic management centers, and vehicle manufacturers to enhance their safety measures and applications.
Collapse
Affiliation(s)
- Md Nasim Khan
- Senior Engineer, AtkinsRealis, 11801 Domain Blvd Suite 500, Austin, TX 78758, United States.
| | - Subasish Das
- Assistant Professor, Texas State University, 601 University Drive, San Marcos, TX 78666, United States.
| | - Jinli Liu
- Geography and Environmental Studies, Texas State University, 601 University Drive, San Marcos, TX 78666, United States.
| |
Collapse
|
12
|
Balawi M, Tenekeci G. Time series traffic collision analysis of London hotspots: Patterns, predictions and prevention strategies. Heliyon 2024; 10:e25710. [PMID: 38384520 PMCID: PMC10878868 DOI: 10.1016/j.heliyon.2024.e25710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 01/13/2024] [Accepted: 02/01/2024] [Indexed: 02/23/2024] Open
Abstract
Despite recent measures on accident prevention, road collisions, mainly on London's "A" roads, persist as accident sources, endangering vulnerable users in particular. Analysing evidence from London's A-Roads unveils issues concerns and trends. This study utilises extensive data to target factors magnifying accidents: speed, traffic, vulnerable interactions. Stats 19 and transport data including volumes, types, speeds, and congestion parameters are all analysed alongside the collision data. The descriptive statistics have been employed to understand nature of data in the first instance. This has supported the process to cleanse the data outliers or periods where were subjected to incidents and interventions. Predictive model development is conducted to analyse and forecast accident frequency using ARIMA and SARIMAX models forecasted accident rates and interventions. ARIMA yielded higher accuracy. Method of analysis resulted in a statistically reliable formulation of the main factors, enabling use of this method for similar cities across the world. Formulated analysis revealed key contributors as population density, weather, and time of the day. The analysis of data supported identification of strategies emerging as infrastructure improvements, traffic control measures and severity and vulnerable users affected in particular. The analysis reveals distinct exhibits of causation, leading to focused recommendations on infrastructure enhancements, traffic control measures, and the impact on severity and vulnerable users, deviating from prior research findings. Insights aid safer London roads, have global predictive and mitigation value.
Collapse
Affiliation(s)
- Mohammad Balawi
- Cyprus International University, North Nicosia, North Cyprus
| | - Goktug Tenekeci
- Jacobs; and Cyprus International University, North Nicosia, North Cyprus
| |
Collapse
|
13
|
Thapa D, Mishra S, Velaga NR, Patil GR. Advancing proactive crash prediction: A discretized duration approach for predicting crashes and severity. ACCIDENT; ANALYSIS AND PREVENTION 2024; 195:107407. [PMID: 38056024 DOI: 10.1016/j.aap.2023.107407] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 11/04/2023] [Accepted: 11/24/2023] [Indexed: 12/08/2023]
Abstract
Driven by advancements in data-driven methods, recent developments in proactive crash prediction models have primarily focused on implementing machine learning and artificial intelligence. However, from a causal perspective, statistical models are preferred for their ability to estimate effect sizes using variable coefficients and elasticity effects. Most statistical framework-based crash prediction models adopt a case-control approach, matching crashes to non-crash events. However, accurately defining the crash-to-non-crash ratio and incorporating crash severities pose challenges. Few studies have ventured beyond the case-control approach to develop proactive crash prediction models, such as the duration-based framework. This study extends the duration-based modeling framework to create a novel framework for predicting crashes and their severity. Addressing the increased computational complexity resulting from incorporating crash severities, we explore a tradeoff between model performance and estimation time. Results indicate that a 15 % sample drawn at the epoch level achieves a balanced approach, reducing data size while maintaining reasonable predictive accuracy. Furthermore, stability analysis of predictor variables across different samples reveals that variables such as Time of day (Early afternoon), Weather condition (Clear), Lighting condition (Daytime), Illumination (Illuminated), and Volume require larger samples for more accurate coefficient estimation. Conversely, Daytime (Early morning, Late morning, Late afternoon), Lighting condition (Dark lighted), Terrain (Flat), Land use (Commercial, Rural), Number of lanes, and Speed converge towards true estimates with small incremental increases in sample size. The validation reveals that the model performs better in highway segments experiencing more frequent crashes (segments where the duration between crashes is less than 100 h, or approximately 4 days).
Collapse
Affiliation(s)
- Diwas Thapa
- Department of Civil Engineering, University of Memphis, Memphis, TN 38152, United States.
| | - Sabyasachee Mishra
- Department of Civil Engineering, University of Memphis, Memphis, TN 38152, United States.
| | - Nagendra R Velaga
- Department of Civil Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.
| | - Gopal R Patil
- Department of Civil Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India.
| |
Collapse
|
14
|
Venthuruthiyil SP, Thapa D, Mishra S. Towards smart work zones: Creating safe and efficient work zones in the technology era. JOURNAL OF SAFETY RESEARCH 2023; 87:345-366. [PMID: 38081707 DOI: 10.1016/j.jsr.2023.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 04/10/2023] [Accepted: 08/08/2023] [Indexed: 12/18/2023]
Abstract
INTRODUCTION Work Zones (WZs) have long been identified as a source of traffic fatalities and delays. Despite considerable technological advances that have alleviated many operational challenges associated with a WZ, social concerns about safety and mobility near WZs remain. Notably, the concept of a Smart Work Zone (SWZ) emerged from the compelling need to improve the safety and mobility of traffic and other WZ participants. This study reviewed the literature to assimilate studies related to SWZ Systems (SWZSs), report their findings, and ascertain a future path forward. METHOD To accomplish this, the existing WZ-related literature base was clustered into safety and traffic mobility topics using Latent Dirichlet Allocation (LDA) modeling. A thorough investigation of the pivotal inferences for the research topics was undertaken to comprehend current SWZ technologies and the need for further research. RESULTS The review uncovered the prominent features of SWZSs reported in the literature and the hindrances to their adoption. The most reported hindrances are the cost and effort associated with development, installation, and relocation. We uncover that Connected Autonomous Vehicles, vehicle-to-vehicle, and vehicle-to-infrastructure communication, along with technology-based worker training are the most promising next frontier for SWZ. CONCLUSION Significant research gaps exist in the literature regarding developing and implementing SWZS. Additionally, little effort has been directed toward developing workers' skills and competency. Practical approaches such as Virtual Reality (VR)-based training are necessary to bring workers up to pace with the developing SWZ technologies. PRACTICAL APPLICATIONS Future research should be directed towards interconnecting and implementing available safety technologies to automate WZ safety and management. Workers should be trained using more practical techniques. In this context, using VR will enable the simulation of hazardous events in a safe environment while also improving workers' skill retention.
Collapse
Affiliation(s)
- Suvin P Venthuruthiyil
- Department of Civil Engineering, Indian Institute of Technology Hyderabad, Kandi, Telangana, India-502285.
| | - Diwas Thapa
- Associate Professor and Faudree Professor of Civil Engineering, Department of Civil Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States.
| | - Sabyasachee Mishra
- Associate Professor and Faudree Professor of Civil Engineering, Department of Civil Engineering, University of Memphis, 3815 Central Avenue, Memphis, TN 38152, United States.
| |
Collapse
|
15
|
Liu D, Li D, Sze NN, Ding H, Song Y. An integrated data- and theory-driven crash severity model. ACCIDENT; ANALYSIS AND PREVENTION 2023; 193:107282. [PMID: 37722256 DOI: 10.1016/j.aap.2023.107282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 08/31/2023] [Accepted: 09/01/2023] [Indexed: 09/20/2023]
Abstract
For crash severity modeling, researchers typically view theory-driven models and data-driven models as different or even conflicting approaches. The reason is that the machine-learning models offer good predictability but weak interpretability, while the latter has robust interpretability but moderate predictability. In order to alleviate the tension between them, this study proposes an integrated data- and theory-driven crash-severity model, known as Embedded Fusion model based on Text Vector Representations (TVR-EF), by leveraging the complementary strengths of both. The model specification consists of two parts. (i) the data-driven component not only mitigate the deficiencies of traditional econometric models, where one-hot encoding is frequently used and makes it impossible to observe semantic relatedness between variable categories, but also enhances the interpretability for the relationship between crash severity and potential influencing factors using the learned embedding weight matrix. (ii) In the theory-driven component, the multinomial logit model is implemented as a 2D-Convolutional Neural Network (2D-CNN) to increase flexibility and decrease dependency on prior knowledge for different crash-severity outcomes. A crash dataset from Guangdong Province, China, is utilized to estimate the TVR-EF model, which is then benchmarked against two traditional econometric models and three widely used machine-learning models. Results indicate that TVR-EF model does not only improve the predictive performance but also makes it easier to interpret.
Collapse
Affiliation(s)
- Dongjie Liu
- School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China
| | - Dawei Li
- School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China; Jiangsu Key Laboratory of Urban ITS, Southeast University, Nanjing, Jiangsu 211189, China; Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, Nanjing, Jiangsu 211189, China.
| | - N N Sze
- Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China
| | - Hongliang Ding
- Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University, Hong Kong SAR, China; Institute of Smart City and Intelligent Transporttaion, Institute of Urban Rail Transportation, Southwest Jiaotong University, Chengdu, Sichuan 611756, China
| | - Yuchen Song
- School of Transportation, Southeast University, Nanjing, Jiangsu 211189, China
| |
Collapse
|
16
|
Maghelal P, Ali AH, Azar E, Jayaraman R, Khalaf K. Severity of vehicle-to-vehicle accidents in the UAE: An exploratory analysis using machine learning algorithms. Heliyon 2023; 9:e20694. [PMID: 37829796 PMCID: PMC10565775 DOI: 10.1016/j.heliyon.2023.e20694] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 10/02/2023] [Accepted: 10/04/2023] [Indexed: 10/14/2023] Open
Abstract
The World Health Organization (WHO) identifies road traffic injuries as a global health problem. The Eastern-Mediterranean region is particularly suffering from low traffic safety levels, recording the third highest death per capita ratio in the world. It is critical to evaluate and understand the causes of crashes and their severity levels as a first step to devising policies that aim to reduce these causes. Previous studies examining the frequency or severity of crashes present important limitations that motivate the need for the current work. While these studies have investigated the relation of contributing factors to severity of crashes, not until recently the importance of these factors are bring investigated. Even then, less research have explored various Machine Learning models and none in the middle-eastern region. This is critical because the WHO report concludes that the chances of dying in a traffic crash in this region are second only to Africa per 100000 population. This is a first study analyzing the severity of vehicle-to-vehicle crashes among drivers in the United Arab Emirates. Traffic Crash Data was obtained from the Abu Dhabi Police, which consisted of 11,400 observations during the period 2014-2017. Machine learning algorithms, including gradient boosting (GB), support vector machines (SVM), and random forest (RF), were trained and tested to predict crash severity and extract (using feature analysis) its determinants. The models were evaluated using two performance metrics: prediction accuracy and F1-scores. The RF model outperformed both GB and SVM, with the confusion matrix of RF reporting a better prediction for all four crash severity classes. The feature importance analysis indicates that the age of car, age of the injured, and the age of the initiator have the highest effect on severity, which is an important finding as the listed factors were rarely considered in previous studies. Vehicle and road characteristics such as vehicle class, crash type, and lighting are slightly associated with the severity. Consistent with other studies, gender was the least essential predictor of severity. Recommendations are finally provided to the Abu Dhabi Department of Municipalities and Transport (AD-DMT) authority to guide the development of road safety policies and countermeasures to mitigate the occurrence and severity of crashes.
Collapse
Affiliation(s)
- Praveen Maghelal
- Faculty of Resilience, Rabdan Academy, Abu Dhabi, United Arab Emirates
| | - Abdulrahim Haroun Ali
- Industrial and Systems Engineering, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Elie Azar
- Civil and Environmental Engineering, Carleton University, Ottawa, ON, Canada
| | - Raja Jayaraman
- Industrial and Systems Engineering, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Kinda Khalaf
- Biomedical Engineering and Health Engineering Innovation Center, Khalifa University, Abu Dhabi, United Arab Emirates
| |
Collapse
|
17
|
Li P, Li J. Exploration of the application of Grey-Markov models in the causality analysis of traffic accidents in roundabouts. PLoS One 2023; 18:e0287045. [PMID: 37768978 PMCID: PMC10538742 DOI: 10.1371/journal.pone.0287045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Accepted: 05/30/2023] [Indexed: 09/30/2023] Open
Abstract
We propose a multivariate Grey-Markov model to quantify traffic accident risk from different causality factors in roundabouts that is uniquely suited for the scarce and stochastic traffic crash data from roundabouts. A data sample of traffic crashes occurring in roundabouts in the U.S. State of Michigan from 2016 to 2021 was collected to investigate the capabilities of this modeling methodology. The multivariate grey model (MGM(1,4)) was constructed using grey relational analysis to determine the best dimensions for model optimization. Then, the Markov chain is introduced to address the unfitness of stochastic, fluctuating data in the MGM(1,4) model. Finally, our proposed hybrid MGM(1,4)-Markov model is compared with other models and validated. This study highlights the superior predictive performance of our MGM(1,4)-Markov model in fore-casting roundabout traffic accidents under data-limited conditions, achieving a 3.02% accuracy rate, in contrast to the traditional GM(1,1) model at 8.30% and the MGM(1,4) model at 4.47%. Moreover, incorporating human, vehicle, and environmental risk factors into a multivariate crash system yields more accurate predictions than merely aggregating crash counts.
Collapse
Affiliation(s)
- Peijing Li
- Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jian Li
- College of Fashion and Design, Donghua University, Shanghai, China
| |
Collapse
|
18
|
Gregurić M, Vrbanić F, Ivanjko E. Towards the spatial analysis of motorway safety in the connected environment by using explainable deep learning. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/03/2023]
|
19
|
Sharma A, Lysenko A, Boroevich KA, Tsunoda T. DeepInsight-3D architecture for anti-cancer drug response prediction with deep-learning on multi-omics. Sci Rep 2023; 13:2483. [PMID: 36774402 PMCID: PMC9922304 DOI: 10.1038/s41598-023-29644-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 02/08/2023] [Indexed: 02/13/2023] Open
Abstract
Modern oncology offers a wide range of treatments and therefore choosing the best option for particular patient is very important for optimal outcome. Multi-omics profiling in combination with AI-based predictive models have great potential for streamlining these treatment decisions. However, these encouraging developments continue to be hampered by very high dimensionality of the datasets in combination with insufficiently large numbers of annotated samples. Here we proposed a novel deep learning-based method to predict patient-specific anticancer drug response from three types of multi-omics data. The proposed DeepInsight-3D approach relies on structured data-to-image conversion that then allows use of convolutional neural networks, which are particularly robust to high dimensionality of the inputs while retaining capabilities to model highly complex relationships between variables. Of particular note, we demonstrate that in this formalism additional channels of an image can be effectively used to accommodate data from different omics layers while implicitly encoding the connection between them. DeepInsight-3D was able to outperform other state-of-the-art methods applied to this task. The proposed improvements can facilitate the development of better personalized treatment strategies for different cancers in the future.
Collapse
Affiliation(s)
- Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Institute for Integrated and Intelligent Systems, Griffith University, Brisbane, Australia.
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan.
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, School of Science, The University of Tokyo, Tokyo, Japan.
- Laboratory for Medical Science Mathematics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
20
|
Mohammadpour SI, Khedmati M, Zada MJH. Classification of truck-involved crash severity: Dealing with missing, imbalanced, and high dimensional safety data. PLoS One 2023; 18:e0281901. [PMID: 36947539 PMCID: PMC10032500 DOI: 10.1371/journal.pone.0281901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Accepted: 02/02/2023] [Indexed: 03/23/2023] Open
Abstract
While the cost of road traffic fatalities in the U.S. surpasses $240 billion a year, the availability of high-resolution datasets allows meticulous investigation of the contributing factors to crash severity. In this paper, the dataset for Trucks Involved in Fatal Accidents in 2010 (TIFA 2010) is utilized to classify the truck-involved crash severity where there exist different issues including missing values, imbalanced classes, and high dimensionality. First, a decision tree-based algorithm, the Synthetic Minority Oversampling Technique (SMOTE), and the Random Forest (RF) feature importance approach are employed for missing value imputation, minority class oversampling, and dimensionality reduction, respectively. Afterward, a variety of classification algorithms, including RF, K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), Gradient-Boosted Decision Trees (GBDT), and Support Vector Machine (SVM) are developed to reveal the influence of the introduced data preprocessing framework on the output quality of ML classifiers. The results show that the GBDT model outperforms all the other competing algorithms for the non-preprocessed crash data based on the G-mean performance measure, but the RF makes the most accurate prediction for the treated dataset. This finding indicates that after the feature selection is conducted to alleviate the computational cost of the machine learning algorithms, bagging (bootstrap aggregating) of decision trees in RF leads to a better model rather than boosting them via GBDT. Besides, the adopted feature importance approach decreases the overall accuracy by only up to 5% in most of the estimated models. Moreover, the worst class recall value of the RF algorithm without prior oversampling is only 34.4% compared to the corresponding value of 90.3% in the up-sampled model which validates the proposed multi-step preprocessing scheme. This study also identifies the temporal and spatial (roadway) attributes, as well as crash characteristics, and Emergency Medical Service (EMS) as the most critical factors in truck crash severity.
Collapse
Affiliation(s)
| | - Majid Khedmati
- Department of Industrial Engineering, Sharif University of Technology, Tehran, Iran
| | | |
Collapse
|
21
|
Fiorentini N, Leandri P, Losa M. Defining machine learning algorithms as accident prediction models for Italian two-lane rural, suburban, and urban roads. Int J Inj Contr Saf Promot 2022; 29:450-462. [PMID: 35613339 DOI: 10.1080/17457300.2022.2075397] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Four Accident Prediction Models have been defined for Italian two-lane rural, suburban, and urban roads by exploiting different Machine Learning Algorithms. Specifically, a Classification and Regression Tree, a Boosted Regression Tree, a Random Forest, and a Support Vector Machine have been implemented to predict the number of Fatal and Injury crashes on a 905-km network, which experienced 5,802 FI crashes in 2008-2016. The dataset incorporates geometrical, functional, and environmental information. Several performance metrics have been computed, such as Determination Coefficient, Mean Absolute Error, Root Mean Square Error, and scatterplots. Outcomes suggest that Support Vector Machine outperforms the other Machine Learning Algorithms for predicting Fatal and Injury crashes. In Addition, the computation of Predictor Importance shows that traffic flow, the density of intersections, driveway density, and type of area are the most impacting factors on crash likelihood. Road authorities may use these findings for conducting reliable safety analyses.
Collapse
Affiliation(s)
- Nicholas Fiorentini
- Department of Civil and Industrial Engineering (DICI), Engineering School, University of Pisa, Pisa, Italy
| | - Pietro Leandri
- Department of Civil and Industrial Engineering (DICI), Engineering School, University of Pisa, Pisa, Italy
| | - Massimo Losa
- Department of Civil and Industrial Engineering (DICI), Engineering School, University of Pisa, Pisa, Italy
| |
Collapse
|
22
|
Chughtai JUR, Haq IU, Muneeb M. An attention-based recurrent learning model for short-term travel time prediction. PLoS One 2022; 17:e0278064. [PMID: 36454768 PMCID: PMC9714702 DOI: 10.1371/journal.pone.0278064] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Accepted: 11/09/2022] [Indexed: 12/03/2022] Open
Abstract
With the advent of Big Data technology and the Internet of Things, Intelligent Transportation Systems (ITS) have become inevitable for future transportation networks. Travel time prediction (TTP) is an essential part of ITS and plays a pivotal role in congestion avoidance and route planning. The novel data sources such as smartphones and in-vehicle navigation applications allow traffic conditions in smart cities to be analyzed and forecast more reliably than ever. Such a massive amount of geospatial data provides a rich source of information for TTP. Gated Recurrent Unit (GRU) has been successfully applied to traffic prediction problems due to its ability to handle long-term traffic sequences. However, the existing GRU does not consider the relationship between various historical travel time positions in the sequences for traffic prediction. We propose an attention-based GRU model for short-term travel time prediction to cope with this problem enabling GRU to learn the relevant context in historical travel time sequences and update the weights of hidden states accordingly. We evaluated the proposed model using FCD data from Beijing. To demonstrate the generalization of our proposed model, we performed a robustness analysis by adding noise obeying Gaussian distribution. The experimental results on test data indicated that our proposed model performed better than the existing deep learning time-series models in terms of Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), and Coefficient of Determination (R2).
Collapse
Affiliation(s)
- Jawad-ur-Rehman Chughtai
- Department of Computer and Information Sciences (DCIS), PIEAS, Islamabad, Pakistan
- Digital Disruption Lab, DCIS, PIEAS, Islamabad, Pakistan
- * E-mail:
| | - Irfan Ul Haq
- Department of Computer and Information Sciences (DCIS), PIEAS, Islamabad, Pakistan
- Digital Disruption Lab, DCIS, PIEAS, Islamabad, Pakistan
| | - Muhammad Muneeb
- Department of Mathematics, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| |
Collapse
|
23
|
Abdi A, Seyedabrishami S, Llorca C, Moreno AT. Exploring the effects of stationary camera spots on inferences drawn from real-time crash severity models. Sci Rep 2022; 12:20321. [PMID: 36434001 PMCID: PMC9700803 DOI: 10.1038/s41598-022-24102-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 11/10/2022] [Indexed: 11/27/2022] Open
Abstract
This study combined crash reports, land use, real-time traffic, and weather data to form an integrated database to analyze the severity of crashes taking place on rural highways. As the traffic cameras are placed at fixed locations, there is a wide range of measured distances between crashes and the selected nearest camera for extracting traffic variables. This may change the significance of traffic variables. For the first time, spacing was introduced as the distance around the detectors in which traffic characteristics are inferred to crashes. Classification and Regression Tree (CART) was employed as an interpretable tool to explore how spacing affects model performance and the significance of traffic variables. Twelve spacing scenarios from 250 to 3000 m were evaluated. Except for short spacings suffering from the low sample size issue, each model has a good predictive performance based on overall accuracy and F2 score in the 1000-3000 m spacings. In this range, three dominant rules emerged: (1) high deviations of speed on the roads surrounded by wastelands are associated with severe crashes; (2) faded markings in residential zones increase the likelihood of severe outcomes; (3) installation of barriers decrease the probability of severe crashes. Comparing the Variable Importance Measure (VIM) reveals that the total importance of traffic variables reduces as the spacing increases. Also, results indicate that average speed is significant until 1750 m; but speed deviation, traffic flow, and percent of heavy vehicles are more stable variables for further spacings. In conclusion, for the first time, spacing scenarios were evaluated systematically and proved that they have a remarkable impact on the significance of variables. This novel research provides guidance not only on the spacing but also on which real-time traffic variables have a greater impact on crash severity, along with design, land use, and environmental variables.
Collapse
Affiliation(s)
- Amirhossein Abdi
- Faculty of Civil and Environmental Engineering, Tarbiat Modares University, P.O. Box 14115-397, Tehran, Iran
| | - Seyedehsan Seyedabrishami
- Faculty of Civil and Environmental Engineering, Tarbiat Modares University, P.O. Box 14115-397, Tehran, Iran.
| | | | | |
Collapse
|
24
|
Abstract
To achieve greater sustainability of the traffic system, the trend of traffic accidents in road traffic was analysed. Injuries from traffic accidents are among the leading factors in the suffering of people around the world. Injuries from road traffic accidents are predicted to be the third leading factor contributing to human deaths. Road traffic accidents have decreased in most countries during the last decade because of the Decade of Action for Road Safety 2011–2020. The main reasons behind the reduction of traffic accidents are improvements in the construction of vehicles and roads, the training and education of drivers, and advances in medical technology and medical care. The primary objective of this paper is to investigate the pattern in the time series of traffic accidents in the city of Belgrade. Time series have been analysed using exploratory data analysis to describe and understand the data, the method of regression and the Box–Jenkins seasonal autoregressive integrated moving average model (SARIMA). The study found that the time series has a pronounced seasonal character. The model presented in the paper has a mean absolute percentage error (MAPE) of 5.22% and can be seen as an indicator that the prognosis is acceptably accurate. The forecasting, in the context of number of a traffic accidents, may be a strategy to achieve different goals such as traffic safety campaigns, traffic safety strategies and action plans to achieve the objectives defined in traffic safety strategies.
Collapse
|
25
|
Zhang S, Khattak A, Matara CM, Hussain A, Farooq A. Hybrid feature selection-based machine learning Classification system for the prediction of injury severity in single and multiple-vehicle accidents. PLoS One 2022; 17:e0262941. [PMID: 35108288 PMCID: PMC8809572 DOI: 10.1371/journal.pone.0262941] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 01/07/2022] [Indexed: 11/19/2022] Open
Abstract
To undertake a reliable analysis of injury severity in road traffic accidents, a complete understanding of important attributes is essential. As a result of the shift from traditional statistical parametric procedures to computer-aided methods, machine learning approaches have become an important aspect in predicting the severity of road traffic injuries. The paper presents a hybrid feature selection-based machine learning classification approach for detecting significant attributes and predicting injury severity in single and multiple-vehicle accidents. To begin, we employed a Random Forests (RF) classifier in conjunction with an intrinsic wrapper-based feature selection approach called the Boruta Algorithm (BA) to find the relevant important attributes that determine injury severity. The influential attributes were then fed into a set of four classifiers to accurately predict injury severity (Naive Bayes (NB), K-Nearest Neighbor (K-NN), Binary Logistic Regression (BLR), and Extreme Gradient Boosting (XGBoost)). According to BA's experimental investigation, the vehicle type was the most influential factor, followed by the month of the year, the driver's age, and the alignment of the road segment. The driver's gender, the presence of a median, and the presence of a shoulder were all found to be unimportant. According to classifier performance measures, XGBoost surpasses the other classifiers in terms of prediction performance. Using the specified attributes, the accuracy, Cohen's Kappa, F1-Measure, and AUC-ROC values of the XGBoost were 82.10%, 0.607, 0.776, and 0.880 for single vehicle accidents and 79.52%, 0.569, 0.752, and 0.86 for multiple-vehicle accidents, respectively.
Collapse
Affiliation(s)
- Shuguang Zhang
- CCCC Southwest Investment & Development Company Limited, Beijing, China
| | - Afaq Khattak
- The Key Laboratory of Road and Traffic Engineering, Ministry of Education, Tongji University, Jiading, Shanghai, China
| | | | - Arshad Hussain
- NUST Institute of Civil Engineering, National University of Sciences and Technology, Islamabad, Pakistan
| | - Asim Farooq
- Head of Department at Centre of Excellence in Transportation Engineering, Pak Austria Facshhoule, Institute of Applied Sciences, Haripur, Pakistan
| |
Collapse
|
26
|
|
27
|
Vaiyapuri T, Gupta M. Traffic accident severity prediction and cognitive analysis using deep learning. Soft comput 2021. [DOI: 10.1007/s00500-021-06515-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
|
28
|
Crash Injury Severity Prediction Using an Ordinal Classification Machine Learning Approach. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182111564. [PMID: 34770076 PMCID: PMC8583475 DOI: 10.3390/ijerph182111564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/27/2021] [Accepted: 10/30/2021] [Indexed: 11/16/2022]
Abstract
In many related works, nominal classification algorithms ignore the order between injury severity levels and make sub-optimal predictions. Existing ordinal classification methods suffer rank inconsistency and rank non-monotonicity. The aim of this paper is to propose an ordinal classification approach to predict traffic crash injury severity and to test its performance over existing machine learning classification methods. First, we compare the performance of the neural network, XGBoost, and SVM classifiers in injury severity prediction. Second, we utilize a severity category-combination method with oversampling to relieve the class-imbalance problem prevalent in crash data. Third, we take advantage of probability calibration and the optimal probability threshold moving to improve the prediction ability of ordinal classification. The proposed approach can satisfy the rank consistency and rank monotonicity requirement and is proved to be superior to other ordinal classification methods and nominal classification machine learning by statistical significance test. Important factors relating to injury severity are selected based on their permutation feature importance scores. We find that converting severity levels into three classes, minor injury, moderate injury, and serious injury, can substantially improve the prediction precision.
Collapse
|
29
|
Sharma A, Lysenko A, Boroevich KA, Vans E, Tsunoda T. DeepFeature: feature selection in nonimage data using convolutional neural network. Brief Bioinform 2021; 22:6343526. [PMID: 34368836 PMCID: PMC8575039 DOI: 10.1093/bib/bbab297] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/30/2021] [Accepted: 07/14/2021] [Indexed: 12/14/2022] Open
Abstract
Artificial intelligence methods offer exciting new capabilities for the discovery of biological mechanisms from raw data because they are able to detect vastly more complex patterns of association that cannot be captured by classical statistical tests. Among these methods, deep neural networks are currently among the most advanced approaches and, in particular, convolutional neural networks (CNNs) have been shown to perform excellently for a variety of difficult tasks. Despite that applications of this type of networks to high-dimensional omics data and, most importantly, meaningful interpretation of the results returned from such models in a biomedical context remains an open problem. Here we present, an approach applying a CNN to nonimage data for feature selection. Our pipeline, DeepFeature, can both successfully transform omics data into a form that is optimal for fitting a CNN model and can also return sets of the most important genes used internally for computing predictions. Within the framework, the Snowfall compression algorithm is introduced to enable more elements in the fixed pixel framework, and region accumulation and element decoder is developed to find elements or genes from the class activation maps. In comparative tests for cancer type prediction task, DeepFeature simultaneously achieved superior predictive performance and better ability to discover key pathways and biological processes meaningful for this context. Capabilities offered by the proposed framework can enable the effective use of powerful deep learning methods to facilitate the discovery of causal mechanisms in high-dimensional biomedical data.
Collapse
Affiliation(s)
- Alok Sharma
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan
| | - Artem Lysenko
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan
| | - Keith A Boroevich
- Laboratory for Medical Science Mathematics, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan
| | - Edwin Vans
- STEMP, University of the South Pacific, Suva, Fiji
| | - Tatsuhiko Tsunoda
- Laboratory for Medical Science Mathematics, Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo 113-0033, Japan
| |
Collapse
|
30
|
Modeling Focused-Ultrasound Response for Non-Invasive Treatment Using Machine Learning. Bioengineering (Basel) 2021; 8:bioengineering8060074. [PMID: 34206007 PMCID: PMC8226898 DOI: 10.3390/bioengineering8060074] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 05/27/2021] [Accepted: 05/27/2021] [Indexed: 11/16/2022] Open
Abstract
The interactions between body tissues and a focused ultrasound beam can be evaluated using various numerical models. Among these, the Rayleigh-Sommerfeld and angular spectrum methods are considered to be the most effective in terms of accuracy. However, they are computationally expensive, which is one of the underlying issues of most computational models. Typically, evaluations using these models require a significant amount of time (hours to days) if realistic scenarios such as tissue inhomogeneity or non-linearity are considered. This study aims to address this issue by developing a rapid estimation model for ultrasound therapy using a machine learning algorithm. Several machine learning models were trained on a very-large dataset (19,227 simulations), and the performance of these models were evaluated with metrics such as Root Mean Squared Error (RMSE), R-squared (R2), Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). The resulted random forest provides superior accuracy with an R2 value of 0.997, an RMSE of 0.0123, an AIC of -82.56, and a BIC of -81.65 on an external test dataset. The results indicate the efficacy of the random forest-based model for the focused ultrasound response, and practical adoption of this approach will improve the therapeutic planning process by minimizing simulation time.
Collapse
|