1
|
Sohrabi S, Lord D, Dadashova B, Mannering F. Assessing the collective safety of automated vehicle groups: A duration modeling approach of accumulated distances between crashes. Accid Anal Prev 2024; 198:107454. [PMID: 38290409 DOI: 10.1016/j.aap.2023.107454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Revised: 12/19/2023] [Accepted: 12/29/2023] [Indexed: 02/01/2024]
Abstract
Ideally, the evaluation of automated vehicles would involve the careful tracking of individual vehicles and recording of observed crash events. Unfortunately, due to the low frequency of crash events, such data would require many years to acquire, and potentially place the motorized public at risk if defective automated technologies were present. To acquire information on the safety effectiveness of automated vehicles more quickly, this paper uses the collective crash histories of a group of automated vehicles, and applies a duration modeling approach to the accumulated distances between crashes. To demonstrate the applicability of this approach as a method compare automated and conventional vehicles (human drivers), an empirical assessment was undertaken using two comparable sources of data. For conventional vehicles, police and non-police-reportable crashes were collected from the Second Strategic Highway Research Program's naturalistic driving study, and for automated vehicles, data from the California Department of Motor Vehicles Autonomous Vehicle Tester program were used (105 crashes from 59 permit holders driving ∼2.8 million miles were used for the analysis). The results of the empirical study showed that automated driving was safer at the 95% confidence level, with a higher number of miles between crashes, relative to their conventional vehicle counterparts. The findings indicate that the number of miles between crashes would be increased by roughly 27% when switching from conventional vehicles to automated vehicles. Despite limited data which mandated a group-vehicle approach, this study can be considered a reasonable initial approximation of automated vehicle safety.
Collapse
Affiliation(s)
- Soheil Sohrabi
- Safe Transportation Research and Education Center, University of California, Berkeley, CA, USA.
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, TX, USA.
| | - Bahar Dadashova
- Texas A&M Transportation Institute, Texas A&M University, TX, USA.
| | - Fred Mannering
- Center for Urban Transportation Research, University of South Florida, FL, USA.
| |
Collapse
|
2
|
Kuo PF, Sulistyah UD, Putra IGB, Lord D. Exploring the spatial relationship of e-bike and motorcycle crashes: Implications for risk reduction. J Safety Res 2024; 88:199-216. [PMID: 38485363 DOI: 10.1016/j.jsr.2023.11.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 08/02/2023] [Accepted: 11/09/2023] [Indexed: 03/19/2024]
Abstract
INTRODUCTION Electric bicycles, or e-bikes, have become very popular over the past decade. In order to reduce the risk of crashes, it is necessary to understand the contributing factors. While several researchers have examined these elements, few have considered the spatial heterogeneity between crashes and environmental variables, such as Points of Interest (POI). In addition, there is a scarcity of studies comparing the crash-related factors of e-bikes and motorcycles. Despite their differing speed and range capabilities, different POIs also tend to impact area/bandwidths differently because e-bikes cannot cover the same range that motorcycles can. METHOD In this study, we compared e-bike and motorcycle crashes at 11 different types of POIs in Taipei from 2016 to 2020. Since crashes are sparse events and easily affected by the Modifiable Areal Unit Problem (MAUP), Kernel Density Estimation (KDE) was employed to transform crash points (count data) to crash risk surfaces (continuous data). Additionally, an advanced variant of Geographical Weighted Regression (GWR), Multiscale Geographically Weighted Regression (MGWR) utilized to predict crash risk because each predictor is allowed to have a different bandwidth. RESULTS The results showed: (a) For e-bike crashes, the MGWR model outperformed the GWR and OLS models in terms of AIC values, while the MGWR and GWR performed similarly with regard to motorcycle crashes; (b) The analysis revealed e-bike and motorcycle crash risk to be associated with various types of POIs. E-bike crashes tended to occur more frequently in areas with more schools, supermarkets, intersections, and elderly people. Meanwhile, motorcycle crashes were more likely to occur in areas with a high number of restaurants and intersections. The search bandwidths of e-bikes are inconsistent and narrower than those of motorcycles.
Collapse
Affiliation(s)
- Pei-Fen Kuo
- Department of Geomatics, National Cheng Kung University, Taiwan
| | | | | | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, USA
| |
Collapse
|
3
|
Tamakloe R, Adanu EK, Atandzi J, Das S, Lord D, Park D. Stability of factors influencing walking-along-the-road pedestrian injury severity outcomes under different lighting conditions: A random parameters logit approach with heterogeneity in means and out-of-sample predictions. Accid Anal Prev 2023; 193:107333. [PMID: 37832357 DOI: 10.1016/j.aap.2023.107333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 09/27/2023] [Accepted: 09/29/2023] [Indexed: 10/15/2023]
Abstract
Pedestrians walking along the road's edge are more exposed and vulnerable than those on designated crosswalks. Often, they remain oblivious to the imminent perils of potential collisions with vehicles, making crashes involving these pedestrians relatively unique compared to others. While previous research has recognized that the surrounding lighting conditions influence traffic crashes, the effect of different lighting conditions on walking-along-the-road pedestrian injury severity outcomes remains unexplored. This study examines the variations in the impact of risk factors on walking-along-the-road pedestrian-involved crash injury severity across various lighting conditions. Preliminary stability tests on the walking-along-the-road pedestrian-involved crash data obtained from Ghana revealed that the effect of most risk factors on injury severity outcomes is likely to differ under each lighting condition, warranting the estimation of separate models for each lighting condition. Thus, the data were grouped based on the lighting conditions, and different models were estimated employing the random parameter logit model with heterogeneity in the means approach to capture different levels of unobserved heterogeneity in the crash data. From the results, heavy vehicles, shoulder presence, and aged drivers were found to cause fatal pedestrian walking-along-the-road severity outcomes during daylight conditions, indicators for male pedestrians and speeding were identified to have stronger associations with fatalities on roads with no light at night, and crashes occurring on Tuesdays and Wednesdays were likely to be severe on lit roads at night. From the marginal effect estimates, although some explanatory variables showed consistent effects across various lighting conditions in pedestrian walking-along-the-road crashes, such as pedestrians aged < 25 years and between 25 and 44 years exhibited significant variations in their impact across different lighting conditions, supporting the finding that the effect of risk factors are unstable. Further, the out-of-sample simulations underscored the shifts in factor effects between different lighting conditions, highlighting that enhancing visibility could play a pivotal role in significantly reducing fatalities associated with pedestrians walking along the road. Targeted engineering, education, and enforcement countermeasures are proposed from the interesting insights drawn to improve pedestrian safety locally and internationally.
Collapse
Affiliation(s)
- Reuben Tamakloe
- Eco-friendly Smart Vehicle Research Center, Korea Advanced Institute of Science and Technology, Daejeon, South Korea; Cho Chun Shik Graduate School of Green Transportation, Korea Advanced Institute of Science and Technology, Daejeon, South Korea; Department of Transportation Engineering, The University of Seoul, Seoul, South Korea.
| | - Emmanuel Kofi Adanu
- Alabama Transportation Institute, The University of Alabama, Tuscaloosa, USA.
| | - Jonathan Atandzi
- School of Modern Logistics, Zhejiang Wanli University, Zhejiang Ningbo, China.
| | - Subasish Das
- Ingram School of Engineering, Texas State University, San Marcos, USA.
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, USA.
| | - Dongjoo Park
- Department of Transportation Engineering, The University of Seoul, Seoul, South Korea.
| |
Collapse
|
4
|
Tahir HB, Yasmin S, Lord D, Haque MM. Examining the performance of engineering treatment evaluation methodologies using the hypothetical treatment and actual treatment settings. Accid Anal Prev 2023; 188:107108. [PMID: 37178500 DOI: 10.1016/j.aap.2023.107108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Revised: 04/19/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023]
Abstract
The selection of treatment evaluation methodology is paramount in determining reliable crash modification factors (CMFs) for engineering treatments. A lack of ground truth makes it cumbersome to examine the performance of treatment evaluation methodologies. In addition, a sound methodological framework is critical for evaluating the performances of treatment evaluation methodologies. In addressing these challenges, this study proposed a framework for assessing treatment evaluation methodologies by hypothetical treatments with known ground truth and actual real-world treatments. In particular, this study examined three before-after treatment evaluation approaches: 1) Empirical Bayes, 2) Simulation-based Empirical Bayes, and 3) Full Bayes methods. In addition, this study examined the Cross-Sectional treatment evaluation methodology. The methodological framework utilized five datasets of hypothetical treatment with known ground truth based on the hotspot identification method and a real-world dataset of wide centerline treatment on two-lane, two-way rural highways in Queensland, Australia. Results showed that all the methods could identify the ground truth of hypothetical treatments, but the Full Bayes approach better predicts the known ground truth compared to Empirical Bayes, Simulation-based Empirical Bayes, and Cross-Sectional methods. The Full Bayes approach was also found to provide the most precise estimate for real-world wide centerline treatment along rural highways compared to other methods. Moreover, the current study highlighted that the Cross-Sectional method offers a viable estimate of treatment effectiveness in case the before-period data is limited.
Collapse
Affiliation(s)
- Hassan Bin Tahir
- Queensland University of Technology, School of Civil and Environmental Engineering, Brisbane, Australia.
| | - Shamsunnahar Yasmin
- Queensland University of Technology, School of Civil and Environmental Engineering, Centre for Accident Research and Road Safety - Queensland (CARRS-Q), Brisbane, Australia.
| | - Dominique Lord
- Texas A&M University, Zachry Department of Civil and Environmental Engineering, TX, USA.
| | - Md Mazharul Haque
- Queensland University of Technology, School of Civil and Environmental Engineering, Brisbane, Australia.
| |
Collapse
|
5
|
Islam ASMM, Shirazi M, Lord D. Finite mixture Negative Binomial-Lindley for modeling heterogeneous crash data with many zero observations. Accid Anal Prev 2022; 175:106765. [PMID: 35947924 DOI: 10.1016/j.aap.2022.106765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 06/22/2022] [Accepted: 06/25/2022] [Indexed: 06/15/2023]
Abstract
Crash data are often highly dispersed; it may also include a large amount of zero observations or have a long tail. The traditional Negative Binomial (NB) model cannot model these data properly. To overcome this issue, the Negative Binomial-Lindley (NB-L) model has been proposed as an alternative to the NB to analyze data with these characteristics. Research studies have shown that the NB-L model provides a superior performance compared to the NB when data include numerous zero observations or have a long tail. In addition, crash data are often collected from sites with different spatial or temporal characteristics. Therefore, it is not unusual to assume that crash data are drawn from multiple subpopulations. Finite mixture models are powerful tools that can be used to account for underlying subpopulations and capture the population heterogeneity. This research documents the derivations and characteristics of the Finite mixture NB-L model (FMNB-L) to analyze data generated from heterogeneous subpopulations with many zero observations and a long tail. We demonstrated the application of the model to identify subpopulations with a simulation study. We then used the FMNB-L model to estimate statistical models for Texas four-lane freeway crashes. These data have unique characteristics; it is highly dispersed, have many locations with very large number of crashes, as well as significant number of locations with zero crash. We used multiple goodness-of-fit metrics to compare the FMNB-L model with the NB, NB-L, and the finite mixture NB models. The FMNB-L identified two subpopulations in datasets. The results show a significantly better fit by the FMNB-L compared to other analyzed models.
Collapse
Affiliation(s)
- A S M Mohaiminul Islam
- Department of Civil and Environmental Engineering, University of Maine, Orono, ME 04469, USA.
| | - Mohammadali Shirazi
- Department of Civil and Environmental Engineering, University of Maine, Orono, ME 04469, USA.
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX 77843, USA.
| |
Collapse
|
6
|
Khodadadi A, Tsapakis I, Shirazi M, Das S, Lord D. Derivation of the Empirical Bayesian method for the Negative Binomial-Lindley generalized linear model with application in traffic safety. Accid Anal Prev 2022; 170:106638. [PMID: 35339878 DOI: 10.1016/j.aap.2022.106638] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 03/07/2022] [Accepted: 03/12/2022] [Indexed: 06/14/2023]
Abstract
The expected crash frequency is the long-term average crash count for a specific site. It is extensively used to systematically evaluate the crash risk associated with roadway elements. To estimate the expected crashes, the Empirical Bayesian (EB) approach is typically employed. The EB method is a computationally convenient approximation to the Full Bayesian (FB) method, which gained popularity due to its simple interpretation, computational efficiency, and the ability to account for the regression to the mean bias. However, the common EB method used in traffic safety analysis is only applicable when the traditional Negative Binomial (NB) model is used. The NB model, however, is not a suitable choice when data is highly dispersed, skewed, or has a large number of zero observations. The Negative Binomial-Lindley (NB-L) model is a mixture of the NB and Lindley distributions and has shown superior fit compared to the NB model, especially when the dataset is characterized by excess zero observations. Even though several studies have used the NB-L in developing crash prediction models, the application of the NB-L in other safety-related tasks (e.g., hot spot identification) is largely neglected. This study proposed a framework to develop the EB method for the NB-L model and subsequently estimate the expected crash values. A comparison between the EB and FB estimates was performed to validate the approximation framework in general. The results indicated that the proposed EB framework is able to estimate expected crashes with comparable precision to the FB estimate, but with much less computational cost. In addition, a site ranking analysis using the EB estimates was conducted to validate the proposed approximation method in safety studies. However, it should be noted that any other type of safety analysis that requires access to the expected crashes can benefit from the proposed EB method. This study concluded that the proposed EB framework can properly approximate the underlying FB approach and can reasonably be considered as an alternative to the traditional EB formula derived from the NB model. The results of this study can help to extend the application of the advanced predictive models beyond predicting crashes to other safety-related tasks, with no additional computational efforts.
Collapse
Affiliation(s)
- Ali Khodadadi
- Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | - Ioannis Tsapakis
- Texas A&M Transportation Institute, 3500 NW Loop 410, Suite 315 San Antonio, TX 78229, United States.
| | | | - Subasish Das
- Texas A&M Transportation Institute, 3135 TAMU, College Station, TX 77843, United States.
| | - Dominique Lord
- Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| |
Collapse
|
7
|
Yang J, Guo X, Xu M, Wang L, Lord D. Alcohol-impaired motorcyclists versus car drivers: A comparison of crash involvement and legal consequence from adjudication data. J Safety Res 2021; 79:292-303. [PMID: 34848010 DOI: 10.1016/j.jsr.2021.09.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 04/08/2021] [Accepted: 09/23/2021] [Indexed: 06/13/2023]
Abstract
INTRODUCTION Driving under the influence (DUI) increases the probability of motor-vehicle collisions, especially for motorcycles with less protections. This study aimed to identify commonalities and differences between criminally DUI offenses (i.e., with a blood alcohol concentration (BAC) of 80 mg/dL or higher) committed by motorcyclists and car drivers. METHODS A total of 10,457 motorcycle DUIs and 8,402 car DUIs were compared using a series of logistic regression models, using data extracted from the documents of adjudication decisions by the courts of Jiangsu, China. RESULTS The results revealed that offenders from the high-BAC group (i.e., 200 mg/dL or higher) accounted for more than 20% of the total DUI offenses, and were more likely to be involved in a crash and punished with a longer detention. Motorcyclists had a higher likelihood of crash involvement, and were also more likely to be responsible for single-vehicle crashes associated with higher odds of injury sustained, compared to alcohol-impaired car drivers. In the verdict, motorcycle offenders were more likely to receive a less severe penalty. CONCLUSIONS Interventions are clearly required to focus on reducing in the high-BAC group of offenders. For alcohol-impaired motorcyclists, their risks of crash and injury against BAC climb more steeply than the risks for car drivers. The factors including frequent occurrences, uncertainty of detection, and short-term sentences may weaken the deterrence effect of the criminalization of motorcycle DUI. Practical Applications: The traffic-related adjudication data support traffic safety analysis. Strategies such as combating motorcycle violations (e.g., unlicensed operators or driving unsafe vehicles), undertaking education and awareness campaigns, are expected for DUI prevention.
Collapse
Affiliation(s)
- Jie Yang
- Development Research Institute of Transportation Governed by Law, School of Law, Southeast University, Nanjing 210096, China.
| | - Xiaoyu Guo
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX 77843-3136, USA
| | - Minchuan Xu
- Judicial Big Data Research Center, School of Law, Southeast University, Nanjing 210096, China
| | - Lusheng Wang
- Judicial Big Data Research Center, School of Law, Southeast University, Nanjing 210096, China
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX 77843-3136, USA
| |
Collapse
|
8
|
Khodadadi A, Tsapakis I, Das S, Lord D, Li Y. Application of different negative binomial parameterizations to develop safety performance functions for non-federal aid system roads. Accid Anal Prev 2021; 156:106103. [PMID: 33866155 DOI: 10.1016/j.aap.2021.106103] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 03/12/2021] [Accepted: 03/22/2021] [Indexed: 06/12/2023]
Abstract
Safety performance functions (SPFs) are the main building blocks in understanding the relationships between crash risk factors and crash frequencies. Many research efforts have focused on high-volume roadways that typically experience more crashes. A few studies have documented SPFs for non-federal aid system (NFAS) roads including rural minor collectors, rural local roads, and urban local roads. NFAS roads are characterized by unique features such as lower speeds, and shorter segment lengths, and they usually experience fewer crashes given the low exposure of these roads. As a result, there is a clear need to investigate the associated safety issues of NFAS roadways and generate distinct SPFs for them. The main objective of this study is to bridge the gap in the literature and develop SPFs for NFAS roads. This study examined the application of traditional negative binomial and zero-favored negative binomial models (i.e., negative binomial-Lindley). Both groups of models were formulated by different variance and dispersion structures. Using crash, roadway inventory, and traffic volume data from 2014 to 2018 in Virginia, the results showed that the NB-L models perform better than the traditional NB models. Furthermore, an appropriate variance structure along with a reasonably chosen dispersion function can further improve the model performance.
Collapse
Affiliation(s)
- Ali Khodadadi
- Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | - Ioannis Tsapakis
- Texas A&M Transportation Institute, 3500 NW Loop 410, San Antonio, TX 78229, United States.
| | - Subasish Das
- Texas A&M Transportation Institute, 3500 NW Loop 410, San Antonio, TX 78229, United States.
| | - Dominique Lord
- Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | - Yingfeng Li
- Virginia Tech Transportation Institute, 3500 Transportation Research Plaza, Building 1 R207, Blacksburg, VA 24061, United States.
| |
Collapse
|
9
|
Li X, Mousavi SM, Dadashova B, Lord D, Wolshon B. Toward a crowdsourcing solution to identify high-risk highway segments through mining driving jerks. Accid Anal Prev 2021; 155:106101. [PMID: 33848812 DOI: 10.1016/j.aap.2021.106101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 01/28/2021] [Accepted: 03/21/2021] [Indexed: 06/12/2023]
Abstract
Traffic crashes have become a leading cause of preventable deaths globally. Identifying high-risk segments not only benefits safety specialists to better understand crash patterns but also reminds road users to be aware of driving risks. This study reports on a new crowdsourcing solution to identify high-risk highway segments by analyzing driving jerks. Driving jerks represent the abrupt changes of acceleration, which have been shown to be closely related to traffic risks. In this study, we first calculate driving jerks from each participant's naturalistic driving data and identify "unsafe" drivers based on their jerk-ratio. Then, we innovatively propose an improved line-constrained clustering method to identify each participant's jerk clusters on each road. These individual-specific jerk clusters are overlapped with road networks to identify potential risky segments. By synthesizing these potential risky segments reported by different participants, we obtain the final detection results for high-risk highway segments. In this study, we compare the jerk-cluster-determined risky segments with crash-rate-determined risky segments to evaluate the proposed solution's effectiveness. The study results demonstrate that our crowdsourcing solution can effectively identify high-risk road segments with an estimated 75 % accuracy. More importantly, by analyzing this valued surrogate measure, safety specialists can identify hazardous road segments before crashes occur.
Collapse
Affiliation(s)
- Xiao Li
- Texas A&M Transportation Institute, Bryan, TX, 77807, USA.
| | | | | | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX, 77843-3136, USA.
| | - Brian Wolshon
- Gulf Coast Research Center for Evacuation and Transportation Resiliency, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
10
|
Kuo PF, Lord D. A visual approach for defining the spatial relationships among crashes, crimes, and alcohol retailers: Applying the color mixing theorem to define the colocation pattern of multiple variables. Accid Anal Prev 2021; 154:106062. [PMID: 33711749 DOI: 10.1016/j.aap.2021.106062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 02/21/2021] [Accepted: 02/24/2021] [Indexed: 06/12/2023]
Abstract
In traffic safety studies, the few scholars who have focused on analyzing disaggregated data obtained results that have been either difficult to explain or demonstrate because they did not provide clear visual maps or utilize statistical tests to quantify the spatial relationships. In order to increase the use of such disaggregated spatial methods for use in traffic safety studies, the current study documents the application of a new RGB (red, green, blue) model which combines the color additive theorem and the kernel density map (KDE) to define crash colocation patterns and the coincidence spaces of related variables. This study contributes to the literature in three major ways: (1) a new RGB model was established and applied in the field of traffic safety; (2) the variable dimensions were expanded from two to three; and, (3) the dimension of uncertainty was also included. When the new RGB model was utilized with data collected in College Station, Texas, the results indicated that the new colocation map is able to clearly and accurately define colocation hotspots of crashes, crimes, and alcohol retailers. As expected, these hotspots are located in areas with many bars, the largest strip malls and busiest intersections. The intensity maps have provided results consistent with the above colocation maps. However, the uncertainty map does not show a relatively higher level of certainty regarding the location of hotspots as we expected because the input of each variable was not related to the highest kernel value. Therefore, future scholars should focus on the colocation and intensity maps while using the uncertainty map as a reference for individual event risk evaluation only.
Collapse
Affiliation(s)
- Pei-Fen Kuo
- Department of Geomatics, National Cheng-Kung University, Taiwan.
| | - Dominique Lord
- Zachry Departmemnt of Civil and Environmental Engineering, Texas A&M University, USA
| |
Collapse
|
11
|
Iio K, Guo X, Lord D. Examining driver distraction in the context of driving speed: An observational study using disruptive technology and naturalistic data. Accid Anal Prev 2021; 153:105983. [PMID: 33618100 DOI: 10.1016/j.aap.2021.105983] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Revised: 12/02/2020] [Accepted: 01/05/2021] [Indexed: 06/12/2023]
Abstract
Considering the number of people who have been involved in crashes associated with driver distractions, it is important to understand the characteristics of distracted driving on public roadways. While experiments have indicated that driver distractions are associated with slower driving speeds, the methodologies tend to have limited external validity. Observational studies are often conducted under limited circumstances - be it time or location. Therefore, in order to better understand the nature of driver distractions, the authors investigated the relationships between driving speed, posted speed limits, and phone handling frequency through naturalistic driving data obtained (via disruptive technology) from 8,240 mobile application users on state-maintained highways throughout Texas. As a measure of manual distractions, a phone handling rate (PHR; times/hours driven) was calculated based on phone rotations. Within-subject comparisons were drawn for driving speed and posted speed limits under normal driving conditions and distracted conditions. The analysis revealed a strong negative correlation between PHR and driving speed (rs = -0.87). Paired t-tests revealed significantly lower driving speeds (p = 0.000 < 0.01, d = -0.48, η = 0.69) and posted speed limits (p = 0.000 < 0.01, d = -0.20, η = 0.42) during phone handling events when compared to driving without phone handling. On average, users drove 3.26 mph slower in distracted conditions than in undistracted conditions. Driving speed had a larger effect size than posted speed limits. The findings were in line with existing theories and experiments as well as other observational studies conducted at fixed locations. Although this research did not reveal causal relations, it is noteworthy that speed reduction with manual distractions was observed under real road conditions. Spatial analyses are recommended to conduct in order to paint a more thorough picture of speed reduction, its relationship to space, and crash risks related to distracted driving.
Collapse
Affiliation(s)
- Kentaro Iio
- Traf-IQ, Inc., 14811 St. Mary's Lane, Suite 180, Houston, TX, 77079, United States.
| | - Xiaoyu Guo
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX, 77843-3136, United States.
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, College Station, TX, 77843-3136, United States.
| |
Collapse
|
12
|
Sohrabi S, Khodadadi A, Mousavi SM, Dadashova B, Lord D. Quantifying the automated vehicle safety performance: A scoping review of the literature, evaluation of methods, and directions for future research. Accid Anal Prev 2021; 152:106003. [PMID: 33571922 DOI: 10.1016/j.aap.2021.106003] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 12/18/2020] [Accepted: 01/16/2021] [Indexed: 05/21/2023]
Abstract
Vehicle automation safety must be evaluated not only for market success but also for more informed decision-making about Automated Vehicles' (AVs) deployment and supporting policies and regulations to govern AVs' unintended consequences. This study is designed to identify the AV safety quantification studies, evaluate the quantification approaches used in the literature, and uncover the gaps and challenges in AV safety evaluation. We employed a scoping review methodology to identify the approaches used in the literature to quantify AV safety. After screening and reviewing the literature, six approaches were identified: target crash population, traffic simulation, driving simulator, road test data analysis, system failure risk assessment, and safety effectiveness estimation. We ran two evaluations on the identified approaches. First, we investigated each approach in terms of its input (required data, assumptions, etc.), output (safety evaluation metrics), and application (to estimate AVs' safety implications at the vehicle, transportation system, and society levels). Second, we qualitatively compared them in terms of three criteria: availability of input data, suitability for evaluating different automation levels, and reliability of estimations. This review identifies four challenges in AV safety evaluation: (a) shortcomings in AV safety evaluation approaches, (b) uncertainties in AV implementations and their impacts on AV safety, (c) potential riskier behavior of AV passengers as well as other road users, and (d) emerging safety issues related to AV implementations. This review is expected to help researchers and rulemakers to choose the most appropriate quantification method based on their goals and study limitations. Future research is required to address the identified challenges in AV safety evaluation.
Collapse
Affiliation(s)
- Soheil Sohrabi
- Zachry Department of Civil & Environmental Engineering, Texas A&M University, Texas, USA; Texas A&M Transportation Institute (TTI), Texas A&M University, Texas, USA.
| | - Ali Khodadadi
- Zachry Department of Civil & Environmental Engineering, Texas A&M University, Texas, USA
| | - Seyedeh Maryam Mousavi
- Zachry Department of Civil & Environmental Engineering, Texas A&M University, Texas, USA; Texas A&M Transportation Institute (TTI), Texas A&M University, Texas, USA
| | - Bahar Dadashova
- Texas A&M Transportation Institute (TTI), Texas A&M University, Texas, USA
| | - Dominique Lord
- Zachry Department of Civil & Environmental Engineering, Texas A&M University, Texas, USA
| |
Collapse
|
13
|
Mousavi SM, Osman OA, Lord D, Dixon KK, Dadashova B. Investigating the safety and operational benefits of mixed traffic environments with different automated vehicle market penetration rates in the proximity of a driveway on an urban arterial. Accid Anal Prev 2021; 152:105982. [PMID: 33497855 DOI: 10.1016/j.aap.2021.105982] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 07/21/2020] [Accepted: 01/07/2021] [Indexed: 06/12/2023]
Abstract
Traffic congestion is monotonically increasing, especially in large cities, due to rapid urbanization. Traffic congestion not only deteriorates traffic operation and degrades traffic safety, but also imposes costs to the road users. The concerns associated with traffic congestion increase when considering more complicated situations such as unsignalized intersections and driveways at which maneuvers are entirely dependent upon drivers' judgment. Urban arterials are characterized by closely spaced signalized and unsignalized intersections and high traffic volumes, which make them a priority while analyzing traffic safety and operation. Autonomous Vehicles (AV) provide ample opportunities to overcome the aforementioned challenges. In essence, this study evaluates the impact of various AV Market Penetration Rates (MPR) on the safety and operation of urban arterials in proximity of a driveway under different traffic levels of service (LOS). Twenty-four separate scenarios were developed using VISSIM, considering six AV MPRs of 0 %, 10 %, 25 %, 50 %, 75 %, and 100 %, and four LOS including A, B, C, and D. Various operational and safety measures were analyzed including traffic density, traffic speed, traffic conflict (rear-end and lane-changing), and driving volatility. The trajectory and lane-based analysis of the traffic density indicates that MPR significantly improves the overall traffic density for all the scenarios, especially under high traffic LOS. Additionally, by increasing the MPR and decreasing the traffic volume of the network, the mean speed increases significantly by up to 6 %. Exploring the safety of the scenarios indicates that by increasing the MPR from 0% to 100 % for all the LOS, the number of rear-end conflicts and lane-changing conflicts decreases 84 %-100 % and 42 %-100 %, respectively. Moreover, assessing the longitudinal driving volatility measures, which represent risky driving behaviors, showed that higher MPRs significantly reduce some of the driving volatility measures and enhance safety.
Collapse
Affiliation(s)
- Seyedeh Maryam Mousavi
- Zachry Department of Civil and Environmental Engineering, Texas A&M Transportation Institute (TTI), Texas A&M University, College Station, TX, 77840, USA; Texas A&M Transportation Institute (TTI), Texas A&M University, Bryan, TX, 77807, USA.
| | - Osama A Osman
- Department of Civil and Chemical Engineering, University of Tennessee, Chattanooga, TN, 37403, USA
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M Transportation Institute (TTI), Texas A&M University, College Station, TX, 77840, USA
| | - Karen K Dixon
- Texas A&M Transportation Institute (TTI), Texas A&M University, Bryan, TX, 77807, USA
| | - Bahar Dadashova
- Texas A&M Transportation Institute (TTI), Texas A&M University, Bryan, TX, 77807, USA
| |
Collapse
|
14
|
Guo X, Wu L, Lord D. Generalized criteria for evaluating hotspot identification methods. Accid Anal Prev 2020; 145:105684. [PMID: 32801091 DOI: 10.1016/j.aap.2020.105684] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 07/03/2020] [Accepted: 07/06/2020] [Indexed: 06/11/2023]
Abstract
Hotspot identification (HSID) is one of the most important components in the highway safety management process. Previous research has found that hazardous sites identified with different methods are not consistent. It is therefore necessary to evaluate the performance of various HSID methods. The existing evaluation criteria are limited to two consecutive periods, and do not consider the temporal instability of crashes. In addition, one existing criterion does not precisely evaluate HSID method under given circumstances. This paper proposed three generalized criteria to evaluate the performance of HSID methods: (1) High Crashes Consistency Test (HCCT) is proposed to evaluate HSID methods in terms of their reliabilities of identifying sites with high crash counts; (2) Common Sites Consistency Test (CSCT) is proposed to gauge HSID methods in consistently identifying a set of common sites as hazardous sites; and, (3) Absolute Rank Differences Test (ARDT) is proposed to measure the consistency of HSID methods in measuring the absolute differences in rankings. Further, three commonly used HSID methods are applied to estimate crashes on Texas rural two-lane roadway segments with eight years of crash data. The performance of these three HSID methods were evaluated to validate the proposed criteria. Comparisons between the existing criteria and the generalized criteria revealed that: (1) the generalized criteria are capable of evaluating different HSID methods over multiple periods; and (2) the generalized criteria are enhanced with a consistent result and with less discrepancy in scores of the best identified HSID method.
Collapse
Affiliation(s)
- Xiaoyu Guo
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, 3136 TAMU, College Station, TX, 77843-3136, United States.
| | - Lingtao Wu
- Center for Transportation Safety, Texas A&M Transportation Institute, Texas A&M University System, 3135 TAMU, College Station, TX, 77843-3135, United States.
| | - Dominique Lord
- Zachry Department of Civil and Environmental Engineering, Texas A&M University, 3136 TAMU, College Station, TX, 77843-3136, United States.
| |
Collapse
|
15
|
Dadashova B, Arenas-Ramires B, Mira-McWillaims J, Dixon K, Lord D. Analysis of crash injury severity on two trans-European transport network corridors in Spain using discrete-choice models and random forests. Traffic Inj Prev 2020; 21:228-233. [PMID: 32160016 DOI: 10.1080/15389588.2020.1733539] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 02/18/2020] [Accepted: 02/19/2020] [Indexed: 06/10/2023]
Abstract
Objective: The objective of this paper is to identify the list of crash severity contributing factors and evaluate their impact on multiple-vehicle crashes on two high use Trans-European interurban, freight corridors in Spain (southern Europe): Madrid - Irùn and Barcelona - Almerìa.Methods: We have used both logistic regression and random forests to identify crash severity predictors and estimate their impacts on crash outcomes. Although both statistical methods can provide useful information to help explain the safety implications of highway crashes, using both methods may further enable a more comprehensive understanding of this phenomenon. For this effort, we disaggregated the crash data into different crash types (i.e., head-on, angle, sideswipe and rear-end) and analyzed this data using roadway design elements, driver characteristics, and environmental factors. To identify the most important predictors of crash severity, we used the random forests data mining approach. We then used ordered logit models to estimate the effect of external factors on the severity of each crash type. Finally, we assessed the accuracy of the model estimates using bootstrap sampling.Results: The results of data mining analyses indicated that roadway design factors such as horizontal and vertical curvature, super elevation, and lane and shoulder width are among the most important factors associated with crash severity. The results of logistic regression show that the impact of the selected roadway element on the crash outcome is conditional on the crash type and the direction of the effects is not always consistent.Conclusions: The contribution of this paper to the existing literature is two-fold: the first important contribution of the paper is related to the safety analysis of two of the most important freight corridors in Spain and southern Europe. The second contribution of this paper is to address the existing gap in the literature relating to the comparison and compatibility of data mining and the logistic regression model.
Collapse
Affiliation(s)
- Bahar Dadashova
- Texas A&M Transportation Insitute, Texas A&M University System, College Station, Texas, USA
| | - Blanca Arenas-Ramires
- University Institute of Automobile Research (INSIA), Technical University of Madrid (UPM), Madrid, Spain
| | - Jose Mira-McWillaims
- University Institute of Automobile Research (INSIA), Technical University of Madrid (UPM), Madrid, Spain
| | - Karen Dixon
- Texas A&M Transportation Insitute, Texas A&M University System, College Station, Texas, USA
| | - Dominique Lord
- Texas A&M Transportation Insitute, Texas A&M University System, College Station, Texas, USA
- Zachary Department of Civil and Environmental Engineering, Texas A&M University System, College Station, Texas, USA
| |
Collapse
|
16
|
Kuo PF, Lord D. Applying the colocation quotient index to crash severity analyses. Accid Anal Prev 2020; 135:105368. [PMID: 31812898 DOI: 10.1016/j.aap.2019.105368] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 11/15/2019] [Accepted: 11/15/2019] [Indexed: 06/10/2023]
Abstract
Examining the spatial relationships among crashes of various severity levels is essential for gaining a better understanding of the severity distribution and potential contributing factors to collisions. However, relatively few scholars have focused on analyzing this type of data. Therefore, in this study, we utilized a new index, the colocation quotient, to measure the spatial associations among crashes of various severities that occurred in College Station, Texas. This new method has been widely used to define the colocation pattern of categorized data in various fields, but it has not yet been applied to crash severity data. According to our findings, (1) crashes tended to be at the same injury level as those of neighboring ones, which was most significant for fatal crashes and second most significant for non-injury crashes; (2) the colocation quotient matrix tended to be symmetrical in non-injury crashes versus injury crashes (minor injury, major injury, and fatal); and, (3) DWIs (driving while intoxicated) and hit-and runs did not show a strong pattern. These colocation quotient results could be helpful for predicting crash severity and by providing traffic engineers with more effective traffic safety measures.
Collapse
Affiliation(s)
- Pei-Fen Kuo
- Department of Geomatics, National Cheng Kung University, Taiwan.
| | | |
Collapse
|
17
|
Mao H, Deng X, Lord D, Flintsch G, Guo F. Adjusting finite sample bias in traffic safety modeling. Accid Anal Prev 2019; 131:112-121. [PMID: 31252329 DOI: 10.1016/j.aap.2019.05.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Revised: 02/22/2019] [Accepted: 05/29/2019] [Indexed: 06/09/2023]
Abstract
Poisson and negative binomial regression models are fundamental statistical analysis tools for traffic safety evaluation. The regression parameter estimation could suffer from the finite sample bias when event frequency is low, which is commonly observed in safety research as crashes are rare events. In this study, we apply a bias-correction procedure to the parameter estimation of Poisson and NB regression models. We provide a general bias-correction formulation and illustrate the finite sample bias through a special scenario with a single binary explanatory variable. Several factors affecting the magnitude of bias are identified, including the number of crashes and the balance of the crash counts within strata of a categorical explanatory variable. Simulations are conducted to examine the properties of the bias-corrected coefficient estimators. The results show that the bias-corrected estimators generally provide less bias and smaller variance. The effect is especially pronounced when the crash count in one stratum is between 5 and 50. We apply the proposed method to a case study of infrastructure safety evaluation. Three scenarios were evaluated, all crashes collected in three years, and two hypothetical situations, where crash information was collected for "half-year" and "quarter-year" periods. The case-study results confirm that the magnitude of bias correction is larger for smaller crash counts. This paper demonstrates the finite sample bias associated with the small number of crashes and suggests bias adjustment can provide more accurate estimation when evaluating the impacts of crash risk factors.
Collapse
Affiliation(s)
- Huiying Mao
- Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA
| | - Xinwei Deng
- Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843-3136, USA
| | - Gerardo Flintsch
- Virginia Tech Transportation Institute, Virginia Tech, Blacksburg, VA 24061, USA; Charles E. Via, Jr. Department of Civil and Environmental Engineering, Virginia Tech, Blacksburg, VA 24061, USA
| | - Feng Guo
- Department of Statistics, Virginia Tech, Blacksburg, VA 24061, USA; Virginia Tech Transportation Institute, Virginia Tech, Blacksburg, VA 24061, USA.
| |
Collapse
|
18
|
Khazraee SH, Johnson V, Lord D. Bayesian Poisson hierarchical models for crash data analysis: Investigating the impact of model choice on site-specific predictions. Accid Anal Prev 2018; 117:181-195. [PMID: 29705601 DOI: 10.1016/j.aap.2018.04.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/06/2017] [Revised: 01/21/2018] [Accepted: 04/13/2018] [Indexed: 06/08/2023]
Abstract
The Poisson-gamma (PG) and Poisson-lognormal (PLN) regression models are among the most popular means for motor vehicle crash data analysis. Both models belong to the Poisson-hierarchical family of models. While numerous studies have compared the overall performance of alternative Bayesian Poisson-hierarchical models, little research has addressed the impact of model choice on the expected crash frequency prediction at individual sites. This paper sought to examine whether there are any trends among candidate models predictions e.g., that an alternative model's prediction for sites with certain conditions tends to be higher (or lower) than that from another model. In addition to the PG and PLN models, this research formulated a new member of the Poisson-hierarchical family of models: the Poisson-inverse gamma (PIGam). Three field datasets (from Texas, Michigan and Indiana) covering a wide range of over-dispersion characteristics were selected for analysis. This study demonstrated that the model choice can be critical when the calibrated models are used for prediction at new sites, especially when the data are highly over-dispersed. For all three datasets, the PIGam model would predict higher expected crash frequencies than would the PLN and PG models, in order, indicating a clear link between the models predictions and the shape of their mixing distributions (i.e., gamma, lognormal, and inverse gamma, respectively). The thicker tail of the PIGam and PLN models (in order) may provide an advantage when the data are highly over-dispersed. The analysis results also illustrated a major deficiency of the Deviance Information Criterion (DIC) in comparing the goodness-of-fit of hierarchical models; models with drastically different set of coefficients (and thus predictions for new sites) may yield similar DIC values, because the DIC only accounts for the parameters in the lowest (observation) level of the hierarchy and ignores the higher levels (regression coefficients).
Collapse
Affiliation(s)
- S Hadi Khazraee
- Uber Technologies, Inc., San Francisco, CA, 94103, United States.
| | - Valen Johnson
- Department of Statistics, Texas A&M University, College Station, TX, 77843-3143, United States.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX, 77843-3136, United States.
| |
Collapse
|
19
|
Ye Z, Xu Y, Lord D. Crash data modeling with a generalized estimator. Accid Anal Prev 2018; 117:340-345. [PMID: 29758516 DOI: 10.1016/j.aap.2018.04.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 04/24/2018] [Accepted: 04/28/2018] [Indexed: 06/08/2023]
Abstract
The investigation of relationships between traffic crashes and relevant factors is important in traffic safety management. Various methods have been developed for modeling crash data. In real world scenarios, crash data often display the characteristics of over-dispersion. However, on occasions, some crash datasets have exhibited under-dispersion, especially in cases where the data are conditioned upon the mean. The commonly used models (such as the Poisson and the NB regression models) have associated limitations to cope with various degrees of dispersion. In light of this, a generalized event count (GEC) model, which can be generally used to handle over-, equi-, and under-dispersed data, is proposed in this study. This model was first applied to case studies using data from Toronto, characterized by over-dispersion, and then to crash data from railway-highway crossings in Korea, characterized with under-dispersion. The results from the GEC model were compared with those from the Negative binomial and the hyper-Poisson models. The cases studies show that the proposed model provides good performance for crash data characterized with over- and under-dispersion. Moreover, the proposed model simplifies the modeling process and the prediction of crash data.
Collapse
Affiliation(s)
- Zhirui Ye
- Jiangsu Key Laboratory of Urban ITS, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, School of Transportation, Southeast University, 2 Sipailou, Nanjing, Jiangsu, 210096, China.
| | - Yueru Xu
- Jiangsu Key Laboratory of Urban ITS, Jiangsu Province Collaborative Innovation Center of Modern Urban Traffic Technologies, School of Transportation, Southeast University, 2 Sipailou, Nanjing, Jiangsu, 210096, China
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX, 77843-3136, United States
| |
Collapse
|
20
|
Abstract
This paper develops a semi-nonparametric Poisson regression model to analyze motor vehicle crash frequency data collected from rural multilane highway segments in California, US. Motor vehicle crash frequency on rural highway is a topic of interest in the area of transportation safety due to higher driving speeds and the resultant severity level. Unlike the traditional Negative Binomial (NB) model, the semi-nonparametric Poisson regression model can accommodate an unobserved heterogeneity following a highly flexible semi-nonparametric (SNP) distribution. Simulation experiments are conducted to demonstrate that the SNP distribution can well mimic a large family of distributions, including normal distributions, log-gamma distributions, bimodal and trimodal distributions. Empirical estimation results show that such flexibility offered by the SNP distribution can greatly improve model precision and the overall goodness-of-fit. The semi-nonparametric distribution can provide a better understanding of crash data structure through its ability to capture potential multimodality in the distribution of unobserved heterogeneity. When estimated coefficients in empirical models are compared, SNP and NB models are found to have a substantially different coefficient for the dummy variable indicating the lane width. The SNP model with better statistical performance suggests that the NB model overestimates the effect of lane width on crash frequency reduction by 83.1%.
Collapse
Affiliation(s)
- Xin Ye
- Key Laboratory of Road and Traffic Engineering of Ministry of Education, College of Transportation Engineering, Tongji University, Shanghai, China
| | - Ke Wang
- Key Laboratory of Road and Traffic Engineering of Ministry of Education, College of Transportation Engineering, Tongji University, Shanghai, China
| | - Yajie Zou
- Key Laboratory of Road and Traffic Engineering of Ministry of Education, College of Transportation Engineering, Tongji University, Shanghai, China
- * E-mail:
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University 3136 TAMU, College Station, TX, United States of America
| |
Collapse
|
21
|
Zou Y, Ash JE, Park BJ, Lord D, Wu L. Empirical Bayes estimates of finite mixture of negative binomial regression models and its application to highway safety. J Appl Stat 2017. [DOI: 10.1080/02664763.2017.1389863] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Yajie Zou
- Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, People’s Republic of China
| | - John E. Ash
- Department of Civil and Environmental Engineering, University of Washington, Seattle, WA, USA
| | - Byung-Jung Park
- Department of Transportation Engineering, Myongji University, Seoul, Korea
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX, USA
| | - Lingtao Wu
- Texas A&M Transportation Institute, Texas A&M University System, College Station, TX, USA
| |
Collapse
|
22
|
Shirazi M, Dhavala SS, Lord D, Geedipally SR. A methodology to design heuristics for model selection based on the characteristics of data: Application to investigate when the Negative Binomial Lindley (NB-L) is preferred over the Negative Binomial (NB). Accid Anal Prev 2017; 107:186-194. [PMID: 28886410 DOI: 10.1016/j.aap.2017.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2017] [Revised: 05/25/2017] [Accepted: 07/04/2017] [Indexed: 06/07/2023]
Abstract
Safety analysts usually use post-modeling methods, such as the Goodness-of-Fit statistics or the Likelihood Ratio Test, to decide between two or more competitive distributions or models. Such metrics require all competitive distributions to be fitted to the data before any comparisons can be accomplished. Given the continuous growth in introducing new statistical distributions, choosing the best one using such post-modeling methods is not a trivial task, in addition to all theoretical or numerical issues the analyst may face during the analysis. Furthermore, and most importantly, these measures or tests do not provide any intuitions into why a specific distribution (or model) is preferred over another (Goodness-of-Logic). This paper ponders into these issues by proposing a methodology to design heuristics for Model Selection based on the characteristics of data, in terms of descriptive summary statistics, before fitting the models. The proposed methodology employs two analytic tools: (1) Monte-Carlo Simulations and (2) Machine Learning Classifiers, to design easy heuristics to predict the label of the 'most-likely-true' distribution for analyzing data. The proposed methodology was applied to investigate when the recently introduced Negative Binomial Lindley (NB-L) distribution is preferred over the Negative Binomial (NB) distribution. Heuristics were designed to select the 'most-likely-true' distribution between these two distributions, given a set of prescribed summary statistics of data. The proposed heuristics were successfully compared against classical tests for several real or observed datasets. Not only they are easy to use and do not need any post-modeling inputs, but also, using these heuristics, the analyst can attain useful information about why the NB-L is preferred over the NB - or vice versa- when modeling data.
Collapse
Affiliation(s)
- Mohammadali Shirazi
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | | | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | | |
Collapse
|
23
|
Wu L, Lord D. Examining the influence of link function misspecification in conventional regression models for developing crash modification factors. Accid Anal Prev 2017; 102:123-135. [PMID: 28282580 DOI: 10.1016/j.aap.2017.02.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 02/08/2017] [Accepted: 02/13/2017] [Indexed: 06/06/2023]
Abstract
This study further examined the use of regression models for developing crash modification factors (CMFs), specifically focusing on the misspecification in the link function. The primary objectives were to validate the accuracy of CMFs derived from the commonly used regression models (i.e., generalized linear models or GLMs with additive linear link functions) when some of the variables have nonlinear relationships and quantify the amount of bias as a function of the nonlinearity. Using the concept of artificial realistic data, various linear and nonlinear crash modification functions (CM-Functions) were assumed for three variables. Crash counts were randomly generated based on these CM-Functions. CMFs were then derived from regression models for three different scenarios. The results were compared with the assumed true values. The main findings are summarized as follows: (1) when some variables have nonlinear relationships with crash risk, the CMFs for these variables derived from the commonly used GLMs are all biased, especially around areas away from the baseline conditions (e.g., boundary areas); (2) with the increase in nonlinearity (i.e., nonlinear relationship becomes stronger), the bias becomes more significant; (3) the quality of CMFs for other variables having linear relationships can be influenced when mixed with those having nonlinear relationships, but the accuracy may still be acceptable; and (4) the misuse of the link function for one or more variables can also lead to biased estimates for other parameters. This study raised the importance of the link function when using regression models for developing CMFs.
Collapse
Affiliation(s)
- Lingtao Wu
- Texas A&M Transportation Institute, Texas A&M University System, 3135 TAMU, College Station, TX 77843-3135, United States.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| |
Collapse
|
24
|
Hamza AV, Nikroo A, Alger E, Antipa N, Atherton LJ, Barker D, Baxamusa S, Bhandarkar S, Biesiada T, Buice E, Carr E, Castro C, Choate C, Conder A, Crippen J, Dylla-Spears R, Dzenitis E, Eddinger S, Emerich M, Fair J, Farrell M, Felker S, Florio J, Forsman A, Giraldez E, Hein N, Hoover D, Horner J, Huang H, Kozioziemski B, Kroll J, Lawson B, Letts SA, Lord D, Mapoles E, Mauldin M, Miller P, Montesanti R, Moreno K, Parham T, Nathan B, Reynolds J, Sater J, Segraves K, Seugling R, Stadermann M, Strauser R, Stephens R, Suratwala TI, Swisher M, Taylor JS, Wallace R, Wegner P, Wilkens H, Yoxalla B. Target Development for the National Ignition Campaign. Fusion Science and Technology 2017. [DOI: 10.13182/fst15-163] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- A. V. Hamza
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - A. Nikroo
- General Atomics, La Jolla, California 92121
| | - E. Alger
- General Atomics, La Jolla, California 92121
| | - N. Antipa
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - L. J. Atherton
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - D. Barker
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - S. Baxamusa
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - S. Bhandarkar
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - T. Biesiada
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - E. Buice
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - E. Carr
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - C. Castro
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - C. Choate
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - A. Conder
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - J. Crippen
- General Atomics, La Jolla, California 92121
| | - R. Dylla-Spears
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - E. Dzenitis
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | | | - M. Emerich
- General Atomics, La Jolla, California 92121
| | - J. Fair
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - M. Farrell
- General Atomics, La Jolla, California 92121
| | - S. Felker
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - J. Florio
- General Atomics, La Jolla, California 92121
| | - A. Forsman
- General Atomics, La Jolla, California 92121
| | | | - N. Hein
- General Atomics, La Jolla, California 92121
| | - D. Hoover
- General Atomics, La Jolla, California 92121
| | - J. Horner
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - H. Huang
- General Atomics, La Jolla, California 92121
| | - B. Kozioziemski
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - J. Kroll
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - B. Lawson
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - S. A. Letts
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - D. Lord
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - E. Mapoles
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - M. Mauldin
- General Atomics, La Jolla, California 92121
| | - P. Miller
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - R. Montesanti
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - K. Moreno
- General Atomics, La Jolla, California 92121
| | - T. Parham
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - B. Nathan
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - J. Reynolds
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - J. Sater
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - K. Segraves
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - R. Seugling
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - M. Stadermann
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | | | | | - T. I. Suratwala
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - M. Swisher
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - J. S. Taylor
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - R. Wallace
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - P. Wegner
- Lawrence Livermore National Laboratory, Livermore, California 94550
| | - H. Wilkens
- General Atomics, La Jolla, California 92121
| | - B. Yoxalla
- Lawrence Livermore National Laboratory, Livermore, California 94550
| |
Collapse
|
25
|
Shirazi M, Reddy Geedipally S, Lord D. A Monte-Carlo simulation analysis for evaluating the severity distribution functions (SDFs) calibration methodology and determining the minimum sample-size requirements. Accid Anal Prev 2017; 98:303-311. [PMID: 27810672 DOI: 10.1016/j.aap.2016.10.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2016] [Revised: 09/26/2016] [Accepted: 10/04/2016] [Indexed: 06/06/2023]
Abstract
Severity distribution functions (SDFs) are used in highway safety to estimate the severity of crashes and conduct different types of safety evaluations and analyses. Developing a new SDF is a difficult task and demands significant time and resources. To simplify the process, the Highway Safety Manual (HSM) has started to document SDF models for different types of facilities. As such, SDF models have recently been introduced for freeway and ramps in HSM addendum. However, since these functions or models are fitted and validated using data from a few selected number of states, they are required to be calibrated to the local conditions when applied to a new jurisdiction. The HSM provides a methodology to calibrate the models through a scalar calibration factor. However, the proposed methodology to calibrate SDFs was never validated through research. Furthermore, there are no concrete guidelines to select a reliable sample size. Using extensive simulation, this paper documents an analysis that examined the bias between the 'true' and 'estimated' calibration factors. It was indicated that as the value of the true calibration factor deviates further away from '1', more bias is observed between the 'true' and 'estimated' calibration factors. In addition, simulation studies were performed to determine the calibration sample size for various conditions. It was found that, as the average of the coefficient of variation (CV) of the 'KAB' and 'C' crashes increases, the analyst needs to collect a larger sample size to calibrate SDF models. Taking this observation into account, sample-size guidelines are proposed based on the average CV of crash severities that are used for the calibration process.
Collapse
Affiliation(s)
- Mohammadali Shirazi
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | | | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| |
Collapse
|
26
|
Park BJ, Lord D, Wu L. Finite mixture modeling approach for developing crash modification factors in highway safety analysis. Accid Anal Prev 2016; 97:274-287. [PMID: 27974277 DOI: 10.1016/j.aap.2016.10.023] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2016] [Revised: 10/17/2016] [Accepted: 10/18/2016] [Indexed: 06/06/2023]
Abstract
This study aimed to investigate the relative performance of two models (negative binomial (NB) model and two-component finite mixture of negative binomial models (FMNB-2)) in terms of developing crash modification factors (CMFs). Crash data on rural multilane divided highways in California and Texas were modeled with the two models, and crash modification functions (CMFunctions) were derived. The resultant CMFunction estimated from the FMNB-2 model showed several good properties over that from the NB model. First, the safety effect of a covariate was better reflected by the CMFunction developed using the FMNB-2 model, since the model takes into account the differential responsiveness of crash frequency to the covariate. Second, the CMFunction derived from the FMNB-2 model is able to capture nonlinear relationships between covariate and safety. Finally, following the same concept as those for NB models, the combined CMFs of multiple treatments were estimated using the FMNB-2 model. The results indicated that they are not the simple multiplicative of single ones (i.e., their safety effects are not independent under FMNB-2 models). Adjustment Factors (AFs) were then developed. It is revealed that current Highway Safety Manual's method could over- or under-estimate the combined CMFs under particular combination of covariates. Safety analysts are encouraged to consider using the FMNB-2 models for developing CMFs and AFs.
Collapse
Affiliation(s)
- Byung-Jung Park
- Department of Transportation Engineering, Myongji University, Republic of Korea.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | - Lingtao Wu
- Texas A&M Transportation Institute, Texas A&M University System, 3135 TAMU, College Station, TX 77843-3135, United States.
| |
Collapse
|
27
|
Shirazi M, Lord D, Geedipally SR. Sample-size guidelines for recalibrating crash prediction models: Recommendations for the highway safety manual. Accid Anal Prev 2016; 93:160-168. [PMID: 27183517 DOI: 10.1016/j.aap.2016.04.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2016] [Revised: 04/08/2016] [Accepted: 04/09/2016] [Indexed: 06/05/2023]
Abstract
The Highway Safety Manual (HSM) prediction models are fitted and validated based on crash data collected from a selected number of states in the United States. Therefore, for a jurisdiction to be able to fully benefit from applying these models, it is necessary to calibrate or recalibrate them to local conditions. The first edition of the HSM recommends calibrating the models using a one-size-fits-all sample-size of 30-50 locations with total of at least 100 crashes per year. However, the HSM recommendation is not fully supported by documented studies. The objectives of this paper are consequently: (1) to examine the required sample size based on the characteristics of the data that will be used for the calibration or recalibration process; and, (2) propose revised guidelines. The objectives were accomplished using simulation runs for different scenarios that characterized the sample mean and variance of the data. The simulation results indicate that as the ratio of the standard deviation to the mean (i.e., coefficient of variation) of the crash data increases, a larger sample-size is warranted to fulfill certain levels of accuracy. Taking this observation into account, sample-size guidelines were prepared based on the coefficient of variation of the crash data that are needed for the calibration process. The guidelines were then successfully applied to the two observed datasets. The proposed guidelines can be used for all facility types and both for segment and intersection prediction models.
Collapse
Affiliation(s)
- Mohammadali Shirazi
- Zachry Department of Civil Engineering, Texas A&M University, College Station TX 77843, United States.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station TX 77843, United States.
| | | |
Collapse
|
28
|
Shirazi M, Lord D, Dhavala SS, Geedipally SR. A semiparametric negative binomial generalized linear model for modeling over-dispersed count data with a heavy tail: Characteristics and applications to crash data. Accid Anal Prev 2016; 91:10-18. [PMID: 26945472 DOI: 10.1016/j.aap.2016.02.020] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 02/21/2016] [Accepted: 02/22/2016] [Indexed: 06/05/2023]
Abstract
Crash data can often be characterized by over-dispersion, heavy (long) tail and many observations with the value zero. Over the last few years, a small number of researchers have started developing and applying novel and innovative multi-parameter models to analyze such data. These multi-parameter models have been proposed for overcoming the limitations of the traditional negative binomial (NB) model, which cannot handle this kind of data efficiently. The research documented in this paper continues the work related to multi-parameter models. The objective of this paper is to document the development and application of a flexible NB generalized linear model with randomly distributed mixed effects characterized by the Dirichlet process (NB-DP) to model crash data. The objective of the study was accomplished using two datasets. The new model was compared to the NB and the recently introduced model based on the mixture of the NB and Lindley (NB-L) distributions. Overall, the research study shows that the NB-DP model offers a better performance than the NB model once data are over-dispersed and have a heavy tail. The NB-DP performed better than the NB-L when the dataset has a heavy tail, but a smaller percentage of zeros. However, both models performed similarly when the dataset contained a large amount of zeros. In addition to a greater flexibility, the NB-DP provides a clustering by-product that allows the safety analyst to better understand the characteristics of the data, such as the identification of outliers and sources of dispersion.
Collapse
Affiliation(s)
- Mohammadali Shirazi
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 77843, United States.
| | | | | |
Collapse
|
29
|
Imprialou MIM, Quddus M, Pitfield DE, Lord D. Re-visiting crash-speed relationships: A new perspective in crash modelling. Accid Anal Prev 2016; 86:173-185. [PMID: 26571206 DOI: 10.1016/j.aap.2015.10.001] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2015] [Revised: 08/20/2015] [Accepted: 10/01/2015] [Indexed: 06/05/2023]
Abstract
Although speed is considered to be one of the main crash contributory factors, research findings are inconsistent. Independent of the robustness of their statistical approaches, crash frequency models typically employ crash data that are aggregated using spatial criteria (e.g., crash counts by link termed as a link-based approach). In this approach, the variability in crashes between links is explained by highly aggregated average measures that may be inappropriate, especially for time-varying variables such as speed and volume. This paper re-examines crash-speed relationships by creating a new crash data aggregation approach that enables improved representation of the road conditions just before crash occurrences. Crashes are aggregated according to the similarity of their pre-crash traffic and geometric conditions, forming an alternative crash count dataset termed as a condition-based approach. Crash-speed relationships are separately developed and compared for both approaches by employing the annual crashes that occurred on the Strategic Road Network of England in 2012. The datasets are modelled by injury severity using multivariate Poisson lognormal regression, with multivariate spatial effects for the link-based model, using a full Bayesian inference approach. The results of the condition-based approach show that high speeds trigger crash frequency. The outcome of the link-based model is the opposite; suggesting that the speed-crash relationship is negative regardless of crash severity. The differences between the results imply that data aggregation is a crucial, yet so far overlooked, methodological element of crash data analyses that may have direct impact on the modelling outcomes.
Collapse
Affiliation(s)
- Maria-Ioanna M Imprialou
- School of Civil and Building Engineering, Loughborough University, Loughborough LE11 3TU, United Kingdom.
| | - Mohammed Quddus
- School of Civil and Building Engineering, Loughborough University, Loughborough LE11 3TU, United Kingdom
| | - David E Pitfield
- School of Civil and Building Engineering, Loughborough University, Loughborough LE11 3TU, United Kingdom
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX 3136, United States
| |
Collapse
|
30
|
Khazraee SH, Sáez-Castillo AJ, Geedipally SR, Lord D. Application of the Hyper-Poisson Generalized Linear Model for Analyzing Motor Vehicle Crashes. Risk Anal 2015; 35:919-930. [PMID: 25385093 DOI: 10.1111/risa.12296] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The hyper-Poisson distribution can handle both over- and underdispersion, and its generalized linear model formulation allows the dispersion of the distribution to be observation-specific and dependent on model covariates. This study's objective is to examine the potential applicability of a newly proposed generalized linear model framework for the hyper-Poisson distribution in analyzing motor vehicle crash count data. The hyper-Poisson generalized linear model was first fitted to intersection crash data from Toronto, characterized by overdispersion, and then to crash data from railway-highway crossings in Korea, characterized by underdispersion. The results of this study are promising. When fitted to the Toronto data set, the goodness-of-fit measures indicated that the hyper-Poisson model with a variable dispersion parameter provided a statistical fit as good as the traditional negative binomial model. The hyper-Poisson model was also successful in handling the underdispersed data from Korea; the model performed as well as the gamma probability model and the Conway-Maxwell-Poisson model previously developed for the same data set. The advantages of the hyper-Poisson model studied in this article are noteworthy. Unlike the negative binomial model, which has difficulties in handling underdispersed data, the hyper-Poisson model can handle both over- and underdispersed crash data. Although not a major issue for the Conway-Maxwell-Poisson model, the effect of each variable on the expected mean of crashes is easily interpretable in the case of this new model.
Collapse
Affiliation(s)
- S Hadi Khazraee
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX, USA
| | | | | | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, College Station, TX, USA
| |
Collapse
|
31
|
Peng Y, Lord D, Zou Y. Applying the Generalized Waring model for investigating sources of variance in motor vehicle crash analysis. Accid Anal Prev 2014; 73:20-26. [PMID: 25173723 DOI: 10.1016/j.aap.2014.07.031] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2014] [Revised: 07/28/2014] [Accepted: 07/28/2014] [Indexed: 06/03/2023]
Abstract
As one of the major analysis methods, statistical models play an important role in traffic safety analysis. They can be used for a wide variety of purposes, including establishing relationships between variables and understanding the characteristics of a system. The purpose of this paper is to document a new type of model that can help with the latter. This model is based on the Generalized Waring (GW) distribution. The GW model yields more information about the sources of the variance observed in datasets than other traditional models, such as the negative binomial (NB) model. In this regards, the GW model can separate the observed variability into three parts: (1) the randomness, which explains the model's uncertainty; (2) the proneness, which refers to the internal differences between entities or observations; and (3) the liability, which is defined as the variance caused by other external factors that are difficult to be identified and have not been included as explanatory variables in the model. The study analyses were accomplished using two observed datasets to explore potential sources of variation. The results show that the GW model can provide meaningful information about sources of variance in crash data and also performs better than the NB model.
Collapse
Affiliation(s)
- Yichuan Peng
- Department of Civil Engineering, University of Central Florida, 4000 Central Florida Blvd, Orlando, FL 32816, United States.
| | - Dominique Lord
- Zachry Development, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | - Yajie Zou
- Zachry Development, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| |
Collapse
|
32
|
Park BJ, Lord D, Lee C. Finite mixture modeling for vehicle crash data with application to hotspot identification. Accid Anal Prev 2014; 71:319-326. [PMID: 24992301 DOI: 10.1016/j.aap.2014.05.030] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2014] [Revised: 05/26/2014] [Accepted: 05/27/2014] [Indexed: 06/03/2023]
Abstract
The application of finite mixture regression models has recently gained an interest from highway safety researchers because of its considerable potential for addressing unobserved heterogeneity. Finite mixture models assume that the observations of a sample arise from two or more unobserved components with unknown proportions. Both fixed and varying weight parameter models have been shown to be useful for explaining the heterogeneity and the nature of the dispersion in crash data. Given the superior performance of the finite mixture model, this study, using observed and simulated data, investigated the relative performance of the finite mixture model and the traditional negative binomial (NB) model in terms of hotspot identification. For the observed data, rural multilane segment crash data for divided highways in California and Texas were used. The results showed that the difference measured by the percentage deviation in ranking orders was relatively small for this dataset. Nevertheless, the ranking results from the finite mixture model were considered more reliable than the NB model because of the better model specification. This finding was also supported by the simulation study which produced a high number of false positives and negatives when a mis-specified model was used for hotspot identification. Regarding an optimal threshold value for identifying hotspots, another simulation analysis indicated that there is a discrepancy between false discovery (increasing) and false negative rates (decreasing). Since the costs associated with false positives and false negatives are different, it is suggested that the selected optimal threshold value should be decided by considering the trade-offs between these two costs so that unnecessary expenses are minimized.
Collapse
Affiliation(s)
- Byung-Jung Park
- Department of Transportation Engineering, Myongji University, South Korea.
| | - Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | - Chungwon Lee
- Department of Civil and Environmental Engineering, Seoul National University, South Korea
| |
Collapse
|
33
|
Heydari S, Miranda-Moreno LF, Lord D, Fu L. Bayesian methodology to estimate and update safety performance functions under limited data conditions: a sensitivity analysis. Accid Anal Prev 2014; 64:41-51. [PMID: 24316506 DOI: 10.1016/j.aap.2013.11.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/21/2013] [Revised: 10/27/2013] [Accepted: 11/01/2013] [Indexed: 06/02/2023]
Abstract
In road safety studies, decision makers must often cope with limited data conditions. In such circumstances, the maximum likelihood estimation (MLE), which relies on asymptotic theory, is unreliable and prone to bias. Moreover, it has been reported in the literature that (a) Bayesian estimates might be significantly biased when using non-informative prior distributions under limited data conditions, and that (b) the calibration of limited data is plausible when existing evidence in the form of proper priors is introduced into analyses. Although the Highway Safety Manual (2010) (HSM) and other research studies provide calibration and updating procedures, the data requirements can be very taxing. This paper presents a practical and sound Bayesian method to estimate and/or update safety performance function (SPF) parameters combining the information available from limited data with the SPF parameters reported in the HSM. The proposed Bayesian updating approach has the advantage of requiring fewer observations to get reliable estimates. This paper documents this procedure. The adopted technique is validated by conducting a sensitivity analysis through an extensive simulation study with 15 different models, which include various prior combinations. This sensitivity analysis contributes to our understanding of the comparative aspects of a large number of prior distributions. Furthermore, the proposed method contributes to unification of the Bayesian updating process for SPFs. The results demonstrate the accuracy of the developed methodology. Therefore, the suggested approach offers considerable promise as a methodological tool to estimate and/or update baseline SPFs and to evaluate the efficacy of road safety countermeasures under limited data conditions.
Collapse
Affiliation(s)
- Shahram Heydari
- Department of Civil and Environmental Engineering, University of Waterloo, 200 University Avenue W., Waterloo, ON N2L 3G1, Canada.
| | - Luis F Miranda-Moreno
- Department of Civil Engineering and Applied Mechanics, McGill University, 817 Sherbrooke St. W., Montreal, QC H3A 2K6, Canada.
| | - Dominique Lord
- Zachary Department of Civil Engineering, Texas A&M University, College Station, TX, USA.
| | - Liping Fu
- Department of Civil and Environmental Engineering, University of Waterloo, 200 University Avenue W., Waterloo, ON N2L 3G1, Canada.
| |
Collapse
|
34
|
Ye Z, Zhang Y, Lord D. Goodness-of-fit testing for accident models with low means. Accid Anal Prev 2013; 61:78-86. [PMID: 23219076 DOI: 10.1016/j.aap.2012.11.007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2011] [Revised: 09/06/2012] [Accepted: 11/08/2012] [Indexed: 06/01/2023]
Abstract
The modeling of relationships between motor vehicle crashes and underlying factors has been investigated for more than three decades. Recently, many highway safety studies have documented the use of negative binomial (NB) regression models. On rare occasions, the Poisson model may be the only alternative especially when crash sample mean is low. Pearson's X(2) and the scaled deviance (G(2)) are two common test statistics that have been proposed as measures of goodness-of-fit (GOF) for Poisson or NB models. Unfortunately, transportation safety analysts often deal with crash data that are characterized by low sample mean values. Under such conditions, the traditional test statistics may not perform very well. This study has three objectives. The first objective is to examine all the traditional test statistics and compare their performance for the GOF of accident models subjected to low sample means. Secondly, this study proposes a new test statistic that is not dependent on the sample size for Poisson regression model, as opposed to the grouped G(2) method. The proposed method is easy to use and does not require grouping data, which is time consuming and may not be feasible to use if the sample size is small. Moreover, the proposed method can be used for lower sample means than documented in previous studies. Thirdly, this study provides guidance on how and when to use appropriate test statistics for both Poisson and negative binomial (NB) regression models.
Collapse
Affiliation(s)
- Zhirui Ye
- School of Transportation, Southeast University, Nanjing, Jiangsu, China.
| | | | | |
Collapse
|
35
|
Zou Y, Geedipally SR, Lord D. Evaluating the double Poisson generalized linear model. Accid Anal Prev 2013; 59:497-505. [PMID: 23954684 DOI: 10.1016/j.aap.2013.07.017] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2012] [Revised: 07/09/2013] [Accepted: 07/12/2013] [Indexed: 06/02/2023]
Abstract
The objectives of this study are to: (1) examine the applicability of the double Poisson (DP) generalized linear model (GLM) for analyzing motor vehicle crash data characterized by over- and under-dispersion and (2) compare the performance of the DP GLM with the Conway-Maxwell-Poisson (COM-Poisson) GLM in terms of goodness-of-fit and theoretical soundness. The DP distribution has seldom been investigated and applied since its first introduction two decades ago. The hurdle for applying the DP is related to its normalizing constant (or multiplicative constant) which is not available in closed form. This study proposed a new method to approximate the normalizing constant of the DP with high accuracy and reliability. The DP GLM and COM-Poisson GLM were developed using two observed over-dispersed datasets and one observed under-dispersed dataset. The modeling results indicate that the DP GLM with its normalizing constant approximated by the new method can handle crash data characterized by over- and under-dispersion. Its performance is comparable to the COM-Poisson GLM in terms of goodness-of-fit (GOF), although COM-Poisson GLM provides a slightly better fit. For the over-dispersed data, the DP GLM performs similar to the NB GLM. Considering the fact that the DP GLM can be easily estimated with inexpensive computation and that it is simpler to interpret coefficients, it offers a flexible and efficient alternative for researchers to model count data.
Collapse
Affiliation(s)
- Yaotian Zou
- School of Civil Engineering, Purdue University, 550 Stadium Mall Drive, West Lafayette, IN 47907-2051, United States.
| | | | | |
Collapse
|
36
|
Miranda-Moreno LF, Heydari S, Lord D, Fu L. Bayesian road safety analysis: incorporation of past evidence and effect of hyper-prior choice. J Safety Res 2013; 46:31-40. [PMID: 23932683 DOI: 10.1016/j.jsr.2013.03.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Revised: 03/11/2013] [Accepted: 03/11/2013] [Indexed: 06/02/2023]
Abstract
PROBLEM This paper aims to address two related issues when applying hierarchical Bayesian models for road safety analysis, namely: (a) how to incorporate available information from previous studies or past experiences in the (hyper) prior distributions for model parameters and (b) what are the potential benefits of incorporating past evidence on the results of a road safety analysis when working with scarce accident data (i.e., when calibrating models with crash datasets characterized by a very low average number of accidents and a small number of sites). METHOD A simulation framework was developed to evaluate the performance of alternative hyper-priors including informative and non-informative Gamma, Pareto, as well as Uniform distributions. Based on this simulation framework, different data scenarios (i.e., number of observations and years of data) were defined and tested using crash data collected at 3-legged rural intersections in California and crash data collected for rural 4-lane highway segments in Texas. RESULTS This study shows how the accuracy of model parameter estimates (inverse dispersion parameter) is considerably improved when incorporating past evidence, in particular when working with the small number of observations and crash data with low mean. The results also illustrates that when the sample size (more than 100 sites) and the number of years of crash data is relatively large, neither the incorporation of past experience nor the choice of the hyper-prior distribution may affect the final results of a traffic safety analysis. CONCLUSIONS As a potential solution to the problem of low sample mean and small sample size, this paper suggests some practical guidance on how to incorporate past evidence into informative hyper-priors. By combining evidence from past studies and data available, the model parameter estimates can significantly be improved. The effect of prior choice seems to be less important on the hotspot identification. IMPACT ON INDUSTRY The results show the benefits of incorporating prior information when working with limited crash data in road safety studies.
Collapse
Affiliation(s)
- Luis F Miranda-Moreno
- Department of Civil Engineering and Applied Mechanics, McGill University, Macdonald Engineering Building, 817 Sherbrooke St. W., Montreal, Quebec H3A 2K6, Canada.
| | | | | | | |
Collapse
|
37
|
Zou Y, Zhang Y, Lord D. Application of finite mixture of negative binomial regression models with varying weight parameters for vehicle crash data analysis. Accid Anal Prev 2013; 50:1042-1051. [PMID: 23022076 DOI: 10.1016/j.aap.2012.08.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2012] [Revised: 07/23/2012] [Accepted: 08/05/2012] [Indexed: 06/01/2023]
Abstract
Recently, a finite mixture of negative binomial (NB) regression models has been proposed to address the unobserved heterogeneity problem in vehicle crash data. This approach can provide useful information about features of the population under study. For a standard finite mixture of regression models, previous studies have used a fixed weight parameter that is applied to the entire dataset. However, various studies suggest modeling the weight parameter as a function of the explanatory variables in the data. The objective of this study is to investigate the differences on the modeling and fitting results between the two-component finite mixture of NB regression models with fixed weight parameters (FMNB-2) and the two-component finite mixture of NB regression models with varying weight parameters (GFMNB-2), and compare the group classification from both models. To accomplish the objective of this study, the FMNB-2 and GFMNB-2 models are applied to two crash datasets. The important findings can be summarized as follows: first, the GFMNB-2 models can provide more reasonable classification results, as well as better statistical fitting performance than the FMNB-2 models; second, the GFMNB-2 models can be used to better reveal the source of dispersion observed in the crash data than the FMNB-2 models. Therefore, it is concluded that in many cases the GFMNB-2 models may be a better alternative to the FMNB-2 models for explaining the heterogeneity and the nature of the dispersion in the crash data.
Collapse
Affiliation(s)
- Yajie Zou
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States.
| | | | | |
Collapse
|
38
|
Lord D, Kuo PF. Examining the effects of site selection criteria for evaluating the effectiveness of traffic safety countermeasures. Accid Anal Prev 2012; 47:52-63. [PMID: 22405239 DOI: 10.1016/j.aap.2011.12.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2011] [Revised: 11/20/2011] [Accepted: 12/21/2011] [Indexed: 05/31/2023]
Abstract
The primary objective of this paper is to describe how site selection effects can influence the safety effectiveness of treatments. More specifically, the goal is to quantify the bias for the safety effectiveness of a treatment as a function of different entry criteria as well as other factors associated with crash data, and propose a new method to minimize this bias when a control group is not available. The study objective was accomplished using simulated data. The proposed method documented in this paper was compared to the four most common types of before-after studies: the Naïve, using a control group (CG), the empirical Bayes (EB) method based on the method of moment (EB(MM)), and the EB method based on a control group (EB(CG)). Five scenarios were examined: a direct comparison of the methods, different dispersion parameter values of the Negative Binomial model, different sample sizes, different values of the index of safety effectiveness (θ), and different levels of uncertainty associated with the index. Based on the simulated scenarios (also supported theoretically), the study results showed that higher entry criteria, larger values of the safety effectiveness, and smaller dispersion parameter values will cause a larger selection bias. Furthermore, among all methods evaluated, the Naïve and the EB(MM) methods are both significantly affected by the selection bias. Using a control group, or the EB(CG), can mutually eliminate the site selection bias, as long as the characteristics of the control group (truncated data for the CG method or the non-truncated sample population for the EB(CG) method) are exactly the same as for the treatment group. In practice, finding datasets for the control group with the exact same characteristics as for the treatment group may not always be feasible. To overcome this problem, the method proposed in this study can be used to adjust the Naïve estimator of the index of safety effectiveness, even when the mean and dispersion parameter are not properly estimated.
Collapse
Affiliation(s)
- Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, USA
| | | |
Collapse
|
39
|
Geedipally SR, Lord D, Dhavala SS. The negative binomial-Lindley generalized linear model: characteristics and application using crash data. Accid Anal Prev 2012; 45:258-265. [PMID: 22269508 DOI: 10.1016/j.aap.2011.07.012] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/28/2011] [Revised: 07/18/2011] [Accepted: 07/18/2011] [Indexed: 05/31/2023]
Abstract
There has been a considerable amount of work devoted by transportation safety analysts to the development and application of new and innovative models for analyzing crash data. One important characteristic about crash data that has been documented in the literature is related to datasets that contained a large amount of zeros and a long or heavy tail (which creates highly dispersed data). For such datasets, the number of sites where no crash is observed is so large that traditional distributions and regression models, such as the Poisson and Poisson-gamma or negative binomial (NB) models cannot be used efficiently. To overcome this problem, the NB-Lindley (NB-L) distribution has recently been introduced for analyzing count data that are characterized by excess zeros. The objective of this paper is to document the application of a NB generalized linear model with Lindley mixed effects (NB-L GLM) for analyzing traffic crash data. The study objective was accomplished using simulated and observed datasets. The simulated dataset was used to show the general performance of the model. The model was then applied to two datasets based on observed data. One of the dataset was characterized by a large amount of zeros. The NB-L GLM was compared with the NB and zero-inflated models. Overall, the research study shows that the NB-L GLM not only offers superior performance over the NB and zero-inflated models when datasets are characterized by a large number of zeros and a long tail, but also when the crash dataset is highly dispersed.
Collapse
Affiliation(s)
- Srinivas Reddy Geedipally
- Texas Transportation Institute, Texas A&M University, 3135 TAMU, College Station, TX 77843-3135, United States.
| | | | | |
Collapse
|
40
|
Patil S, Geedipally SR, Lord D. Analysis of crash severities using nested logit model--accounting for the underreporting of crashes. Accid Anal Prev 2012; 45:646-653. [PMID: 22269553 DOI: 10.1016/j.aap.2011.09.034] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2011] [Revised: 08/22/2011] [Accepted: 09/18/2011] [Indexed: 05/31/2023]
Abstract
Recent studies in the area of highway safety have demonstrated the usefulness of logit models for modeling crash injury severities. Use of these models enables one to identify and quantify the effects of factors that contribute to certain levels of severity. Most often, these models are estimated assuming equal probability of the occurrence for each injury severity level in the data. However, traffic crash data are generally characterized by underreporting, especially when crashes result in lower injury severity. Thus, the sample used for an analysis is often outcome-based, which can result in a biased estimation of model parameters. This is more of a problem when a nested logit model specification is used instead of a multinomial logit model and when true shares of the outcomes-injury severity levels in the population are not known (which is almost always the case). This study demonstrates an application of a recently proposed weighted conditional maximum likelihood estimator in tackling the problem of underreporting of crashes when using a nested logit model for crash severity analyses.
Collapse
Affiliation(s)
- Sunil Patil
- RAND Europe, Westbrook Center, Milton Road, Cambridge-CB4 1YG, UK.
| | | | | |
Collapse
|
41
|
Francis RA, Geedipally SR, Guikema SD, Dhavala SS, Lord D, LaRocca S. Characterizing the performance of the Conway-Maxwell Poisson generalized linear model. Risk Anal 2012; 32:167-183. [PMID: 21801191 DOI: 10.1111/j.1539-6924.2011.01659.x] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Count data are pervasive in many areas of risk analysis; deaths, adverse health outcomes, infrastructure system failures, and traffic accidents are all recorded as count events, for example. Risk analysts often wish to estimate the probability distribution for the number of discrete events as part of doing a risk assessment. Traditional count data regression models of the type often used in risk assessment for this problem suffer from limitations due to the assumed variance structure. A more flexible model based on the Conway-Maxwell Poisson (COM-Poisson) distribution was recently proposed, a model that has the potential to overcome the limitations of the traditional model. However, the statistical performance of this new model has not yet been fully characterized. This article assesses the performance of a maximum likelihood estimation method for fitting the COM-Poisson generalized linear model (GLM). The objectives of this article are to (1) characterize the parameter estimation accuracy of the MLE implementation of the COM-Poisson GLM, and (2) estimate the prediction accuracy of the COM-Poisson GLM using simulated data sets. The results of the study indicate that the COM-Poisson GLM is flexible enough to model under-, equi-, and overdispersed data sets with different sample mean values. The results also show that the COM-Poisson GLM yields accurate parameter estimates. The COM-Poisson GLM provides a promising and flexible approach for performing count data regression.
Collapse
Affiliation(s)
- Royce A Francis
- Department of Engineering Management and Systems Engineering, George Washington University, Washington, DC, USA
| | | | | | | | | | | |
Collapse
|
42
|
Lord D, Geedipally SR. The negative binomial-Lindley distribution as a tool for analyzing crash data characterized by a large amount of zeros. Accid Anal Prev 2011; 43:1738-1742. [PMID: 21658501 DOI: 10.1016/j.aap.2011.04.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2011] [Revised: 04/01/2011] [Accepted: 04/02/2011] [Indexed: 05/30/2023]
Abstract
The modeling of crash count data is a very important topic in highway safety. As documented in the literature, given the characteristics associated with crash data, transportation safety analysts have proposed a significant number of analysis tools, statistical methods and models for analyzing such data. Among the data issues, we find the one related to crash data which have a large amount of zeros and a long or heavy tail. It has been found that using this kind of dataset could lead to erroneous results or conclusions if the wrong statistical tools or methods are used. Thus, the purpose of this paper is to introduce a new distribution, known as the negative binomial-Lindley (NB-L), which has very recently been introduced for analyzing data characterized by a large number of zeros. The NB-L offers the advantage of being able to handle this kind of datasets, while still maintaining similar characteristics as the traditional negative binomial (NB). In other words, the NB-L is a two-parameter distribution and the long-term mean is never equal to zero. To examine this distribution, simulated and observed data were used. The results show that the NB-L can provide a better statistical fit than the traditional NB for datasets that contain a large amount of zeros.
Collapse
Affiliation(s)
- Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, USA.
| | | |
Collapse
|
43
|
Savolainen PT, Mannering FL, Lord D, Quddus MA. The statistical analysis of highway crash-injury severities: a review and assessment of methodological alternatives. Accid Anal Prev 2011; 43:1666-1676. [PMID: 21658493 DOI: 10.1016/j.aap.2011.03.025] [Citation(s) in RCA: 304] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2010] [Revised: 03/24/2011] [Accepted: 03/27/2011] [Indexed: 05/30/2023]
Abstract
Reducing the severity of injuries resulting from motor-vehicle crashes has long been a primary emphasis of highway agencies and motor-vehicle manufacturers. While progress can be simply measured by the reduction in injury levels over time, insights into the effectiveness of injury-reduction technologies, policies, and regulations require a more detailed empirical assessment of the complex interactions that vehicle, roadway, and human factors have on resulting crash-injury severities. Over the years, researchers have used a wide range of methodological tools to assess the impact of such factors on disaggregate-level injury-severity data, and recent methodological advances have enabled the development of sophisticated models capable of more precisely determining the influence of these factors. This paper summarizes the evolution of research and current thinking as it relates to the statistical analysis of motor-vehicle injury severities, and provides a discussion of future methodological directions.
Collapse
Affiliation(s)
- Peter T Savolainen
- Department of Civil and Environmental Engineering, Wayne State University, 5050 Anthony Wayne Drive, Detroit, MI 48202-3902, United States.
| | | | | | | |
Collapse
|
44
|
Lord D, Page R. Elucidating the functions of key regulators in biofilm formation and dispersal. Acta Crystallogr A 2011. [DOI: 10.1107/s010876731108785x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
45
|
Ye Z, Veneziano D, Lord D. Safety impact of Gateway Monuments. Accid Anal Prev 2011; 43:290-300. [PMID: 21094327 DOI: 10.1016/j.aap.2010.08.027] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2009] [Revised: 05/09/2010] [Accepted: 08/29/2010] [Indexed: 05/30/2023]
Abstract
Gateway Monuments are free standing roadside structures or signage that communicate the name of a city, country or township to motorists. The placement of such monuments within state-controlled right-of-way is a relatively recent occurrence in California. As a result, the California Department of Transportation (Caltrans) initiated research to quantify the impacts that this type of signage may or may not have on crashes in their vicinity. To date, no specific research has examined the impact such features have on crashes. To determine whether these features impacted safety, the before-after study method using the Empirical Bayes technique was used, with reference groups and Safety Performance Functions adapted from existing studies, eliminating the need to calibrate new models. Results indicated that, on an individual basis, no deterioration in safety was observed at any monument site. When all sites were examined collectively (using two different scenarios), the calculated index of effectiveness values were 0.978 and 0.680, respectively, corresponding to 2.2% and 32.0% reductions in crashes. In addition to the EB method, naïve study methods (with and without AADT taken into account) were applied to the study data. Results (crash reductions) from these methods also showed that the presence of Gateway Monuments did not have negative impact on traffic safety. However, the use of EB technique should be very careful employed when adopting reference groups from different jurisdictions, as these may affect the validity of EB results. In light of these results, Caltrans may continue to participate in the Gateway Monument Program at its discretion with the knowledge that roadway safety is not impacted by monuments.
Collapse
Affiliation(s)
- Zhirui Ye
- Western Transportation Institute, Montana State University, Bozeman, MT, USA.
| | | | | |
Collapse
|
46
|
Li X, Lord D, Zhang Y. Development of Accident Modification Factors for Rural Frontage Road Segments in Texas Using Generalized Additive Models. ACTA ACUST UNITED AC 2011. [DOI: 10.1061/(asce)te.1943-5436.0000202] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]
Affiliation(s)
- Xiugang Li
- Transportation Analyst, Oregon Dept. of Transportation, 555 13th St. NE Ste 2, Salem, OR 97301
- Associate Professor, Texas A&M Univ., CE/TTI 301-A, 3136 TAMU, College Station, TX 77843 (corresponding author)
- Associate Professor, Texas A&M Univ., CE/TTI 301-G, 3136 TAMU, College Station, TX 77843
| | - Dominique Lord
- Transportation Analyst, Oregon Dept. of Transportation, 555 13th St. NE Ste 2, Salem, OR 97301
- Associate Professor, Texas A&M Univ., CE/TTI 301-A, 3136 TAMU, College Station, TX 77843 (corresponding author)
- Associate Professor, Texas A&M Univ., CE/TTI 301-G, 3136 TAMU, College Station, TX 77843
| | - Yunlong Zhang
- Transportation Analyst, Oregon Dept. of Transportation, 555 13th St. NE Ste 2, Salem, OR 97301
- Associate Professor, Texas A&M Univ., CE/TTI 301-A, 3136 TAMU, College Station, TX 77843 (corresponding author)
- Associate Professor, Texas A&M Univ., CE/TTI 301-G, 3136 TAMU, College Station, TX 77843
| |
Collapse
|
47
|
Fitzpatrick K, Lord D, Park BJ. Horizontal Curve Accident Modification Factor with Consideration of Driveway Density on Rural Four-Lane Highways in Texas. ACTA ACUST UNITED AC 2010. [DOI: 10.1061/(asce)te.1943-5436.0000145] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
Affiliation(s)
- Kay Fitzpatrick
- Senior Research Engineer, Texas Transportation Institute, Texas A&M Univ. System, 3135 TAMU, College Station, TX 77843-3135 (corresponding author)
- Associate Professor, Texas A&M Univ., 3136 TAMU, College Station, TX 77843-3136
- Graduate Research Assistant, Texas Transportation Institute, Texas A&M Univ. System, 3135 TAMU, College Station, TX 77843-3135
| | - Dominique Lord
- Senior Research Engineer, Texas Transportation Institute, Texas A&M Univ. System, 3135 TAMU, College Station, TX 77843-3135 (corresponding author)
- Associate Professor, Texas A&M Univ., 3136 TAMU, College Station, TX 77843-3136
- Graduate Research Assistant, Texas Transportation Institute, Texas A&M Univ. System, 3135 TAMU, College Station, TX 77843-3135
| | - Byung-Jung Park
- Senior Research Engineer, Texas Transportation Institute, Texas A&M Univ. System, 3135 TAMU, College Station, TX 77843-3135 (corresponding author)
- Associate Professor, Texas A&M Univ., 3136 TAMU, College Station, TX 77843-3136
- Graduate Research Assistant, Texas Transportation Institute, Texas A&M Univ. System, 3135 TAMU, College Station, TX 77843-3135
| |
Collapse
|
48
|
Lord D, Geedipally SR, Guikema SD. Extension of the application of conway-maxwell-poisson models: analyzing traffic crash data exhibiting underdispersion. Risk Anal 2010; 30:1268-1276. [PMID: 20412518 DOI: 10.1111/j.1539-6924.2010.01417.x] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The objective of this article is to evaluate the performance of the COM-Poisson GLM for analyzing crash data exhibiting underdispersion (when conditional on the mean). The COM-Poisson distribution, originally developed in 1962, has recently been reintroduced by statisticians for analyzing count data subjected to either over- or underdispersion. Over the last year, the COM-Poisson GLM has been evaluated in the context of crash data analysis and it has been shown that the model performs as well as the Poisson-gamma model for crash data exhibiting overdispersion. To accomplish the objective of this study, several COM-Poisson models were estimated using crash data collected at 162 railway-highway crossings in South Korea between 1998 and 2002. This data set has been shown to exhibit underdispersion when models linking crash data to various explanatory variables are estimated. The modeling results were compared to those produced from the Poisson and gamma probability models documented in a previous published study. The results of this research show that the COM-Poisson GLM can handle crash data when the modeling output shows signs of underdispersion. Finally, they also show that the model proposed in this study provides better statistical performance than the gamma probability and the traditional Poisson models, at least for this data set.
Collapse
Affiliation(s)
- Dominique Lord
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, USA.
| | | | | |
Collapse
|
49
|
Geedipally SR, Lord D. Investigating the effect of modeling single-vehicle and multi-vehicle crashes separately on confidence intervals of Poisson-gamma models. Accid Anal Prev 2010; 42:1273-1282. [PMID: 20441842 DOI: 10.1016/j.aap.2010.02.004] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2009] [Revised: 02/04/2010] [Accepted: 02/08/2010] [Indexed: 05/29/2023]
Abstract
Crash prediction models still constitute one of the primary tools for estimating traffic safety. These statistical models play a vital role in various types of safety studies. With a few exceptions, they have often been employed to estimate the number of crashes per unit of time for an entire highway segment or intersection, without distinguishing the influence different sub-groups have on crash risk. The two most important sub-groups that have been identified in the literature are single- and multi-vehicle crashes. Recently, some researchers have noted that developing two distinct models for these two categories of crashes provides better predicting performance than developing models combining both crash categories together. Thus, there is a need to determine whether a significant difference exists for the computation of confidence intervals when a single model is applied rather than two distinct models for single- and multi-vehicle crashes. Building confidence intervals have many important applications in highway safety. This paper investigates the effect of modeling single- and multi-vehicle (head-on and rear-end only) crashes separately versus modeling them together on the prediction of confidence intervals of Poisson-gamma models. Confidence intervals were calculated for total (all severities) crash models and fatal and severe injury crash models. The data used for the comparison analysis were collected on Texas multilane undivided highways for the years 1997-2001. This study shows that modeling single- and multi-vehicle crashes separately predicts larger confidence intervals than modeling them together as a single model. This difference is much larger for fatal and injury crash models than for models for all severity levels. Furthermore, it is found that the single- and multi-vehicle crashes are not independent. Thus, a joint (bivariate) model which accounts for correlation between single- and multi-vehicle crashes is developed and it predicts wider confidence intervals than a univariate model for all severities. Finally, the simulation results show that separate models predict values that are closer to the true confidence intervals, and thus this research supports previous studies that recommended modeling single- and multi-vehicle crashes separately for analyzing highway segments.
Collapse
|
50
|
Park BJ, Lord D, Hart JD. Bias properties of Bayesian statistics in finite mixture of negative binomial regression models in crash data analysis. Accid Anal Prev 2010; 42:741-749. [PMID: 20159102 DOI: 10.1016/j.aap.2009.11.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2009] [Revised: 10/27/2009] [Accepted: 11/03/2009] [Indexed: 05/28/2023]
Abstract
Factors that cause heterogeneity in crash data are often unknown to researchers and failure to accommodate such heterogeneity in statistical models can undermine the validity of empirical results. A recently proposed finite mixture for the negative binomial regression model has shown a potential advantage in addressing the unobserved heterogeneity as well as providing useful information about features of the population under study. Despite its usefulness, however, no study has been found to examine the performance of this finite mixture under various conditions of sample sizes and sample-mean values that are common in crash data analysis. This study investigated the bias associated with the Bayesian summary statistics (posterior mean and median) of dispersion parameters in the two-component finite mixture of negative binomial regression models. A simulation study was conducted using various sample sizes under different sample-mean values. Two prior specifications (non-informative and weakly-informative) on the dispersion parameter were also compared. The results showed that the posterior mean using the non-informative prior exhibited a high bias for the dispersion parameter and should be avoided when the dataset contains less than 2,000 observations (even for high sample-mean values). The posterior median showed much better bias properties, particularly at small sample sizes and small sample means. However, as the sample size increases, the posterior median using the non-informative prior also began to exhibit an upward-bias trend. In such cases, the posterior mean or median with the weakly-informative prior provided smaller bias. Based on simulation results, guidelines about the choice of priors and the summary statistics to use are presented for different sample sizes and sample-mean values.
Collapse
Affiliation(s)
- Byung-Jung Park
- Zachry Department of Civil Engineering, Texas A&M University, 3136 TAMU, College Station, TX 77843-3136, United States
| | | | | |
Collapse
|