Understanding the effects of underreporting on injury severity estimation of single-vehicle motorcycle crashes: A hybrid approach incorporating majority class oversampling and random parameters with heterogeneity-in-means
{"title":"Understanding the effects of underreporting on injury severity estimation of single-vehicle motorcycle crashes: A hybrid approach incorporating majority class oversampling and random parameters with heterogeneity-in-means","authors":"Nawaf Alnawmasi , Apostolos Ziakopoulos , Athanasios Theofilatos , Yasir Ali","doi":"10.1016/j.amar.2025.100372","DOIUrl":null,"url":null,"abstract":"<div><div>The underreporting of crash data is a well-documented issue in road safety literature, but few studies have focused on addressing this problem in the context of analyzing crash injury severities. This paper aims to provide an empirical assessment of the impact of underreporting issue using a hybrid approach in estimating injury severity for single-vehicle motorcycle crashes. Unlike traditional machine learning methods that oversample the minority class (the category with the fewer observations such as fatal and severe injuries), the present study oversamples the majority class (i.e. minor injuries), which are often underreported in crash datasets, thus providing a fresh perspective on this issue. Afterwards, random parameter models with heterogeneity in means and variances were applied. The results of this study, as supported by the likelihood ratio tests, indicate that the key variables influencing motorcyclists’ injury severities remain consistent across both original and oversampled data models. Specifically, crashes occurring during slowing down or stopping are associated with lower injury severity, whereas negotiating a right turn increases the probability of severe injuries. Interestingly, crashes that occur on dry pavements are associated with higher injury severity when compared to wet pavements, likely due to rider behavior adjustments in adverse weather conditions to compensate for the risk. Overall, the oversampled models have a significantly lower marginal effects values compared to the original model’s marginal effects. This study provides a foundation for further examination of underreporting issue in crash injury severity modelling and also highlights the need to capture the dynamics of crash injuries suggesting that alternative approaches could improve the understanding and hence road safety management. Future studies are encouraged to replicate this methodology to validate the findings as well as utilize other advanced machine learning algorithms, like tree-based models to assess underreporting mitigation.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"45 ","pages":"Article 100372"},"PeriodicalIF":12.5000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytic Methods in Accident Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221366572500003X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
The underreporting of crash data is a well-documented issue in road safety literature, but few studies have focused on addressing this problem in the context of analyzing crash injury severities. This paper aims to provide an empirical assessment of the impact of underreporting issue using a hybrid approach in estimating injury severity for single-vehicle motorcycle crashes. Unlike traditional machine learning methods that oversample the minority class (the category with the fewer observations such as fatal and severe injuries), the present study oversamples the majority class (i.e. minor injuries), which are often underreported in crash datasets, thus providing a fresh perspective on this issue. Afterwards, random parameter models with heterogeneity in means and variances were applied. The results of this study, as supported by the likelihood ratio tests, indicate that the key variables influencing motorcyclists’ injury severities remain consistent across both original and oversampled data models. Specifically, crashes occurring during slowing down or stopping are associated with lower injury severity, whereas negotiating a right turn increases the probability of severe injuries. Interestingly, crashes that occur on dry pavements are associated with higher injury severity when compared to wet pavements, likely due to rider behavior adjustments in adverse weather conditions to compensate for the risk. Overall, the oversampled models have a significantly lower marginal effects values compared to the original model’s marginal effects. This study provides a foundation for further examination of underreporting issue in crash injury severity modelling and also highlights the need to capture the dynamics of crash injuries suggesting that alternative approaches could improve the understanding and hence road safety management. Future studies are encouraged to replicate this methodology to validate the findings as well as utilize other advanced machine learning algorithms, like tree-based models to assess underreporting mitigation.
期刊介绍:
Analytic Methods in Accident Research is a journal that publishes articles related to the development and application of advanced statistical and econometric methods in studying vehicle crashes and other accidents. The journal aims to demonstrate how these innovative approaches can provide new insights into the factors influencing the occurrence and severity of accidents, thereby offering guidance for implementing appropriate preventive measures. While the journal primarily focuses on the analytic approach, it also accepts articles covering various aspects of transportation safety (such as road, pedestrian, air, rail, and water safety), construction safety, and other areas where human behavior, machine failures, or system failures lead to property damage or bodily harm.