首页 > 最新文献

Analytic Methods in Accident Research最新文献

英文 中文
Revisiting traffic conflict modelling: comparing generalized Pareto and Lomax models for failure-induced and proximity-based conflicts 重新审视交通冲突模型:比较故障诱导冲突和基于邻近性冲突的广义Pareto和Lomax模型
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2026-01-05 DOI: 10.1016/j.amar.2026.100418
Harpreet Singh , Shimul Md Mazharul Haque
Over the past few decades, traffic conflict modelling with proximity-based conflicts has emerged as a key approach for estimating crash risk from traffic conflicts, with extreme value models providing a rigorous framework for extrapolating rare-event probabilities. However, proximity-based definitions of conflicts may lead to biased estimation, as they often include interactions arising from deliberate and controlled driving behaviours that may not correspond to actual crash likelihood. In contrast, failure-induced conflicts that take into account evasive action and response delays can potentially overcome this limitation. Despite these advances, a comprehensive comparison of proximity-based and failure-induced conflicts within crash risk modelling is still lacking. This study addresses this gap by comparing and evaluating the performance of different threshold-exceedance modelling approaches for crash frequency estimation. Three threshold-exceedance models are evaluated in the study, including (i) a Lomax model applied to response delays during failure-induced conflicts, (ii) a Generalized Pareto Distribution model for proximity-based conflicts, and (iii) a Generalized Pareto Distribution model for failure-induced conflicts. Empirical analysis is conducted using high-resolution trajectory data from four signalized intersections in Brisbane, Australia. The results indicate that both the Generalized Pareto Distribution model for failure-induced conflicts and the Lomax model, representing response delays within failure-induced conflicts, provided reasonable estimates of historical rear-end crashes, with predicted crash counts contained within the 95 % confidence interval of the observed crash data. In contrast, the Generalized Pareto Distribution model for the proximity-based conflicts overestimated the crash frequency. Notably, within the failure-induced conflicts, the Generalized Pareto Distribution model demonstrated greater accuracy than the Lomax model, yielding estimates closer to the observed mean and with narrower confidence bounds, thereby indicating higher predictive precision. Overall, the findings underscore the value of incorporating failure-induced conflicts into traffic conflict modelling, revealing that the Generalized Pareto Distribution model with failure-induced conflicts provides more accurate and reliable crash risk estimates.
在过去的几十年里,基于邻近性冲突的交通冲突建模已经成为估计交通冲突事故风险的一种关键方法,极值模型为推断罕见事件概率提供了严格的框架。然而,基于接近度的冲突定义可能会导致有偏见的估计,因为它们通常包括由故意和控制的驾驶行为引起的相互作用,而这些行为可能与实际的碰撞可能性不符。相比之下,考虑到逃避行为和响应延迟的失败引发的冲突可以潜在地克服这一限制。尽管取得了这些进展,但在坠机风险模型中,基于接近性和故障引发的冲突的全面比较仍然缺乏。本研究通过比较和评估用于碰撞频率估计的不同阈值超越建模方法的性能来解决这一差距。本研究评估了三种阈值超越模型,包括(i)用于故障引起冲突期间响应延迟的Lomax模型,(ii)用于基于邻近的冲突的广义帕累托分布模型,以及(iii)用于故障引起冲突的广义帕累托分布模型。利用澳大利亚布里斯班四个信号交叉口的高分辨率轨迹数据进行了实证分析。结果表明,故障引起的冲突的广义Pareto分布模型和表示故障引起的冲突中的响应延迟的Lomax模型都提供了对历史追尾事故的合理估计,预测的碰撞计数包含在观察到的碰撞数据的95%置信区间内。而广义帕累托分布模型对基于接近度的冲突估计过高。值得注意的是,在故障引起的冲突中,广义帕累托分布模型比Lomax模型显示出更高的准确性,其估计更接近观察到的平均值,置信限更窄,从而表明更高的预测精度。总的来说,研究结果强调了将故障引起的冲突纳入交通冲突模型的价值,揭示了包含故障引起的冲突的广义帕累托分布模型提供了更准确和可靠的碰撞风险估计。
{"title":"Revisiting traffic conflict modelling: comparing generalized Pareto and Lomax models for failure-induced and proximity-based conflicts","authors":"Harpreet Singh ,&nbsp;Shimul Md Mazharul Haque","doi":"10.1016/j.amar.2026.100418","DOIUrl":"10.1016/j.amar.2026.100418","url":null,"abstract":"<div><div>Over the past few decades, traffic conflict modelling with proximity-based conflicts has emerged as a key approach for estimating crash risk from traffic conflicts, with extreme value models providing a rigorous framework for extrapolating rare-event probabilities. However, proximity-based definitions of conflicts may lead to biased estimation, as they often include interactions arising from deliberate and controlled driving behaviours that may not correspond to actual crash likelihood. In contrast, failure-induced conflicts that take into account evasive action and response delays can potentially overcome this limitation. Despite these advances, a comprehensive comparison of proximity-based and failure-induced conflicts within crash risk modelling is still lacking. This study addresses this gap by comparing and evaluating the performance of different threshold-exceedance modelling approaches for crash frequency estimation. Three threshold-exceedance models are evaluated in the study, including (i) a Lomax model applied to response delays during failure-induced conflicts, (ii) a Generalized Pareto Distribution model for proximity-based conflicts, and (iii) a Generalized Pareto Distribution model for failure-induced conflicts. Empirical analysis is conducted using high-resolution trajectory data from four signalized intersections in Brisbane, Australia. The results indicate that both the Generalized Pareto Distribution model for failure-induced conflicts and the Lomax model, representing response delays within failure-induced conflicts, provided reasonable estimates of historical rear-end crashes, with predicted crash counts contained within the 95 % confidence interval of the observed crash data. In contrast, the Generalized Pareto Distribution model for the proximity-based conflicts overestimated the crash frequency. Notably, within the failure-induced conflicts, the Generalized Pareto Distribution model demonstrated greater accuracy than the Lomax model, yielding estimates closer to the observed mean and with narrower confidence bounds, thereby indicating higher predictive precision. Overall, the findings underscore the value of incorporating failure-induced conflicts into traffic conflict modelling, revealing that the Generalized Pareto Distribution model with failure-induced conflicts provides more accurate and reliable crash risk estimates.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"49 ","pages":"Article 100418"},"PeriodicalIF":12.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145977997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accounting for temporal instability in macro-level crash frequency modeling: A framework integrating high-resolution traffic dynamic patterns and spatiotemporal approaches 宏观层面碰撞频率建模中的时间不稳定性:一个整合高分辨率交通动态模式和时空方法的框架
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2025-12-18 DOI: 10.1016/j.amar.2025.100417
Jin Liu , Hao Yu , Guoyao Yang , Zhenning Li , Pan Liu
Small-scale temporal factors have substantial effects on zonal crash risk, yet their influence has long been overlooked due to data limitations. This omission may introduce confounding and bias the estimation of observed annually aggregated variables. This study aims to combine high-resolution traffic dynamic patterns derived from taxi trajectory data and advanced spatiotemporal models to account for hourly-scale temporal instability in year-level crash frequency modeling. Two sets of hourly-scale spatiotemporal traffic dynamic patterns were extracted, enabling the development of small-scale models. Spatiotemporal model with adaptive smoothing spatial specification was employed to further capture temporal effects. Model specification results revealed strong temporal autocorrelation at the hourly scale, with a magnitude comparable to the spatial autocorrelation. The results showed that the model effectively captured the varying safety impacts of unobserved temporal factors across hourly intervals, and that the extracted high-resolution patterns successfully internalized previously unobserved time-relevant information into the model’s linear component. Comparative analyses demonstrated that both incorporating traffic dynamic patterns and accounting for hourly-scale spatiotemporal autocorrelation in crash frequency modeling significantly improved the model fit and predictive performance. The proposed framework detected a broader set of risk factors than purely spatial models, and yielded less biased posterior means and more rigorous intervals through disentangling small-scale noise from fixed-effect signals and preventing pseudo-replication of observations. The extracted traffic flow dynamics also represent a fundamental yet traditionally inaccessible set of factors in regional crash analysis. This study established their macro-level associations with crash risk, revealing that higher mean speeds and lower speed fluctuations are linked to reduced crash risk. Additionally, these patterns can be regarded as “high-frequency” features, and those extracted from just one month of data were proved sufficient to construct models comparable to those based on the full-year dataset. This finding enables a practical framework that leverages real-time updated high-frequency variables as model inputs for rolling crash risk prediction. The methods and findings of this study offer practitioners in-depth macro-level insights into crash causation and valuable guidance on regional traffic safety interventions.
小尺度时间因子对区域碰撞风险具有重要影响,但由于数据的限制,其影响长期被忽视。这种遗漏可能会给观测到的年合计变量的估计带来混淆和偏差。本研究旨在结合基于出租车轨迹数据的高分辨率交通动态模式和先进的时空模型,在年级碰撞频率建模中考虑小时尺度的时间不稳定性。提取了两组小时尺度的时空交通动态格局,实现了小尺度模型的开发。采用自适应平滑空间规范的时空模型进一步捕捉时间效应。模式规范结果显示,在小时尺度上具有较强的时间自相关性,其强度与空间自相关性相当。结果表明,该模型有效地捕获了未观察到的时间因素在小时间隔内的不同安全影响,并且提取的高分辨率模式成功地将先前未观察到的时间相关信息内在化到模型的线性成分中。对比分析表明,在碰撞频率建模中纳入交通动态模式和考虑小时尺度时空自相关因素均显著提高了模型拟合和预测性能。与纯粹的空间模型相比,该框架检测到更广泛的风险因素,并通过从固定效应信号中分离小尺度噪声和防止观测结果的伪复制,产生更少偏差的后验均值和更严格的区间。提取的交通流动态也代表了区域碰撞分析中一组基本但传统上难以获得的因素。这项研究建立了它们与碰撞风险的宏观联系,揭示了较高的平均速度和较低的速度波动与降低碰撞风险有关。此外,这些模式可以被视为“高频”特征,从一个月的数据中提取的特征被证明足以构建与基于全年数据集的模型相当的模型。这一发现使得一个实用的框架能够利用实时更新的高频变量作为滚动碰撞风险预测的模型输入。本研究的方法和结果为从业人员提供了深入了解事故原因的宏观层面的见解,并为区域交通安全干预提供了有价值的指导。
{"title":"Accounting for temporal instability in macro-level crash frequency modeling: A framework integrating high-resolution traffic dynamic patterns and spatiotemporal approaches","authors":"Jin Liu ,&nbsp;Hao Yu ,&nbsp;Guoyao Yang ,&nbsp;Zhenning Li ,&nbsp;Pan Liu","doi":"10.1016/j.amar.2025.100417","DOIUrl":"10.1016/j.amar.2025.100417","url":null,"abstract":"<div><div>Small-scale temporal factors have substantial effects on zonal crash risk, yet their influence has long been overlooked due to data limitations. This omission may introduce confounding and bias the estimation of observed annually aggregated variables. This study aims to combine high-resolution traffic dynamic patterns derived from taxi trajectory data and advanced spatiotemporal models to account for hourly-scale temporal instability in year-level crash frequency modeling. Two sets of hourly-scale spatiotemporal traffic dynamic patterns were extracted, enabling the development of small-scale models. Spatiotemporal model with adaptive smoothing spatial specification was employed to further capture temporal effects. Model specification results revealed strong temporal autocorrelation at the hourly scale, with a magnitude comparable to the spatial autocorrelation. The results showed that the model effectively captured the varying safety impacts of unobserved temporal factors across hourly intervals, and that the extracted high-resolution patterns successfully internalized previously unobserved time-relevant information into the model’s linear component. Comparative analyses demonstrated that both incorporating traffic dynamic patterns and accounting for hourly-scale spatiotemporal autocorrelation in crash frequency modeling significantly improved the model fit and predictive performance. The proposed framework detected a broader set of risk factors than purely spatial models, and yielded less biased posterior means and more rigorous intervals through disentangling small-scale noise from fixed-effect signals and preventing pseudo-replication of observations. The extracted traffic flow dynamics also represent a fundamental yet traditionally inaccessible set of factors in regional crash analysis. This study established their macro-level associations with crash risk, revealing that higher mean speeds and lower speed fluctuations are linked to reduced crash risk. Additionally, these patterns can be regarded as “high-frequency” features, and those extracted from just one month of data were proved sufficient to construct models comparable to those based on the full-year dataset. This finding enables a practical framework that leverages real-time updated high-frequency variables as model inputs for rolling crash risk prediction. The methods and findings of this study offer practitioners in-depth macro-level insights into crash causation and valuable guidance on regional traffic safety interventions.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"49 ","pages":"Article 100417"},"PeriodicalIF":12.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145842329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An evasive action-based bivariate extreme value model for estimating pedestrian crash frequency using traffic conflicts 基于回避行为的交通冲突行人碰撞频率估计二元极值模型
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2026-02-02 DOI: 10.1016/j.amar.2026.100420
Saransh Sahu , Yasir Ali , Sebastien Glaser , Shimul Md Mazharul Haque
Traditional models, employing extreme value theory for estimating pedestrian crashes from traffic conflicts, commonly utilise popular conflict measures, such as post encroachment time and gap time. Whilst these measures have proven useful, they are limited in identifying a vehicle–pedestrian conflict based on a fixed threshold value and depend on subjective graphical-based extreme identification methods, which neither fully capture the dynamic interactions between vehicles and pedestrians nor account for road user behaviour to identify conflicting events. This study proposes a bivariate extreme value modelling framework that analyses evasive action-based traffic conflicts by integrating risk force theory and artificial intelligence-based video analytics to estimate pedestrian crash frequency by severity. The methodological framework quantifies crash risk dynamically during vehicle–pedestrian interactions and identifies traffic conflict events based on evasive behaviours. Traffic conflicts are modelled using a Generalised Pareto distribution to capture the tail behaviour of high-risk conflicts. The proposed econometric modelling framework was validated using 72 h of traffic movement data from three signalised intersections in Queensland, Australia. Results demonstrate that the Generalised Pareto distributions effectively fit evasive action-based vehicle–pedestrian conflicts, with estimated total pedestrian frequency and severe crash frequency aligning closely with historical crash records, thereby supporting the validity of the proposed model. This study presents a scalable, behaviourally grounded methodology as an alternative to a subjective conflict identification approach, enabling continuous risk assessment for proactive pedestrian safety management and real-time safety analysis.
传统模型采用极值理论估计交通冲突中的行人碰撞,通常使用流行的冲突度量,如侵占后时间和间隙时间。虽然这些措施已被证明是有用的,但它们在识别基于固定阈值的车辆-行人冲突方面受到限制,并且依赖于主观的基于图形的极端识别方法,这些方法既不能完全捕获车辆和行人之间的动态交互,也不能考虑道路使用者的行为来识别冲突事件。本研究提出了一个二元极值建模框架,该框架通过整合风险力理论和基于人工智能的视频分析来分析基于规避行为的交通冲突,从而根据严重程度估计行人碰撞频率。该方法框架动态量化车辆与行人交互过程中的碰撞风险,并基于规避行为识别交通冲突事件。利用广义帕累托分布对交通冲突进行建模,以捕捉高风险冲突的尾部行为。所提出的计量经济模型框架使用来自澳大利亚昆士兰州三个信号交叉口的72小时交通运动数据进行了验证。结果表明,广义帕累托分布有效拟合了基于规避行为的车-人冲突,估计的行人总频率和严重碰撞频率与历史碰撞记录密切相关,从而支持了所提模型的有效性。本研究提出了一种可扩展的、基于行为的方法,作为主观冲突识别方法的替代方案,为主动行人安全管理和实时安全分析提供持续的风险评估。
{"title":"An evasive action-based bivariate extreme value model for estimating pedestrian crash frequency using traffic conflicts","authors":"Saransh Sahu ,&nbsp;Yasir Ali ,&nbsp;Sebastien Glaser ,&nbsp;Shimul Md Mazharul Haque","doi":"10.1016/j.amar.2026.100420","DOIUrl":"10.1016/j.amar.2026.100420","url":null,"abstract":"<div><div>Traditional models, employing extreme value theory for estimating pedestrian crashes from traffic conflicts, commonly utilise popular conflict measures, such as post encroachment time and gap time. Whilst these measures have proven useful, they are limited in identifying a vehicle–pedestrian conflict based on a fixed threshold value and depend on subjective graphical-based extreme identification methods, which neither fully capture the dynamic interactions between vehicles and pedestrians nor account for road user behaviour to identify conflicting events. This study proposes a bivariate extreme value modelling framework that analyses evasive action-based traffic conflicts by integrating risk force theory and artificial intelligence-based video analytics to estimate pedestrian crash frequency by severity. The methodological framework quantifies crash risk dynamically during vehicle–pedestrian interactions and identifies traffic conflict events based on evasive behaviours. Traffic conflicts are modelled using a Generalised Pareto distribution to capture the tail behaviour of high-risk conflicts. The proposed econometric modelling framework was validated using 72 h of traffic movement data from three signalised intersections in Queensland, Australia. Results demonstrate that the Generalised Pareto distributions effectively fit evasive action-based vehicle–pedestrian conflicts, with estimated total pedestrian frequency and severe crash frequency aligning closely with historical crash records, thereby supporting the validity of the proposed model. This study presents a scalable, behaviourally grounded methodology as an alternative to a subjective conflict identification approach, enabling continuous risk assessment for proactive pedestrian safety management and real-time safety analysis.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"49 ","pages":"Article 100420"},"PeriodicalIF":12.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147396802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The role of regional economic conditions in active traveler injury severity: Accounting for COVID-19 pandemic disruptions 区域经济状况在主动旅行者伤害严重程度中的作用:考虑COVID-19大流行造成的破坏
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2026-03-01 Epub Date: 2026-01-17 DOI: 10.1016/j.amar.2026.100419
Zehao Wang, Wei (David) Fan
Regional economic disparities contribute to a disproportionate number of fatal and severe crashes among active travelers (pedestrians and bicyclists) in economically disadvantaged areas. Such road safety inequalities may be further exacerbated by external shocks such as the COVID-19 pandemic, due to regional variations in safety resilience. However, few studies have examined how the determinants of injury severity vary across regions with differing economic conditions, while accounting for COVID-contributing temporal shifts. This study uses North Carolina as a case study, classifying counties into three groups (i.e., highly, moderately, and least distressed counties) based on four economic indicators, and defining three pandemic periods (i.e., before, during, and after the pandemic). A partially constrained random parameter multinomial logit model with heterogeneity in the means and variances is estimated for crashes in each county group. Results show that the effects of factors are more stable in the least distressed counties, suggesting stronger safety resilience under external shocks. Additionally, during the pandemic, alcohol-impaired driving significantly affected injury severity only in highly and moderately distressed counties. Out-of-sample predictions further suggest that the probability of severe injuries among active travelers increases with rising regional economic distress and after the pandemic. Moreover, compared to the least distressed counties, the reduced safety resilience in highly and moderately distressed counties is attributed to weaker recovery and resistance capacities, respectively. These findings provide valuable insights for formulating region-specific policies, detecting system vulnerabilities, and promoting equitable and sustainable active transportation systems.
区域经济差异导致经济落后地区活跃的旅行者(行人和骑自行车的人)发生的致命和严重车祸数量不成比例。由于安全复原力的区域差异,2019冠状病毒病大流行等外部冲击可能会进一步加剧这种道路安全不平等。然而,很少有研究调查不同经济条件下伤害严重程度的决定因素如何在不同地区发生变化,同时考虑到导致covid - 19的时间变化。本研究以北卡罗来纳州为例,根据四项经济指标将县分为三组(即高度、中度和最不困难的县),并定义了三个大流行时期(即大流行之前、期间和之后)。估计了每个县组中具有均值和方差异质性的部分约束随机参数多项logit模型。结果表明,最不贫困县的影响因素更为稳定,表明其在外部冲击下的安全弹性更强。此外,在大流行期间,酒后驾驶仅在高度和中度痛苦的县显著影响伤害严重程度。样本外预测进一步表明,随着区域经济困境的加剧和大流行之后,活跃旅行者严重受伤的可能性会增加。此外,与最轻困境县相比,高度和中度困境县的安全弹性降低分别归因于较弱的恢复能力和抵抗能力。这些发现为制定特定区域政策、发现系统脆弱性以及促进公平和可持续的主动交通系统提供了宝贵的见解。
{"title":"The role of regional economic conditions in active traveler injury severity: Accounting for COVID-19 pandemic disruptions","authors":"Zehao Wang,&nbsp;Wei (David) Fan","doi":"10.1016/j.amar.2026.100419","DOIUrl":"10.1016/j.amar.2026.100419","url":null,"abstract":"<div><div>Regional economic disparities contribute to a disproportionate number of fatal and severe crashes among active travelers (pedestrians and bicyclists) in economically disadvantaged areas. Such road safety inequalities may be further exacerbated by external shocks such as the COVID-19 pandemic, due to regional variations in safety resilience. However, few studies have examined how the determinants of injury severity vary across regions with differing economic conditions, while accounting for COVID-contributing temporal shifts. This study uses North Carolina as a case study, classifying counties into three groups (i.e., highly, moderately, and least distressed counties) based on four economic indicators, and defining three pandemic periods (i.e., before, during, and after the pandemic). A partially constrained random parameter multinomial logit model with heterogeneity in the means and variances is estimated for crashes in each county group. Results show that the effects of factors are more stable in the least distressed counties, suggesting stronger safety resilience under external shocks. Additionally, during the pandemic, alcohol-impaired driving significantly affected injury severity only in highly and moderately distressed counties. Out-of-sample predictions further suggest that the probability of severe injuries among active travelers increases with rising regional economic distress and after the pandemic. Moreover, compared to the least distressed counties, the reduced safety resilience in highly and moderately distressed counties is attributed to weaker recovery and resistance capacities, respectively. These findings provide valuable insights for formulating region-specific policies, detecting system vulnerabilities, and promoting equitable and sustainable active transportation systems.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"49 ","pages":"Article 100419"},"PeriodicalIF":12.6,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146038194","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A note on observed injury bias in police-reported pre-crash travel speed estimates 关于警方报告的碰撞前行驶速度估计中观察到的伤害偏见的说明
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2025-12-01 Epub Date: 2025-10-12 DOI: 10.1016/j.amar.2025.100407
Mouyid Islam , Fred Mannering
Vehicle pre-crash travel speed is one of the most important determinants of driver injury severity. However, pre-crash travel speed estimates made by police officers, especially those in crashes with less severe injuries (where there is less of a need for high levels of accuracy due to potential litigation), can be susceptible to biases because of the tendency to associate less severe driver injuries with lower pre-crash travel speeds. This potential bias makes the use of pre-crash travel speeds in injury-severity modeling highly problematic due to its endogeneity with injury severity. To detect the presence and extent of this problem, a bias correction term for pre-crash travel speed estimation equations is applied by treating injury-severity level (discrete) and pre-crash travel speed (continuous) as a discrete/continuous econometric model. The findings show that for severe injury crashes, the bias correction is statistically insignificant, reflecting the increased accuracy required of police officers in severe crashes. However, for crashes resulting in less severe occupant injuries, there is a significant bias resulting from observed injury levels, which distorts the effects of explanatory variables on pre-crash travel speed estimates. The results of this paper not only provide empirical evidence of potential endogeneity problems in models of crash injury severity but also underscore the need to more fully consider potential endogeneity issues and their associated consequences in statistical models and machine learning models.
车辆碰撞前行驶速度是驾驶员损伤严重程度的重要决定因素之一。然而,警察在碰撞前的行驶速度估计,特别是那些受伤不太严重的事故(由于潜在的诉讼,不太需要高水平的准确性),可能容易受到偏见的影响,因为倾向于将较轻的驾驶员伤害与较低的碰撞前行驶速度联系起来。由于碰撞前行驶速度与损伤严重程度内生性一致,这种潜在的偏差使得在损伤严重程度建模中使用碰撞前行驶速度非常成问题。为了检测该问题的存在和程度,通过将伤害严重程度(离散)和碰撞前行驶速度(连续)作为离散/连续计量模型,应用碰撞前行驶速度估计方程的偏差校正项。研究结果表明,对于严重伤害事故,偏差校正在统计上不显著,反映了在严重事故中对警察的准确性要求的提高。然而,对于导致乘员伤害较轻的碰撞,观察到的伤害水平会产生显著的偏差,这扭曲了解释变量对碰撞前行驶速度估计的影响。本文的结果不仅提供了碰撞损伤严重程度模型中潜在内生性问题的经验证据,而且强调了在统计模型和机器学习模型中更充分考虑潜在内生性问题及其相关后果的必要性。
{"title":"A note on observed injury bias in police-reported pre-crash travel speed estimates","authors":"Mouyid Islam ,&nbsp;Fred Mannering","doi":"10.1016/j.amar.2025.100407","DOIUrl":"10.1016/j.amar.2025.100407","url":null,"abstract":"<div><div>Vehicle pre-crash travel speed is one of the most important determinants of driver injury severity. However, pre-crash travel speed estimates made by police officers, especially those in crashes with less severe injuries (where there is less of a need for high levels of accuracy due to potential litigation), can be susceptible to biases because of the tendency to associate less severe driver injuries with lower pre-crash travel speeds. This potential bias makes the use of pre-crash travel speeds in injury-severity modeling highly problematic due to its endogeneity with injury severity. To detect the presence and extent of this problem, a bias correction term for pre-crash travel speed estimation equations is applied by treating injury-severity level (discrete) and pre-crash travel speed (continuous) as a discrete/continuous econometric model. The findings show that for severe injury crashes, the bias correction is statistically insignificant, reflecting the increased accuracy required of police officers in severe crashes. However, for crashes resulting in less severe occupant injuries, there is a significant bias resulting from observed injury levels, which distorts the effects of explanatory variables on pre-crash travel speed estimates. The results of this paper not only provide empirical evidence of potential endogeneity problems in models of crash injury severity but also underscore the need to more fully consider potential endogeneity issues and their associated consequences in statistical models and machine learning models.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"48 ","pages":"Article 100407"},"PeriodicalIF":12.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145424812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian forecasting of short-term crash risk with conditional extreme value models: A comparison between one-stage and two-stage approaches 条件极值模型的短期崩溃风险贝叶斯预测:一阶段和两阶段方法的比较
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2025-12-01 Epub Date: 2025-10-28 DOI: 10.1016/j.amar.2025.100409
Depeng Niu, Tarek Sayed
Extreme Value Theory (EVT) has become a widely used approach for quantifying crash risk from traffic conflict data. Most existing applications, however, rely on unconditional models, which fail to adequately capture dependence in extreme traffic conflicts and do not reliably predict future crash risk. To demonstrate the potential of conditional EVT models for advancing short-term crash risk forecasting, this study compares two conditional EVT approaches within a Bayesian framework that address extremal dependence from distinct perspectives. The first approach is the two-stage GARCH-EVT framework, where conditional mean and variance are modeled using GARCH-type specifications before EVT is applied to the standardized residuals. Both traditional and covariate-augmented variants are examined. The second approach uses a one-stage conditional peak-over-threshold (POT) model, represented by the score-driven POT model, which directly captures dynamics in the conditional exceedance probability and the distribution of exceedance sizes. Crash risk is quantified using two conditional tail risk measures, Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR), with forecasting performance evaluated through traditional and comparative backtesting. An empirical study examines rear-end conflicts collected at two signalized intersections over four observation days to generate one-cycle-ahead crash risk forecasts during the out-of-sample period. Traditional backtesting indicates that both the covariate-augmented GARCH-EVT and the score-driven POT approaches produce valid and comparable forecasts, with the two-stage method yielding estimates with lower uncertainty. Comparative backtesting, however, shows that the score-driven POT model achieves slightly superior forecasting accuracy. The weaker performance of the two-stage framework can be attributed to partial removal of extremal dependence, sensitivity to substitute values in cycles without conflicts, and the limitations inherent in its two-stage structure.
极值理论(Extreme Value Theory, EVT)已成为一种广泛应用于交通冲突数据中碰撞风险量化的方法。然而,大多数现有的应用程序依赖于无条件模型,这些模型不能充分捕捉极端交通冲突中的依赖性,也不能可靠地预测未来的碰撞风险。为了证明条件EVT模型在推进短期崩溃风险预测方面的潜力,本研究比较了贝叶斯框架内的两种条件EVT方法,这些方法从不同的角度解决了极端依赖性。第一种方法是两阶段GARCH-EVT框架,在EVT应用于标准化残差之前,使用garch类型规范对条件均值和方差进行建模。传统的和协变量增广的变体进行了检查。第二种方法采用单阶段条件峰值超过阈值(POT)模型,由分数驱动的POT模型表示,该模型直接捕获条件超越概率和超越大小分布中的动态。采用风险价值(VaR)和条件风险价值(CVaR)两个条件尾部风险度量来量化崩溃风险,并通过传统回溯测试和比较回溯测试来评估预测绩效。一项实证研究考察了在四个观察日内收集的两个信号交叉口的追尾冲突,以在样本外期间生成一个周期前的碰撞风险预测。传统的回溯检验表明,协变量增强的GARCH-EVT和分数驱动的POT方法都能产生有效的、可比较的预测,两阶段方法产生的估计具有较低的不确定性。然而,对比回测表明,分数驱动的POT模型的预测精度略高。两阶段框架较弱的表现可归因于部分去除极值依赖性,对无冲突循环中的替代值敏感,以及其两阶段结构固有的局限性。
{"title":"Bayesian forecasting of short-term crash risk with conditional extreme value models: A comparison between one-stage and two-stage approaches","authors":"Depeng Niu,&nbsp;Tarek Sayed","doi":"10.1016/j.amar.2025.100409","DOIUrl":"10.1016/j.amar.2025.100409","url":null,"abstract":"<div><div>Extreme Value Theory (EVT) has become a widely used approach for quantifying crash risk from traffic conflict data. Most existing applications, however, rely on unconditional models, which fail to adequately capture dependence in extreme traffic conflicts and do not reliably predict future crash risk. To demonstrate the potential of conditional EVT models for advancing short-term crash risk forecasting, this study compares two conditional EVT approaches within a Bayesian framework that address extremal dependence from distinct perspectives. The first approach is the two-stage GARCH-EVT framework, where conditional mean and variance are modeled using GARCH-type specifications before EVT is applied to the standardized residuals. Both traditional and covariate-augmented variants are examined. The second approach uses a one-stage conditional peak-over-threshold (POT) model, represented by the score-driven POT model, which directly captures dynamics in the conditional exceedance probability and the distribution of exceedance sizes. Crash risk is quantified using two conditional tail risk measures, Value-at-Risk (VaR) and Conditional Value-at-Risk (CVaR), with forecasting performance evaluated through traditional and comparative backtesting. An empirical study examines rear-end conflicts collected at two signalized intersections over four observation days to generate one-cycle-ahead crash risk forecasts during the out-of-sample period. Traditional backtesting indicates that both the covariate-augmented GARCH-EVT and the score-driven POT approaches produce valid and comparable forecasts, with the two-stage method yielding estimates with lower uncertainty. Comparative backtesting, however, shows that the score-driven POT model achieves slightly superior forecasting accuracy. The weaker performance of the two-stage framework can be attributed to partial removal of extremal dependence, sensitivity to substitute values in cycles without conflicts, and the limitations inherent in its two-stage structure.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"48 ","pages":"Article 100409"},"PeriodicalIF":12.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145624150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analyzing crash injury severities with deep learning and advanced statistical models: An assessment of methodological challenges 用深度学习和高级统计模型分析碰撞损伤严重程度:方法挑战的评估
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2025-12-01 Epub Date: 2025-09-08 DOI: 10.1016/j.amar.2025.100405
MohammadAli Seyfi , Amir Mohammad Karimi Mamaghan , Ali Behnood , Fred Mannering
In this research, statistical and deep learning models are applied to determine factors that affect motorcycle crash-injury severities. Four methodological challenges are considered: 1) imbalanced data (because fatal injuries are an exceedingly small portion of all resulting injury outcomes); 2) unobserved heterogeneity (because many unobserved factors will influence resulting injury severities); 3) quantification of variable effects; and 4) the possibility of temporally shifting relationships among variables. Convolutional neural networks and deep neural networks are the deep learning models considered, and random parameters logit models with heterogeneity in means and variances is the statistical model considered. Extensive experimentation indicated that data imbalance and unobserved heterogeneity could be best handled in deep learning models with a Bayesian deep neural network with a random generator and weighted loss function. With statistical modeling indicating significant shifts in model parameters over time, the data were segmented by year and both statistical and deep learning models were estimated. While techniques are available for deep learning to potentially handle data imbalance and unobserved heterogeneity, the quantification of variable effects and temporal shifts remains a challenge. For example, a comparison of variable effects show that the deep learning estimates of variable effects are generally inconsistent with the plausible values generated by the statistical models in terms of magnitudes and occasionally in terms of direction, indicating a need for improvements in deep-learning variable-effect extraction methods. The findings also show the need for future work to isolate the effect of complex temporal relationships which are currently imbedded in deep learning approaches, because the segmentation of data that has been used in statistical models to isolate temporal effects, and even the use of all data and defining new time-dependent variables, may not be a viable deep learning option due to the potential loss in predictive performance.
在本研究中,应用统计和深度学习模型来确定影响摩托车碰撞伤害严重程度的因素。研究考虑了四个方法学上的挑战:1)数据不平衡(因为致命伤害在所有导致的伤害结果中所占比例极小);2)未观察到的异质性(因为许多未观察到的因素会影响导致的损伤严重程度);3)变量效应的量化;(4)变量间关系发生时间转移的可能性。考虑的深度学习模型是卷积神经网络和深度神经网络,考虑的统计模型是均值和方差异质性的随机参数logit模型。大量的实验表明,使用随机生成器和加权损失函数的贝叶斯深度神经网络可以最好地处理数据不平衡和未观察到的异质性。统计建模表明模型参数随时间的显著变化,数据按年分割,并对统计和深度学习模型进行估计。虽然深度学习技术可以潜在地处理数据不平衡和未观察到的异质性,但变量效应和时间变化的量化仍然是一个挑战。例如,对变量效应的比较表明,深度学习对变量效应的估计通常与统计模型产生的合理值在量级上不一致,有时在方向上也不一致,这表明深度学习变量效应提取方法需要改进。研究结果还表明,未来的工作需要隔离目前嵌入深度学习方法中的复杂时间关系的影响,因为统计模型中使用的数据分割来隔离时间效应,甚至使用所有数据和定义新的时间相关变量,由于预测性能的潜在损失,可能不是一个可行的深度学习选择。
{"title":"Analyzing crash injury severities with deep learning and advanced statistical models: An assessment of methodological challenges","authors":"MohammadAli Seyfi ,&nbsp;Amir Mohammad Karimi Mamaghan ,&nbsp;Ali Behnood ,&nbsp;Fred Mannering","doi":"10.1016/j.amar.2025.100405","DOIUrl":"10.1016/j.amar.2025.100405","url":null,"abstract":"<div><div>In this research, statistical and deep learning models are applied to determine factors that affect motorcycle crash-injury severities. Four methodological challenges are considered: 1) imbalanced data (because fatal injuries are an exceedingly small portion of all resulting injury outcomes); 2) unobserved heterogeneity (because many unobserved factors will influence resulting injury severities); 3) quantification of variable effects; and 4) the possibility of temporally shifting relationships among variables. Convolutional neural networks and deep neural networks are the deep learning models considered, and random parameters logit models with heterogeneity in means and variances is the statistical model considered. Extensive experimentation indicated that data imbalance and unobserved heterogeneity could be best handled in deep learning models with a Bayesian deep neural network with a random generator and weighted loss function. With statistical modeling indicating significant shifts in model parameters over time, the data were segmented by year and both statistical and deep learning models were estimated. While techniques are available for deep learning to potentially handle data imbalance and unobserved heterogeneity, the quantification of variable effects and temporal shifts remains a challenge. For example, a comparison of variable effects show that the deep learning estimates of variable effects are generally inconsistent with the plausible values generated by the statistical models in terms of magnitudes and occasionally in terms of direction, indicating a need for improvements in deep-learning variable-effect extraction methods. The findings also show the need for future work to isolate the effect of complex temporal relationships which are currently imbedded in deep learning approaches, because the segmentation of data that has been used in statistical models to isolate temporal effects, and even the use of all data and defining new time-dependent variables, may not be a viable deep learning option due to the potential loss in predictive performance.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"48 ","pages":"Article 100405"},"PeriodicalIF":12.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145120262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint analysis on pedestrian injury severity across vehicle movements at intersections: Addressing temporal instability and spatial correlations 交叉路口车辆运动对行人伤害严重程度的联合分析:解决时间不稳定性和空间相关性
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2025-12-01 Epub Date: 2025-10-07 DOI: 10.1016/j.amar.2025.100406
Chenzhu Wang, Mohamed Abdel-Aty, Natalia Barbour
Intersection-related vehicle–pedestrian collisions present a significant challenge in transportation safety due to the complexity and hazards of intersections within urban road networks. This study introduces a spatially aggregated ordered logit model with a joint multivariate normal structure, which offers distinct advantages over conventional models by effectively capturing correlations among vehicle movement types (left-turn, straight, and right-turn) and accounting for residual aggregation at both intersection and county levels. Using a dataset of 4280 pedestrian-vehicle crashes in Florida from 2019 to 2023, incorporating pedestrian, driver, vehicle, intersection, environmental, crash, and temporal characteristics, the proposed model demonstrates superior performance in capturing interdependencies among vehicle maneuvers. Four temporally consistently significant variables are identified including pedestrians aged under 18 years old, urban areas, major roadway speed limits below 30 mph and lighted roadways during nighttime. In contrast, several other variables demonstrate significance only in specific years, reflecting notable temporal variation in their impact on pedestrian injury severity. A series of statistical tests, including normality distribution tests, spatial autocorrelation tests, and assessments of independence and homoscedasticity, were conducted to validate the model. The results confirm the model’s ability to satisfy critical statistical assumptions—normality, independence, homoscedasticity, and spatial autocorrelation—and its robustness in achieving a high degree of spatial independence. The findings underscore the need for targeted safety measures and intersection design strategies to mitigate collision risks. By offering enhanced accuracy, temporal flexibility, and spatial insights, the proposed modeling approach provides a robust framework for developing evidence-based safety interventions and optimizing intersection designs to reduce pedestrian injury severity.
由于城市道路网络中交叉口的复杂性和危险性,与交叉口相关的车辆-行人碰撞对交通安全提出了重大挑战。本研究引入了一个具有联合多元正态结构的空间聚合有序logit模型,该模型通过有效捕获车辆运动类型(左转弯、直转弯和右转弯)之间的相关性,并考虑路口和县级的剩余聚合,具有明显优于传统模型的优势。利用2019年至2023年佛罗里达州4280起行人与车辆碰撞的数据集,结合行人、驾驶员、车辆、十字路口、环境、碰撞和时间特征,所提出的模型在捕捉车辆机动之间的相互依赖性方面表现出卓越的性能。确定了四个暂时一致的重要变量,包括18岁以下的行人,城市地区,主要道路限速低于30英里/小时以及夜间照明道路。相比之下,其他几个变量仅在特定年份表现出显著性,反映了它们对行人伤害严重程度的影响在时间上的显著变化。通过正态分布检验、空间自相关检验、独立性和均方差评估等一系列统计检验对模型进行验证。结果证实了该模型能够满足关键的统计假设——正态性、独立性、均方差和空间自相关——以及它在实现高度空间独立性方面的鲁棒性。研究结果强调了有针对性的安全措施和交叉口设计策略的必要性,以减轻碰撞风险。通过提供更高的准确性、时间灵活性和空间洞察力,所提出的建模方法为开发基于证据的安全干预措施和优化十字路口设计提供了一个强大的框架,以降低行人伤害的严重程度。
{"title":"Joint analysis on pedestrian injury severity across vehicle movements at intersections: Addressing temporal instability and spatial correlations","authors":"Chenzhu Wang,&nbsp;Mohamed Abdel-Aty,&nbsp;Natalia Barbour","doi":"10.1016/j.amar.2025.100406","DOIUrl":"10.1016/j.amar.2025.100406","url":null,"abstract":"<div><div>Intersection-related vehicle–pedestrian collisions present a significant challenge in transportation safety due to the complexity and hazards of intersections within urban road networks. This study introduces a spatially aggregated ordered logit model with a joint multivariate normal structure, which offers distinct advantages over conventional models by effectively capturing correlations among vehicle movement types (left-turn, straight, and right-turn) and accounting for residual aggregation at both intersection and county levels. Using a dataset of 4280 pedestrian-vehicle crashes in Florida from 2019 to 2023, incorporating pedestrian, driver, vehicle, intersection, environmental, crash, and temporal characteristics, the proposed model demonstrates superior performance in capturing interdependencies among vehicle maneuvers. Four temporally consistently significant variables are identified including pedestrians aged under 18 years old, urban areas, major roadway speed limits below 30 mph and lighted roadways during nighttime. In contrast, several other variables demonstrate significance only in specific years, reflecting notable temporal variation in their impact on pedestrian injury severity. A series of statistical tests, including normality distribution tests, spatial autocorrelation tests, and assessments of independence and homoscedasticity, were conducted to validate the model. The results confirm the model’s ability to satisfy critical statistical assumptions—normality, independence, homoscedasticity, and spatial autocorrelation—and its robustness in achieving a high degree of spatial independence. The findings underscore the need for targeted safety measures and intersection design strategies to mitigate collision risks. By offering enhanced accuracy, temporal flexibility, and spatial insights, the proposed modeling approach provides a robust framework for developing evidence-based safety interventions and optimizing intersection designs to reduce pedestrian injury severity.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"48 ","pages":"Article 100406"},"PeriodicalIF":12.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145364125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A note on random parameters models of crash injury severities with k-means clustering for data preprocessing 基于k-均值聚类的碰撞损伤严重性随机参数模型预处理研究
IF 12.6 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2025-12-01 Epub Date: 2025-10-12 DOI: 10.1016/j.amar.2025.100408
Nawaf Alnawmasi , Fred Mannering
Many recent studies have shown that data segmentation (seeking to segment the data into potentially homogeneous groups by factors such as data-collection year, driver age, driver gender, driver behaviors, etc.) can significantly improve crash injury-severity model estimation results. However, the choice of the segmentation criterion is often speculative and based on a predetermined expectation of homogeneity by the analyst. In an effort to improve model estimation results, a potential alternative to analyst-specified data segmentation is to preprocess the data using multivariate machine learning techniques. This paper demonstrates the potential of data preprocessing using k-means clustering as a means to improve the estimation of statistical models. Empirical results show that the combination of k-means clustering, in addition to data segmentation by year to account for temporal shifts in parameters, result in an improved statistical fit (a hybrid of analyst-specified and machine learning data segmentation). Furthermore, a comparison of the marginal effects generated by the clustered and non-clustered models suggests that the preprocessing of data by clustering techniques can result in more precise marginal effect estimates to guide safety policies. The findings show considerable potential for using machine learning algorithms, such as k-means clustering, to improve the estimation results of statistical models.
最近的许多研究表明,数据分割(试图通过数据收集年份、驾驶员年龄、驾驶员性别、驾驶员行为等因素将数据分割成潜在的同质组)可以显著改善碰撞伤害严重程度模型的估计结果。然而,分割标准的选择通常是推测性的,并且基于分析人员对同质性的预定期望。为了改进模型估计结果,分析师指定的数据分割的潜在替代方案是使用多变量机器学习技术预处理数据。本文展示了使用k-means聚类作为改进统计模型估计的手段的数据预处理的潜力。经验结果表明,k-means聚类的结合,除了按年进行数据分割以考虑参数的时间变化外,还可以改善统计拟合(分析师指定和机器学习数据分割的混合)。此外,对聚类模型和非聚类模型产生的边际效应进行了比较,表明通过聚类技术对数据进行预处理可以得到更精确的边际效应估计,从而指导安全政策。这些发现显示了使用机器学习算法(如k-means聚类)来改进统计模型的估计结果的巨大潜力。
{"title":"A note on random parameters models of crash injury severities with k-means clustering for data preprocessing","authors":"Nawaf Alnawmasi ,&nbsp;Fred Mannering","doi":"10.1016/j.amar.2025.100408","DOIUrl":"10.1016/j.amar.2025.100408","url":null,"abstract":"<div><div>Many recent studies have shown that data segmentation (seeking to segment the data into potentially homogeneous groups by factors such as data-collection year, driver age, driver gender, driver behaviors, etc.) can significantly improve crash injury-severity model estimation results. However, the choice of the segmentation criterion is often speculative and based on a predetermined expectation of homogeneity by the analyst. In an effort to improve model estimation results, a potential alternative to analyst-specified data segmentation is to preprocess the data using multivariate machine learning techniques. This paper demonstrates the potential of data preprocessing using <em>k</em>-means clustering as a means to improve the estimation of statistical models. Empirical results show that the combination of <em>k</em>-means clustering, in addition to data segmentation by year to account for temporal shifts in parameters, result in an improved statistical fit (a hybrid of analyst-specified and machine learning data segmentation). Furthermore, a comparison of the marginal effects generated by the clustered and non-clustered models suggests that the preprocessing of data by clustering techniques can result in more precise marginal effect estimates to guide safety policies. The findings show considerable potential for using machine learning algorithms, such as <em>k</em>-means clustering, to improve the estimation results of statistical models.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"48 ","pages":"Article 100408"},"PeriodicalIF":12.6,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145364122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Grouped random parameters Poisson-Lindley model with spatial effects addressing crashes at intersections: Insights from visual environment features and spatiotemporal instability 具有空间效应的分组随机参数泊松-林德利模型:来自视觉环境特征和时空不稳定性的见解
IF 12.5 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH Pub Date : 2025-09-01 Epub Date: 2025-05-15 DOI: 10.1016/j.amar.2025.100387
Chenzhu Wang, Mohamed Abdel-Aty, Lei Han
This study investigates the unobserved heterogeneity and spatiotemporal variations in the effects of visual environment features on intersection crash frequency. A Grouped Random Parameters Poisson-Lindley model with Spatial Effects is developed to account for spatial variations at both the macro (county) and micro (intersection) levels. The analysis utilizes crash data from 2,044 intersections across 12 Florida counties, collected between 2020 and 2022, along with explanatory variables including traffic flow, geometric design characteristics, and visual environment features (extracted from Google Street View images). Comparing to existing methods (e.g., Fixed, Random Parameters, and Grouped Random Parameters Poisson-Lindley models), the proposed approach, which incorporates both macro- and micro-level spatial effects, demonstrates significantly improved model performance. Additionally, the temporal variations of explanatory variables over the three-year period are clearly identified through out-of-sample predictions and marginal effects analysis. Two visual environment features, Vegetation and Grass, result in the identification of grouped random parameters, highlighting the varying impact of these features on intersection crash frequency across the 12 counties. The findings also reveal a strengthening of micro-level spatial effects, indicating heightened spatial correlations between adjacent intersections following the COVID-19 pandemic. Key factors influencing crash frequency include traffic volume, four-legged intersections, major roads with more than four lanes, wider minor roads, and a higher proportion of vehicles in the drivers’ field of vision. These results provide valuable insights into the influence of drivers’ visual environment on intersection safety and offer policy recommendations for enhancing traffic safety.
本研究探讨了视觉环境特征对交叉口碰撞频率影响的异质性和时空变异。建立了具有空间效应的分组随机参数泊松-林德利模型,以解释宏观(县)和微观(路口)水平的空间变化。该分析利用了2020年至2022年间收集的佛罗里达州12个县的2,044个十字路口的碰撞数据,以及包括交通流量、几何设计特征和视觉环境特征在内的解释变量(从谷歌街景图像中提取)。与现有的固定参数、随机参数和分组随机参数泊松-林德利模型相比,该方法结合了宏观和微观层面的空间效应,显著提高了模型的性能。此外,通过样本外预测和边际效应分析,清楚地确定了三年期间解释变量的时间变化。植被(Vegetation)和草地(Grass)这两个视觉环境特征可以识别成组的随机参数,突出这些特征对12个县的交叉口碰撞频率的不同影响。研究结果还显示,微观层面的空间效应增强,表明在2019冠状病毒病大流行之后,相邻路口之间的空间相关性增强。影响碰撞频率的关键因素包括交通量、四足交叉路口、四车道以上的主要道路、较宽的次要道路以及驾驶员视野中车辆比例较高。这些结果为研究驾驶员视觉环境对交叉口安全的影响提供了有价值的见解,并为加强交通安全提供了政策建议。
{"title":"Grouped random parameters Poisson-Lindley model with spatial effects addressing crashes at intersections: Insights from visual environment features and spatiotemporal instability","authors":"Chenzhu Wang,&nbsp;Mohamed Abdel-Aty,&nbsp;Lei Han","doi":"10.1016/j.amar.2025.100387","DOIUrl":"10.1016/j.amar.2025.100387","url":null,"abstract":"<div><div>This study investigates the unobserved heterogeneity and spatiotemporal variations in the effects of visual environment features on intersection crash frequency. A Grouped Random Parameters Poisson-Lindley model with Spatial Effects is developed to account for spatial variations at both the macro (county) and micro (intersection) levels. The analysis utilizes crash data from 2,044 intersections across 12 Florida counties, collected between 2020 and 2022, along with explanatory variables including traffic flow, geometric design characteristics, and visual environment features (extracted from Google Street View images). Comparing to existing methods (e.g., Fixed, Random Parameters, and Grouped Random Parameters Poisson-Lindley models), the proposed approach, which incorporates both macro- and micro-level spatial effects, demonstrates significantly improved model performance. Additionally, the temporal variations of explanatory variables over the three-year period are clearly identified through out-of-sample predictions and marginal effects analysis. Two visual environment features, Vegetation and Grass, result in the identification of grouped random parameters, highlighting the varying impact of these features on intersection crash frequency across the 12 counties. The findings also reveal a strengthening of micro-level spatial effects, indicating heightened spatial correlations between adjacent intersections following the COVID-19 pandemic. Key factors influencing crash frequency include traffic volume, four-legged intersections, major roads with more than four lanes, wider minor roads, and a higher proportion of vehicles in the drivers’ field of vision. These results provide valuable insights into the influence of drivers’ visual environment on intersection safety and offer policy recommendations for enhancing traffic safety.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"47 ","pages":"Article 100387"},"PeriodicalIF":12.5,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Analytic Methods in Accident Research
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1