A genetic programming approach for real-time crash prediction to solve trade-off between interpretability and accuracy

IF 2.6 3区工程技术 Q3 TRANSPORTATION Journal of Transportation Safety & Security Pub Date : 2022-05-31 DOI:10.1080/19439962.2022.2076756

Xiaochi Ma, Jian Lu, Xian Liu, Weibin Qu

{"title":"A genetic programming approach for real-time crash prediction to solve trade-off between interpretability and accuracy","authors":"Xiaochi Ma, Jian Lu, Xian Liu, Weibin Qu","doi":"10.1080/19439962.2022.2076756","DOIUrl":null,"url":null,"abstract":"Abstract Real-time crash risk prediction is a hot topic of emerging technology. Due to the lack of basic risk formation theory, previous studies focussed on the application of complex models to improve the accuracy of prediction, ignoring the interpretation of variables, while the traditional statistical analysis method can interpret variables, but the prediction accuracy is poor, which falls into a dilemma of trade-off. In this study, based on the traffic flow information of elevated expressway, an improved genetic programming (GP) approach with elite gene bank is applied to obtain an explicit traffic flow crash risk function to solve the above trade-off problem. Logistic regression and backward-propagation neural network combined with partial dependency plot were used as baseline methods to examine the interpretability and accuracy of GP. It is found that GP prediction model has been proved to be able to select important variables and solve the trade-off dilemma, which has good interpretability and accuracy. The results show that crash risk in the traffic flow mainly comes from the traffic volume, speed of the upstream section, and the speed of the current section. Furthermore, the error of GP comes from the unobserved heterogeneity and crash mechanism theory is proposed.","PeriodicalId":46672,"journal":{"name":"Journal of Transportation Safety & Security","volume":"65 1","pages":"421 - 443"},"PeriodicalIF":2.6000,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Transportation Safety & Security","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1080/19439962.2022.2076756","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 5

Abstract

Abstract Real-time crash risk prediction is a hot topic of emerging technology. Due to the lack of basic risk formation theory, previous studies focussed on the application of complex models to improve the accuracy of prediction, ignoring the interpretation of variables, while the traditional statistical analysis method can interpret variables, but the prediction accuracy is poor, which falls into a dilemma of trade-off. In this study, based on the traffic flow information of elevated expressway, an improved genetic programming (GP) approach with elite gene bank is applied to obtain an explicit traffic flow crash risk function to solve the above trade-off problem. Logistic regression and backward-propagation neural network combined with partial dependency plot were used as baseline methods to examine the interpretability and accuracy of GP. It is found that GP prediction model has been proved to be able to select important variables and solve the trade-off dilemma, which has good interpretability and accuracy. The results show that crash risk in the traffic flow mainly comes from the traffic volume, speed of the upstream section, and the speed of the current section. Furthermore, the error of GP comes from the unobserved heterogeneity and crash mechanism theory is proposed.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种解决可解释性和准确性之间权衡的实时碰撞预测遗传规划方法

实时碰撞风险预测是新兴技术的一个热点。由于缺乏基本的风险形成理论，以往的研究多侧重于应用复杂模型来提高预测精度，忽略了对变量的解释，而传统的统计分析方法虽然可以解释变量，但预测精度较差，陷入取舍的困境。本研究基于高架高速公路交通流信息，采用改进的遗传规划(GP)方法，结合精英基因库，得到明确的交通流碰撞风险函数，以解决上述权衡问题。采用Logistic回归和后向传播神经网络结合部分依赖图作为基线方法来检验GP的可解释性和准确性。结果表明，GP预测模型能够选择重要变量并解决权衡困境，具有良好的可解释性和准确性。结果表明，交通流中的碰撞风险主要来自于车流量、上游路段的车速和当前路段的车速。此外，还提出了GP的误差来自于未观测到的异质性和崩溃机制理论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of Transportation Safety & Security TRANSPORTATION-

CiteScore

6.00

自引率

15.40%

发文量