Chen Yuan , Ye Li , Helai Huang , Shiqi Wang , Zhenhao Sun , Yan Li
{"title":"基于交通流特征的实时冲突风险预测:一种新的轨迹数据分析方法","authors":"Chen Yuan , Ye Li , Helai Huang , Shiqi Wang , Zhenhao Sun , Yan Li","doi":"10.1016/j.amar.2022.100217","DOIUrl":null,"url":null,"abstract":"<div><p>The real-time conflict prediction model using traffic flow characteristics is much less studied than the crash-based model. This study aims at exploring the relationship between conflicts and traffic flow features with the consideration of heterogeneity and developing predictive models to identify conflict-prone conditions in a real-time manner. The high-resolution trajectory data from the HighD dataset is used as empirical data. A novel method with the virtual detector approach for traffic feature extraction and a two-step framework is proposed for the trajectory data analysis. The framework consists of an exploratory study by random parameter logit model with heterogeneity in means and variances and a comparative study on several machine learning methods, including eXtreme Gradient Boosting (Boosting), Random Forest (Bagging), Support Vector Machine (Single-classifier), and Multilayer-Perceptron (Deep neural network). Results indicate that (1) traffic flow characteristics have significant impacts on the probability of conflict occurrence; (2) the statistical model considering mean heterogeneity outperforms the counterpart and lane differences variables are found to significantly impact the means of random parameters for both lane variables and lane differences variables; (3) eXtreme Gradient Boosting trained on an under-sampled dataset turns out to be the best model with the highest AUC of 0.871 and precision of 0.867, showing that re-sampling techniques can significantly improve the model performance. The proposed model is found to be sensitive to the conflict threshold. Sensitivity analysis on feature selection further confirms that the conflict risk prediction should consider both subject lane features and lane difference features, which verifies the consistency with exploratory analysis based on the statistical model. The consistency between statistical models and machine learning methods improves the interpretability of results for the latter one.</p></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"35 ","pages":"Article 100217"},"PeriodicalIF":12.5000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"Using traffic flow characteristics to predict real-time conflict risk: A novel method for trajectory data analysis\",\"authors\":\"Chen Yuan , Ye Li , Helai Huang , Shiqi Wang , Zhenhao Sun , Yan Li\",\"doi\":\"10.1016/j.amar.2022.100217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The real-time conflict prediction model using traffic flow characteristics is much less studied than the crash-based model. This study aims at exploring the relationship between conflicts and traffic flow features with the consideration of heterogeneity and developing predictive models to identify conflict-prone conditions in a real-time manner. The high-resolution trajectory data from the HighD dataset is used as empirical data. A novel method with the virtual detector approach for traffic feature extraction and a two-step framework is proposed for the trajectory data analysis. The framework consists of an exploratory study by random parameter logit model with heterogeneity in means and variances and a comparative study on several machine learning methods, including eXtreme Gradient Boosting (Boosting), Random Forest (Bagging), Support Vector Machine (Single-classifier), and Multilayer-Perceptron (Deep neural network). Results indicate that (1) traffic flow characteristics have significant impacts on the probability of conflict occurrence; (2) the statistical model considering mean heterogeneity outperforms the counterpart and lane differences variables are found to significantly impact the means of random parameters for both lane variables and lane differences variables; (3) eXtreme Gradient Boosting trained on an under-sampled dataset turns out to be the best model with the highest AUC of 0.871 and precision of 0.867, showing that re-sampling techniques can significantly improve the model performance. The proposed model is found to be sensitive to the conflict threshold. Sensitivity analysis on feature selection further confirms that the conflict risk prediction should consider both subject lane features and lane difference features, which verifies the consistency with exploratory analysis based on the statistical model. The consistency between statistical models and machine learning methods improves the interpretability of results for the latter one.</p></div>\",\"PeriodicalId\":47520,\"journal\":{\"name\":\"Analytic Methods in Accident Research\",\"volume\":\"35 \",\"pages\":\"Article 100217\"},\"PeriodicalIF\":12.5000,\"publicationDate\":\"2022-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Analytic Methods in Accident Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2213665722000069\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytic Methods in Accident Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213665722000069","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
Using traffic flow characteristics to predict real-time conflict risk: A novel method for trajectory data analysis
The real-time conflict prediction model using traffic flow characteristics is much less studied than the crash-based model. This study aims at exploring the relationship between conflicts and traffic flow features with the consideration of heterogeneity and developing predictive models to identify conflict-prone conditions in a real-time manner. The high-resolution trajectory data from the HighD dataset is used as empirical data. A novel method with the virtual detector approach for traffic feature extraction and a two-step framework is proposed for the trajectory data analysis. The framework consists of an exploratory study by random parameter logit model with heterogeneity in means and variances and a comparative study on several machine learning methods, including eXtreme Gradient Boosting (Boosting), Random Forest (Bagging), Support Vector Machine (Single-classifier), and Multilayer-Perceptron (Deep neural network). Results indicate that (1) traffic flow characteristics have significant impacts on the probability of conflict occurrence; (2) the statistical model considering mean heterogeneity outperforms the counterpart and lane differences variables are found to significantly impact the means of random parameters for both lane variables and lane differences variables; (3) eXtreme Gradient Boosting trained on an under-sampled dataset turns out to be the best model with the highest AUC of 0.871 and precision of 0.867, showing that re-sampling techniques can significantly improve the model performance. The proposed model is found to be sensitive to the conflict threshold. Sensitivity analysis on feature selection further confirms that the conflict risk prediction should consider both subject lane features and lane difference features, which verifies the consistency with exploratory analysis based on the statistical model. The consistency between statistical models and machine learning methods improves the interpretability of results for the latter one.
期刊介绍:
Analytic Methods in Accident Research is a journal that publishes articles related to the development and application of advanced statistical and econometric methods in studying vehicle crashes and other accidents. The journal aims to demonstrate how these innovative approaches can provide new insights into the factors influencing the occurrence and severity of accidents, thereby offering guidance for implementing appropriate preventive measures. While the journal primarily focuses on the analytic approach, it also accepts articles covering various aspects of transportation safety (such as road, pedestrian, air, rail, and water safety), construction safety, and other areas where human behavior, machine failures, or system failures lead to property damage or bodily harm.