Alimul Rajee , Md. Shahriare Satu , Mohammad Zoynul Abedin , K.M. Akkas Ali , Saad Aloteibi , Mohammad Ali Moni
{"title":"面向异构交通事故数据分析的集成特征选择算法","authors":"Alimul Rajee , Md. Shahriare Satu , Mohammad Zoynul Abedin , K.M. Akkas Ali , Saad Aloteibi , Mohammad Ali Moni","doi":"10.1016/j.knosys.2025.113089","DOIUrl":null,"url":null,"abstract":"<div><div>Traffic accidents are unexpected incidents where one or multiple vehicles collide and damage properties, dying or injuring many individuals. It causes significant social burdens, including loss of life, serious injuries, and economic suppression from medical costs, property damages, and productivity losses. This kind of incident brings a miserable situation for the affected people. Many factors, including infrastructure, weather, vehicles, or driver-related issues, contribute to happening traffic accidents. This work explores an innovative approach by investigating contributing factors to ensure road safety. In this study, an ensemble machine learning model, namely Weighted Fusion-Based Feature Selection (WFFS), was proposed to identify different significant features to reduce the effects of traffic accidents. A large amount of traffic accident records from the United Kingdom (UK) were gathered and split into several folds, which were cleaned and balanced using different techniques such as removing percentages, Synthetic Minority Oversampling Technique (SMOTE), and random oversampling. Then, WFFS were employed in each fold and identified the most significant features to predict traffic accident severity more accurately. Different classifiers, such as tree-based, bagging, boosting, and voting classifiers, were implemented into WFFS-generated feature subsets and performed better than primary data and other feature subsets. In this case, the random tree-based bagging method provided the highest accuracy of 97.28% to predict accident severity for the WFFS subset, where its number of features is 18. However, different classifiers achieved better accuracies for 6 out of 11 times using WFFS. This method is highly recommended for policymakers and transportation engineers to identify potentially hazardous locations and take appropriate measures to diminish the effects of traffic accidents.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"311 ","pages":"Article 113089"},"PeriodicalIF":7.6000,"publicationDate":"2025-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"WFFS—An ensemble feature selection algorithm for heterogeneous traffic accident data analysis\",\"authors\":\"Alimul Rajee , Md. Shahriare Satu , Mohammad Zoynul Abedin , K.M. Akkas Ali , Saad Aloteibi , Mohammad Ali Moni\",\"doi\":\"10.1016/j.knosys.2025.113089\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Traffic accidents are unexpected incidents where one or multiple vehicles collide and damage properties, dying or injuring many individuals. It causes significant social burdens, including loss of life, serious injuries, and economic suppression from medical costs, property damages, and productivity losses. This kind of incident brings a miserable situation for the affected people. Many factors, including infrastructure, weather, vehicles, or driver-related issues, contribute to happening traffic accidents. This work explores an innovative approach by investigating contributing factors to ensure road safety. In this study, an ensemble machine learning model, namely Weighted Fusion-Based Feature Selection (WFFS), was proposed to identify different significant features to reduce the effects of traffic accidents. A large amount of traffic accident records from the United Kingdom (UK) were gathered and split into several folds, which were cleaned and balanced using different techniques such as removing percentages, Synthetic Minority Oversampling Technique (SMOTE), and random oversampling. Then, WFFS were employed in each fold and identified the most significant features to predict traffic accident severity more accurately. Different classifiers, such as tree-based, bagging, boosting, and voting classifiers, were implemented into WFFS-generated feature subsets and performed better than primary data and other feature subsets. In this case, the random tree-based bagging method provided the highest accuracy of 97.28% to predict accident severity for the WFFS subset, where its number of features is 18. However, different classifiers achieved better accuracies for 6 out of 11 times using WFFS. This method is highly recommended for policymakers and transportation engineers to identify potentially hazardous locations and take appropriate measures to diminish the effects of traffic accidents.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"311 \",\"pages\":\"Article 113089\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-02-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125001364\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/2/1 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125001364","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/1 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
WFFS—An ensemble feature selection algorithm for heterogeneous traffic accident data analysis
Traffic accidents are unexpected incidents where one or multiple vehicles collide and damage properties, dying or injuring many individuals. It causes significant social burdens, including loss of life, serious injuries, and economic suppression from medical costs, property damages, and productivity losses. This kind of incident brings a miserable situation for the affected people. Many factors, including infrastructure, weather, vehicles, or driver-related issues, contribute to happening traffic accidents. This work explores an innovative approach by investigating contributing factors to ensure road safety. In this study, an ensemble machine learning model, namely Weighted Fusion-Based Feature Selection (WFFS), was proposed to identify different significant features to reduce the effects of traffic accidents. A large amount of traffic accident records from the United Kingdom (UK) were gathered and split into several folds, which were cleaned and balanced using different techniques such as removing percentages, Synthetic Minority Oversampling Technique (SMOTE), and random oversampling. Then, WFFS were employed in each fold and identified the most significant features to predict traffic accident severity more accurately. Different classifiers, such as tree-based, bagging, boosting, and voting classifiers, were implemented into WFFS-generated feature subsets and performed better than primary data and other feature subsets. In this case, the random tree-based bagging method provided the highest accuracy of 97.28% to predict accident severity for the WFFS subset, where its number of features is 18. However, different classifiers achieved better accuracies for 6 out of 11 times using WFFS. This method is highly recommended for policymakers and transportation engineers to identify potentially hazardous locations and take appropriate measures to diminish the effects of traffic accidents.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.