Estimating the occurrence of broken rails in commuter railroads with machine learning algorithms

IF 2.1 4区工程技术 Q3 ENGINEERING, CIVIL Proceedings of the Institution of Mechanical Engineers Part F-Journal of Rail and Rapid Transit Pub Date : 2024-09-04 DOI:10.1177/09544097241280848

Di Kang, Junyan Dai, Xiang Liu, Zheyong Bian, Asim Zaman, Xin Wang

{"title":"Estimating the occurrence of broken rails in commuter railroads with machine learning algorithms","authors":"Di Kang, Junyan Dai, Xiang Liu, Zheyong Bian, Asim Zaman, Xin Wang","doi":"10.1177/09544097241280848","DOIUrl":null,"url":null,"abstract":"Broken rail prevention is critical for ensuring track infrastructure safety. With the increasing availability of rail data, the opportunity for data-driven analyses emerges as a promising avenue for enhancing railroad safety. While previous research has predominantly concentrated on predicting broken rails within the context of freight railroads, the attention afforded to commuter railroads has been limited. To address this research gap, this paper presents an analytical modeling framework based on machine learning (ML) algorithms (including LightGBM, XGBoost, Random Forests, and Logistic Regression) to investigate the occurrence of broken rails on commuter rail segments. It leverages various features such as gradient, curvature, annual traffic, operational speed, and the history of prior rail defects. We use oversampling techniques, including ADASYN, random oversampling, and SMOTE, to address the issue of imbalanced data. This challenge arises due to the majority of commuter rail segments not experiencing any broken rails during the study period, resulting in a small sample size of broken rail instances. The findings indicate that, for the dataset employed in this study, LightGBM, in conjunction with random oversampling, exhibits superior performance. Based on the feature importance results, the critical factors influencing the prediction of broken rail occurrences on this commuter railroad are gradient, operational speed, and prior rail defects.","PeriodicalId":54567,"journal":{"name":"Proceedings of the Institution of Mechanical Engineers Part F-Journal of Rail and Rapid Transit","volume":"31 1","pages":""},"PeriodicalIF":2.1000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Institution of Mechanical Engineers Part F-Journal of Rail and Rapid Transit","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1177/09544097241280848","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 0

Abstract

Broken rail prevention is critical for ensuring track infrastructure safety. With the increasing availability of rail data, the opportunity for data-driven analyses emerges as a promising avenue for enhancing railroad safety. While previous research has predominantly concentrated on predicting broken rails within the context of freight railroads, the attention afforded to commuter railroads has been limited. To address this research gap, this paper presents an analytical modeling framework based on machine learning (ML) algorithms (including LightGBM, XGBoost, Random Forests, and Logistic Regression) to investigate the occurrence of broken rails on commuter rail segments. It leverages various features such as gradient, curvature, annual traffic, operational speed, and the history of prior rail defects. We use oversampling techniques, including ADASYN, random oversampling, and SMOTE, to address the issue of imbalanced data. This challenge arises due to the majority of commuter rail segments not experiencing any broken rails during the study period, resulting in a small sample size of broken rail instances. The findings indicate that, for the dataset employed in this study, LightGBM, in conjunction with random oversampling, exhibits superior performance. Based on the feature importance results, the critical factors influencing the prediction of broken rail occurrences on this commuter railroad are gradient, operational speed, and prior rail defects.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用机器学习算法估算通勤铁路断轨发生率

预防断轨对于确保轨道基础设施安全至关重要。随着铁路数据可用性的不断提高，数据驱动分析成为提高铁路安全的一个大有可为的途径。以往的研究主要集中在货运铁路范围内的断轨预测，而对通勤铁路的关注则十分有限。为了弥补这一研究空白，本文提出了一种基于机器学习（ML）算法（包括 LightGBM、XGBoost、随机森林和逻辑回归）的分析建模框架，用于研究通勤铁路线段的断轨发生率。它利用了各种特征，如坡度、曲率、年交通量、运行速度和以前的轨道缺陷历史。我们使用超采样技术（包括 ADASYN、随机超采样和 SMOTE）来解决不平衡数据问题。由于大部分通勤轨道区段在研究期间未发生过任何断轨事件，导致断轨实例的样本量较小，因此出现了这一难题。研究结果表明，对于本研究采用的数据集，LightGBM 与随机超采样相结合，表现出卓越的性能。根据特征重要性结果，影响该通勤铁路断轨预测的关键因素是坡度、运行速度和先前的轨道缺陷。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the Institution of Mechanical Engineers Part F-Journal of Rail and Rapid Transit 工程技术-工程：机械

CiteScore

4.80

自引率

10.00%

发文量

审稿时长

7 months

期刊介绍： The Journal of Rail and Rapid Transit is devoted to engineering in its widest interpretation applicable to rail and rapid transit. The Journal aims to promote sharing of technical knowledge, ideas and experience between engineers and researchers working in the railway field.