{"title":"Railroad accident analysis by machine learning and natural language processing","authors":"Raj Bridgelall , Denver D. Tolliver","doi":"10.1016/j.jrtpm.2023.100429","DOIUrl":null,"url":null,"abstract":"<div><p>The evolving complexities of railroad systems also increase their vulnerability to failure from human error. This study compared the outcomes of two workflows that incorporated 11 different machine learning techniques to identify characteristics of railroad operations that are generally associated with human-caused accidents. The first workflow engineered features from the fixed attribute fields of a large railroad accident database and the second applied natural language processing to extract features from the unstructured accident narratives. Both workflows applied a Shapely game-theoretic model to rank the importance of features based on their marginal contribution towards predicting accident cause. Among several interesting findings, some of the most unexpected were that human-caused accidents are generally not associated with high train speeds nor derailment type accidents, and that shoving cars is riskier than pulling cars. Those, and other findings, from this study can inform management decisions, planning, and policies to minimize the risk of human-caused accidents.</p></div>","PeriodicalId":51821,"journal":{"name":"Journal of Rail Transport Planning & Management","volume":"29 ","pages":"Article 100429"},"PeriodicalIF":2.6000,"publicationDate":"2023-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2210970623000616/pdfft?md5=7beb76b2bfeb64b23efbb7c9927107db&pid=1-s2.0-S2210970623000616-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Rail Transport Planning & Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210970623000616","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0
Abstract
The evolving complexities of railroad systems also increase their vulnerability to failure from human error. This study compared the outcomes of two workflows that incorporated 11 different machine learning techniques to identify characteristics of railroad operations that are generally associated with human-caused accidents. The first workflow engineered features from the fixed attribute fields of a large railroad accident database and the second applied natural language processing to extract features from the unstructured accident narratives. Both workflows applied a Shapely game-theoretic model to rank the importance of features based on their marginal contribution towards predicting accident cause. Among several interesting findings, some of the most unexpected were that human-caused accidents are generally not associated with high train speeds nor derailment type accidents, and that shoving cars is riskier than pulling cars. Those, and other findings, from this study can inform management decisions, planning, and policies to minimize the risk of human-caused accidents.