Antoine Plissonneau , Luca Jourdan , Damien Trentesaux , Lotfi Abdi , Mohamed Sallak , Abdelghani Bekrar , Benjamin Quost , Walter Schön
{"title":"带预测性辅助任务的深度强化学习用于自动列车防撞","authors":"Antoine Plissonneau , Luca Jourdan , Damien Trentesaux , Lotfi Abdi , Mohamed Sallak , Abdelghani Bekrar , Benjamin Quost , Walter Schön","doi":"10.1016/j.jrtpm.2024.100453","DOIUrl":null,"url":null,"abstract":"<div><p>The contribution of this paper consists of a deep reinforcement learning (DRL) based method for autonomous train collision avoidance. While DRL applied to autonomous vehicles’ collision avoidance has shown interesting results compared to traditional methods, train-like vehicles are not currently covered. In addition, DRL applied to collision avoidance suffers from sparse rewards, which can lead to poor convergence and long training time. To overcome these limitations, this paper proposes a method for training a reinforcement learning (RL) agent for collision avoidance using local obstacle information mapped into occupancy grids. This method also integrates a network architecture containing a predictive auxiliary task consisting in future state prediction and encouraging the intermediate representation to be predictive of obstacle trajectories. A comparison study conducted on multiple simulated scenarios demonstrates that the trained policy outperforms other deep-learning-based policies as well as human driving in terms of both safety and efficiency. As a first step toward the certification of a DRL based method, this paper proposes to approximate the policy learned by the RL agent with an interpretable decision tree. Although this approximation results in a loss of performance, it enables a safety analysis of the learned function and thus paves the way to use the strengths of RL in certifiable algorithms. As this work is pioneering the use of RL for collision avoidance of rail-guided vehicles, and to facilitate future work by other engineers and researchers, a RL-ready simulator is provided with this paper.</p></div>","PeriodicalId":51821,"journal":{"name":"Journal of Rail Transport Planning & Management","volume":"31 ","pages":"Article 100453"},"PeriodicalIF":2.6000,"publicationDate":"2024-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Deep reinforcement learning with predictive auxiliary task for autonomous train collision avoidance\",\"authors\":\"Antoine Plissonneau , Luca Jourdan , Damien Trentesaux , Lotfi Abdi , Mohamed Sallak , Abdelghani Bekrar , Benjamin Quost , Walter Schön\",\"doi\":\"10.1016/j.jrtpm.2024.100453\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The contribution of this paper consists of a deep reinforcement learning (DRL) based method for autonomous train collision avoidance. While DRL applied to autonomous vehicles’ collision avoidance has shown interesting results compared to traditional methods, train-like vehicles are not currently covered. In addition, DRL applied to collision avoidance suffers from sparse rewards, which can lead to poor convergence and long training time. To overcome these limitations, this paper proposes a method for training a reinforcement learning (RL) agent for collision avoidance using local obstacle information mapped into occupancy grids. This method also integrates a network architecture containing a predictive auxiliary task consisting in future state prediction and encouraging the intermediate representation to be predictive of obstacle trajectories. A comparison study conducted on multiple simulated scenarios demonstrates that the trained policy outperforms other deep-learning-based policies as well as human driving in terms of both safety and efficiency. As a first step toward the certification of a DRL based method, this paper proposes to approximate the policy learned by the RL agent with an interpretable decision tree. Although this approximation results in a loss of performance, it enables a safety analysis of the learned function and thus paves the way to use the strengths of RL in certifiable algorithms. As this work is pioneering the use of RL for collision avoidance of rail-guided vehicles, and to facilitate future work by other engineers and researchers, a RL-ready simulator is provided with this paper.</p></div>\",\"PeriodicalId\":51821,\"journal\":{\"name\":\"Journal of Rail Transport Planning & Management\",\"volume\":\"31 \",\"pages\":\"Article 100453\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Rail Transport Planning & Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210970624000234\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Rail Transport Planning & Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210970624000234","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}
Deep reinforcement learning with predictive auxiliary task for autonomous train collision avoidance
The contribution of this paper consists of a deep reinforcement learning (DRL) based method for autonomous train collision avoidance. While DRL applied to autonomous vehicles’ collision avoidance has shown interesting results compared to traditional methods, train-like vehicles are not currently covered. In addition, DRL applied to collision avoidance suffers from sparse rewards, which can lead to poor convergence and long training time. To overcome these limitations, this paper proposes a method for training a reinforcement learning (RL) agent for collision avoidance using local obstacle information mapped into occupancy grids. This method also integrates a network architecture containing a predictive auxiliary task consisting in future state prediction and encouraging the intermediate representation to be predictive of obstacle trajectories. A comparison study conducted on multiple simulated scenarios demonstrates that the trained policy outperforms other deep-learning-based policies as well as human driving in terms of both safety and efficiency. As a first step toward the certification of a DRL based method, this paper proposes to approximate the policy learned by the RL agent with an interpretable decision tree. Although this approximation results in a loss of performance, it enables a safety analysis of the learned function and thus paves the way to use the strengths of RL in certifiable algorithms. As this work is pioneering the use of RL for collision avoidance of rail-guided vehicles, and to facilitate future work by other engineers and researchers, a RL-ready simulator is provided with this paper.