Adaptive rescheduling of rail transit services with short-turnings under disruptions via a multi-agent deep reinforcement learning approach

IF 5.8 1区工程技术 Q1 ECONOMICS Transportation Research Part B-Methodological Pub Date : 2024-09-04 DOI:10.1016/j.trb.2024.103067

Chengshuo Ying , Andy H.F. Chow , Yimo Yan , Yong-Hong Kuo , Shouyang Wang

{"title":"Adaptive rescheduling of rail transit services with short-turnings under disruptions via a multi-agent deep reinforcement learning approach","authors":"Chengshuo Ying , Andy H.F. Chow , Yimo Yan , Yong-Hong Kuo , Shouyang Wang","doi":"10.1016/j.trb.2024.103067","DOIUrl":null,"url":null,"abstract":"<div><p>This paper presents a novel multi-agent deep reinforcement learning (MADRL) approach for real-time rescheduling of rail transit services with short-turnings during a complete track blockage on a double-track service corridor. The optimization problem is modeled as a Markov decision process with multiple control agents rescheduling train services on each directional line for system recovery. To ensure computational efficacy, we employ a multi-agent policy optimization solution framework in which each control agent employs a decentralized policy function for deriving local decisions and a centralized value function approximation (VFA) estimating global system state values. Both the policy functions and VFAs are represented by multi-layer artificial neural networks (ANNs). A multi-agent proximal policy optimization gradient algorithm is developed for training the policies and VFAs through iterative simulated system transitions. The proposed framework is implemented and tested with real-world scenarios with data collected from London Underground, UK. Computational results demonstrate the superiority of the developed framework in computational effectiveness compared with previous distributed control algorithms and conventional metaheuristic methods. We also provide managerial implications for train rescheduling during disruptions with different durations, locations, and passenger behaviors. Additional experiments show the scalability of the proposed MADRL framework in managing disruptions with uncertain durations with a generalized model. This study contributes to real-time rail transit management with innovative control and optimization techniques.</p></div>","PeriodicalId":54418,"journal":{"name":"Transportation Research Part B-Methodological","volume":"188 ","pages":"Article 103067"},"PeriodicalIF":5.8000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part B-Methodological","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0191261524001917","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

This paper presents a novel multi-agent deep reinforcement learning (MADRL) approach for real-time rescheduling of rail transit services with short-turnings during a complete track blockage on a double-track service corridor. The optimization problem is modeled as a Markov decision process with multiple control agents rescheduling train services on each directional line for system recovery. To ensure computational efficacy, we employ a multi-agent policy optimization solution framework in which each control agent employs a decentralized policy function for deriving local decisions and a centralized value function approximation (VFA) estimating global system state values. Both the policy functions and VFAs are represented by multi-layer artificial neural networks (ANNs). A multi-agent proximal policy optimization gradient algorithm is developed for training the policies and VFAs through iterative simulated system transitions. The proposed framework is implemented and tested with real-world scenarios with data collected from London Underground, UK. Computational results demonstrate the superiority of the developed framework in computational effectiveness compared with previous distributed control algorithms and conventional metaheuristic methods. We also provide managerial implications for train rescheduling during disruptions with different durations, locations, and passenger behaviors. Additional experiments show the scalability of the proposed MADRL framework in managing disruptions with uncertain durations with a generalized model. This study contributes to real-time rail transit management with innovative control and optimization techniques.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过多代理深度强化学习方法，在中断情况下对短途轨道交通服务进行自适应重新调度

本文提出了一种新颖的多代理深度强化学习（MADRL）方法，用于在双轨服务走廊的轨道完全阻塞期间，实时重新安排带有短转弯的轨道交通服务。优化问题被建模为一个马尔可夫决策过程，由多个控制代理重新安排每个方向线路上的列车服务，以实现系统恢复。为确保计算效率，我们采用了一个多代理策略优化解决方案框架，其中每个控制代理都采用了一个分散的策略函数来推导局部决策，并采用一个集中的价值函数近似值（VFA）来估计全局系统状态值。策略函数和 VFA 均由多层人工神经网络 (ANN) 表示。开发了一种多代理近似策略优化梯度算法，用于通过迭代模拟系统转换来训练策略和 VFA。利用从英国伦敦地铁收集的数据，对所提出的框架进行了实施和实际场景测试。计算结果证明，与之前的分布式控制算法和传统的元启发式方法相比，所开发的框架在计算效率方面更胜一筹。我们还提供了在不同持续时间、地点和乘客行为的中断期间重新安排列车的管理意义。其他实验表明，所提出的 MADRL 框架在利用通用模型管理不确定持续时间的中断时具有可扩展性。这项研究通过创新的控制和优化技术为实时轨道交通管理做出了贡献。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Transportation Research Part B-Methodological 工程技术-工程：土木

CiteScore

12.40

自引率

8.80%

发文量

143

审稿时长

14.1 weeks

期刊介绍： Transportation Research: Part B publishes papers on all methodological aspects of the subject, particularly those that require mathematical analysis. The general theme of the journal is the development and solution of problems that are adequately motivated to deal with important aspects of the design and/or analysis of transportation systems. Areas covered include: traffic flow; design and analysis of transportation networks; control and scheduling; optimization; queuing theory; logistics; supply chains; development and application of statistical, econometric and mathematical models to address transportation problems; cost models; pricing and/or investment; traveler or shipper behavior; cost-benefit methodologies.