利用深度强化学习解决列车调度问题

IF 2.6 Q3 TRANSPORTATION Journal of Rail Transport Planning & Management Pub Date : 2023-06-01 DOI:10.1016/j.jrtpm.2023.100394

Valerio Agasucci , Giorgio Grani , Leonardo Lamorgese

{"title":"利用深度强化学习解决列车调度问题","authors":"Valerio Agasucci , Giorgio Grani , Leonardo Lamorgese","doi":"10.1016/j.jrtpm.2023.100394","DOIUrl":null,"url":null,"abstract":"<div><p>Every day, railways experience disturbances and disruptions, both on the network and the fleet side, that affect the stability of rail traffic. Induced delays propagate through the network, which leads to a mismatch in demand and offer for goods and passengers, and, in turn, to a loss in service quality. In these cases, it is the duty of human traffic controllers, the so-called dispatchers, to do their best to minimize the impact on traffic. However, dispatchers inevitably have a limited depth of perception of the knock-on effect of their decisions, particularly how they affect areas of the network that are outside their direct control. In recent years, much work in Decision Science has been devoted to developing methods to solve the problem automatically and support the dispatchers in this challenging task. This paper investigates Machine Learning-based methods for tackling this problem, proposing two different Deep Q-Learning methods(Decentralized and Centralized). Numerical results show the superiority of these techniques respect to the classical linear Q-Learning based on matrices. Moreover the Centralized approach is compared with a MILP formulation showing interesting results. The experiments are inspired on data provided by a U.S. class 1 railroad.</p></div>","PeriodicalId":51821,"journal":{"name":"Journal of Rail Transport Planning & Management","volume":"26 ","pages":"Article 100394"},"PeriodicalIF":2.6000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Solving the train dispatching problem via deep reinforcement learning\",\"authors\":\"Valerio Agasucci , Giorgio Grani , Leonardo Lamorgese\",\"doi\":\"10.1016/j.jrtpm.2023.100394\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Every day, railways experience disturbances and disruptions, both on the network and the fleet side, that affect the stability of rail traffic. Induced delays propagate through the network, which leads to a mismatch in demand and offer for goods and passengers, and, in turn, to a loss in service quality. In these cases, it is the duty of human traffic controllers, the so-called dispatchers, to do their best to minimize the impact on traffic. However, dispatchers inevitably have a limited depth of perception of the knock-on effect of their decisions, particularly how they affect areas of the network that are outside their direct control. In recent years, much work in Decision Science has been devoted to developing methods to solve the problem automatically and support the dispatchers in this challenging task. This paper investigates Machine Learning-based methods for tackling this problem, proposing two different Deep Q-Learning methods(Decentralized and Centralized). Numerical results show the superiority of these techniques respect to the classical linear Q-Learning based on matrices. Moreover the Centralized approach is compared with a MILP formulation showing interesting results. The experiments are inspired on data provided by a U.S. class 1 railroad.</p></div>\",\"PeriodicalId\":51821,\"journal\":{\"name\":\"Journal of Rail Transport Planning & Management\",\"volume\":\"26 \",\"pages\":\"Article 100394\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2023-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Rail Transport Planning & Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2210970623000264\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Rail Transport Planning & Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210970623000264","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 1

摘要

每天，铁路都会经历网络和车队方面的干扰和中断，影响铁路交通的稳定性。诱导的延误通过网络传播，导致货物和乘客的需求和报价不匹配，进而导致服务质量下降。在这种情况下，人类交通管制员，即所谓的调度员，有责任尽最大努力将对交通的影响降至最低。然而，调度员不可避免地对其决策的连锁反应有着有限的感知深度，特别是他们如何影响他们直接控制之外的网络区域。近年来，决策科学领域的许多工作都致力于开发自动解决问题的方法，并支持调度员完成这项具有挑战性的任务。本文研究了基于机器学习的方法来解决这个问题，提出了两种不同的深度Q学习方法（分散和集中）。数值结果表明，与传统的基于矩阵的线性Q学习相比，这些技术具有优越性。此外，将集中式方法与MILP公式进行了比较，结果令人感兴趣。这些实验的灵感来源于美国一级铁路提供的数据。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Solving the train dispatching problem via deep reinforcement learning

Every day, railways experience disturbances and disruptions, both on the network and the fleet side, that affect the stability of rail traffic. Induced delays propagate through the network, which leads to a mismatch in demand and offer for goods and passengers, and, in turn, to a loss in service quality. In these cases, it is the duty of human traffic controllers, the so-called dispatchers, to do their best to minimize the impact on traffic. However, dispatchers inevitably have a limited depth of perception of the knock-on effect of their decisions, particularly how they affect areas of the network that are outside their direct control. In recent years, much work in Decision Science has been devoted to developing methods to solve the problem automatically and support the dispatchers in this challenging task. This paper investigates Machine Learning-based methods for tackling this problem, proposing two different Deep Q-Learning methods(Decentralized and Centralized). Numerical results show the superiority of these techniques respect to the classical linear Q-Learning based on matrices. Moreover the Centralized approach is compared with a MILP formulation showing interesting results. The experiments are inspired on data provided by a U.S. class 1 railroad.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Rail Transport Planning & Management TRANSPORTATION-

CiteScore

7.10

自引率

8.10%

发文量