Q_EDQ：基于强化学习的多模式旅行场景中的高效路径规划

IF 5.1 2区工程技术 Q1 TRANSPORTATION Travel Behaviour and Society Pub Date : 2024-11-05 DOI:10.1016/j.tbs.2024.100943

JianQiang Yan , Yinxiang Li , Yuan Gao , BoTing Qu , Jing Chen

{"title":"Q_EDQ：基于强化学习的多模式旅行场景中的高效路径规划","authors":"JianQiang Yan , Yinxiang Li , Yuan Gao , BoTing Qu , Jing Chen","doi":"10.1016/j.tbs.2024.100943","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, Mobility as a Service (MaaS) has garnered increasing attention by integrating various modes of transportation to provide users with a unified travel solution. However, In multimodal transportation planning, we primarily face three challenges: Firstly, a multimodal travel network is constructed that covers multiple travel modes and is highly scalable. Secondly, the routing algorithm fully considers the dynamic and real-time nature of the multimodal travel process. Finally, a generalized travel cost objective function is constructed that considers the psychological burden of transfers on passengers in multimodal travel scenarios. In this study, we firstly constructed an integrated multimodal transport network based on graph theory, which covers four transport modes, namely, the metro, the bus, the car-sharing and the walking. Subsequently, by introducing a double-Q learning mechanism and an optimized dynamic exploration strategy, we propose a new algorithm, Q_EDQ, the algorithm aims to learn the globally optimal path as efficiently as possible, with faster convergence speed and improved stability. Experiments utilizing real bus and metro data from Xi’an, Shaanxi Province, were conducted to compare the Q_EDQ algorithm with traditional genetic algorithms. In the conducted four experiments, compared to the optimal paths planned by traditional genetic algorithms, the improved Q-algorithm achieved a minimum efficiency increase of 12.52% and a maximum of 35%. These results demonstrate the enhanced capability of the improved Q-algorithm to learn globally optimal paths in complex multimodal transportation networks. Compared to the classical Q algorithm, the algorithmic model in this study shows an average performance improvement of 10% to 30% in global optimal path search, as well as convergence performance including loss and reward values.</div></div>","PeriodicalId":51534,"journal":{"name":"Travel Behaviour and Society","volume":"39 ","pages":"Article 100943"},"PeriodicalIF":5.1000,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Q_EDQ: Efficient path planning in multimodal travel scenarios based on reinforcement learning\",\"authors\":\"JianQiang Yan , Yinxiang Li , Yuan Gao , BoTing Qu , Jing Chen\",\"doi\":\"10.1016/j.tbs.2024.100943\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recently, Mobility as a Service (MaaS) has garnered increasing attention by integrating various modes of transportation to provide users with a unified travel solution. However, In multimodal transportation planning, we primarily face three challenges: Firstly, a multimodal travel network is constructed that covers multiple travel modes and is highly scalable. Secondly, the routing algorithm fully considers the dynamic and real-time nature of the multimodal travel process. Finally, a generalized travel cost objective function is constructed that considers the psychological burden of transfers on passengers in multimodal travel scenarios. In this study, we firstly constructed an integrated multimodal transport network based on graph theory, which covers four transport modes, namely, the metro, the bus, the car-sharing and the walking. Subsequently, by introducing a double-Q learning mechanism and an optimized dynamic exploration strategy, we propose a new algorithm, Q_EDQ, the algorithm aims to learn the globally optimal path as efficiently as possible, with faster convergence speed and improved stability. Experiments utilizing real bus and metro data from Xi’an, Shaanxi Province, were conducted to compare the Q_EDQ algorithm with traditional genetic algorithms. In the conducted four experiments, compared to the optimal paths planned by traditional genetic algorithms, the improved Q-algorithm achieved a minimum efficiency increase of 12.52% and a maximum of 35%. These results demonstrate the enhanced capability of the improved Q-algorithm to learn globally optimal paths in complex multimodal transportation networks. Compared to the classical Q algorithm, the algorithmic model in this study shows an average performance improvement of 10% to 30% in global optimal path search, as well as convergence performance including loss and reward values.</div></div>\",\"PeriodicalId\":51534,\"journal\":{\"name\":\"Travel Behaviour and Society\",\"volume\":\"39 \",\"pages\":\"Article 100943\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2024-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Travel Behaviour and Society\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2214367X24002060\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Travel Behaviour and Society","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2214367X24002060","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 0

摘要

最近，"移动即服务"（MaaS）通过整合各种交通方式，为用户提供统一的出行解决方案，受到越来越多的关注。然而，在多模式交通规划中，我们主要面临三个挑战：首先，构建一个涵盖多种出行方式且具有高度可扩展性的多模式出行网络。其次，路由算法要充分考虑多模式出行过程的动态性和实时性。最后，构建了广义的出行成本目标函数，考虑了多式联运出行场景中换乘给乘客带来的心理负担。在本研究中，我们首先基于图论构建了一个综合多式联运网络，涵盖了地铁、公交、共享汽车和步行四种交通方式。随后，通过引入双 Q 学习机制和优化的动态探索策略，我们提出了一种新算法 Q_EDQ，该算法旨在尽可能高效地学习全局最优路径，具有更快的收敛速度和更高的稳定性。我们利用陕西省西安市的真实公交和地铁数据进行了实验，将 Q_EDQ 算法与传统遗传算法进行了比较。在所进行的四次实验中，与传统遗传算法规划的最优路径相比，改进的 Q 算法的效率最低提高了 12.52%，最高提高了 35%。这些结果表明，改进型 Q 算法在复杂的多式联运网络中学习全局最优路径的能力得到了增强。与经典 Q 算法相比，本研究中的算法模型在全局最优路径搜索以及包括损失和奖励值在内的收敛性能方面平均提高了 10% 至 30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Q_EDQ: Efficient path planning in multimodal travel scenarios based on reinforcement learning

Recently, Mobility as a Service (MaaS) has garnered increasing attention by integrating various modes of transportation to provide users with a unified travel solution. However, In multimodal transportation planning, we primarily face three challenges: Firstly, a multimodal travel network is constructed that covers multiple travel modes and is highly scalable. Secondly, the routing algorithm fully considers the dynamic and real-time nature of the multimodal travel process. Finally, a generalized travel cost objective function is constructed that considers the psychological burden of transfers on passengers in multimodal travel scenarios. In this study, we firstly constructed an integrated multimodal transport network based on graph theory, which covers four transport modes, namely, the metro, the bus, the car-sharing and the walking. Subsequently, by introducing a double-Q learning mechanism and an optimized dynamic exploration strategy, we propose a new algorithm, Q_EDQ, the algorithm aims to learn the globally optimal path as efficiently as possible, with faster convergence speed and improved stability. Experiments utilizing real bus and metro data from Xi’an, Shaanxi Province, were conducted to compare the Q_EDQ algorithm with traditional genetic algorithms. In the conducted four experiments, compared to the optimal paths planned by traditional genetic algorithms, the improved Q-algorithm achieved a minimum efficiency increase of 12.52% and a maximum of 35%. These results demonstrate the enhanced capability of the improved Q-algorithm to learn globally optimal paths in complex multimodal transportation networks. Compared to the classical Q algorithm, the algorithmic model in this study shows an average performance improvement of 10% to 30% in global optimal path search, as well as convergence performance including loss and reward values.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Travel Behaviour and Society TRANSPORTATION-

CiteScore

9.80

自引率

7.70%

发文量

109

期刊介绍： Travel Behaviour and Society is an interdisciplinary journal publishing high-quality original papers which report leading edge research in theories, methodologies and applications concerning transportation issues and challenges which involve the social and spatial dimensions. In particular, it provides a discussion forum for major research in travel behaviour, transportation infrastructure, transportation and environmental issues, mobility and social sustainability, transportation geographic information systems (TGIS), transportation and quality of life, transportation data collection and analysis, etc.