{"title":"利用异步多代理强化学习进行合作路径规划","authors":"Jiaming Yin, Weixiong Rao, Yu Xiao, Keshuang Tang","doi":"arxiv-2409.00754","DOIUrl":null,"url":null,"abstract":"In this paper, we study the shortest path problem (SPP) with multiple\nsource-destination pairs (MSD), namely MSD-SPP, to minimize average travel time\nof all shortest paths. The inherent traffic capacity limits within a road\nnetwork contributes to the competition among vehicles. Multi-agent\nreinforcement learning (MARL) model cannot offer effective and efficient path\nplanning cooperation due to the asynchronous decision making setting in\nMSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routing\nactions in the previous time step. To tackle the efficiency issue, we propose\nto divide an entire road network into multiple sub-graphs and subsequently\nexecute a two-stage process of inter-region and intra-region route planning. To\naddress the asynchronous issue, in the proposed asyn-MARL framework, we first\ndesign a global state, which exploits a low-dimensional vector to implicitly\nrepresent the joint observations and actions of multi-agents. Then we develop a\nnovel trajectory collection mechanism to decrease the redundancy in training\ntrajectories. Additionally, we design a novel actor network to facilitate the\ncooperation among vehicles towards the same or close destinations and a\nreachability graph aimed at preventing infinite loops in routing paths. On both\nsynthetic and real road networks, our evaluation result demonstrates that our\napproach outperforms state-of-the-art planning approaches.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":"26 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning\",\"authors\":\"Jiaming Yin, Weixiong Rao, Yu Xiao, Keshuang Tang\",\"doi\":\"arxiv-2409.00754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we study the shortest path problem (SPP) with multiple\\nsource-destination pairs (MSD), namely MSD-SPP, to minimize average travel time\\nof all shortest paths. The inherent traffic capacity limits within a road\\nnetwork contributes to the competition among vehicles. Multi-agent\\nreinforcement learning (MARL) model cannot offer effective and efficient path\\nplanning cooperation due to the asynchronous decision making setting in\\nMSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routing\\nactions in the previous time step. To tackle the efficiency issue, we propose\\nto divide an entire road network into multiple sub-graphs and subsequently\\nexecute a two-stage process of inter-region and intra-region route planning. To\\naddress the asynchronous issue, in the proposed asyn-MARL framework, we first\\ndesign a global state, which exploits a low-dimensional vector to implicitly\\nrepresent the joint observations and actions of multi-agents. Then we develop a\\nnovel trajectory collection mechanism to decrease the redundancy in training\\ntrajectories. Additionally, we design a novel actor network to facilitate the\\ncooperation among vehicles towards the same or close destinations and a\\nreachability graph aimed at preventing infinite loops in routing paths. On both\\nsynthetic and real road networks, our evaluation result demonstrates that our\\napproach outperforms state-of-the-art planning approaches.\",\"PeriodicalId\":501479,\"journal\":{\"name\":\"arXiv - CS - Artificial Intelligence\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.00754\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning
In this paper, we study the shortest path problem (SPP) with multiple
source-destination pairs (MSD), namely MSD-SPP, to minimize average travel time
of all shortest paths. The inherent traffic capacity limits within a road
network contributes to the competition among vehicles. Multi-agent
reinforcement learning (MARL) model cannot offer effective and efficient path
planning cooperation due to the asynchronous decision making setting in
MSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routing
actions in the previous time step. To tackle the efficiency issue, we propose
to divide an entire road network into multiple sub-graphs and subsequently
execute a two-stage process of inter-region and intra-region route planning. To
address the asynchronous issue, in the proposed asyn-MARL framework, we first
design a global state, which exploits a low-dimensional vector to implicitly
represent the joint observations and actions of multi-agents. Then we develop a
novel trajectory collection mechanism to decrease the redundancy in training
trajectories. Additionally, we design a novel actor network to facilitate the
cooperation among vehicles towards the same or close destinations and a
reachability graph aimed at preventing infinite loops in routing paths. On both
synthetic and real road networks, our evaluation result demonstrates that our
approach outperforms state-of-the-art planning approaches.