{"title":"基于多Agent强化学习的按需配送跨区域快递员置换","authors":"Shuai Wang;Shijie Hu;Baoshen Guo;Guang Wang","doi":"10.1109/TBDATA.2023.3262408","DOIUrl":null,"url":null,"abstract":"On-demand delivery has become prevailing for people to order meals and groceries online, especially during the pandemic. It is essential to dispatch massive orders to limited couriers to satisfy on-demand delivery users, especially during peak hours. Existing studies mainly focus on order dispatching within a region, and they are challenging to be applied to the cross-region courier displacement problem due to (1) unique practical factors, including regional spatial-temporal demand-supply dynamics and strict delivery time constraints, and (2) the large-scale setting and high-dimensional decision space given massive couriers in on-demand delivery. To address these challenges, in this work, we propose an efficient cross-region courier displacement framework, i.e., \n<underline>C</u>\nourier \n<underline>D</u>\nisplacement \n<underline>R</u>\neinforcement \n<underline>L</u>\nearning (short for \n<italic>CDRL</i>\n) based on centralized multi-agent actor-critic, which first design the actor-critic network with a time-varying displacement intensity control module to capture demand-supply dynamics and utilize the centralized training and decentralized execution multi-agent framework to address the large-scale coordination. One-month real-world order records collected from one of the biggest on-demand delivery services in the world are utilized to show the performance of our design. The extensive results show that our method offers a 47.97% of increase in balancing supply and demand and reduces idle ride time by 24.62% simultaneously.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"9 5","pages":"1321-1333"},"PeriodicalIF":7.5000,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Cross-Region Courier Displacement for On-Demand Delivery With Multi-Agent Reinforcement Learning\",\"authors\":\"Shuai Wang;Shijie Hu;Baoshen Guo;Guang Wang\",\"doi\":\"10.1109/TBDATA.2023.3262408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"On-demand delivery has become prevailing for people to order meals and groceries online, especially during the pandemic. It is essential to dispatch massive orders to limited couriers to satisfy on-demand delivery users, especially during peak hours. Existing studies mainly focus on order dispatching within a region, and they are challenging to be applied to the cross-region courier displacement problem due to (1) unique practical factors, including regional spatial-temporal demand-supply dynamics and strict delivery time constraints, and (2) the large-scale setting and high-dimensional decision space given massive couriers in on-demand delivery. To address these challenges, in this work, we propose an efficient cross-region courier displacement framework, i.e., \\n<underline>C</u>\\nourier \\n<underline>D</u>\\nisplacement \\n<underline>R</u>\\neinforcement \\n<underline>L</u>\\nearning (short for \\n<italic>CDRL</i>\\n) based on centralized multi-agent actor-critic, which first design the actor-critic network with a time-varying displacement intensity control module to capture demand-supply dynamics and utilize the centralized training and decentralized execution multi-agent framework to address the large-scale coordination. One-month real-world order records collected from one of the biggest on-demand delivery services in the world are utilized to show the performance of our design. The extensive results show that our method offers a 47.97% of increase in balancing supply and demand and reduces idle ride time by 24.62% simultaneously.\",\"PeriodicalId\":13106,\"journal\":{\"name\":\"IEEE Transactions on Big Data\",\"volume\":\"9 5\",\"pages\":\"1321-1333\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2023-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10083277/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10083277/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Cross-Region Courier Displacement for On-Demand Delivery With Multi-Agent Reinforcement Learning
On-demand delivery has become prevailing for people to order meals and groceries online, especially during the pandemic. It is essential to dispatch massive orders to limited couriers to satisfy on-demand delivery users, especially during peak hours. Existing studies mainly focus on order dispatching within a region, and they are challenging to be applied to the cross-region courier displacement problem due to (1) unique practical factors, including regional spatial-temporal demand-supply dynamics and strict delivery time constraints, and (2) the large-scale setting and high-dimensional decision space given massive couriers in on-demand delivery. To address these challenges, in this work, we propose an efficient cross-region courier displacement framework, i.e.,
C
ourier
D
isplacement
R
einforcement
L
earning (short for
CDRL
) based on centralized multi-agent actor-critic, which first design the actor-critic network with a time-varying displacement intensity control module to capture demand-supply dynamics and utilize the centralized training and decentralized execution multi-agent framework to address the large-scale coordination. One-month real-world order records collected from one of the biggest on-demand delivery services in the world are utilized to show the performance of our design. The extensive results show that our method offers a 47.97% of increase in balancing supply and demand and reduces idle ride time by 24.62% simultaneously.
期刊介绍:
The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.