一种基于强化学习和解耦策略的按需配送订单调度匹配算法

IF 5.2 1区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Tsinghua Science and Technology Pub Date : 2023-09-22 DOI:10.26599/TST.2023.9010069

Jingfang Chen;Ling Wang;Zixiao Pan;Yuting Wu;Jie Zheng;Xuetao Ding

{"title":"一种基于强化学习和解耦策略的按需配送订单调度匹配算法","authors":"Jingfang Chen;Ling Wang;Zixiao Pan;Yuting Wu;Jie Zheng;Xuetao Ding","doi":"10.26599/TST.2023.9010069","DOIUrl":null,"url":null,"abstract":"The on-demand food delivery (OFD) service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality. The order dispatching problem is one of the most concerning issues for the OFD platforms, which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time. To solve such a challenging combinatorial optimization problem, an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method. First, to deal with the large-scale complexity, a decoupling method is designed by reducing the matching space between new orders and riders. Second, to overcome the high dynamism and satisfy the stringent requirements on decision time, a reinforcement learning based dispatching heuristic is presented. To be specific, a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence. Besides, a training approach is specially designed to improve learning performance. Furthermore, a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence. On real-world datasets, numerical experiments are conducted to validate the effectiveness of the proposed algorithm. Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.","PeriodicalId":60306,"journal":{"name":"Tsinghua Science and Technology","volume":"29 2","pages":"386-399"},"PeriodicalIF":5.2000,"publicationDate":"2023-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/5971803/10258149/10258151.pdf","citationCount":"0","resultStr":"{\"title\":\"A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery\",\"authors\":\"Jingfang Chen;Ling Wang;Zixiao Pan;Yuting Wu;Jie Zheng;Xuetao Ding\",\"doi\":\"10.26599/TST.2023.9010069\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The on-demand food delivery (OFD) service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality. The order dispatching problem is one of the most concerning issues for the OFD platforms, which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time. To solve such a challenging combinatorial optimization problem, an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method. First, to deal with the large-scale complexity, a decoupling method is designed by reducing the matching space between new orders and riders. Second, to overcome the high dynamism and satisfy the stringent requirements on decision time, a reinforcement learning based dispatching heuristic is presented. To be specific, a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence. Besides, a training approach is specially designed to improve learning performance. Furthermore, a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence. On real-world datasets, numerical experiments are conducted to validate the effectiveness of the proposed algorithm. Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.\",\"PeriodicalId\":60306,\"journal\":{\"name\":\"Tsinghua Science and Technology\",\"volume\":\"29 2\",\"pages\":\"386-399\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2023-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/iel7/5971803/10258149/10258151.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Tsinghua Science and Technology\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10258151/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Tsinghua Science and Technology","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10258151/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

按需送餐服务在过去几十年中得到了快速发展，但同时也面临着进一步提高运营质量的挑战。订单调度问题是OFD平台最关心的问题之一，它指的是在非常有限的决策时间内合理地向骑手动态调度大量订单。为了解决这样一个具有挑战性的组合优化问题，将强化学习技术与优化方法相结合，提出了一种有效的匹配算法。首先，为了处理大规模复杂性，通过减少新订单和骑手之间的匹配空间，设计了一种解耦方法。其次，为了克服高动态性和满足对决策时间的严格要求，提出了一种基于强化学习的调度启发式算法。具体地说，基于问题的特征构建了序列到序列的神经网络，以生成顺序优先级序列。此外，还专门设计了一种训练方法来提高学习成绩。此外，采用贪婪启发式算法，根据订单优先级序列有效地调度新订单。在真实世界的数据集上，进行了数值实验来验证所提出算法的有效性。统计结果表明，该算法可以有效地解决问题，提高配送效率，保持客户满意度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Matching Algorithm with Reinforcement Learning and Decoupling Strategy for Order Dispatching in On-Demand Food Delivery

The on-demand food delivery (OFD) service has gained rapid development in the past decades but meanwhile encounters challenges for further improving operation quality. The order dispatching problem is one of the most concerning issues for the OFD platforms, which refer to dynamically dispatching a large number of orders to riders reasonably in very limited decision time. To solve such a challenging combinatorial optimization problem, an effective matching algorithm is proposed by fusing the reinforcement learning technique and the optimization method. First, to deal with the large-scale complexity, a decoupling method is designed by reducing the matching space between new orders and riders. Second, to overcome the high dynamism and satisfy the stringent requirements on decision time, a reinforcement learning based dispatching heuristic is presented. To be specific, a sequence-to-sequence neural network is constructed based on the problem characteristic to generate an order priority sequence. Besides, a training approach is specially designed to improve learning performance. Furthermore, a greedy heuristic is employed to effectively dispatch new orders according to the order priority sequence. On real-world datasets, numerical experiments are conducted to validate the effectiveness of the proposed algorithm. Statistical results show that the proposed algorithm can effectively solve the problem by improving delivery efficiency and maintaining customer satisfaction.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊