Imitation Improvement Learning for Large-Scale Capacitated Vehicle Routing Problems

Viet The Bui, Tien Mai
{"title":"Imitation Improvement Learning for Large-Scale Capacitated Vehicle Routing Problems","authors":"Viet The Bui, Tien Mai","doi":"10.1609/icaps.v33i1.27236","DOIUrl":null,"url":null,"abstract":"Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a new improvement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar and better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into subproblems by clustering sequentially nodes in clockwise order, and then learning to solve them simultaneously. Our approaches enhance state-of-the-art CVRP solvers while attaining competitive solution quality on several well-known datasets, including real-world instances with sizes up to 30,000 nodes. Our best methods are able to achieve new state-of-the-art solutions for several large instances and generalize to a wide range of CVRP variants and solvers. We also contribute new datasets and results to test the generalizability of our deep RL algorithms.","PeriodicalId":239898,"journal":{"name":"International Conference on Automated Planning and Scheduling","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Automated Planning and Scheduling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icaps.v33i1.27236","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Recent works using deep reinforcement learning (RL) to solve routing problems such as the capacitated vehicle routing problem (CVRP) have focused on improvement learning-based methods, which involve improving a given solution until it becomes near-optimal. Although adequate solutions can be achieved for small problem instances, their efficiency degrades for large-scale ones. In this work, we propose a new improvement learning-based framework based on imitation learning where classical heuristics serve as experts to encourage the policy model to mimic and produce similar and better solutions. Moreover, to improve scalability, we propose Clockwise Clustering, a novel augmented framework for decomposing large-scale CVRP into subproblems by clustering sequentially nodes in clockwise order, and then learning to solve them simultaneously. Our approaches enhance state-of-the-art CVRP solvers while attaining competitive solution quality on several well-known datasets, including real-world instances with sizes up to 30,000 nodes. Our best methods are able to achieve new state-of-the-art solutions for several large instances and generalize to a wide range of CVRP variants and solvers. We also contribute new datasets and results to test the generalizability of our deep RL algorithms.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大规模车辆路径问题的模仿改进学习
最近使用深度强化学习(RL)来解决路线问题(如有能力车辆路线问题(CVRP))的工作主要集中在改进基于学习的方法上,其中包括改进给定的解决方案,直到它接近最优。尽管对于小问题实例可以获得适当的解决方案,但对于大规模问题实例,它们的效率会降低。在这项工作中,我们提出了一个基于模仿学习的新的改进学习框架,其中经典启发式作为专家来鼓励政策模型模仿并产生类似和更好的解决方案。此外,为了提高可扩展性,我们提出了顺时针聚类,这是一种新的增强框架,通过顺时针顺序聚类节点,将大规模CVRP分解为子问题,然后同时学习解决它们。我们的方法增强了最先进的CVRP求解器,同时在几个知名数据集上获得具有竞争力的解决方案质量,包括规模高达30,000个节点的现实世界实例。我们最好的方法能够为几个大型实例实现新的最先进的解决方案,并推广到广泛的CVRP变体和求解器。我们还提供了新的数据集和结果来测试我们的深度强化学习算法的泛化性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Fast and Robust Resource-Constrained Scheduling with Graph Neural Networks Solving the Multi-Choice Two Dimensional Shelf Strip Packing Problem with Time Windows Generalizing Action Justification and Causal Links to Policies Exact Anytime Multi-Agent Path Finding Using Branch-and-Cut-and-Price and Large Neighborhood Search A Constraint Programming Solution to the Guillotine Rectangular Cutting Problem
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1