RL SolVeR Pro: Reinforcement Learning for Solving Vehicle Routing Problem

Arun Kumar Kalakanti, Shivani Verma, T. Paul, Takufumi Yoshida
{"title":"RL SolVeR Pro: Reinforcement Learning for Solving Vehicle Routing Problem","authors":"Arun Kumar Kalakanti, Shivani Verma, T. Paul, Takufumi Yoshida","doi":"10.1109/AiDAS47888.2019.8970890","DOIUrl":null,"url":null,"abstract":"Vehicle Routing Problem (VRP) is a well-known NP-hard combinatorial optimization problem at the heart of the transportation and logistics research. VRP can be exactly solved only for small instances of the problem with conventional methods. Traditionally this problem has been solved using heuristic methods for large instances even though there is no guarantee of optimality. Efficient solution adopted to VRP may lead to significant savings per year in large transportation and logistics systems. Much of the recent works using Reinforcement Learning are computationally intensive and face the three curse of dimensionality: explosions in state and action spaces and high stochasticity i.e., large number of possible next states for a given state action pair. Also, recent works on VRP don’t consider the realistic simulation settings of customer environments, stochastic elements and scalability aspects as they use only standard Solomon benchmark instances of at most 100 customers. In this work, Reinforcement Learning Solver for Vehicle Routing Problem (RL SolVeR Pro) is proposed wherein the optimal route learning problem is cast as a Markov Decision Process (MDP). The curse of dimensionality of RL is also overcome by using two-phase solver with geometric clustering. Also, realistic simulation for VRP was used to validate the effectiveness and applicability of the proposed RL SolVeR Pro under various conditions and constraints. Our simulation results suggest that our proposed method is able to obtain better or same level of results, compared to the two best-known heuristics: Clarke-Wright Savings and Sweep Heuristic. The proposed RL Solver can be applied to other variants of the VRP and has the potential to be applied more generally to other combinatorial optimization problems.","PeriodicalId":227508,"journal":{"name":"2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Artificial Intelligence and Data Sciences (AiDAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AiDAS47888.2019.8970890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

Vehicle Routing Problem (VRP) is a well-known NP-hard combinatorial optimization problem at the heart of the transportation and logistics research. VRP can be exactly solved only for small instances of the problem with conventional methods. Traditionally this problem has been solved using heuristic methods for large instances even though there is no guarantee of optimality. Efficient solution adopted to VRP may lead to significant savings per year in large transportation and logistics systems. Much of the recent works using Reinforcement Learning are computationally intensive and face the three curse of dimensionality: explosions in state and action spaces and high stochasticity i.e., large number of possible next states for a given state action pair. Also, recent works on VRP don’t consider the realistic simulation settings of customer environments, stochastic elements and scalability aspects as they use only standard Solomon benchmark instances of at most 100 customers. In this work, Reinforcement Learning Solver for Vehicle Routing Problem (RL SolVeR Pro) is proposed wherein the optimal route learning problem is cast as a Markov Decision Process (MDP). The curse of dimensionality of RL is also overcome by using two-phase solver with geometric clustering. Also, realistic simulation for VRP was used to validate the effectiveness and applicability of the proposed RL SolVeR Pro under various conditions and constraints. Our simulation results suggest that our proposed method is able to obtain better or same level of results, compared to the two best-known heuristics: Clarke-Wright Savings and Sweep Heuristic. The proposed RL Solver can be applied to other variants of the VRP and has the potential to be applied more generally to other combinatorial optimization problems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
RL SolVeR Pro:用于解决车辆路线问题的强化学习
车辆路径问题(VRP)是一个众所周知的NP-hard组合优化问题,是交通物流研究的核心问题。传统方法只能精确地解决VRP问题的小实例。传统上,这个问题是使用启发式方法来解决大型实例的,即使没有最优性的保证。采用VRP的有效解决方案可以在大型运输和物流系统中每年节省大量资金。最近使用强化学习的许多工作都是计算密集型的,并且面临着三个维度的诅咒:状态和动作空间的爆炸以及高随机性,即给定状态动作对的大量可能的下一个状态。此外,最近关于VRP的工作并没有考虑到客户环境的现实模拟设置,随机元素和可伸缩性方面,因为它们只使用最多100个客户的标准Solomon基准实例。在这项工作中,提出了车辆路径问题的强化学习求解器(RL Solver Pro),其中最优路径学习问题被转换为马尔可夫决策过程(MDP)。采用几何聚类的两相求解器克服了RL的维数问题。通过VRP仿真,验证了所提出的RL SolVeR Pro在各种条件和约束下的有效性和适用性。我们的模拟结果表明,与Clarke-Wright Savings和Sweep Heuristic这两种最著名的启发式方法相比,我们提出的方法能够获得更好或相同水平的结果。所提出的RL求解器可以应用于VRP的其他变体,并且有可能更广泛地应用于其他组合优化问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Construction of Fuzzy System for Classification of Heart Disease Based on Phonocardiogram Signal Automated Machine Learning based on Genetic Programming: a case study on a real house pricing dataset Framework Of Malay Intelligent Autonomous Helper (Min@H): Text, Speech And Knowledge Dimension Towards Artificial Wisdom For Future Military Training System Survey of Sea Wave Parameters Classification and Prediction using Machine Leaming Models An optimized Multi-Layer Ensemble Framework for Sentiment Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1