滚动地平线法优化Can-Order策略的强化学习

J. Noh
{"title":"滚动地平线法优化Can-Order策略的强化学习","authors":"J. Noh","doi":"10.3390/systems11070350","DOIUrl":null,"url":null,"abstract":"This study presents a novel approach to a mixed-integer linear programming (MILP) model for periodic inventory management that combines reinforcement learning algorithms. The rolling horizon method (RHM) is a multi-period optimization approach that is applied to handle new information in updated markets. The RHM faces a limitation in easily determining a prediction horizon; to overcome this, a dynamic RHM is developed in which RL algorithms optimize the prediction horizon of the RHM. The state vector consisted of the order-up-to-level, real demand, total cost, holding cost, and backorder cost, whereas the action included the prediction horizon and forecasting demand for the next time step. The performance of the proposed model was validated through two experiments conducted in cases with stable and uncertain demand patterns. The results showed the effectiveness of the proposed approach in inventory management, particularly when the proximal policy optimization (PPO) algorithm was used for training compared with other reinforcement learning algorithms. This study signifies important advancements in both the theoretical and practical aspects of multi-item inventory management.","PeriodicalId":52858,"journal":{"name":"syst mt`lyh","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Optimizing Can-Order Policy with the Rolling Horizon Method\",\"authors\":\"J. Noh\",\"doi\":\"10.3390/systems11070350\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study presents a novel approach to a mixed-integer linear programming (MILP) model for periodic inventory management that combines reinforcement learning algorithms. The rolling horizon method (RHM) is a multi-period optimization approach that is applied to handle new information in updated markets. The RHM faces a limitation in easily determining a prediction horizon; to overcome this, a dynamic RHM is developed in which RL algorithms optimize the prediction horizon of the RHM. The state vector consisted of the order-up-to-level, real demand, total cost, holding cost, and backorder cost, whereas the action included the prediction horizon and forecasting demand for the next time step. The performance of the proposed model was validated through two experiments conducted in cases with stable and uncertain demand patterns. The results showed the effectiveness of the proposed approach in inventory management, particularly when the proximal policy optimization (PPO) algorithm was used for training compared with other reinforcement learning algorithms. This study signifies important advancements in both the theoretical and practical aspects of multi-item inventory management.\",\"PeriodicalId\":52858,\"journal\":{\"name\":\"syst mt`lyh\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"syst mt`lyh\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/systems11070350\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"syst mt`lyh","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/systems11070350","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本研究提出了一种结合强化学习算法的混合整数线性规划(MILP)周期性库存管理模型的新方法。滚动水平法是一种多周期优化方法,用于处理更新市场中的新信息。RHM在容易确定预测范围方面存在局限性;为了克服这个问题,开发了一种动态RHM,其中RL算法优化了RHM的预测范围。状态向量包括到级订单、实际需求、总成本、持有成本和缺货成本,而动作包括预测范围和预测下一个时间步的需求。通过稳定需求模式和不确定需求模式两种情况下的实验,验证了该模型的性能。结果表明,与其他强化学习算法相比,本文提出的方法在库存管理中是有效的,特别是当使用近端策略优化(PPO)算法进行训练时。本研究在多项目库存管理的理论和实践方面都取得了重要进展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Reinforcement Learning for Optimizing Can-Order Policy with the Rolling Horizon Method
This study presents a novel approach to a mixed-integer linear programming (MILP) model for periodic inventory management that combines reinforcement learning algorithms. The rolling horizon method (RHM) is a multi-period optimization approach that is applied to handle new information in updated markets. The RHM faces a limitation in easily determining a prediction horizon; to overcome this, a dynamic RHM is developed in which RL algorithms optimize the prediction horizon of the RHM. The state vector consisted of the order-up-to-level, real demand, total cost, holding cost, and backorder cost, whereas the action included the prediction horizon and forecasting demand for the next time step. The performance of the proposed model was validated through two experiments conducted in cases with stable and uncertain demand patterns. The results showed the effectiveness of the proposed approach in inventory management, particularly when the proximal policy optimization (PPO) algorithm was used for training compared with other reinforcement learning algorithms. This study signifies important advancements in both the theoretical and practical aspects of multi-item inventory management.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
9 weeks
期刊最新文献
Optimal Government Subsidy Decision and Its Impact on Sustainable Development of a Closed-Loop Supply Chain An Emotional Design Model for Future Smart Product Based on Grounded Theory Evolution Mechanism of Public-Private Partnership Project Trust from the Perspective of the Supply Chain Derivation of Optimal Operation Factors of Anaerobic Digesters through Artificial Neural Network Technology An Industrial Case Study on the Monitoring and Maintenance Service System for a Robot-Driven Polishing Service System under Industry 4.0 Contexts
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1