{"title":"在线并发三维料仓包装优化的深度强化学习方法与料仓替换策略","authors":"Y.P. Tsang , D.Y. Mo , K.T. Chung , C.K.M. Lee","doi":"10.1016/j.compind.2024.104202","DOIUrl":null,"url":null,"abstract":"<div><div>In the realm of robotic palletisation, the quest for optimal space utilization remains vital but also presents a critical challenge, particularly due to the constraints of decision complexity and the need for real-time decision-making without complete prior information. The widely adopted rule-based heuristics approaches were ease to use, but failed to adapt dynamically to the complex and changing landscape of online 3D bin packing. This study is motivated by the need for a system that is both more agile and intelligent, capable of managing the intricacies of dual-bin scenarios and the variable inflow of items. This study introduces a novel deep reinforcement learning (DRL) optimiser, employing a double deep Q-network (DDQN) to obtain optimal packing policies in an online environment with two proposed bin replacement strategies. This approach surpasses the limitations of previous methods by facilitating the simultaneous management of multiple bins and enabling on-the-fly adjustments to decisions based on limited prior knowledge. In a case study involving a logistics company, the proposed optimizer demonstrated a significant improvement in average space utilization across various lookahead scenarios, outperforming traditional heuristics in simulation experiments. The proposed optimiser contributes significantly to the economic and environmental sustainability of robotic warehouses, positioning itself as a cornerstone for the future of smart logistics.</div></div>","PeriodicalId":55219,"journal":{"name":"Computers in Industry","volume":"164 ","pages":"Article 104202"},"PeriodicalIF":8.2000,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies\",\"authors\":\"Y.P. Tsang , D.Y. Mo , K.T. Chung , C.K.M. Lee\",\"doi\":\"10.1016/j.compind.2024.104202\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In the realm of robotic palletisation, the quest for optimal space utilization remains vital but also presents a critical challenge, particularly due to the constraints of decision complexity and the need for real-time decision-making without complete prior information. The widely adopted rule-based heuristics approaches were ease to use, but failed to adapt dynamically to the complex and changing landscape of online 3D bin packing. This study is motivated by the need for a system that is both more agile and intelligent, capable of managing the intricacies of dual-bin scenarios and the variable inflow of items. This study introduces a novel deep reinforcement learning (DRL) optimiser, employing a double deep Q-network (DDQN) to obtain optimal packing policies in an online environment with two proposed bin replacement strategies. This approach surpasses the limitations of previous methods by facilitating the simultaneous management of multiple bins and enabling on-the-fly adjustments to decisions based on limited prior knowledge. In a case study involving a logistics company, the proposed optimizer demonstrated a significant improvement in average space utilization across various lookahead scenarios, outperforming traditional heuristics in simulation experiments. The proposed optimiser contributes significantly to the economic and environmental sustainability of robotic warehouses, positioning itself as a cornerstone for the future of smart logistics.</div></div>\",\"PeriodicalId\":55219,\"journal\":{\"name\":\"Computers in Industry\",\"volume\":\"164 \",\"pages\":\"Article 104202\"},\"PeriodicalIF\":8.2000,\"publicationDate\":\"2024-11-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers in Industry\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0166361524001301\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Industry","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0166361524001301","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
在机器人码垛领域,追求最佳空间利用率仍然至关重要,但也是一项严峻的挑战,特别是由于决策复杂性的限制,以及需要在没有完整先验信息的情况下进行实时决策。广泛采用的基于规则的启发式方法易于使用,但无法动态适应复杂多变的在线 3D 仓储包装环境。本研究的动机是需要一个更加敏捷和智能的系统,能够管理错综复杂的双仓场景和多变的物品流入。本研究引入了一种新颖的深度强化学习(DRL)优化器,利用双深度 Q 网络(DDQN)在在线环境中通过两种建议的垃圾箱替换策略获得最佳包装策略。这种方法超越了以往方法的局限性,有利于同时管理多个垃圾箱,并能根据有限的先验知识对决策进行即时调整。在一项涉及一家物流公司的案例研究中,所提出的优化器在各种前瞻性方案中显著提高了平均空间利用率,在模拟实验中优于传统的启发式方法。所提出的优化器极大地促进了机器人仓库的经济和环境可持续性,使其成为未来智能物流的基石。
A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies
In the realm of robotic palletisation, the quest for optimal space utilization remains vital but also presents a critical challenge, particularly due to the constraints of decision complexity and the need for real-time decision-making without complete prior information. The widely adopted rule-based heuristics approaches were ease to use, but failed to adapt dynamically to the complex and changing landscape of online 3D bin packing. This study is motivated by the need for a system that is both more agile and intelligent, capable of managing the intricacies of dual-bin scenarios and the variable inflow of items. This study introduces a novel deep reinforcement learning (DRL) optimiser, employing a double deep Q-network (DDQN) to obtain optimal packing policies in an online environment with two proposed bin replacement strategies. This approach surpasses the limitations of previous methods by facilitating the simultaneous management of multiple bins and enabling on-the-fly adjustments to decisions based on limited prior knowledge. In a case study involving a logistics company, the proposed optimizer demonstrated a significant improvement in average space utilization across various lookahead scenarios, outperforming traditional heuristics in simulation experiments. The proposed optimiser contributes significantly to the economic and environmental sustainability of robotic warehouses, positioning itself as a cornerstone for the future of smart logistics.
期刊介绍:
The objective of Computers in Industry is to present original, high-quality, application-oriented research papers that:
• Illuminate emerging trends and possibilities in the utilization of Information and Communication Technology in industry;
• Establish connections or integrations across various technology domains within the expansive realm of computer applications for industry;
• Foster connections or integrations across diverse application areas of ICT in industry.