Marlon Löppenberg , Steve Yuwono , Mochammad Rizky Diprasetya, Andreas Schwung
{"title":"Dynamic robot routing optimization: State–space decomposition for operations research-informed reinforcement learning","authors":"Marlon Löppenberg , Steve Yuwono , Mochammad Rizky Diprasetya, Andreas Schwung","doi":"10.1016/j.rcim.2024.102812","DOIUrl":null,"url":null,"abstract":"<div><p>There is a growing interest in implementing artificial intelligence for operations research in the industrial environment. While numerous classic operations research solvers ensure optimal solutions, they often struggle with real-time dynamic objectives and environments, such as dynamic routing problems, which require periodic algorithmic recalibration. To deal with dynamic environments, deep reinforcement learning has shown great potential with its capability as a self-learning and optimizing mechanism. However, the real-world applications of reinforcement learning are relatively limited due to lengthy training time and inefficiency in high-dimensional state spaces. In this study, we introduce two methods to enhance reinforcement learning for dynamic routing optimization. The first method involves transferring knowledge from classic operations research solvers to reinforcement learning during training, which accelerates exploration and reduces lengthy training time. The second method uses a state–space decomposer to transform the high-dimensional state space into a low-dimensional latent space, which allows the reinforcement learning agent to learn efficiently in the latent space. Lastly, we demonstrate the applicability of our approach in an industrial application of an automated welding process, where our approach identifies the shortest welding pathway of an industrial robotic arm to weld a set of dynamically changing target nodes, poses and sizes. The suggested method cuts computation time by 25% to 50% compared to classic routing algorithms.</p></div>","PeriodicalId":21452,"journal":{"name":"Robotics and Computer-integrated Manufacturing","volume":"90 ","pages":"Article 102812"},"PeriodicalIF":9.1000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0736584524000991/pdfft?md5=ab4f57a94d511a4d5a4e9b1c724edd3f&pid=1-s2.0-S0736584524000991-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics and Computer-integrated Manufacturing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0736584524000991","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
There is a growing interest in implementing artificial intelligence for operations research in the industrial environment. While numerous classic operations research solvers ensure optimal solutions, they often struggle with real-time dynamic objectives and environments, such as dynamic routing problems, which require periodic algorithmic recalibration. To deal with dynamic environments, deep reinforcement learning has shown great potential with its capability as a self-learning and optimizing mechanism. However, the real-world applications of reinforcement learning are relatively limited due to lengthy training time and inefficiency in high-dimensional state spaces. In this study, we introduce two methods to enhance reinforcement learning for dynamic routing optimization. The first method involves transferring knowledge from classic operations research solvers to reinforcement learning during training, which accelerates exploration and reduces lengthy training time. The second method uses a state–space decomposer to transform the high-dimensional state space into a low-dimensional latent space, which allows the reinforcement learning agent to learn efficiently in the latent space. Lastly, we demonstrate the applicability of our approach in an industrial application of an automated welding process, where our approach identifies the shortest welding pathway of an industrial robotic arm to weld a set of dynamically changing target nodes, poses and sizes. The suggested method cuts computation time by 25% to 50% compared to classic routing algorithms.
期刊介绍:
The journal, Robotics and Computer-Integrated Manufacturing, focuses on sharing research applications that contribute to the development of new or enhanced robotics, manufacturing technologies, and innovative manufacturing strategies that are relevant to industry. Papers that combine theory and experimental validation are preferred, while review papers on current robotics and manufacturing issues are also considered. However, papers on traditional machining processes, modeling and simulation, supply chain management, and resource optimization are generally not within the scope of the journal, as there are more appropriate journals for these topics. Similarly, papers that are overly theoretical or mathematical will be directed to other suitable journals. The journal welcomes original papers in areas such as industrial robotics, human-robot collaboration in manufacturing, cloud-based manufacturing, cyber-physical production systems, big data analytics in manufacturing, smart mechatronics, machine learning, adaptive and sustainable manufacturing, and other fields involving unique manufacturing technologies.