{"title":"A Combination Feature-Based Reinforcement Learning Approach via Mathematical Optimization","authors":"Fengyuan Shi;Ying Meng;Jiyin Liu;Lixin Tang","doi":"10.1109/TASE.2025.3544431","DOIUrl":null,"url":null,"abstract":"Reinforcement learning is a promising method for solving decision problems, and its potential has been increasingly recognized for large-scale combinatorial optimization problems in recent years. However, the existing studies on reinforcement learning for cutting stock problems mostly rely on sequence-to-sequence or graph neural network approaches that use the learned experience to make decisions while neglecting the combination features of cutting stock problems. In this paper, we propose a novel reinforcement learning framework for cutting stock problems that integrate integer programming and monotone comparative statics to construct a Markov decision process with a high-quality action space. We start by constructing a new Markov decision process that considers the diagonal structure of the integer programming model for combinatorial optimization problems, and then use column generation to obtain each action by combining multiple decision variables. Furthermore, we design a bipartite graph and related bipartite graph convolutional network to find the solutions. The results show that the proposed reinforcement learning framework provides a high-quality action space, and the designed bipartite graph convolutional network can effectively select the best actions from the action set.Note to Practitioners—This article was motivated by the cutting stock problems that exist in various industrial scenarios such as the wood, steel, paper, and glass industries. We improve the reinforcement learning for the cutting stock problem that can be adopted in industrial scenarios, which can increase the profile and reduce the production cost of industrial enterprises. Our improvement can also be referred to when solving other combinatorial optimization problems that can promote making decisions in industrial production.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"12455-12469"},"PeriodicalIF":6.4000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10898006/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Reinforcement learning is a promising method for solving decision problems, and its potential has been increasingly recognized for large-scale combinatorial optimization problems in recent years. However, the existing studies on reinforcement learning for cutting stock problems mostly rely on sequence-to-sequence or graph neural network approaches that use the learned experience to make decisions while neglecting the combination features of cutting stock problems. In this paper, we propose a novel reinforcement learning framework for cutting stock problems that integrate integer programming and monotone comparative statics to construct a Markov decision process with a high-quality action space. We start by constructing a new Markov decision process that considers the diagonal structure of the integer programming model for combinatorial optimization problems, and then use column generation to obtain each action by combining multiple decision variables. Furthermore, we design a bipartite graph and related bipartite graph convolutional network to find the solutions. The results show that the proposed reinforcement learning framework provides a high-quality action space, and the designed bipartite graph convolutional network can effectively select the best actions from the action set.Note to Practitioners—This article was motivated by the cutting stock problems that exist in various industrial scenarios such as the wood, steel, paper, and glass industries. We improve the reinforcement learning for the cutting stock problem that can be adopted in industrial scenarios, which can increase the profile and reduce the production cost of industrial enterprises. Our improvement can also be referred to when solving other combinatorial optimization problems that can promote making decisions in industrial production.
期刊介绍:
The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.