A Combination Feature-Based Reinforcement Learning Approach via Mathematical Optimization

IF 6.4 2区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2025-02-21 DOI:10.1109/TASE.2025.3544431
Fengyuan Shi;Ying Meng;Jiyin Liu;Lixin Tang
{"title":"A Combination Feature-Based Reinforcement Learning Approach via Mathematical Optimization","authors":"Fengyuan Shi;Ying Meng;Jiyin Liu;Lixin Tang","doi":"10.1109/TASE.2025.3544431","DOIUrl":null,"url":null,"abstract":"Reinforcement learning is a promising method for solving decision problems, and its potential has been increasingly recognized for large-scale combinatorial optimization problems in recent years. However, the existing studies on reinforcement learning for cutting stock problems mostly rely on sequence-to-sequence or graph neural network approaches that use the learned experience to make decisions while neglecting the combination features of cutting stock problems. In this paper, we propose a novel reinforcement learning framework for cutting stock problems that integrate integer programming and monotone comparative statics to construct a Markov decision process with a high-quality action space. We start by constructing a new Markov decision process that considers the diagonal structure of the integer programming model for combinatorial optimization problems, and then use column generation to obtain each action by combining multiple decision variables. Furthermore, we design a bipartite graph and related bipartite graph convolutional network to find the solutions. The results show that the proposed reinforcement learning framework provides a high-quality action space, and the designed bipartite graph convolutional network can effectively select the best actions from the action set.Note to Practitioners—This article was motivated by the cutting stock problems that exist in various industrial scenarios such as the wood, steel, paper, and glass industries. We improve the reinforcement learning for the cutting stock problem that can be adopted in industrial scenarios, which can increase the profile and reduce the production cost of industrial enterprises. Our improvement can also be referred to when solving other combinatorial optimization problems that can promote making decisions in industrial production.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"12455-12469"},"PeriodicalIF":6.4000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10898006/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Reinforcement learning is a promising method for solving decision problems, and its potential has been increasingly recognized for large-scale combinatorial optimization problems in recent years. However, the existing studies on reinforcement learning for cutting stock problems mostly rely on sequence-to-sequence or graph neural network approaches that use the learned experience to make decisions while neglecting the combination features of cutting stock problems. In this paper, we propose a novel reinforcement learning framework for cutting stock problems that integrate integer programming and monotone comparative statics to construct a Markov decision process with a high-quality action space. We start by constructing a new Markov decision process that considers the diagonal structure of the integer programming model for combinatorial optimization problems, and then use column generation to obtain each action by combining multiple decision variables. Furthermore, we design a bipartite graph and related bipartite graph convolutional network to find the solutions. The results show that the proposed reinforcement learning framework provides a high-quality action space, and the designed bipartite graph convolutional network can effectively select the best actions from the action set.Note to Practitioners—This article was motivated by the cutting stock problems that exist in various industrial scenarios such as the wood, steel, paper, and glass industries. We improve the reinforcement learning for the cutting stock problem that can be adopted in industrial scenarios, which can increase the profile and reduce the production cost of industrial enterprises. Our improvement can also be referred to when solving other combinatorial optimization problems that can promote making decisions in industrial production.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于数学优化的组合特征强化学习方法
强化学习是解决决策问题的一种很有前途的方法,近年来,它在大规模组合优化问题中的潜力日益得到认可。然而,现有的关于切削料问题强化学习的研究大多依赖于序列到序列或图神经网络方法,这些方法利用学习到的经验进行决策,而忽略了切削料问题的组合特征。在本文中,我们提出了一种新的用于切割库存问题的强化学习框架,该框架结合整数规划和单调比较静力学来构造具有高质量动作空间的马尔可夫决策过程。针对组合优化问题,我们首先构造了一个考虑整数规划模型对角结构的马尔可夫决策过程,然后通过组合多个决策变量,使用列生成的方法来获得每个动作。进一步,我们设计了一个二部图和相关的二部图卷积网络来寻找解。结果表明,所提出的强化学习框架提供了一个高质量的动作空间,所设计的二部图卷积网络可以有效地从动作集中选择最优动作。给从业人员的说明—本文的灵感来自于存在于各种工业场景中的切割库存问题,如木材、钢铁、造纸和玻璃工业。我们对可应用于工业场景的切削库存问题进行了改进的强化学习,可以提高工业企业的轮廓,降低工业企业的生产成本。我们的改进也可用于解决其他组合优化问题,以促进工业生产中的决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automation Science and Engineering
IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统
CiteScore
12.50
自引率
14.30%
发文量
404
审稿时长
3.0 months
期刊介绍: The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.
期刊最新文献
Dynamic Event-triggered H ∞ Control for Networked Control Systems with Stochastic Delay: An Envelope-Guided Partial Reset Approach Embedded Grating Sensing and Compensation Enabling Cross-Scale Nanopositioning Robotic Non-Contact Three-Dimensional Micromanipulation by Acoustohydrodynamic Effects SETKNet: Stochastic Event-Triggered Kalman Net with Sensor Scheduling for Remote State Estimation Important-Data-Based Attack Strategy and Resilient H ∞ Estimator Design for Autonomous Vehicle
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1