针对工作车间调度问题的新型软行为批判框架（带断裂图嵌入和自动编码器机制

IF 12.2 1区工程技术 Q1 ENGINEERING, INDUSTRIAL Journal of Manufacturing Systems Pub Date : 2024-09-02 DOI:10.1016/j.jmsy.2024.08.015

Wenquan Zhang , Fei Zhao , Chuntao Yang , Chao Du , Xiaobing Feng , Yukun Zhang , Zhaoxian Peng , Xuesong Mei

{"title":"针对工作车间调度问题的新型软行为批判框架（带断裂图嵌入和自动编码器机制","authors":"Wenquan Zhang , Fei Zhao , Chuntao Yang , Chao Du , Xiaobing Feng , Yukun Zhang , Zhaoxian Peng , Xuesong Mei","doi":"10.1016/j.jmsy.2024.08.015","DOIUrl":null,"url":null,"abstract":"<div><p>The Job-Shop Scheduling Problem (JSSP) is a well-established and classic NP-hard combinatorial optimization issue. The quality of its scheduling scheme directly affects the operational efficiency of manufacturing systems. Priority Dispatching Rules (PDRs) are often utilized to address JSSP in real-world contexts, but the process of creating effective PDRs can be daunting and time-consuming. It also necessitates comprehensive domain knowledge, typically resulting in mediocre performance. In this paper, we introduce a novel reinforcement learning (RL) model called Disjunctive Graph Embedding with Autoencoder Mechanism for Job Shop Scheduling Problems (DGEAM-JSSP), designed to automate PDRs learning. Our proposed model confronts the issue using a Graph Neural Network (GNN) to learn node features that encapsulate the spatial structure of the JSSP graph representation. The ensuing policy network is size-agnostic, enabling effective generalization on larger-scale instances. Additionally, we employ a transformer encoder, incorporating parallel encoding and a self-attention mechanism, to successfully recognize long-term dependencies among operations in large-scale scheduling problems. We also implemented an end-to-end training approach using the Soft Actor–Critic (SAC) algorithm to instruct the two modules. Computational experiment results reveal that, with a single training, our agent successfully learns a superior dispatching policy, surpassing PDRs and state-of-the-art RL frameworks specifically tailored for each JSSP instance size in solution quality, as well as OR-Tools in execution speed. Moreover, results from random and benchmark instances illustrate that the uniquely-modeled learned policies have impressive generalization performance on real-world instances and significantly larger-scale scenarios involving up to 2000 operations.</p></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"76 ","pages":"Pages 614-626"},"PeriodicalIF":12.2000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel Soft Actor–Critic framework with disjunctive graph embedding and autoencoder mechanism for Job Shop Scheduling Problems\",\"authors\":\"Wenquan Zhang , Fei Zhao , Chuntao Yang , Chao Du , Xiaobing Feng , Yukun Zhang , Zhaoxian Peng , Xuesong Mei\",\"doi\":\"10.1016/j.jmsy.2024.08.015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The Job-Shop Scheduling Problem (JSSP) is a well-established and classic NP-hard combinatorial optimization issue. The quality of its scheduling scheme directly affects the operational efficiency of manufacturing systems. Priority Dispatching Rules (PDRs) are often utilized to address JSSP in real-world contexts, but the process of creating effective PDRs can be daunting and time-consuming. It also necessitates comprehensive domain knowledge, typically resulting in mediocre performance. In this paper, we introduce a novel reinforcement learning (RL) model called Disjunctive Graph Embedding with Autoencoder Mechanism for Job Shop Scheduling Problems (DGEAM-JSSP), designed to automate PDRs learning. Our proposed model confronts the issue using a Graph Neural Network (GNN) to learn node features that encapsulate the spatial structure of the JSSP graph representation. The ensuing policy network is size-agnostic, enabling effective generalization on larger-scale instances. Additionally, we employ a transformer encoder, incorporating parallel encoding and a self-attention mechanism, to successfully recognize long-term dependencies among operations in large-scale scheduling problems. We also implemented an end-to-end training approach using the Soft Actor–Critic (SAC) algorithm to instruct the two modules. Computational experiment results reveal that, with a single training, our agent successfully learns a superior dispatching policy, surpassing PDRs and state-of-the-art RL frameworks specifically tailored for each JSSP instance size in solution quality, as well as OR-Tools in execution speed. Moreover, results from random and benchmark instances illustrate that the uniquely-modeled learned policies have impressive generalization performance on real-world instances and significantly larger-scale scenarios involving up to 2000 operations.</p></div>\",\"PeriodicalId\":16227,\"journal\":{\"name\":\"Journal of Manufacturing Systems\",\"volume\":\"76 \",\"pages\":\"Pages 614-626\"},\"PeriodicalIF\":12.2000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Manufacturing Systems\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S027861252400178X\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S027861252400178X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 0

摘要

作业车间调度问题（JSSP）是一个行之有效的经典 NP 难组合优化问题。其调度方案的质量直接影响制造系统的运行效率。在现实世界中，优先级调度规则（PDR）经常被用来解决 JSSP 问题，但创建有效的优先级调度规则的过程可能非常艰巨和耗时。它还需要全面的领域知识，通常导致性能平平。在本文中，我们介绍了一种新颖的强化学习（RL）模型，名为 "用于作业车间调度问题的带自动编码器机制的关联图嵌入"（DGEAM-JSSP），旨在自动学习 PDR。我们提出的模型使用图神经网络（GNN）来学习节点特征，从而封装 JSSP 图表示的空间结构，从而解决这一问题。随之而来的策略网络与规模无关，因此能在更大规模的实例中实现有效的泛化。此外，我们还采用了变压器编码器，结合并行编码和自我注意机制，成功识别了大规模调度问题中操作之间的长期依赖关系。我们还采用了一种端到端的训练方法，使用软行为批判（SAC）算法来指导这两个模块。计算实验结果表明，只需一次训练，我们的代理就能成功学习到卓越的调度策略，在解决方案质量上超过了PDR和专门为每种JSSP实例大小定制的最先进的RL框架，在执行速度上也超过了OR-Tools。此外，随机实例和基准实例的结果表明，独特建模的学习策略在真实世界实例和涉及多达 2000 个操作的更大规模场景中具有令人印象深刻的泛化性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A novel Soft Actor–Critic framework with disjunctive graph embedding and autoencoder mechanism for Job Shop Scheduling Problems

The Job-Shop Scheduling Problem (JSSP) is a well-established and classic NP-hard combinatorial optimization issue. The quality of its scheduling scheme directly affects the operational efficiency of manufacturing systems. Priority Dispatching Rules (PDRs) are often utilized to address JSSP in real-world contexts, but the process of creating effective PDRs can be daunting and time-consuming. It also necessitates comprehensive domain knowledge, typically resulting in mediocre performance. In this paper, we introduce a novel reinforcement learning (RL) model called Disjunctive Graph Embedding with Autoencoder Mechanism for Job Shop Scheduling Problems (DGEAM-JSSP), designed to automate PDRs learning. Our proposed model confronts the issue using a Graph Neural Network (GNN) to learn node features that encapsulate the spatial structure of the JSSP graph representation. The ensuing policy network is size-agnostic, enabling effective generalization on larger-scale instances. Additionally, we employ a transformer encoder, incorporating parallel encoding and a self-attention mechanism, to successfully recognize long-term dependencies among operations in large-scale scheduling problems. We also implemented an end-to-end training approach using the Soft Actor–Critic (SAC) algorithm to instruct the two modules. Computational experiment results reveal that, with a single training, our agent successfully learns a superior dispatching policy, surpassing PDRs and state-of-the-art RL frameworks specifically tailored for each JSSP instance size in solution quality, as well as OR-Tools in execution speed. Moreover, results from random and benchmark instances illustrate that the uniquely-modeled learned policies have impressive generalization performance on real-world instances and significantly larger-scale scenarios involving up to 2000 operations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Manufacturing Systems 工程技术-工程：工业

CiteScore

23.30

自引率

13.20%

发文量

216

审稿时长

25 days

期刊介绍： The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs. With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.