Wenquan Zhang , Fei Zhao , Chuntao Yang , Chao Du , Xiaobing Feng , Yukun Zhang , Zhaoxian Peng , Xuesong Mei
{"title":"A novel Soft Actor–Critic framework with disjunctive graph embedding and autoencoder mechanism for Job Shop Scheduling Problems","authors":"Wenquan Zhang , Fei Zhao , Chuntao Yang , Chao Du , Xiaobing Feng , Yukun Zhang , Zhaoxian Peng , Xuesong Mei","doi":"10.1016/j.jmsy.2024.08.015","DOIUrl":null,"url":null,"abstract":"<div><p>The Job-Shop Scheduling Problem (JSSP) is a well-established and classic NP-hard combinatorial optimization issue. The quality of its scheduling scheme directly affects the operational efficiency of manufacturing systems. Priority Dispatching Rules (PDRs) are often utilized to address JSSP in real-world contexts, but the process of creating effective PDRs can be daunting and time-consuming. It also necessitates comprehensive domain knowledge, typically resulting in mediocre performance. In this paper, we introduce a novel reinforcement learning (RL) model called Disjunctive Graph Embedding with Autoencoder Mechanism for Job Shop Scheduling Problems (DGEAM-JSSP), designed to automate PDRs learning. Our proposed model confronts the issue using a Graph Neural Network (GNN) to learn node features that encapsulate the spatial structure of the JSSP graph representation. The ensuing policy network is size-agnostic, enabling effective generalization on larger-scale instances. Additionally, we employ a transformer encoder, incorporating parallel encoding and a self-attention mechanism, to successfully recognize long-term dependencies among operations in large-scale scheduling problems. We also implemented an end-to-end training approach using the Soft Actor–Critic (SAC) algorithm to instruct the two modules. Computational experiment results reveal that, with a single training, our agent successfully learns a superior dispatching policy, surpassing PDRs and state-of-the-art RL frameworks specifically tailored for each JSSP instance size in solution quality, as well as OR-Tools in execution speed. Moreover, results from random and benchmark instances illustrate that the uniquely-modeled learned policies have impressive generalization performance on real-world instances and significantly larger-scale scenarios involving up to 2000 operations.</p></div>","PeriodicalId":16227,"journal":{"name":"Journal of Manufacturing Systems","volume":"76 ","pages":"Pages 614-626"},"PeriodicalIF":12.2000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Manufacturing Systems","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S027861252400178X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0
Abstract
The Job-Shop Scheduling Problem (JSSP) is a well-established and classic NP-hard combinatorial optimization issue. The quality of its scheduling scheme directly affects the operational efficiency of manufacturing systems. Priority Dispatching Rules (PDRs) are often utilized to address JSSP in real-world contexts, but the process of creating effective PDRs can be daunting and time-consuming. It also necessitates comprehensive domain knowledge, typically resulting in mediocre performance. In this paper, we introduce a novel reinforcement learning (RL) model called Disjunctive Graph Embedding with Autoencoder Mechanism for Job Shop Scheduling Problems (DGEAM-JSSP), designed to automate PDRs learning. Our proposed model confronts the issue using a Graph Neural Network (GNN) to learn node features that encapsulate the spatial structure of the JSSP graph representation. The ensuing policy network is size-agnostic, enabling effective generalization on larger-scale instances. Additionally, we employ a transformer encoder, incorporating parallel encoding and a self-attention mechanism, to successfully recognize long-term dependencies among operations in large-scale scheduling problems. We also implemented an end-to-end training approach using the Soft Actor–Critic (SAC) algorithm to instruct the two modules. Computational experiment results reveal that, with a single training, our agent successfully learns a superior dispatching policy, surpassing PDRs and state-of-the-art RL frameworks specifically tailored for each JSSP instance size in solution quality, as well as OR-Tools in execution speed. Moreover, results from random and benchmark instances illustrate that the uniquely-modeled learned policies have impressive generalization performance on real-world instances and significantly larger-scale scenarios involving up to 2000 operations.
期刊介绍:
The Journal of Manufacturing Systems is dedicated to showcasing cutting-edge fundamental and applied research in manufacturing at the systems level. Encompassing products, equipment, people, information, control, and support functions, manufacturing systems play a pivotal role in the economical and competitive development, production, delivery, and total lifecycle of products, meeting market and societal needs.
With a commitment to publishing archival scholarly literature, the journal strives to advance the state of the art in manufacturing systems and foster innovation in crafting efficient, robust, and sustainable manufacturing systems. The focus extends from equipment-level considerations to the broader scope of the extended enterprise. The Journal welcomes research addressing challenges across various scales, including nano, micro, and macro-scale manufacturing, and spanning diverse sectors such as aerospace, automotive, energy, and medical device manufacturing.