一种基于深度强化学习的作业插入动态分布式阻塞流程调度方法

IF 2.5 Q2 ENGINEERING, INDUSTRIAL IET Collaborative Intelligent Manufacturing Pub Date : 2022-09-09 DOI:10.1049/cim2.12060

Xueyan Sun, Birgit Vogel-Heuser, Fandi Bi, Weiming Shen

{"title":"一种基于深度强化学习的作业插入动态分布式阻塞流程调度方法","authors":"Xueyan Sun, Birgit Vogel-Heuser, Fandi Bi, Weiming Shen","doi":"10.1049/cim2.12060","DOIUrl":null,"url":null,"abstract":"<p>The distributed blocking flowshop scheduling problem (DBFSP) with new job insertions is studied. Rescheduling all remaining jobs after a dynamic event like a new job insertion is unreasonable to an actual distributed blocking flowshop production process. A deep reinforcement learning (DRL) algorithm is proposed to optimise the job selection model, and local modifications are made on the basis of the original scheduling plan when new jobs arrive. The objective is to minimise the total completion time deviation of all products so that all jobs can be finished on time to reduce the cost of storage. First, according to the definitions of the dynamic DBFSP problem, a DRL framework based on multi-agent deep deterministic policy gradient (MADDPG) is proposed. In this framework, a full schedule is generated by the variable neighbourhood descent algorithm before a dynamic event occurs. Meanwhile, all newly added jobs are reordered before the agents make decisions to select the one that needs to be scheduled most urgently. This study defines the observations, actions and reward calculation methods and applies centralised training and distributed execution in MADDPG. Finally, a comprehensive computational experiment is carried out to compare the proposed method with the closely related and well-performing methods. The results indicate that the proposed method can solve the dynamic DBFSP effectively and efficiently.</p>","PeriodicalId":33286,"journal":{"name":"IET Collaborative Intelligent Manufacturing","volume":"4 3","pages":"166-180"},"PeriodicalIF":2.5000,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cim2.12060","citationCount":"7","resultStr":"{\"title\":\"A deep reinforcement learning based approach for dynamic distributed blocking flowshop scheduling with job insertions\",\"authors\":\"Xueyan Sun, Birgit Vogel-Heuser, Fandi Bi, Weiming Shen\",\"doi\":\"10.1049/cim2.12060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The distributed blocking flowshop scheduling problem (DBFSP) with new job insertions is studied. Rescheduling all remaining jobs after a dynamic event like a new job insertion is unreasonable to an actual distributed blocking flowshop production process. A deep reinforcement learning (DRL) algorithm is proposed to optimise the job selection model, and local modifications are made on the basis of the original scheduling plan when new jobs arrive. The objective is to minimise the total completion time deviation of all products so that all jobs can be finished on time to reduce the cost of storage. First, according to the definitions of the dynamic DBFSP problem, a DRL framework based on multi-agent deep deterministic policy gradient (MADDPG) is proposed. In this framework, a full schedule is generated by the variable neighbourhood descent algorithm before a dynamic event occurs. Meanwhile, all newly added jobs are reordered before the agents make decisions to select the one that needs to be scheduled most urgently. This study defines the observations, actions and reward calculation methods and applies centralised training and distributed execution in MADDPG. Finally, a comprehensive computational experiment is carried out to compare the proposed method with the closely related and well-performing methods. The results indicate that the proposed method can solve the dynamic DBFSP effectively and efficiently.</p>\",\"PeriodicalId\":33286,\"journal\":{\"name\":\"IET Collaborative Intelligent Manufacturing\",\"volume\":\"4 3\",\"pages\":\"166-180\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2022-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cim2.12060\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Collaborative Intelligent Manufacturing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/cim2.12060\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, INDUSTRIAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Collaborative Intelligent Manufacturing","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/cim2.12060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 7

摘要

研究了具有新作业插入的分布式阻塞流车间调度问题。在动态事件(如新作业插入)之后重新调度所有剩余的作业对于实际的分布式阻塞流水车间生产过程是不合理的。提出了一种深度强化学习(DRL)算法来优化作业选择模型，并在新作业到达时，在原有调度计划的基础上进行局部修改。目标是尽量减少所有产品的总完工时间偏差，以便所有工作都能按时完成，以降低存储成本。首先，根据动态DBFSP问题的定义，提出了基于多智能体深度确定性策略梯度(madpg)的DRL框架。在该框架中，在动态事件发生之前，由可变邻域下降算法生成一个完整的调度。同时，在agent决定选择最需要调度的作业之前，所有新增的作业都被重新排序。本研究定义了观察、行动和奖励计算方法，并将其应用于MADDPG的集中训练和分布式执行。最后，进行了全面的计算实验，将所提出的方法与密切相关且性能良好的方法进行了比较。结果表明，该方法能够有效地求解动态DBFSP问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A deep reinforcement learning based approach for dynamic distributed blocking flowshop scheduling with job insertions

The distributed blocking flowshop scheduling problem (DBFSP) with new job insertions is studied. Rescheduling all remaining jobs after a dynamic event like a new job insertion is unreasonable to an actual distributed blocking flowshop production process. A deep reinforcement learning (DRL) algorithm is proposed to optimise the job selection model, and local modifications are made on the basis of the original scheduling plan when new jobs arrive. The objective is to minimise the total completion time deviation of all products so that all jobs can be finished on time to reduce the cost of storage. First, according to the definitions of the dynamic DBFSP problem, a DRL framework based on multi-agent deep deterministic policy gradient (MADDPG) is proposed. In this framework, a full schedule is generated by the variable neighbourhood descent algorithm before a dynamic event occurs. Meanwhile, all newly added jobs are reordered before the agents make decisions to select the one that needs to be scheduled most urgently. This study defines the observations, actions and reward calculation methods and applies centralised training and distributed execution in MADDPG. Finally, a comprehensive computational experiment is carried out to compare the proposed method with the closely related and well-performing methods. The results indicate that the proposed method can solve the dynamic DBFSP effectively and efficiently.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IET Collaborative Intelligent Manufacturing Engineering-Industrial and Manufacturing Engineering

CiteScore

9.10

自引率

2.40%

发文量

审稿时长

20 weeks

期刊介绍： IET Collaborative Intelligent Manufacturing is a Gold Open Access journal that focuses on the development of efficient and adaptive production and distribution systems. It aims to meet the ever-changing market demands by publishing original research on methodologies and techniques for the application of intelligence, data science, and emerging information and communication technologies in various aspects of manufacturing, such as design, modeling, simulation, planning, and optimization of products, processes, production, and assembly. The journal is indexed in COMPENDEX (Elsevier), Directory of Open Access Journals (DOAJ), Emerging Sources Citation Index (Clarivate Analytics), INSPEC (IET), SCOPUS (Elsevier) and Web of Science (Clarivate Analytics).