Reinforcement learning based maintenance scheduling of flexible multi-machine manufacturing systems with varying interactive degradation

IF 11 1区 工程技术 Q1 ENGINEERING, INDUSTRIAL Reliability Engineering & System Safety Pub Date : 2025-08-01 Epub Date: 2025-03-22 DOI:10.1016/j.ress.2025.111018
Jiangxi Chen, Xiaojun Zhou
{"title":"Reinforcement learning based maintenance scheduling of flexible multi-machine manufacturing systems with varying interactive degradation","authors":"Jiangxi Chen,&nbsp;Xiaojun Zhou","doi":"10.1016/j.ress.2025.111018","DOIUrl":null,"url":null,"abstract":"<div><div>In flexible multi-machine manufacturing systems, variations in product types dynamically influence machine loads, subsequently affecting the degradation processes of the machines. Moreover, the interactive degradation between the upstream and downstream machines, caused by the product quality deviations, changes with the different production routes for the variable product types. These factors, combined with the uncertain production schedules, present significant challenges for effective maintenance scheduling. To address these challenges, the maintenance scheduling problem is modeled as a Hidden-Mode Markov Decision Process (HM-MDP), where product types are treated as hidden modes that influence machine degradation and the subsequent maintenance decisions. The Interactive Degradation-Aware Proximal Policy Optimization (IDAPPO) reinforcement learning framework is introduced, enhancing the PPO algorithm with Graph Neural Networks (GNNs) to capture interactive degradation among machines and Long Short-Term Memory (LSTM) networks to handle temporal variations in production schedules. An entropy-based exploration strategy further manages the uncertainty of production schedules, enabling IDAPPO to adaptively optimize maintenance actions. Extensive experiments on both small-scale (5-machine) and large-scale (24-machine) systems demonstrate significantly reduced system losses and accelerated convergence of IDAPPO compared to the baseline approaches. These results indicate that IDAPPO provides a scalable and adaptive solution for improving the efficiency and reliability of complex manufacturing environments.</div></div>","PeriodicalId":54500,"journal":{"name":"Reliability Engineering & System Safety","volume":"260 ","pages":"Article 111018"},"PeriodicalIF":11.0000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Reliability Engineering & System Safety","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0951832025002194","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/3/22 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}
引用次数: 0

Abstract

In flexible multi-machine manufacturing systems, variations in product types dynamically influence machine loads, subsequently affecting the degradation processes of the machines. Moreover, the interactive degradation between the upstream and downstream machines, caused by the product quality deviations, changes with the different production routes for the variable product types. These factors, combined with the uncertain production schedules, present significant challenges for effective maintenance scheduling. To address these challenges, the maintenance scheduling problem is modeled as a Hidden-Mode Markov Decision Process (HM-MDP), where product types are treated as hidden modes that influence machine degradation and the subsequent maintenance decisions. The Interactive Degradation-Aware Proximal Policy Optimization (IDAPPO) reinforcement learning framework is introduced, enhancing the PPO algorithm with Graph Neural Networks (GNNs) to capture interactive degradation among machines and Long Short-Term Memory (LSTM) networks to handle temporal variations in production schedules. An entropy-based exploration strategy further manages the uncertainty of production schedules, enabling IDAPPO to adaptively optimize maintenance actions. Extensive experiments on both small-scale (5-machine) and large-scale (24-machine) systems demonstrate significantly reduced system losses and accelerated convergence of IDAPPO compared to the baseline approaches. These results indicate that IDAPPO provides a scalable and adaptive solution for improving the efficiency and reliability of complex manufacturing environments.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于强化学习的不同交互退化柔性多机制造系统维修调度
在柔性多机器制造系统中,产品类型的变化动态地影响机器负载,从而影响机器的退化过程。此外,对于不同的产品类型,由产品质量偏差引起的上下游机器之间的交互退化随着生产路线的不同而变化。这些因素,加上不确定的生产计划,对有效的维护计划提出了重大挑战。为了解决这些挑战,维护调度问题被建模为隐藏模式马尔可夫决策过程(hmm - mdp),其中产品类型被视为影响机器退化和后续维护决策的隐藏模式。引入了交互式退化感知近端策略优化(IDAPPO)强化学习框架,利用图神经网络(gnn)增强PPO算法来捕获机器之间的交互式退化,并利用长短期记忆(LSTM)网络来处理生产计划的时间变化。基于熵的勘探策略进一步管理了生产计划的不确定性,使IDAPPO能够自适应优化维护行动。在小规模(5台机器)和大规模(24台机器)系统上进行的大量实验表明,与基线方法相比,IDAPPO显著减少了系统损失并加速了收敛。这些结果表明,IDAPPO为提高复杂制造环境的效率和可靠性提供了一种可扩展和自适应的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Reliability Engineering & System Safety
Reliability Engineering & System Safety 管理科学-工程:工业
CiteScore
15.20
自引率
39.50%
发文量
621
审稿时长
67 days
期刊介绍: Elsevier publishes Reliability Engineering & System Safety in association with the European Safety and Reliability Association and the Safety Engineering and Risk Analysis Division. The international journal is devoted to developing and applying methods to enhance the safety and reliability of complex technological systems, like nuclear power plants, chemical plants, hazardous waste facilities, space systems, offshore and maritime systems, transportation systems, constructed infrastructure, and manufacturing plants. The journal normally publishes only articles that involve the analysis of substantive problems related to the reliability of complex systems or present techniques and/or theoretical results that have a discernable relationship to the solution of such problems. An important aim is to balance academic material and practical applications.
期刊最新文献
Quantifying potential cyber-attack risks in CNC systems under zero-subjectivity closed-loop Dempster–Shafer theory FMECA and rule-based Bayesian network modelling Inactivity times of components upon system failure with application to missing data problems Domain knowledge-enhanced dual-stream graph joint learning network for aeroengine remaining useful life prediction Revealing the dynamics and multidimensional resilience of rainstorm-flood cascade disasters in mountain valley cities: An interpretable machine learning case study from Southwestern China Robustness of spatial interdependent networks under extreme geographically localized attacks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1