Daniel Beahr , Debangsu Bhattacharyya , Douglas A. Allan , Stephen E. Zitney
{"title":"利用强化学习开发增强和替代传统过程控制的算法","authors":"Daniel Beahr , Debangsu Bhattacharyya , Douglas A. Allan , Stephen E. Zitney","doi":"10.1016/j.compchemeng.2024.108826","DOIUrl":null,"url":null,"abstract":"<div><p>This work seeks to allow for the online operation and training of model-free reinforcement learning (RL) agents but limit the risk to system equipment and personnel. The parallel implementation of RL alongside more conventional process control (CPC) allows for the RL algorithm to learn from CPC. The past performance of both methods are assessed on a continuous basis allowing for a transition from CPC to RL and, if needed, transitioning back to CPC from RL. This allows for the RL algorithm to slowly and safely assume control of the process without significant degradation in control performance. It is shown that the RL can derive a near optimal policy even when coupled with a suboptimal CPC. It is also demonstrated that the coupled RL-CPC algorithm learns at a faster rate than traditional RL methods of exploration while the algorithm’s performance does not deteriorate below CPC, even when exposed to an unknown operating condition.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"190 ","pages":"Article 108826"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Development of algorithms for augmenting and replacing conventional process control using reinforcement learning\",\"authors\":\"Daniel Beahr , Debangsu Bhattacharyya , Douglas A. Allan , Stephen E. Zitney\",\"doi\":\"10.1016/j.compchemeng.2024.108826\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This work seeks to allow for the online operation and training of model-free reinforcement learning (RL) agents but limit the risk to system equipment and personnel. The parallel implementation of RL alongside more conventional process control (CPC) allows for the RL algorithm to learn from CPC. The past performance of both methods are assessed on a continuous basis allowing for a transition from CPC to RL and, if needed, transitioning back to CPC from RL. This allows for the RL algorithm to slowly and safely assume control of the process without significant degradation in control performance. It is shown that the RL can derive a near optimal policy even when coupled with a suboptimal CPC. It is also demonstrated that the coupled RL-CPC algorithm learns at a faster rate than traditional RL methods of exploration while the algorithm’s performance does not deteriorate below CPC, even when exposed to an unknown operating condition.</p></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"190 \",\"pages\":\"Article 108826\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135424002448\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424002448","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Development of algorithms for augmenting and replacing conventional process control using reinforcement learning
This work seeks to allow for the online operation and training of model-free reinforcement learning (RL) agents but limit the risk to system equipment and personnel. The parallel implementation of RL alongside more conventional process control (CPC) allows for the RL algorithm to learn from CPC. The past performance of both methods are assessed on a continuous basis allowing for a transition from CPC to RL and, if needed, transitioning back to CPC from RL. This allows for the RL algorithm to slowly and safely assume control of the process without significant degradation in control performance. It is shown that the RL can derive a near optimal policy even when coupled with a suboptimal CPC. It is also demonstrated that the coupled RL-CPC algorithm learns at a faster rate than traditional RL methods of exploration while the algorithm’s performance does not deteriorate below CPC, even when exposed to an unknown operating condition.
期刊介绍:
Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.