{"title":"Sim-to-real transfer in reinforcement learning-based, non-steady-state control for chemical plants","authors":"Shumpei Kubosawa, Takashi Onishi, Y. Tsuruoka","doi":"10.1080/18824889.2022.2029033","DOIUrl":null,"url":null,"abstract":"We present a novel framework for controlling non-steady situations in chemical plants to address the behavioural gaps between the simulator for constructing the reinforcement learning-based controller and the real plant considered for deploying the framework. In the field of reinforcement learning, the performance deterioration problem owing to such gaps are referred to as simulation-to-reality gaps (Sim-to-Real gaps). These gaps are triggered by multiple factors, including modelling errors on the simulators, incorrect state identifications, and unpredicted disturbances on the real situations. We focus on these issues and divided the objective of performing optimal control under gapped situations into three tasks, namely, (1) identifying the model parameters and current state, (2) optimizing the operation procedures, and (3) letting the real situations close to the simulated and predicted situations by adjusting the control inputs. Each task is assigned to a reinforcement learning agent and trained individually. After the training, the agents are integrated and collaborate on the original objective. We present the evaluation of our method in an actual chemical distillation plant, which demonstrates that our system successfully narrows down the gaps due to the emulated disturbance of a weather change (heavy rain) as well as the modelling errors and achieves the desired states.","PeriodicalId":413922,"journal":{"name":"SICE journal of control, measurement, and system integration","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SICE journal of control, measurement, and system integration","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/18824889.2022.2029033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
We present a novel framework for controlling non-steady situations in chemical plants to address the behavioural gaps between the simulator for constructing the reinforcement learning-based controller and the real plant considered for deploying the framework. In the field of reinforcement learning, the performance deterioration problem owing to such gaps are referred to as simulation-to-reality gaps (Sim-to-Real gaps). These gaps are triggered by multiple factors, including modelling errors on the simulators, incorrect state identifications, and unpredicted disturbances on the real situations. We focus on these issues and divided the objective of performing optimal control under gapped situations into three tasks, namely, (1) identifying the model parameters and current state, (2) optimizing the operation procedures, and (3) letting the real situations close to the simulated and predicted situations by adjusting the control inputs. Each task is assigned to a reinforcement learning agent and trained individually. After the training, the agents are integrated and collaborate on the original objective. We present the evaluation of our method in an actual chemical distillation plant, which demonstrates that our system successfully narrows down the gaps due to the emulated disturbance of a weather change (heavy rain) as well as the modelling errors and achieves the desired states.