醋酸乙烯单体工厂模型控制的析因核动态策略规划

2018 IEEE 14th International Conference on Automation Science and Engineering (CASE) Pub Date : 2018-08-01 DOI:10.1109/COASE.2018.8560593

Yunduan Cui, Lingwei Zhu, Morihiro Fujisaki, H. Kanokogi, Takamitsu Matsubara

{"title":"醋酸乙烯单体工厂模型控制的析因核动态策略规划","authors":"Yunduan Cui, Lingwei Zhu, Morihiro Fujisaki, H. Kanokogi, Takamitsu Matsubara","doi":"10.1109/COASE.2018.8560593","DOIUrl":null,"url":null,"abstract":"This research focuses on applying reinforcement learning towards chemical plant control problems in order to optimize production while maintaining plant stability without requiring knowledge of the plant models. Since a typical chemical plant has a large number of sensors and actuators, the control problem of such a plant can be formulated as a Markov decision process involving high-dimensional state and a huge number of actions that might be difficult to solve by previous methods due to computational complexity and sample insufficiency. To overcome these issues, we propose a new reinforcement learning method, Factorial Kernel Dynamic Policy Programming, that employs 1) a factorial policy model and 2) a factor-wise kernel-based smooth policy update by regularization with the Kullback-Leibler divergence between the current and updated policies. To validate its effectiveness, FKDPP is evaluated via the Vinyl Acetate Monomer plant (VAM) model, a popular benchmark chemical plant control problem. Compared with previous methods that cannot directly process a huge number of actions, our proposed method leverages the same number of training samples and achieves a better control strategy for VAM yield, quality, and plant stability.","PeriodicalId":6518,"journal":{"name":"2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)","volume":"59 1","pages":"304-309"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Factorial Kernel Dynamic Policy Programming for Vinyl Acetate Monomer Plant Model Control\",\"authors\":\"Yunduan Cui, Lingwei Zhu, Morihiro Fujisaki, H. Kanokogi, Takamitsu Matsubara\",\"doi\":\"10.1109/COASE.2018.8560593\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This research focuses on applying reinforcement learning towards chemical plant control problems in order to optimize production while maintaining plant stability without requiring knowledge of the plant models. Since a typical chemical plant has a large number of sensors and actuators, the control problem of such a plant can be formulated as a Markov decision process involving high-dimensional state and a huge number of actions that might be difficult to solve by previous methods due to computational complexity and sample insufficiency. To overcome these issues, we propose a new reinforcement learning method, Factorial Kernel Dynamic Policy Programming, that employs 1) a factorial policy model and 2) a factor-wise kernel-based smooth policy update by regularization with the Kullback-Leibler divergence between the current and updated policies. To validate its effectiveness, FKDPP is evaluated via the Vinyl Acetate Monomer plant (VAM) model, a popular benchmark chemical plant control problem. Compared with previous methods that cannot directly process a huge number of actions, our proposed method leverages the same number of training samples and achieves a better control strategy for VAM yield, quality, and plant stability.\",\"PeriodicalId\":6518,\"journal\":{\"name\":\"2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)\",\"volume\":\"59 1\",\"pages\":\"304-309\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COASE.2018.8560593\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COASE.2018.8560593","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

本研究的重点是将强化学习应用于化工厂控制问题，以便在不需要了解工厂模型的情况下，在保持工厂稳定性的同时优化生产。由于典型的化工厂具有大量的传感器和执行器，因此化工厂的控制问题可以表示为一个涉及高维状态和大量动作的马尔可夫决策过程，而以往的方法由于计算复杂性和样本不足而难以解决。为了克服这些问题，我们提出了一种新的强化学习方法，析因核动态策略规划，它采用1)析因策略模型和2)基于析因核的平滑策略更新，通过正则化当前和更新策略之间的Kullback-Leibler散度。为了验证其有效性，通过醋酸乙烯单体工厂(VAM)模型(一个流行的基准化工厂控制问题)对FKDPP进行了评估。与以往不能直接处理大量动作的方法相比，我们提出的方法利用相同数量的训练样本，实现了更好的VAM产量、质量和工厂稳定性控制策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Factorial Kernel Dynamic Policy Programming for Vinyl Acetate Monomer Plant Model Control

This research focuses on applying reinforcement learning towards chemical plant control problems in order to optimize production while maintaining plant stability without requiring knowledge of the plant models. Since a typical chemical plant has a large number of sensors and actuators, the control problem of such a plant can be formulated as a Markov decision process involving high-dimensional state and a huge number of actions that might be difficult to solve by previous methods due to computational complexity and sample insufficiency. To overcome these issues, we propose a new reinforcement learning method, Factorial Kernel Dynamic Policy Programming, that employs 1) a factorial policy model and 2) a factor-wise kernel-based smooth policy update by regularization with the Kullback-Leibler divergence between the current and updated policies. To validate its effectiveness, FKDPP is evaluated via the Vinyl Acetate Monomer plant (VAM) model, a popular benchmark chemical plant control problem. Compared with previous methods that cannot directly process a huge number of actions, our proposed method leverages the same number of training samples and achieves a better control strategy for VAM yield, quality, and plant stability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)

自引率

0.00%

发文量

期刊最新文献

Automated Electric-Field-Based Nanowire Characterization, Manipulation, and Assembly Dynamic Sampling for Feasibility Determination Gripping Positions Selection for Unfolding a Rectangular Cloth Product Multi-Robot Routing Algorithms for Robots Operating in Vineyards Enhancing Data-Driven Models with Knowledge from Engineering Models in Manufacturing