Factorial Kernel Dynamic Policy Programming for Vinyl Acetate Monomer Plant Model Control

Yunduan Cui, Lingwei Zhu, Morihiro Fujisaki, H. Kanokogi, Takamitsu Matsubara
{"title":"Factorial Kernel Dynamic Policy Programming for Vinyl Acetate Monomer Plant Model Control","authors":"Yunduan Cui, Lingwei Zhu, Morihiro Fujisaki, H. Kanokogi, Takamitsu Matsubara","doi":"10.1109/COASE.2018.8560593","DOIUrl":null,"url":null,"abstract":"This research focuses on applying reinforcement learning towards chemical plant control problems in order to optimize production while maintaining plant stability without requiring knowledge of the plant models. Since a typical chemical plant has a large number of sensors and actuators, the control problem of such a plant can be formulated as a Markov decision process involving high-dimensional state and a huge number of actions that might be difficult to solve by previous methods due to computational complexity and sample insufficiency. To overcome these issues, we propose a new reinforcement learning method, Factorial Kernel Dynamic Policy Programming, that employs 1) a factorial policy model and 2) a factor-wise kernel-based smooth policy update by regularization with the Kullback-Leibler divergence between the current and updated policies. To validate its effectiveness, FKDPP is evaluated via the Vinyl Acetate Monomer plant (VAM) model, a popular benchmark chemical plant control problem. Compared with previous methods that cannot directly process a huge number of actions, our proposed method leverages the same number of training samples and achieves a better control strategy for VAM yield, quality, and plant stability.","PeriodicalId":6518,"journal":{"name":"2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)","volume":"59 1","pages":"304-309"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on Automation Science and Engineering (CASE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COASE.2018.8560593","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14

Abstract

This research focuses on applying reinforcement learning towards chemical plant control problems in order to optimize production while maintaining plant stability without requiring knowledge of the plant models. Since a typical chemical plant has a large number of sensors and actuators, the control problem of such a plant can be formulated as a Markov decision process involving high-dimensional state and a huge number of actions that might be difficult to solve by previous methods due to computational complexity and sample insufficiency. To overcome these issues, we propose a new reinforcement learning method, Factorial Kernel Dynamic Policy Programming, that employs 1) a factorial policy model and 2) a factor-wise kernel-based smooth policy update by regularization with the Kullback-Leibler divergence between the current and updated policies. To validate its effectiveness, FKDPP is evaluated via the Vinyl Acetate Monomer plant (VAM) model, a popular benchmark chemical plant control problem. Compared with previous methods that cannot directly process a huge number of actions, our proposed method leverages the same number of training samples and achieves a better control strategy for VAM yield, quality, and plant stability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
醋酸乙烯单体工厂模型控制的析因核动态策略规划
本研究的重点是将强化学习应用于化工厂控制问题,以便在不需要了解工厂模型的情况下,在保持工厂稳定性的同时优化生产。由于典型的化工厂具有大量的传感器和执行器,因此化工厂的控制问题可以表示为一个涉及高维状态和大量动作的马尔可夫决策过程,而以往的方法由于计算复杂性和样本不足而难以解决。为了克服这些问题,我们提出了一种新的强化学习方法,析因核动态策略规划,它采用1)析因策略模型和2)基于析因核的平滑策略更新,通过正则化当前和更新策略之间的Kullback-Leibler散度。为了验证其有效性,通过醋酸乙烯单体工厂(VAM)模型(一个流行的基准化工厂控制问题)对FKDPP进行了评估。与以往不能直接处理大量动作的方法相比,我们提出的方法利用相同数量的训练样本,实现了更好的VAM产量、质量和工厂稳定性控制策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Automated Electric-Field-Based Nanowire Characterization, Manipulation, and Assembly Dynamic Sampling for Feasibility Determination Gripping Positions Selection for Unfolding a Rectangular Cloth Product Multi-Robot Routing Algorithms for Robots Operating in Vineyards Enhancing Data-Driven Models with Knowledge from Engineering Models in Manufacturing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1