用于经济非线性模型预测控制的库普曼模型端到端强化学习

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers & Chemical Engineering Pub Date : 2024-08-05 DOI:10.1016/j.compchemeng.2024.108824
Daniel Mayfrank , Alexander Mitsos , Manuel Dahmen
{"title":"用于经济非线性模型预测控制的库普曼模型端到端强化学习","authors":"Daniel Mayfrank ,&nbsp;Alexander Mitsos ,&nbsp;Manuel Dahmen","doi":"10.1016/j.compchemeng.2024.108824","DOIUrl":null,"url":null,"abstract":"<div><p>(Economic) nonlinear model predictive control ((e)NMPC) requires dynamic models that are sufficiently accurate and computationally tractable. Data-driven surrogate models for mechanistic models can reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum prediction accuracy on simulation samples and perform suboptimally in (e)NMPC. We present a method for end-to-end reinforcement learning of Koopman surrogate models for optimal performance as part of (e)NMPC. We apply our method to two applications derived from an established nonlinear continuous stirred-tank reactor model. The controller performance is compared to that of (e)NMPCs utilizing models trained using system identification, and model-free neural network controllers trained using reinforcement learning. We show that the end-to-end trained models outperform those trained using system identification in (e)NMPC, and that, in contrast to the neural network controllers, the (e)NMPC controllers can react to changes in the control setting without retraining.</p></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"190 ","pages":"Article 108824"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0098135424002424/pdfft?md5=b8942e7813913b046ed7ab32d3f23e7e&pid=1-s2.0-S0098135424002424-main.pdf","citationCount":"0","resultStr":"{\"title\":\"End-to-end reinforcement learning of Koopman models for economic nonlinear model predictive control\",\"authors\":\"Daniel Mayfrank ,&nbsp;Alexander Mitsos ,&nbsp;Manuel Dahmen\",\"doi\":\"10.1016/j.compchemeng.2024.108824\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>(Economic) nonlinear model predictive control ((e)NMPC) requires dynamic models that are sufficiently accurate and computationally tractable. Data-driven surrogate models for mechanistic models can reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum prediction accuracy on simulation samples and perform suboptimally in (e)NMPC. We present a method for end-to-end reinforcement learning of Koopman surrogate models for optimal performance as part of (e)NMPC. We apply our method to two applications derived from an established nonlinear continuous stirred-tank reactor model. The controller performance is compared to that of (e)NMPCs utilizing models trained using system identification, and model-free neural network controllers trained using reinforcement learning. We show that the end-to-end trained models outperform those trained using system identification in (e)NMPC, and that, in contrast to the neural network controllers, the (e)NMPC controllers can react to changes in the control setting without retraining.</p></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"190 \",\"pages\":\"Article 108824\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-08-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0098135424002424/pdfft?md5=b8942e7813913b046ed7ab32d3f23e7e&pid=1-s2.0-S0098135424002424-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135424002424\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135424002424","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

(经济)非线性模型预测控制((e)NMPC)要求动态模型足够精确且计算简单。机理模型的数据驱动代用模型可以减轻(e)NMPC 的计算负担;然而,这些模型通常是通过系统识别来训练的,目的是在模拟样本上获得最大预测精度,在(e)NMPC 中的表现并不理想。我们提出了一种端到端强化学习 Koopman 代理模型的方法,使其作为 (e)NMPC 的一部分发挥最佳性能。我们将这一方法应用于从已建立的非线性连续搅拌罐反应器模型中衍生出来的两个应用中。我们将控制器性能与利用系统识别训练模型的(e)NMPC 和利用强化学习训练的无模型神经网络控制器进行了比较。结果表明,端到端训练模型优于使用系统识别训练的(e)NMPC 模型,而且与神经网络控制器相比,(e)NMPC 控制器无需重新训练即可对控制设置的变化做出反应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
End-to-end reinforcement learning of Koopman models for economic nonlinear model predictive control

(Economic) nonlinear model predictive control ((e)NMPC) requires dynamic models that are sufficiently accurate and computationally tractable. Data-driven surrogate models for mechanistic models can reduce the computational burden of (e)NMPC; however, such models are typically trained by system identification for maximum prediction accuracy on simulation samples and perform suboptimally in (e)NMPC. We present a method for end-to-end reinforcement learning of Koopman surrogate models for optimal performance as part of (e)NMPC. We apply our method to two applications derived from an established nonlinear continuous stirred-tank reactor model. The controller performance is compared to that of (e)NMPCs utilizing models trained using system identification, and model-free neural network controllers trained using reinforcement learning. We show that the end-to-end trained models outperform those trained using system identification in (e)NMPC, and that, in contrast to the neural network controllers, the (e)NMPC controllers can react to changes in the control setting without retraining.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
期刊最新文献
The bullwhip effect, market competition and standard deviation ratio in two parallel supply chains CADET-Julia: Efficient and versatile, open-source simulator for batch chromatography in Julia Computer aided formulation design based on molecular dynamics simulation: Detergents with fragrance Model-based real-time optimization in continuous pharmaceutical manufacturing Risk-averse supply chain management via robust reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1