A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC

IF 11 1区 工程技术 Q1 ENERGY & FUELS Applied Energy Pub Date : 2025-08-01 Epub Date: 2025-04-16 DOI:10.1016/j.apenergy.2025.125850
Keyan Zhu , Guangming Zhang , Chen Zhu , Yuguang Niu , Jizhen Liu
{"title":"A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC","authors":"Keyan Zhu ,&nbsp;Guangming Zhang ,&nbsp;Chen Zhu ,&nbsp;Yuguang Niu ,&nbsp;Jizhen Liu","doi":"10.1016/j.apenergy.2025.125850","DOIUrl":null,"url":null,"abstract":"<div><div>Enhancing the comprehensive performance of the combined heat and power (CHP) units is crucial for accommodating renewable energy and achieving energy conservation. To this end, a bi-level optimization strategy based on reinforcement learning (RL) and multi-objective model predictive control (MOMPC) is proposed to enhance the CHP units flexibility and economic performance. Firstly, a CHP unit model is constructed, and its various parameters are incorporated into the rolling optimization of the MOMPC, serving as the lower-level follower to solve the fundamental control. Secondly, a bi-level optimization strategy integrating the twin delayed deep deterministic policy gradient (TD3) algorithm with MOMPC (TD3-MOMPC) is proposed. The TD3 agent is designated as the upper-level leader. By decomposing the complex flexibility requirements and the optimization control sequence of the CHP unit, tasks are assigned to both the upper-level leader and the lower-level follower for bi-level interactive optimization. Thirdly, with power flexibility, heating quality, and operational economy serving as leader guidance, a multi-criterion optimization reward function is designed for the upper-level. Then, the actions of the upper-level TD3 agent are designed as dynamic weights and time-varying prediction horizons for the rolling optimization of MOMPC, serving as a bridge to connect and guide the bi-level optimization. Finally, to verify the effectiveness of the bi-level optimization strategy, extensive tests on load variation and disturbance rejection were conducted on a 300 MW CHP unit. The results show that the proposed strategy enhances the unit's load flexibility, heating quality, and operational economy.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"391 ","pages":"Article 125850"},"PeriodicalIF":11.0000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030626192500580X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

Abstract

Enhancing the comprehensive performance of the combined heat and power (CHP) units is crucial for accommodating renewable energy and achieving energy conservation. To this end, a bi-level optimization strategy based on reinforcement learning (RL) and multi-objective model predictive control (MOMPC) is proposed to enhance the CHP units flexibility and economic performance. Firstly, a CHP unit model is constructed, and its various parameters are incorporated into the rolling optimization of the MOMPC, serving as the lower-level follower to solve the fundamental control. Secondly, a bi-level optimization strategy integrating the twin delayed deep deterministic policy gradient (TD3) algorithm with MOMPC (TD3-MOMPC) is proposed. The TD3 agent is designated as the upper-level leader. By decomposing the complex flexibility requirements and the optimization control sequence of the CHP unit, tasks are assigned to both the upper-level leader and the lower-level follower for bi-level interactive optimization. Thirdly, with power flexibility, heating quality, and operational economy serving as leader guidance, a multi-criterion optimization reward function is designed for the upper-level. Then, the actions of the upper-level TD3 agent are designed as dynamic weights and time-varying prediction horizons for the rolling optimization of MOMPC, serving as a bridge to connect and guide the bi-level optimization. Finally, to verify the effectiveness of the bi-level optimization strategy, extensive tests on load variation and disturbance rejection were conducted on a 300 MW CHP unit. The results show that the proposed strategy enhances the unit's load flexibility, heating quality, and operational economy.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于强化学习和多目标MPC的热电联产机组灵活经济运行的双层优化策略
提高热电联产(CHP)机组的综合性能对于适应可再生能源和实现节能至关重要。为此,本文提出了一种基于强化学习(RL)和多目标模型预测控制(MOMPC)的双层优化策略,以提高热电联产机组的灵活性和经济性。首先,构建热电联产机组模型,并将其各种参数纳入 MOMPC 的滚动优化中,作为解决基本控制的下级随从。其次,提出了将双延迟深度确定性策略梯度(TD3)算法与 MOMPC(TD3-MOMPC)相结合的双层优化策略。TD3 代理被指定为上层领导。通过分解热电联产机组复杂的灵活性要求和优化控制顺序,将任务分配给上层领导者和下层追随者,实现双层互动优化。第三,以电力灵活性、供热质量和运行经济性为领导导向,为上层设计了多标准优化奖励函数。然后,将上层 TD3 代理的行动设计为 MOMPC 滚动优化的动态权重和时变预测视野,作为连接和指导双层优化的桥梁。最后,为了验证双级优化策略的有效性,在 300 MW 热电联产机组上进行了大量的负荷变化和干扰抑制测试。结果表明,建议的策略提高了机组的负荷灵活性、供热质量和运行经济性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Applied Energy
Applied Energy 工程技术-工程:化工
CiteScore
21.20
自引率
10.70%
发文量
1830
审稿时长
41 days
期刊介绍: Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.
期刊最新文献
Privacy-preserving transfer learning framework for building energy forecasting with fully anonymized data Robust Koopman EMPC for optimal frequency regulation of VSC-MTDC systems Operational decision of wind–photovoltaic–energy storage integrated system in day-ahead and ancillary service joint market considering weather variability Collaborative governance of carbon mitigation, energy transition, and material management: A factorial non-deterministic carbon-energy-metal nexus optimization model Systematic under-representation of ERA5 10 m wind speeds in the ERA5-land product undermines wind energy studies
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1