A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC

IF 11 1区工程技术 Q1 ENERGY & FUELS Applied Energy Pub Date : 2025-08-01 Epub Date: 2025-04-16 DOI:10.1016/j.apenergy.2025.125850

Keyan Zhu , Guangming Zhang , Chen Zhu , Yuguang Niu , Jizhen Liu

{"title":"A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC","authors":"Keyan Zhu , Guangming Zhang , Chen Zhu , Yuguang Niu , Jizhen Liu","doi":"10.1016/j.apenergy.2025.125850","DOIUrl":null,"url":null,"abstract":"<div><div>Enhancing the comprehensive performance of the combined heat and power (CHP) units is crucial for accommodating renewable energy and achieving energy conservation. To this end, a bi-level optimization strategy based on reinforcement learning (RL) and multi-objective model predictive control (MOMPC) is proposed to enhance the CHP units flexibility and economic performance. Firstly, a CHP unit model is constructed, and its various parameters are incorporated into the rolling optimization of the MOMPC, serving as the lower-level follower to solve the fundamental control. Secondly, a bi-level optimization strategy integrating the twin delayed deep deterministic policy gradient (TD3) algorithm with MOMPC (TD3-MOMPC) is proposed. The TD3 agent is designated as the upper-level leader. By decomposing the complex flexibility requirements and the optimization control sequence of the CHP unit, tasks are assigned to both the upper-level leader and the lower-level follower for bi-level interactive optimization. Thirdly, with power flexibility, heating quality, and operational economy serving as leader guidance, a multi-criterion optimization reward function is designed for the upper-level. Then, the actions of the upper-level TD3 agent are designed as dynamic weights and time-varying prediction horizons for the rolling optimization of MOMPC, serving as a bridge to connect and guide the bi-level optimization. Finally, to verify the effectiveness of the bi-level optimization strategy, extensive tests on load variation and disturbance rejection were conducted on a 300 MW CHP unit. The results show that the proposed strategy enhances the unit's load flexibility, heating quality, and operational economy.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"391 ","pages":"Article 125850"},"PeriodicalIF":11.0000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030626192500580X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

Enhancing the comprehensive performance of the combined heat and power (CHP) units is crucial for accommodating renewable energy and achieving energy conservation. To this end, a bi-level optimization strategy based on reinforcement learning (RL) and multi-objective model predictive control (MOMPC) is proposed to enhance the CHP units flexibility and economic performance. Firstly, a CHP unit model is constructed, and its various parameters are incorporated into the rolling optimization of the MOMPC, serving as the lower-level follower to solve the fundamental control. Secondly, a bi-level optimization strategy integrating the twin delayed deep deterministic policy gradient (TD3) algorithm with MOMPC (TD3-MOMPC) is proposed. The TD3 agent is designated as the upper-level leader. By decomposing the complex flexibility requirements and the optimization control sequence of the CHP unit, tasks are assigned to both the upper-level leader and the lower-level follower for bi-level interactive optimization. Thirdly, with power flexibility, heating quality, and operational economy serving as leader guidance, a multi-criterion optimization reward function is designed for the upper-level. Then, the actions of the upper-level TD3 agent are designed as dynamic weights and time-varying prediction horizons for the rolling optimization of MOMPC, serving as a bridge to connect and guide the bi-level optimization. Finally, to verify the effectiveness of the bi-level optimization strategy, extensive tests on load variation and disturbance rejection were conducted on a 300 MW CHP unit. The results show that the proposed strategy enhances the unit's load flexibility, heating quality, and operational economy.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于强化学习和多目标MPC的热电联产机组灵活经济运行的双层优化策略

提高热电联产（CHP）机组的综合性能对于适应可再生能源和实现节能至关重要。为此，本文提出了一种基于强化学习（RL）和多目标模型预测控制（MOMPC）的双层优化策略，以提高热电联产机组的灵活性和经济性。首先，构建热电联产机组模型，并将其各种参数纳入 MOMPC 的滚动优化中，作为解决基本控制的下级随从。其次，提出了将双延迟深度确定性策略梯度（TD3）算法与 MOMPC（TD3-MOMPC）相结合的双层优化策略。TD3 代理被指定为上层领导。通过分解热电联产机组复杂的灵活性要求和优化控制顺序，将任务分配给上层领导者和下层追随者，实现双层互动优化。第三，以电力灵活性、供热质量和运行经济性为领导导向，为上层设计了多标准优化奖励函数。然后，将上层 TD3 代理的行动设计为 MOMPC 滚动优化的动态权重和时变预测视野，作为连接和指导双层优化的桥梁。最后，为了验证双级优化策略的有效性，在 300 MW 热电联产机组上进行了大量的负荷变化和干扰抑制测试。结果表明，建议的策略提高了机组的负荷灵活性、供热质量和运行经济性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Applied Energy 工程技术-工程：化工

CiteScore

21.20

自引率

10.70%

发文量

1830

审稿时长

41 days

期刊介绍： Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.