{"title":"A bi-level optimization strategy for flexible and economic operation of the CHP units based on reinforcement learning and multi-objective MPC","authors":"Keyan Zhu , Guangming Zhang , Chen Zhu , Yuguang Niu , Jizhen Liu","doi":"10.1016/j.apenergy.2025.125850","DOIUrl":null,"url":null,"abstract":"<div><div>Enhancing the comprehensive performance of the combined heat and power (CHP) units is crucial for accommodating renewable energy and achieving energy conservation. To this end, a bi-level optimization strategy based on reinforcement learning (RL) and multi-objective model predictive control (MOMPC) is proposed to enhance the CHP units flexibility and economic performance. Firstly, a CHP unit model is constructed, and its various parameters are incorporated into the rolling optimization of the MOMPC, serving as the lower-level follower to solve the fundamental control. Secondly, a bi-level optimization strategy integrating the twin delayed deep deterministic policy gradient (TD3) algorithm with MOMPC (TD3-MOMPC) is proposed. The TD3 agent is designated as the upper-level leader. By decomposing the complex flexibility requirements and the optimization control sequence of the CHP unit, tasks are assigned to both the upper-level leader and the lower-level follower for bi-level interactive optimization. Thirdly, with power flexibility, heating quality, and operational economy serving as leader guidance, a multi-criterion optimization reward function is designed for the upper-level. Then, the actions of the upper-level TD3 agent are designed as dynamic weights and time-varying prediction horizons for the rolling optimization of MOMPC, serving as a bridge to connect and guide the bi-level optimization. Finally, to verify the effectiveness of the bi-level optimization strategy, extensive tests on load variation and disturbance rejection were conducted on a 300 MW CHP unit. The results show that the proposed strategy enhances the unit's load flexibility, heating quality, and operational economy.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"391 ","pages":"Article 125850"},"PeriodicalIF":11.0000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S030626192500580X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Enhancing the comprehensive performance of the combined heat and power (CHP) units is crucial for accommodating renewable energy and achieving energy conservation. To this end, a bi-level optimization strategy based on reinforcement learning (RL) and multi-objective model predictive control (MOMPC) is proposed to enhance the CHP units flexibility and economic performance. Firstly, a CHP unit model is constructed, and its various parameters are incorporated into the rolling optimization of the MOMPC, serving as the lower-level follower to solve the fundamental control. Secondly, a bi-level optimization strategy integrating the twin delayed deep deterministic policy gradient (TD3) algorithm with MOMPC (TD3-MOMPC) is proposed. The TD3 agent is designated as the upper-level leader. By decomposing the complex flexibility requirements and the optimization control sequence of the CHP unit, tasks are assigned to both the upper-level leader and the lower-level follower for bi-level interactive optimization. Thirdly, with power flexibility, heating quality, and operational economy serving as leader guidance, a multi-criterion optimization reward function is designed for the upper-level. Then, the actions of the upper-level TD3 agent are designed as dynamic weights and time-varying prediction horizons for the rolling optimization of MOMPC, serving as a bridge to connect and guide the bi-level optimization. Finally, to verify the effectiveness of the bi-level optimization strategy, extensive tests on load variation and disturbance rejection were conducted on a 300 MW CHP unit. The results show that the proposed strategy enhances the unit's load flexibility, heating quality, and operational economy.
期刊介绍:
Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.