A trustworthy reinforcement learning framework for autonomous control of a large-scale complex heating system: Simulation and field implementation

IF 10.1 1区工程技术 Q1 ENERGY & FUELS Applied Energy Pub Date : 2024-11-08 DOI:10.1016/j.apenergy.2024.124815

Amirreza Heidari , Luc Girardin , Cédric Dorsaz , François Maréchal

{"title":"A trustworthy reinforcement learning framework for autonomous control of a large-scale complex heating system: Simulation and field implementation","authors":"Amirreza Heidari , Luc Girardin , Cédric Dorsaz , François Maréchal","doi":"10.1016/j.apenergy.2024.124815","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional control approaches heavily rely on hard-coded expert knowledge, complicating the development of optimal control solutions as system complexity increases. Deep Reinforcement Learning (DRL) offers a self-learning control solution, proving advantageous in scenarios where crafting expert-based solutions becomes intricate. This study investigates the potential of DRL for supervisory control in a unique and complex heating system within a large-scale university building. The DRL framework aims to minimize energy costs while ensuring occupant comfort. However, the trial-and-error learning approach of DRL raises concerns about the trustworthiness of executed actions, hindering practical implementation. To address this, the study incorporates action masking, enabling the integration of hard constraints into DRL to enhance user trust. Maskable Proximal Policy Optimization (MPPO) is evaluated alongside standard Proximal Policy Optimization (PPO) and Soft Actor–Critic (SAC). Simulation results reveal that MPPO achieves comparable energy savings (8% relative to the baseline control) with fewer comfort violations than other methods. Therefore, it is selected among the candidate algorithms and experimentally implemented in the university building over one week. Experimental findings demonstrate that MPPO reduces energy costs while maintaining occupant comfort, resulting in a 36% saving compared to a historical day with similar weather conditions. These results underscore the proactive decision-making capability of DRL, establishing its viability for autonomous control in complex energy systems.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"378 ","pages":"Article 124815"},"PeriodicalIF":10.1000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261924021986","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional control approaches heavily rely on hard-coded expert knowledge, complicating the development of optimal control solutions as system complexity increases. Deep Reinforcement Learning (DRL) offers a self-learning control solution, proving advantageous in scenarios where crafting expert-based solutions becomes intricate. This study investigates the potential of DRL for supervisory control in a unique and complex heating system within a large-scale university building. The DRL framework aims to minimize energy costs while ensuring occupant comfort. However, the trial-and-error learning approach of DRL raises concerns about the trustworthiness of executed actions, hindering practical implementation. To address this, the study incorporates action masking, enabling the integration of hard constraints into DRL to enhance user trust. Maskable Proximal Policy Optimization (MPPO) is evaluated alongside standard Proximal Policy Optimization (PPO) and Soft Actor–Critic (SAC). Simulation results reveal that MPPO achieves comparable energy savings (8% relative to the baseline control) with fewer comfort violations than other methods. Therefore, it is selected among the candidate algorithms and experimentally implemented in the university building over one week. Experimental findings demonstrate that MPPO reduces energy costs while maintaining occupant comfort, resulting in a 36% saving compared to a historical day with similar weather conditions. These results underscore the proactive decision-making capability of DRL, establishing its viability for autonomous control in complex energy systems.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于大规模复杂供热系统自主控制的可信强化学习框架：模拟和实地实施

传统的控制方法严重依赖于硬编码的专家知识，随着系统复杂性的增加，优化控制解决方案的开发也变得更加复杂。深度强化学习（DRL）提供了一种自学习控制解决方案，在基于专家的解决方案变得复杂的情况下证明了其优势。本研究调查了 DRL 在大型大学建筑内独特而复杂的供热系统中用于监督控制的潜力。DRL 框架旨在最大限度地降低能源成本，同时确保居住者的舒适度。然而，DRL 的试错学习方法引起了人们对所执行操作的可信度的担忧，从而阻碍了实际应用。为解决这一问题，本研究采用了行动掩码技术，将硬约束整合到 DRL 中，以增强用户信任度。可屏蔽近端策略优化（MPPO）与标准近端策略优化（PPO）和软行为批判（SAC）一起进行了评估。仿真结果表明，与其他方法相比，MPPO 实现了相当的节能效果（相对于基线控制为 8%），且违反舒适度的情况较少。因此，我们在候选算法中选择了 MPPO，并在大学建筑中进行了为期一周的实验。实验结果表明，MPPO 降低了能源成本，同时保持了居住舒适度，与天气条件相似的历史天数相比，节省了 36%。这些结果凸显了 DRL 的前瞻性决策能力，确立了其在复杂能源系统中进行自主控制的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Applied Energy 工程技术-工程：化工

CiteScore

21.20

自引率

10.70%

发文量

1830

审稿时长

41 days

期刊介绍： Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.