Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Energy and AI Pub Date : 2024-11-22 DOI:10.1016/j.egyai.2024.100448

Julian Ruddick , Glenn Ceusters , Gilles Van Kriekinge , Evgenii Genov , Cedric De Cauwer , Thierry Coosemans , Maarten Messagie

{"title":"Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems","authors":"Julian Ruddick , Glenn Ceusters , Gilles Van Kriekinge , Evgenii Genov , Cedric De Cauwer , Thierry Coosemans , Maarten Messagie","doi":"10.1016/j.egyai.2024.100448","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (<span>OptLayerPolicy</span>) and a metaheuristic algorithm generating a decision tree control policy (<span>TreeC</span>), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing them against model predictive control and simple rule-based control benchmarks. The experiments were conducted on the electrical installation of four reproductions of residential houses, each with its own battery, photovoltaic, and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, <span>TreeC</span>, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for <span>TreeC</span> and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The <span>OptLayerPolicy</span> safety layer allows safe online training of a reinforcement learning agent in the real world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone; nonetheless, it has been found beneficial for all investigated methods. The <span>TreeC</span> method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"18 ","pages":"Article 100448"},"PeriodicalIF":9.6000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546824001149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (OptLayerPolicy) and a metaheuristic algorithm generating a decision tree control policy (TreeC), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing them against model predictive control and simple rule-based control benchmarks. The experiments were conducted on the electrical installation of four reproductions of residential houses, each with its own battery, photovoltaic, and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, TreeC, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for TreeC and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The OptLayerPolicy safety layer allows safe online training of a reinforcement learning agent in the real world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone; nonetheless, it has been found beneficial for all investigated methods. The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于安全强化学习、模型预测控制和决策树的家庭能源管理系统的实际验证

基于机器学习的能源管理方法，特别是带有安全层的强化学习（OptLayerPolicy）和生成决策树控制策略（TreeC）的元启发式算法，最近取得了长足的进步。然而，它们的有效性只在计算机模拟中得到过验证。本文介绍了这些方法的实际验证情况，并将其与模型预测控制和基于规则的简单控制基准进行了比较。实验是在四栋复制品住宅的电气装置上进行的，每栋住宅都有自己的电池、光伏和动态负载系统，模拟了一个非可控电力负载和一个可控电动汽车充电器。结果表明，基于简单规则、TreeC 和模型预测控制的方法实现了相似的成本，差异仅为 0.6%。基于强化学习的方法仍处于训练阶段，其成本比其他方法高出 25.5%。其他模拟结果表明，通过为 TreeC 使用更具代表性的训练数据集，并解决模型预测控制实施过程中因依赖各种来源的准确数据而产生的错误，可以进一步降低成本。OptLayerPolicy 安全层允许在现实世界中对强化学习代理进行安全的在线训练，并给出准确的约束函数表述。所提出的安全层方法仍然容易出错，但它对所有研究方法都有益处。TreeC 方法需要建立一个真实的模拟来进行训练，它的运行性能最安全，仅超出电网限制 27.1 Wh，而强化学习方法则超出 593.9 Wh。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊