Reinforcement Learning Building Control: An Online Approach with Guided Exploration using Surrogate Models

Sourav Dey, Gregor Henze
{"title":"Reinforcement Learning Building Control: An Online Approach with Guided Exploration using Surrogate Models","authors":"Sourav Dey, Gregor Henze","doi":"10.1115/1.4064842","DOIUrl":null,"url":null,"abstract":"\n The incorporation of emerging technologies, including solar photovoltaics, electric vehicles, battery energy storage, smart devices, internet-of-things (IoT) devices, and sensors in buildings, desirable control objectives are becoming increasingly complex, calling for advanced control approaches. Reinforcement learning (RL) is a powerful method for this, that can adapt and learn from environmental interaction, but it can take a long time to learn and can be unstable initially due to limited environmental knowledge. In our research, we propose an online RL approach for buildings that uses data-driven surrogate models to guide the RL agent during its early training. This helps the controller learn faster and more stably than the traditional direct plug-and-learn online learning approach. In this research, we propose an online approach in buildings with RL where, with the help of data-driven surrogate models, the RL agent is guided during its early exploratory training stage, aiding the controller to learn a near-optimal policy faster and exhibiting more stable training progress than a traditional direct plug-and-learn online learning RL approach. The agents are assisted in their learning and action with information gained from the surrogate models generating multiple artificial trajectories starting from the current state. The research presented an exploration of various surrogate model-assisted training methods and revealed that models focusing on artificial trajectories around rule-based controls yielded the most stable performance. In contrast, models employing random exploration with a one-step look-ahead approach demonstrated superior overall performance.","PeriodicalId":326594,"journal":{"name":"ASME Journal of Engineering for Sustainable Buildings and Cities","volume":"6 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASME Journal of Engineering for Sustainable Buildings and Cities","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/1.4064842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The incorporation of emerging technologies, including solar photovoltaics, electric vehicles, battery energy storage, smart devices, internet-of-things (IoT) devices, and sensors in buildings, desirable control objectives are becoming increasingly complex, calling for advanced control approaches. Reinforcement learning (RL) is a powerful method for this, that can adapt and learn from environmental interaction, but it can take a long time to learn and can be unstable initially due to limited environmental knowledge. In our research, we propose an online RL approach for buildings that uses data-driven surrogate models to guide the RL agent during its early training. This helps the controller learn faster and more stably than the traditional direct plug-and-learn online learning approach. In this research, we propose an online approach in buildings with RL where, with the help of data-driven surrogate models, the RL agent is guided during its early exploratory training stage, aiding the controller to learn a near-optimal policy faster and exhibiting more stable training progress than a traditional direct plug-and-learn online learning RL approach. The agents are assisted in their learning and action with information gained from the surrogate models generating multiple artificial trajectories starting from the current state. The research presented an exploration of various surrogate model-assisted training methods and revealed that models focusing on artificial trajectories around rule-based controls yielded the most stable performance. In contrast, models employing random exploration with a one-step look-ahead approach demonstrated superior overall performance.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
强化学习建筑控制:使用替代模型引导探索的在线方法
随着太阳能光伏发电、电动汽车、电池储能、智能设备、物联网(IoT)设备和楼宇传感器等新兴技术的融入,理想的控制目标变得越来越复杂,需要采用先进的控制方法。在这方面,强化学习(RL)是一种强大的方法,它可以从环境交互中进行适应和学习,但由于环境知识有限,它可能需要很长时间才能学会,而且最初可能不稳定。在我们的研究中,我们提出了一种针对建筑物的在线 RL 方法,该方法使用数据驱动的代理模型来指导 RL 代理的早期训练。与传统的直接插拔式在线学习方法相比,这种方法能帮助控制器更快、更稳定地学习。在这项研究中,我们提出了一种使用 RL 的建筑物在线方法,在数据驱动的代理模型的帮助下,RL 代理在其早期探索性训练阶段得到指导,从而帮助控制器更快地学习接近最优的策略,并且与传统的直接即插即用在线学习 RL 方法相比,显示出更稳定的训练进度。代用模型从当前状态出发,生成多个人工轨迹,从中获取信息,辅助代理学习和行动。研究对各种代用模型辅助训练方法进行了探索,结果表明,以基于规则控制的人工轨迹为重点的模型性能最稳定。相比之下,采用随机探索和一步前瞻方法的模型表现出更优越的整体性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Impact of Building Design and Operating Strategies on Urban Heat Island Effects Part II: Sensitivity Analysis ASSESSING ENERGY SAVINGS: A COMPARATIVE STUDY OF REFLECTIVE ROOF COATINGS IN FOUR USA CLIMATE ZONES AN ELEMENTARY APPROACH TO EVALUATING THE THERMAL SELF-SUFFICIENCY OF RESIDENTIAL BUILDINGS WITH THERMAL ENERGY STORAGE A PROPOSED METHOD AND CASE STUDY OF WASTE HEAT RECOVERY IN AN INDUSTRIAL APPLICATION Impact of Building Operating Strategies on Urban Heat Island Effects Part I: Model Development and Validation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1