Human-in-the-loop control strategy for IoT-based smart thermostats with Deep Reinforcement Learning

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Energy and AI Pub Date : 2025-03-15 DOI:10.1016/j.egyai.2025.100490

Payam Fatehi Karjou, Fabian Stupperich, Phillip Stoffel, Drk Müller

{"title":"Human-in-the-loop control strategy for IoT-based smart thermostats with Deep Reinforcement Learning","authors":"Payam Fatehi Karjou, Fabian Stupperich, Phillip Stoffel, Drk Müller","doi":"10.1016/j.egyai.2025.100490","DOIUrl":null,"url":null,"abstract":"<div><div>Thermostatic Radiator Valves (TRVs) are a widely used technology for regulating room heating in Europe countries. Smart TRVs can provide significant energy savings, often ranging from 20–40% compared to conventional heating systems. They use sensors and algorithms to learn user behavior and optimize heating schedules accordingly. They can often be easily retrofitted to existing heating systems, making them a practical option for enhancing energy efficiency in present buildings, especially in office buildings due to their highly dynamic operational patterns. This work presents a novel human-in-the-loop control strategy for Internet of Things (IoT)-based TRVs using Deep Reinforcement Learning (DRL). A key focus of this research is enhancing the adaptability of agents’ behavior by implementing a more generic and flexible Markov Decision Process (MDP) to promote policy generalization across diverse scenarios. The study explores the challenges of transferring control behaviors from simulation environments to real-world settings, examining the performance across different thermal zones and evaluating the integration flexibility of the control strategy within building systems. Real-world occupant behavior is incorporated, including dynamic comfort preferences and occupancy predictions, to better align thermostat operation with user preferences. Furthermore, this paper discusses the practical challenges encountered during implementation, including battery consumption of IoT devices, integration of occupancy detection and prediction systems, and maintenance requirements. By addressing these issues, the proposed control strategy seeks to improve the scalability and feasibility of IoT-based TRVs, thereby providing a viable solution for their widespread deployment in buildings.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"20 ","pages":"Article 100490"},"PeriodicalIF":9.6000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825000229","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Thermostatic Radiator Valves (TRVs) are a widely used technology for regulating room heating in Europe countries. Smart TRVs can provide significant energy savings, often ranging from 20–40% compared to conventional heating systems. They use sensors and algorithms to learn user behavior and optimize heating schedules accordingly. They can often be easily retrofitted to existing heating systems, making them a practical option for enhancing energy efficiency in present buildings, especially in office buildings due to their highly dynamic operational patterns. This work presents a novel human-in-the-loop control strategy for Internet of Things (IoT)-based TRVs using Deep Reinforcement Learning (DRL). A key focus of this research is enhancing the adaptability of agents’ behavior by implementing a more generic and flexible Markov Decision Process (MDP) to promote policy generalization across diverse scenarios. The study explores the challenges of transferring control behaviors from simulation environments to real-world settings, examining the performance across different thermal zones and evaluating the integration flexibility of the control strategy within building systems. Real-world occupant behavior is incorporated, including dynamic comfort preferences and occupancy predictions, to better align thermostat operation with user preferences. Furthermore, this paper discusses the practical challenges encountered during implementation, including battery consumption of IoT devices, integration of occupancy detection and prediction systems, and maintenance requirements. By addressing these issues, the proposed control strategy seeks to improve the scalability and feasibility of IoT-based TRVs, thereby providing a viable solution for their widespread deployment in buildings.

Abstract Image