使用含水层热能存储的数据中心冷却运行的情境感知强化学习

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Energy and AI Pub Date : 2024-07-05 DOI:10.1016/j.egyai.2024.100395
Lukas Leindals, Peter Grønning, Dominik Franjo Dominković, Rune Grønborg Junker
{"title":"使用含水层热能存储的数据中心冷却运行的情境感知强化学习","authors":"Lukas Leindals,&nbsp;Peter Grønning,&nbsp;Dominik Franjo Dominković,&nbsp;Rune Grønborg Junker","doi":"10.1016/j.egyai.2024.100395","DOIUrl":null,"url":null,"abstract":"<div><p>Data centers are often equipped with multiple cooling units. Here, an aquifer thermal energy storage (ATES) system has shown to be efficient. However, the usage of hot and cold-water wells in the ATES must be balanced for legal and environmental reasons. Reinforcement Learning has been proven to be a useful tool for optimizing the cooling operation at data centers. Nonetheless, since cooling demand changes continuously, balancing the ATES usage on a yearly basis imposes an additional challenge in the form of a delayed reward. To overcome this, we formulate a return decomposition, Cool-RUDDER, which relies on simple domain knowledge and needs no training. We trained a proximal policy optimization agent to keep server temperatures steady while minimizing operational costs. Comparing the Cool-RUDDER reward signal to other ATES-associated rewards, all models kept the server temperatures steady at around 30 °C. An optimal ATES balance was defined to be 0% and a yearly imbalance of −4.9% with a confidence interval of [−6.2, −3.8]% was achieved for the Cool 2.0 reward. This outperformed a baseline ATES-associated reward of 0 at −16.3% with a confidence interval of [−17.1, −15.4]% and all other ATES-associated rewards. However, the improved ATES balance comes with a higher energy consumption cost of 12.5% when comparing the relative cost of the Cool 2.0 reward to the zero reward, resulting in a trade-off. Moreover, the method comes with limited requirements and is applicable to any long-term problem satisfying a linear state-transition system.</p></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"17 ","pages":"Article 100395"},"PeriodicalIF":9.6000,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666546824000612/pdfft?md5=b17bfa78652179749ed19203f3f51d82&pid=1-s2.0-S2666546824000612-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Context-aware reinforcement learning for cooling operation of data centers with an Aquifer Thermal Energy Storage\",\"authors\":\"Lukas Leindals,&nbsp;Peter Grønning,&nbsp;Dominik Franjo Dominković,&nbsp;Rune Grønborg Junker\",\"doi\":\"10.1016/j.egyai.2024.100395\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Data centers are often equipped with multiple cooling units. Here, an aquifer thermal energy storage (ATES) system has shown to be efficient. However, the usage of hot and cold-water wells in the ATES must be balanced for legal and environmental reasons. Reinforcement Learning has been proven to be a useful tool for optimizing the cooling operation at data centers. Nonetheless, since cooling demand changes continuously, balancing the ATES usage on a yearly basis imposes an additional challenge in the form of a delayed reward. To overcome this, we formulate a return decomposition, Cool-RUDDER, which relies on simple domain knowledge and needs no training. We trained a proximal policy optimization agent to keep server temperatures steady while minimizing operational costs. Comparing the Cool-RUDDER reward signal to other ATES-associated rewards, all models kept the server temperatures steady at around 30 °C. An optimal ATES balance was defined to be 0% and a yearly imbalance of −4.9% with a confidence interval of [−6.2, −3.8]% was achieved for the Cool 2.0 reward. This outperformed a baseline ATES-associated reward of 0 at −16.3% with a confidence interval of [−17.1, −15.4]% and all other ATES-associated rewards. However, the improved ATES balance comes with a higher energy consumption cost of 12.5% when comparing the relative cost of the Cool 2.0 reward to the zero reward, resulting in a trade-off. Moreover, the method comes with limited requirements and is applicable to any long-term problem satisfying a linear state-transition system.</p></div>\",\"PeriodicalId\":34138,\"journal\":{\"name\":\"Energy and AI\",\"volume\":\"17 \",\"pages\":\"Article 100395\"},\"PeriodicalIF\":9.6000,\"publicationDate\":\"2024-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666546824000612/pdfft?md5=b17bfa78652179749ed19203f3f51d82&pid=1-s2.0-S2666546824000612-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Energy and AI\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666546824000612\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546824000612","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

数据中心通常配备多个冷却装置。在这种情况下,含水层热能储存(ATES)系统就显示出了高效性。然而,出于法律和环境原因,ATES 系统中冷热水井的使用必须保持平衡。强化学习已被证明是优化数据中心冷却运行的有效工具。然而,由于冷却需求不断变化,每年平衡 ATES 的使用会带来额外的挑战,即延迟奖励。为了克服这一问题,我们提出了一种回报分解方法 Cool-RUDDER,它依赖于简单的领域知识,无需培训。我们训练了一个近似策略优化代理,以保持服务器温度稳定,同时最大限度地降低运营成本。将 Cool-RUDDER 奖励信号与其他 ATES 相关奖励进行比较,所有模型都能将服务器温度稳定在 30 °C 左右。最佳 ATES 平衡被定义为 0%,而 Cool 2.0 奖励的年失衡率为 -4.9%,置信区间为 [-6.2, -3.8]%。这一结果优于 ATES 相关奖励基准值 0(-16.3%,置信区间为[-17.1, -15.4]%)和所有其他 ATES 相关奖励。不过,在比较 Cool 2.0 奖励与 0 奖励的相对成本时,改进后的 ATES 平衡会带来 12.5% 的较高能耗成本,因此需要权衡利弊。此外,该方法要求有限,适用于任何满足线性状态转换系统的长期问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Context-aware reinforcement learning for cooling operation of data centers with an Aquifer Thermal Energy Storage

Data centers are often equipped with multiple cooling units. Here, an aquifer thermal energy storage (ATES) system has shown to be efficient. However, the usage of hot and cold-water wells in the ATES must be balanced for legal and environmental reasons. Reinforcement Learning has been proven to be a useful tool for optimizing the cooling operation at data centers. Nonetheless, since cooling demand changes continuously, balancing the ATES usage on a yearly basis imposes an additional challenge in the form of a delayed reward. To overcome this, we formulate a return decomposition, Cool-RUDDER, which relies on simple domain knowledge and needs no training. We trained a proximal policy optimization agent to keep server temperatures steady while minimizing operational costs. Comparing the Cool-RUDDER reward signal to other ATES-associated rewards, all models kept the server temperatures steady at around 30 °C. An optimal ATES balance was defined to be 0% and a yearly imbalance of −4.9% with a confidence interval of [−6.2, −3.8]% was achieved for the Cool 2.0 reward. This outperformed a baseline ATES-associated reward of 0 at −16.3% with a confidence interval of [−17.1, −15.4]% and all other ATES-associated rewards. However, the improved ATES balance comes with a higher energy consumption cost of 12.5% when comparing the relative cost of the Cool 2.0 reward to the zero reward, resulting in a trade-off. Moreover, the method comes with limited requirements and is applicable to any long-term problem satisfying a linear state-transition system.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Energy and AI
Energy and AI Engineering-Engineering (miscellaneous)
CiteScore
16.50
自引率
0.00%
发文量
64
审稿时长
56 days
期刊最新文献
Neural network potential-based molecular investigation of thermal decomposition mechanisms of ethylene and ammonia Machine learning for battery quality classification and lifetime prediction using formation data Enhancing PV feed-in power forecasting through federated learning with differential privacy using LSTM and GRU Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems Decentralized coordination of distributed energy resources through local energy markets and deep reinforcement learning
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1