Empirical analysis of blower cooling failure in containment: Effects on IT performance

H. Alissa, K. Nemati, Udaya L. N. Puvvadi, B. Sammakia, K. Ghose, M. Seymour, Russell Tipton, Ken Schneebeli
{"title":"Empirical analysis of blower cooling failure in containment: Effects on IT performance","authors":"H. Alissa, K. Nemati, Udaya L. N. Puvvadi, B. Sammakia, K. Ghose, M. Seymour, Russell Tipton, Ken Schneebeli","doi":"10.1109/ITHERM.2016.7517716","DOIUrl":null,"url":null,"abstract":"Data Centers are prone to power outages and cooling failures. During such events, complex transport interactions take place between the cooling system and the IT. Empirical data on this phenomenon is scarce in the current literature due to the complexity and size of such experiments. In this study, a facility level data center blowers cooling failure experiment is run and analyzed. Quantitative instrumentation includes pressure differentials, tile airflow, point air inlet temperature, contours air inlet temperature and IT IPMI data during failure-recovery. Qualitative measurements include IR imaging and airflow visualization via smoke trace. To our knowledge, this is the first experimental study in literature in which an actual multi aisle facility cooling failure is run with real IT (compute, Network and storage) load in the white space. This will enable a link between variations from the facility to the chip levels. Results show that by using external air inlet temperature sensors the containment configuration has a longer uptime during failure. However, the IPMI data shows the opposite. In fact, the RTT is reduced by ~70% when the external and internal sensors are compared. This occurs due external impedances formed by the containment during failure degrading IT airflow systems. The inconsistency between IT IPMI inlet sensors and externally placed IT or rack inlet sensors (based on best practices) are expected to increase as the airflow imbalances increase.","PeriodicalId":426908,"journal":{"name":"2016 15th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 15th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems (ITherm)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITHERM.2016.7517716","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Data Centers are prone to power outages and cooling failures. During such events, complex transport interactions take place between the cooling system and the IT. Empirical data on this phenomenon is scarce in the current literature due to the complexity and size of such experiments. In this study, a facility level data center blowers cooling failure experiment is run and analyzed. Quantitative instrumentation includes pressure differentials, tile airflow, point air inlet temperature, contours air inlet temperature and IT IPMI data during failure-recovery. Qualitative measurements include IR imaging and airflow visualization via smoke trace. To our knowledge, this is the first experimental study in literature in which an actual multi aisle facility cooling failure is run with real IT (compute, Network and storage) load in the white space. This will enable a link between variations from the facility to the chip levels. Results show that by using external air inlet temperature sensors the containment configuration has a longer uptime during failure. However, the IPMI data shows the opposite. In fact, the RTT is reduced by ~70% when the external and internal sensors are compared. This occurs due external impedances formed by the containment during failure degrading IT airflow systems. The inconsistency between IT IPMI inlet sensors and externally placed IT or rack inlet sensors (based on best practices) are expected to increase as the airflow imbalances increase.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
安全壳鼓风机冷却失效的实证分析:对IT性能的影响
数据中心容易出现断电和冷却故障。在这些事件中,冷却系统和IT之间发生了复杂的传输相互作用。由于此类实验的复杂性和规模,目前文献中关于这一现象的经验数据很少。本文对某数据中心设施级鼓风机的冷却故障进行了实验分析。定量仪器包括压差,气流,点空气入口温度,轮廓空气入口温度和IT IPMI数据在故障恢复期间。定性测量包括红外成像和通过烟迹显示气流。据我们所知,这是文献中的第一个实验研究,其中在空白空间中运行实际的多通道设施冷却故障,同时运行真实的IT(计算、网络和存储)负载。这将使从设施到芯片水平的变化之间的联系成为可能。结果表明,通过使用外部进气温度传感器,安全壳结构在故障时具有更长的正常运行时间。然而,IPMI数据显示的情况恰恰相反。事实上,当外部和内部传感器进行比较时,RTT降低了约70%。这是由于在IT气流系统失效时,容器形成的外部阻抗造成的。随着气流不平衡的增加,IT IPMI进气传感器与外部放置的IT或机架进气传感器(基于最佳实践)之间的不一致性预计会增加。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Analytical model of graphene-enabled ultra-low power phase change memory ALN thin-films as heat spreaders in III–V photonics devices Part 2: Simulations Experimental study of bubble dynamics in highly wetting dielectric liquid pool boiling through high-speed video Condensate mobility actuated by microsurface topography and wettability modifications Inverse approach to characterize die-attach thermal interface of light emitting diodes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1