Energy proportionality and workload consolidation for latency-critical applications

G. Prekas, Mia Primorac, A. Belay, C. Kozyrakis, Edouard Bugnion
{"title":"Energy proportionality and workload consolidation for latency-critical applications","authors":"G. Prekas, Mia Primorac, A. Belay, C. Kozyrakis, Edouard Bugnion","doi":"10.1145/2806777.2806848","DOIUrl":null,"url":null,"abstract":"Energy proportionality and workload consolidation are important objectives towards increasing efficiency in large-scale datacenters. Our work focuses on achieving these goals in the presence of applications with μs-scale tail latency requirements. Such applications represent a growing subset of datacenter workloads and are typically deployed on dedicated servers, which is the simplest way to ensure low tail latency across all loads. Unfortunately, it also leads to low energy efficiency and low resource utilization during the frequent periods of medium or low load. We present the OS mechanisms and dynamic control needed to adjust core allocation and voltage/frequency settings based on the measured delays for latency-critical workloads. This allows for energy proportionality and frees the maximum amount of resources per server for other background applications, while respecting service-level objectives. Monitoring hardware queue depths allows us to detect increases in queuing latencies. Carefully coordinated adjustments to the NIC's packet redirection table enable us to reassign flow groups between the threads of a latency-critical application in milliseconds without dropping or reordering packets. We compare the efficiency of our solution to the Pareto-optimal frontier of 224 distinct static configurations. Dynamic resource control saves 44%--54% of processor energy, which corresponds to 85%--93% of the Pareto-optimal upper bound. Dynamic resource control also allows background jobs to run at 32%--46% of their standalone throughput, which corresponds to 82%--92% of the Pareto bound.","PeriodicalId":275158,"journal":{"name":"Proceedings of the Sixth ACM Symposium on Cloud Computing","volume":"214 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"71","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Sixth ACM Symposium on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2806777.2806848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 71

Abstract

Energy proportionality and workload consolidation are important objectives towards increasing efficiency in large-scale datacenters. Our work focuses on achieving these goals in the presence of applications with μs-scale tail latency requirements. Such applications represent a growing subset of datacenter workloads and are typically deployed on dedicated servers, which is the simplest way to ensure low tail latency across all loads. Unfortunately, it also leads to low energy efficiency and low resource utilization during the frequent periods of medium or low load. We present the OS mechanisms and dynamic control needed to adjust core allocation and voltage/frequency settings based on the measured delays for latency-critical workloads. This allows for energy proportionality and frees the maximum amount of resources per server for other background applications, while respecting service-level objectives. Monitoring hardware queue depths allows us to detect increases in queuing latencies. Carefully coordinated adjustments to the NIC's packet redirection table enable us to reassign flow groups between the threads of a latency-critical application in milliseconds without dropping or reordering packets. We compare the efficiency of our solution to the Pareto-optimal frontier of 224 distinct static configurations. Dynamic resource control saves 44%--54% of processor energy, which corresponds to 85%--93% of the Pareto-optimal upper bound. Dynamic resource control also allows background jobs to run at 32%--46% of their standalone throughput, which corresponds to 82%--92% of the Pareto bound.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
延迟关键型应用程序的能量比例和工作负载整合
能源比例和工作负载整合是提高大型数据中心效率的重要目标。我们的工作重点是在具有μs级尾延迟需求的应用程序中实现这些目标。这类应用程序代表了数据中心工作负载的一个不断增长的子集,通常部署在专用服务器上,这是确保跨所有负载的低尾部延迟的最简单方法。然而,在频繁的中负荷或低负荷期间,这也导致能源效率低,资源利用率低。我们介绍了基于延迟关键工作负载的测量延迟来调整核心分配和电压/频率设置所需的操作系统机制和动态控制。这允许能量比例,并为其他后台应用程序释放每台服务器的最大资源量,同时尊重服务级目标。监视硬件队列深度使我们能够检测队列延迟的增加。仔细协调调整NIC的数据包重定向表,使我们能够在毫秒内重新分配延迟关键应用程序线程之间的流组,而不会丢弃或重新排序数据包。我们将我们的解决方案的效率与224种不同静态配置的帕累托最优边界进行了比较。动态资源控制可以节省44% ~ 54%的处理器能量,相当于pareto最优上界的85% ~ 93%。动态资源控制还允许后台作业以独立吞吐量的32%- 46%运行,这相当于帕累托界限的82%- 92%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Software-defined caching: managing caches in multi-tenant data centers Managed communication and consistency for fast data-parallel iterative analytics MemcachedGPU: scaling-up scale-out key-value stores Database high availability using SHADOW systems Proceedings of the Sixth ACM Symposium on Cloud Computing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1