Differential Approximation and Sprinting for Multi-Priority Big Data Engines

R. Birke, Isabelly Rocha, Juan F. Pérez, V. Schiavoni, P. Felber, L. Chen
{"title":"Differential Approximation and Sprinting for Multi-Priority Big Data Engines","authors":"R. Birke, Isabelly Rocha, Juan F. Pérez, V. Schiavoni, P. Felber, L. Chen","doi":"10.1145/3361525.3361547","DOIUrl":null,"url":null,"abstract":"Today's big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees. Traces from production systems show that the latency advantage of high-priority jobs comes at the cost of severe latency degradation of low-priority jobs as well as daunting resource waste caused by repetitive eviction and re-execution of low-priority jobs. We advocate a new resource management design that exploits the idea of differential approximation and sprinting. The unique combination of approximation and sprinting avoids the eviction of low-priority jobs and its consequent latency degradation and resource waste. To this end, we designed, implemented and evaluated DiAS, an extension of the Spark processing engine to support deflate jobs by dropping tasks and to sprint jobs. Our experiments on scenarios with two and three priority classes indicate that DiAS achieves up to 90% and 60% latency reduction for low- and high-priority jobs, respectively. DiAS not only eliminates resource waste but also (surprisingly) lowers energy consumption up to 30% at only a marginal accuracy loss for low-priority jobs.","PeriodicalId":381253,"journal":{"name":"Proceedings of the 20th International Middleware Conference","volume":"112 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 20th International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3361525.3361547","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Today's big data clusters based on the MapReduce paradigm are capable of executing analysis jobs with multiple priorities, providing differential latency guarantees. Traces from production systems show that the latency advantage of high-priority jobs comes at the cost of severe latency degradation of low-priority jobs as well as daunting resource waste caused by repetitive eviction and re-execution of low-priority jobs. We advocate a new resource management design that exploits the idea of differential approximation and sprinting. The unique combination of approximation and sprinting avoids the eviction of low-priority jobs and its consequent latency degradation and resource waste. To this end, we designed, implemented and evaluated DiAS, an extension of the Spark processing engine to support deflate jobs by dropping tasks and to sprint jobs. Our experiments on scenarios with two and three priority classes indicate that DiAS achieves up to 90% and 60% latency reduction for low- and high-priority jobs, respectively. DiAS not only eliminates resource waste but also (surprisingly) lowers energy consumption up to 30% at only a marginal accuracy loss for low-priority jobs.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多优先级大数据引擎的微分逼近与冲刺
如今,基于MapReduce范式的大数据集群能够执行具有多个优先级的分析任务,并提供差异延迟保证。来自生产系统的跟踪显示,高优先级作业的延迟优势是以低优先级作业的严重延迟退化以及低优先级作业的重复退出和重新执行所造成的令人生畏的资源浪费为代价的。我们提倡一种新的资源管理设计,利用微分逼近和冲刺的思想。近似和冲刺的独特组合避免了低优先级作业的淘汰以及随之而来的延迟退化和资源浪费。为此,我们设计、实现并评估了DiAS,这是Spark处理引擎的一个扩展,支持通过丢弃任务来压缩作业和冲刺作业。我们对具有两个和三个优先级类别的场景进行的实验表明,DiAS分别为低优先级和高优先级作业实现了高达90%和60%的延迟减少。DiAS不仅消除了资源浪费,而且(令人惊讶的是)在低优先级作业的精度损失很小的情况下,降低了高达30%的能耗。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
OS-Augmented Oversubscription of Opportunistic Memory with a User-Assisted OOM Killer Medley: A Novel Distributed Failure Detector for IoT Networks AccTEE FabricCRDT: A Conflict-Free Replicated Datatypes Approach to Permissioned Blockchains Combining it all: Cost minimal and low-latency stream processing across distributed heterogeneous infrastructures
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1