Parity-Only Caching for Robust Straggler Tolerance

Mi Zhang, Qiuping Wang, Zhirong Shen, P. Lee
{"title":"Parity-Only Caching for Robust Straggler Tolerance","authors":"Mi Zhang, Qiuping Wang, Zhirong Shen, P. Lee","doi":"10.1109/MSST.2019.00006","DOIUrl":null,"url":null,"abstract":"Stragglers (i.e., nodes with slow performance) are prevalent and incur performance instability in large-scale storage systems, yet it is challenging to detect stragglers in practice. We make a case by showing how erasure-coded caching provides robust straggler tolerance without relying on timely and accurate straggler detection, while incurring limited redundancy overhead in caching. We first analytically motivate that caching only parity blocks can achieve effective straggler tolerance. To this end, we present POCache, a parity-only caching design that provides robust straggler tolerance. To limit the erasure coding overhead, POCache slices blocks into smaller subblocks and parallelizes the coding operations at the subblock level. Also, it leverages a straggler-aware cache algorithm that takes into account both file access popularity and straggler estimation to decide which parity blocks should be cached. We implement a POCache prototype atop Hadoop 3.1 HDFS, while preserving the performance and functionalities of normal HDFS operations. Our extensive experiments on both local and Amazon EC2 clusters show that in the presence of stragglers, POCache can reduce the read latency by up to 87.9% compared to vanilla HDFS.","PeriodicalId":391517,"journal":{"name":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 35th Symposium on Mass Storage Systems and Technologies (MSST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MSST.2019.00006","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Stragglers (i.e., nodes with slow performance) are prevalent and incur performance instability in large-scale storage systems, yet it is challenging to detect stragglers in practice. We make a case by showing how erasure-coded caching provides robust straggler tolerance without relying on timely and accurate straggler detection, while incurring limited redundancy overhead in caching. We first analytically motivate that caching only parity blocks can achieve effective straggler tolerance. To this end, we present POCache, a parity-only caching design that provides robust straggler tolerance. To limit the erasure coding overhead, POCache slices blocks into smaller subblocks and parallelizes the coding operations at the subblock level. Also, it leverages a straggler-aware cache algorithm that takes into account both file access popularity and straggler estimation to decide which parity blocks should be cached. We implement a POCache prototype atop Hadoop 3.1 HDFS, while preserving the performance and functionalities of normal HDFS operations. Our extensive experiments on both local and Amazon EC2 clusters show that in the presence of stragglers, POCache can reduce the read latency by up to 87.9% compared to vanilla HDFS.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
鲁棒掉队容忍度的纯奇偶缓存
在大规模存储系统中,散列节点(即性能较低的节点)普遍存在,导致性能不稳定,但在实际应用中,对散列节点的检测是一个挑战。我们通过展示擦除编码缓存如何在不依赖于及时和准确的掉队检测的情况下提供健壮的掉队容忍度,同时在缓存中产生有限的冗余开销来说明这一点。我们首先分析了缓存奇偶校验块可以实现有效的掉队容忍度。为此,我们提出了POCache,这是一种仅限奇偶校验的缓存设计,提供了健壮的掉队容忍度。为了限制擦除编码开销,POCache将块分割成更小的子块,并在子块级别并行化编码操作。此外,它还利用了一个能够感知散列器的缓存算法,该算法同时考虑了文件访问的流行程度和散列器的估计,以决定应该缓存哪些奇偶校验块。我们在Hadoop 3.1 HDFS上实现了一个POCache原型,同时保留了正常HDFS操作的性能和功能。我们在本地和Amazon EC2集群上进行的大量实验表明,在存在离散者的情况下,与普通HDFS相比,POCache可以减少高达87.9%的读延迟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Mitigate HDD Fail-Slow by Pro-actively Utilizing System-level Data Redundancy with Enhanced HDD Controllability and Observability Fighting with Unknowns: Estimating the Performance of Scalable Distributed Storage Systems with Minimal Measurement Data Towards Virtual Machine Image Management for Persistent Memory CDAC: Content-Driven Deduplication-Aware Storage Cache vNVML: An Efficient User Space Library for Virtualizing and Sharing Non-Volatile Memories
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1