使mapreduce调度在擦除编码存储集群中有效

Runhui Li, P. Lee
{"title":"使mapreduce调度在擦除编码存储集群中有效","authors":"Runhui Li, P. Lee","doi":"10.1109/LANMAN.2015.7114730","DOIUrl":null,"url":null,"abstract":"With the explosive growth of data, enterprises increasingly adopt erasure coding on storage clusters to save storage space. On the other hand, erasure coding incurs higher performance overhead, especially during recovery. This motivates us to study the feasibility of alleviating performance overhead of erasure coding, while maintaining its storage efficiency advantage. In this paper, we study the performance issue of MapReduce when it runs on erasure-coded storage. We first review our previously proposed degraded-first scheduling, which avoids network bandwidth competition among degraded map tasks in failure mode, and hence improves the MapReduce performance over the default locality-first scheduling in MapReduce. We then show that the basic degraded-first scheduling may not work effectively when there are multiple running MapReduce jobs, and hence we propose heuristics to enhance the degraded-first scheduling design. Simulations demonstrate the performance gain of our enhanced degraded-first scheduling in a multi-job scenario. Our work makes a case that a new design of MapReduce scheduling is critical when we move to erasure-coded storage.","PeriodicalId":193630,"journal":{"name":"The 21st IEEE International Workshop on Local and Metropolitan Area Networks","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Making mapreduce scheduling effective in erasure-coded storage clusters\",\"authors\":\"Runhui Li, P. Lee\",\"doi\":\"10.1109/LANMAN.2015.7114730\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the explosive growth of data, enterprises increasingly adopt erasure coding on storage clusters to save storage space. On the other hand, erasure coding incurs higher performance overhead, especially during recovery. This motivates us to study the feasibility of alleviating performance overhead of erasure coding, while maintaining its storage efficiency advantage. In this paper, we study the performance issue of MapReduce when it runs on erasure-coded storage. We first review our previously proposed degraded-first scheduling, which avoids network bandwidth competition among degraded map tasks in failure mode, and hence improves the MapReduce performance over the default locality-first scheduling in MapReduce. We then show that the basic degraded-first scheduling may not work effectively when there are multiple running MapReduce jobs, and hence we propose heuristics to enhance the degraded-first scheduling design. Simulations demonstrate the performance gain of our enhanced degraded-first scheduling in a multi-job scenario. Our work makes a case that a new design of MapReduce scheduling is critical when we move to erasure-coded storage.\",\"PeriodicalId\":193630,\"journal\":{\"name\":\"The 21st IEEE International Workshop on Local and Metropolitan Area Networks\",\"volume\":\"54 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The 21st IEEE International Workshop on Local and Metropolitan Area Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LANMAN.2015.7114730\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 21st IEEE International Workshop on Local and Metropolitan Area Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LANMAN.2015.7114730","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

随着数据量的爆炸式增长,企业越来越多地在存储集群上采用擦除编码来节省存储空间。另一方面,擦除编码会带来更高的性能开销,特别是在恢复期间。这促使我们研究在保持擦除编码的存储效率优势的同时减轻其性能开销的可行性。在本文中,我们研究了MapReduce在擦除编码存储上运行时的性能问题。我们首先回顾了我们之前提出的退化优先调度,它避免了退化映射任务在故障模式下的网络带宽竞争,从而提高了MapReduce中默认的位置优先调度的性能。然后我们表明,当有多个正在运行的MapReduce作业时,基本的退化优先调度可能无法有效地工作,因此我们提出了启发式方法来增强退化优先调度设计。仿真结果表明,在多作业场景下,增强的退化优先调度的性能提高。我们的工作表明,当我们转向擦除编码存储时,MapReduce调度的新设计是至关重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Making mapreduce scheduling effective in erasure-coded storage clusters
With the explosive growth of data, enterprises increasingly adopt erasure coding on storage clusters to save storage space. On the other hand, erasure coding incurs higher performance overhead, especially during recovery. This motivates us to study the feasibility of alleviating performance overhead of erasure coding, while maintaining its storage efficiency advantage. In this paper, we study the performance issue of MapReduce when it runs on erasure-coded storage. We first review our previously proposed degraded-first scheduling, which avoids network bandwidth competition among degraded map tasks in failure mode, and hence improves the MapReduce performance over the default locality-first scheduling in MapReduce. We then show that the basic degraded-first scheduling may not work effectively when there are multiple running MapReduce jobs, and hence we propose heuristics to enhance the degraded-first scheduling design. Simulations demonstrate the performance gain of our enhanced degraded-first scheduling in a multi-job scenario. Our work makes a case that a new design of MapReduce scheduling is critical when we move to erasure-coded storage.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A novel energy efficient cooperative spectrum sensing scheme for cognitive radio sensor network based on evolutionary game Bitcoin for smart trading in smart grid Scalable mobile backhauling via information-centric networking Virtual-single-cell wireless networks with 3G-LTE-based protocol and PON for backhaul network On exploiting white spaces in WiFi networks for opportunistic M2M communications
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1