Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters?

Md. Wasi-ur-Rahman, Nusrat S. Islam, Xiaoyi Lu, D. Panda
{"title":"Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters?","authors":"Md. Wasi-ur-Rahman, Nusrat S. Islam, Xiaoyi Lu, D. Panda","doi":"10.1109/PDSW-DISCS.2016.7","DOIUrl":null,"url":null,"abstract":"Modern High-Performance Computing (HPC) clusters are equipped with advanced technological resources that need to be properly utilized to achieve supreme performance for end applications. One such example, Non-Volatile Memory (NVM), provides the opportunity for fast scalable performance through its DRAM-like performance characteristics. On the other hand, distributed processing engines, such as MapReduce, are continuously being enhanced with features enabling high-performance technologies. In this paper, we present a novel MapReduce framework with NVRAM-assisted map output spill approach. We have designed our framework on top of the existing RDMA-enhanced Hadoop MapReduce to ensure both map and reduce phase performance enhancements to be present for end applications. Our proposed approach significantly enhances map phase performance proven by a wide variety of MapReduce benchmarks and workloads from Intel HiBench [9] and PUMA [18] suites. Our performance evaluation illustrates that NVRAM-based spill approach can improve map execution performance by 2.73x which contributes to the overall execution improvement of 55% for Sort. Our design also guarantees significant performance benefits for other workloads: 54% for TeraSort, 21% for PageRank, 58% for SelfJoin, etc. To the best of our knowledge, this is the first approach towards leveraging NVRAM in MapReduce execution frameworks for applications on HPC clusters.","PeriodicalId":375550,"journal":{"name":"2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 1st Joint International Workshop on Parallel Data Storage and data Intensive Scalable Computing Systems (PDSW-DISCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDSW-DISCS.2016.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Modern High-Performance Computing (HPC) clusters are equipped with advanced technological resources that need to be properly utilized to achieve supreme performance for end applications. One such example, Non-Volatile Memory (NVM), provides the opportunity for fast scalable performance through its DRAM-like performance characteristics. On the other hand, distributed processing engines, such as MapReduce, are continuously being enhanced with features enabling high-performance technologies. In this paper, we present a novel MapReduce framework with NVRAM-assisted map output spill approach. We have designed our framework on top of the existing RDMA-enhanced Hadoop MapReduce to ensure both map and reduce phase performance enhancements to be present for end applications. Our proposed approach significantly enhances map phase performance proven by a wide variety of MapReduce benchmarks and workloads from Intel HiBench [9] and PUMA [18] suites. Our performance evaluation illustrates that NVRAM-based spill approach can improve map execution performance by 2.73x which contributes to the overall execution improvement of 55% for Sort. Our design also guarantees significant performance benefits for other workloads: 54% for TeraSort, 21% for PageRank, 58% for SelfJoin, etc. To the best of our knowledge, this is the first approach towards leveraging NVRAM in MapReduce execution frameworks for applications on HPC clusters.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
非易失性内存是否有利于高性能计算集群上的MapReduce应用?
现代高性能计算(HPC)集群拥有先进的技术资源,需要合理利用这些资源,才能为终端应用提供最高的性能。例如,非易失性内存(NVM)通过其类似dram的性能特性提供了快速可扩展性能的机会。另一方面,分布式处理引擎,如MapReduce,正在不断地增强支持高性能技术的特性。在本文中,我们提出了一种新的MapReduce框架,该框架采用nvram辅助映射输出溢出方法。我们在现有的rdma增强的Hadoop MapReduce之上设计了我们的框架,以确保map和reduce阶段的性能增强在最终应用程序中呈现。我们提出的方法显著提高了地图阶段的性能,并得到了来自英特尔HiBench[9]和PUMA[18]套件的各种MapReduce基准测试和工作负载的证明。我们的性能评估表明,基于nvram的溢出方法可以将映射执行性能提高2.73倍,这使得Sort的总体执行性能提高了55%。我们的设计还保证了其他工作负载的显著性能优势:TeraSort为54%,PageRank为21%,SelfJoin为58%等。据我们所知,这是在高性能计算集群上的应用程序的MapReduce执行框架中利用NVRAM的第一种方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Klimatic: A Virtual Data Lake for Harvesting and Distribution of Geospatial Data Towards Energy Efficient Data Management in HPC: The Open Ethernet Drive Approach FatMan vs. LittleBoy: Scaling Up Linear Algebraic Operations in Scale-Out Data Platforms A Bloom Filter Based Scalable Data Integrity Check Tool for Large-Scale Dataset Can Non-volatile Memory Benefit MapReduce Applications on HPC Clusters?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1