Application profiling in hierarchical Hadoop for geo-distributed computing environments

Marco Cavallo, G. Modica, Carmelo Polito, O. Tomarchio
{"title":"Application profiling in hierarchical Hadoop for geo-distributed computing environments","authors":"Marco Cavallo, G. Modica, Carmelo Polito, O. Tomarchio","doi":"10.1109/ISCC.2016.7543796","DOIUrl":null,"url":null,"abstract":"In the past two decades there has been a growing interest over the definition of new distributed computational paradigms capable to serve the need of manipulating and analyzing huge amounts of data. Among the others, the MapReduce outstands for popularity. Its open-source implementation Hadoop is widely used in academic environments and is also greatly supported by huge IT players. There are many application scenarios where the data to be manipulated resides on data centers which are heterogeneous in term of computing capacity and are geographically distant from each other's. Unfortunately, in this contexts Hadoop performs very poorly. In this paper we propose to leverage on a hierarchical computing framework to boost the Hadoop performance in geo-distributed computing environments. The framework we propose drains fresh information from the distributed computing context and exploits it to carry out a smart job scheduling strategy. In this work, the focus is put on the study and definition of the application profile of the jobs. We implemented a software prototype of the proposed hierarchical Hadoop framework. Tests run on the prototype proved the capability of the job scheduling system to compute the job's execution path and estimate its completion time.","PeriodicalId":148096,"journal":{"name":"2016 IEEE Symposium on Computers and Communication (ISCC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Symposium on Computers and Communication (ISCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCC.2016.7543796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

In the past two decades there has been a growing interest over the definition of new distributed computational paradigms capable to serve the need of manipulating and analyzing huge amounts of data. Among the others, the MapReduce outstands for popularity. Its open-source implementation Hadoop is widely used in academic environments and is also greatly supported by huge IT players. There are many application scenarios where the data to be manipulated resides on data centers which are heterogeneous in term of computing capacity and are geographically distant from each other's. Unfortunately, in this contexts Hadoop performs very poorly. In this paper we propose to leverage on a hierarchical computing framework to boost the Hadoop performance in geo-distributed computing environments. The framework we propose drains fresh information from the distributed computing context and exploits it to carry out a smart job scheduling strategy. In this work, the focus is put on the study and definition of the application profile of the jobs. We implemented a software prototype of the proposed hierarchical Hadoop framework. Tests run on the prototype proved the capability of the job scheduling system to compute the job's execution path and estimate its completion time.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
地理分布式计算环境的分层Hadoop应用程序分析
在过去的二十年里,人们对能够满足处理和分析海量数据需求的新型分布式计算范式的定义越来越感兴趣。其中,MapReduce最受欢迎。它的开源实现Hadoop在学术环境中广泛使用,也得到了大型IT企业的大力支持。在许多应用场景中,需要操作的数据驻留在计算能力不同且地理位置相距遥远的数据中心中。不幸的是,在这种情况下,Hadoop的表现非常糟糕。在本文中,我们建议利用层次计算框架来提高Hadoop在地理分布式计算环境中的性能。我们提出的框架从分布式计算环境中提取新信息,并利用它来实现智能作业调度策略。在本工作中,重点研究和定义了工种的应用概况。我们实现了提出的分层Hadoop框架的软件原型。在样机上运行的测试证明了作业调度系统计算作业执行路径和估计作业完成时间的能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Joint power control and sub-channel allocation for co-channel OFDMA femtocells Measuring the users and conversations of a vibrant online emotional support system An efficient KP-ABE scheme for content protection in Information-Centric Networking Energy-efficient MAC schemes for Delay-Tolerant Sensor Networks FRT-Skip Graph: A Skip Graph-style structured overlay based on Flexible Routing Tables
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1