Enhancing Hadoop Performance in Homogenous and Heterogeneous Big Data Environments by Dynamic Slot Configuration

E. Hamza
{"title":"Enhancing Hadoop Performance in Homogenous and Heterogeneous Big Data Environments by Dynamic Slot Configuration","authors":"E. Hamza","doi":"10.25728/ASSA.2020.20.1.761","DOIUrl":null,"url":null,"abstract":"Hadoop is one of the most famous platform solutions for processing large volume and scale of data in parallel processing in Cloud computing. A Hadoop system can be characterized based on three main factors: cluster, workload and user. Each of these factors can be described as either heterogeneous or homogenous, which reflects the heterogeneity degree of the Hadoop systemThe objective of this proposed research work is to investigate the degree of influence of heterogeneity for each of these factors on the performance of Hadoop based on different schedulers. Three schedulers are considered with different levels of Hadoop heterogeneity and are tested and analyzed: the first algorithm considered is the FIFO (First in First out), the second is the Fair sharing, and the final is the COSHH (Classification and Optimization based Scheduler for Heterogeneous Hadoop). Performance issues are related to Hadoop schedulers and comparative performance analysis between different cases of jobs submission. These jobs are processed in different homogenous or heterogeneous data environments and under fixed or reconfigurable slot between map and reduce tasks for Hadoop MapReduce java programming clustering model. The results showed that when assigning tunable knob between map and reduce tasks under certain schedulers like FIFO algorithm, the performance enhanced significantly especially in cases of heterogeneity environment where the workload decreased significantly and the utilization of computational resources increase was obvious.","PeriodicalId":39095,"journal":{"name":"Advances in Systems Science and Applications","volume":"20 1","pages":"13-26"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Systems Science and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25728/ASSA.2020.20.1.761","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

Abstract

Hadoop is one of the most famous platform solutions for processing large volume and scale of data in parallel processing in Cloud computing. A Hadoop system can be characterized based on three main factors: cluster, workload and user. Each of these factors can be described as either heterogeneous or homogenous, which reflects the heterogeneity degree of the Hadoop systemThe objective of this proposed research work is to investigate the degree of influence of heterogeneity for each of these factors on the performance of Hadoop based on different schedulers. Three schedulers are considered with different levels of Hadoop heterogeneity and are tested and analyzed: the first algorithm considered is the FIFO (First in First out), the second is the Fair sharing, and the final is the COSHH (Classification and Optimization based Scheduler for Heterogeneous Hadoop). Performance issues are related to Hadoop schedulers and comparative performance analysis between different cases of jobs submission. These jobs are processed in different homogenous or heterogeneous data environments and under fixed or reconfigurable slot between map and reduce tasks for Hadoop MapReduce java programming clustering model. The results showed that when assigning tunable knob between map and reduce tasks under certain schedulers like FIFO algorithm, the performance enhanced significantly especially in cases of heterogeneity environment where the workload decreased significantly and the utilization of computational resources increase was obvious.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过动态槽位配置增强Hadoop在同质和异构大数据环境下的性能
Hadoop是最著名的平台解决方案之一,用于在云计算中并行处理大量和大规模的数据。Hadoop系统的特征可以基于三个主要因素:集群、工作负载和用户。这些因素中的每一个都可以被描述为异构或同质,这反映了Hadoop系统的异构程度。本文提出的研究工作的目的是研究基于不同调度器的这些因素的异构程度对Hadoop性能的影响程度。考虑了三种具有不同级别Hadoop异构性的调度器,并对其进行了测试和分析:第一个考虑的算法是FIFO(先进先出),第二个是公平共享,最后是COSHH(基于分类和优化的异构Hadoop调度器)。性能问题与Hadoop调度器和不同作业提交情况之间的比较性能分析有关。这些作业在不同的同构或异构数据环境中进行处理,并且在Hadoop MapReduce java编程集群模型的map和reduce任务之间的固定或可重构插槽中进行处理。结果表明,在某些调度程序(如FIFO算法)下,当在map和reduce任务之间分配可调旋数时,性能得到了显著提高,特别是在异构环境下,工作负载明显减少,计算资源利用率明显提高。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Advances in Systems Science and Applications
Advances in Systems Science and Applications Engineering-Engineering (all)
CiteScore
1.20
自引率
0.00%
发文量
0
期刊介绍: Advances in Systems Science and Applications (ASSA) is an international peer-reviewed open-source online academic journal. Its scope covers all major aspects of systems (and processes) analysis, modeling, simulation, and control, ranging from theoretical and methodological developments to a large variety of application areas. Survey articles and innovative results are also welcome. ASSA is aimed at the audience of scientists, engineers and researchers working in the framework of these problems. ASSA should be a platform on which researchers will be able to communicate and discuss both their specialized issues and interdisciplinary problems of systems analysis and its applications in science and industry, including data science, artificial intelligence, material science, manufacturing, transportation, power and energy, ecology, corporate management, public governance, finance, and many others.
期刊最新文献
The Model of the Production Side of the Russian Economy Deep learning techniques for detection of covid-19 using chest x-rays Using Patent Landscapes for Technology Benchmarking: A Case of 5G Networks Achieving Angular Superresolution of Control and Measurement Systems in Signal Processing The Modular Inequalities for Hardy-type Operators on Monotone Functions in Orlicz Space
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1