Model to estimate the size of a Hadoop cluster - HCEm

2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS) Pub Date : 2014-12-01 DOI:10.1109/PADSW.2014.7097897

J. Brito, Aleteia P. F. Araujo

引用次数: 0

Abstract

This paper describes a model which aims to estimate the size of a cluster running Hadoop framework for the processing of large datasets at a given timeframe. As main contributions it denes (i) a light layer of optimization for MapReduce jobs, (ii) presents a model to estimate the size cluster for a Hadoop framework and (iii) performs tests using a real environment - the Amazon Elastic MapReduce. The proposed approach works with the MapReduce to dene the main configuration parameters and determines computational resources of hosts in the cluster in order to meet the desired runtime for the requirements of a given workload requirement. Thus, the results show that the proposed model is able to avoid to over-allocation or sub-allocation of computing resources on a Hadoop cluster.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

模型来估计Hadoop集群的大小- HCEm

本文描述了一个模型，该模型旨在估计在给定时间范围内运行Hadoop框架处理大型数据集的集群的大小。它的主要贡献是(i)为MapReduce作业提供了一个简单的优化层，(ii)提供了一个模型来估计Hadoop框架的集群大小，(iii)使用真实环境——Amazon Elastic MapReduce执行测试。该方法与MapReduce一起确定主要配置参数，并确定集群中主机的计算资源，以满足给定工作负载需求的期望运行时。结果表明，该模型能够避免Hadoop集群上计算资源的过度分配或子分配。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)

自引率

0.00%

发文量

期刊最新文献

Optimal bandwidth allocation with dynamic multi-path routing for non-critical traffic in AFDX networks Sensor-free corner shape detection by wireless networks Accelerated variance reduction methods on GPU Fault-Tolerant bi-directional communications in web-based applications Performance analysis of HPC applications with irregular tree data structures