Dynamic Data Partitioning and Virtual Machine Mapping: Efficient Data Intensive Computation

2013 IEEE 5th International Conference on Cloud Computing Technology and Science Pub Date : 2013-12-02 DOI:10.1109/CloudCom.2013.134

Kenn Slagter, Ching-Hsien Hsu, Yeh-Ching Chung

引用次数: 4

Abstract

Big data refers to data that is so large that it exceeds the processing capabilities of traditional systems. Big data can be awkward to work and the storage, processing and analysis of big data can be problematic. MapReduce is a recent programming model that can handle big data. MapReduce achieves this by distributing the storage and processing of data amongst a large number of computers (nodes). However, this means the time required to process a MapReduce job is dependent on whichever node is last to complete a task. This problem is exacerbated by heterogeneous environments. In this paper we propose a method to improve MapReduce execution in heterogeneous environments. This is done by dynamically partitioning data during the Map phase and by using virtual machine mapping in the Reduce phase in order to maximize resource utilization.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

动态数据分区和虚拟机映射:高效的数据密集型计算

大数据是指大到超出传统系统处理能力的数据。大数据可能难以处理，大数据的存储、处理和分析也可能存在问题。MapReduce是一种可以处理大数据的最新编程模型。MapReduce通过在大量计算机(节点)中分布数据的存储和处理来实现这一点。然而，这意味着处理MapReduce作业所需的时间取决于最后完成任务的节点。异构环境加剧了这个问题。本文提出了一种改进异构环境下MapReduce执行的方法。这是通过在Map阶段动态划分数据和在Reduce阶段使用虚拟机映射来实现的，以便最大限度地利用资源。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE 5th International Conference on Cloud Computing Technology and Science

自引率

0.00%

发文量

期刊最新文献

A Feasibility Study of Host-Level Contention Detection by Guest Virtual Machines Porting Grid Applications to the Cloud with Schlouder Towards Data Handling Requirements-Aware Cloud Computing Providing Desirable Data to Users When Integrating Wireless Sensor Networks with Mobile Cloud MELA: Monitoring and Analyzing Elasticity of Cloud Services