云系统中MapReduce任务的最优容量分配

2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing Pub Date : 2014-09-01 DOI:10.1109/SYNASC.2014.58

M. Malekimajd, A. M. Rizzi, D. Ardagna, M. Ciavotta, M. Passacantando, A. Movaghar

{"title":"云系统中MapReduce任务的最优容量分配","authors":"M. Malekimajd, A. M. Rizzi, D. Ardagna, M. Ciavotta, M. Passacantando, A. Movaghar","doi":"10.1109/SYNASC.2014.58","DOIUrl":null,"url":null,"abstract":"Nowadays, analyzing large amount of data is of paramount importance for many companies. Big data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to providing performance for MapReduce jobs and minimize cloud resource cost. The contribution of this paper is twofold: (i) we formulate a linear programming model able to minimize cloud resources cost and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees, (ii) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters. Moreover, our solutions are validated by a large set of experiments. We demonstrate that our method is able to determine the global optimal solution for systems including up to 1000 user classes in less than 0.5 seconds. Moreover, the execution time of MapReduce jobs are within 19% of our upper bounds on average.","PeriodicalId":150575,"journal":{"name":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Optimal Capacity Allocation for Executing MapReduce Jobs in Cloud Systems\",\"authors\":\"M. Malekimajd, A. M. Rizzi, D. Ardagna, M. Ciavotta, M. Passacantando, A. Movaghar\",\"doi\":\"10.1109/SYNASC.2014.58\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, analyzing large amount of data is of paramount importance for many companies. Big data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to providing performance for MapReduce jobs and minimize cloud resource cost. The contribution of this paper is twofold: (i) we formulate a linear programming model able to minimize cloud resources cost and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees, (ii) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters. Moreover, our solutions are validated by a large set of experiments. We demonstrate that our method is able to determine the global optimal solution for systems including up to 1000 user classes in less than 0.5 seconds. Moreover, the execution time of MapReduce jobs are within 19% of our upper bounds on average.\",\"PeriodicalId\":150575,\"journal\":{\"name\":\"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SYNASC.2014.58\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SYNASC.2014.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

如今，分析大量数据对许多公司来说是至关重要的。MapReduce编程模型为大数据和商业智能应用提供了便利，而在基础设施层，云计算为按需分配大型集群提供了灵活且经济高效的解决方案。在这样的系统中，容量分配是为MapReduce作业提供性能和最小化云资源成本的关键挑战。本文的贡献是双重的:(i)我们制定了一个线性规划模型，能够最大限度地减少云资源成本和执行具有(软)截止日期保证的多类作业的作业拒绝处罚，(ii)我们提供了共享Hadoop集群中MapReduce作业执行时间的新上限和下限。此外，我们的解决方案已通过大量实验验证。我们证明了我们的方法能够在不到0.5秒的时间内确定包含多达1000个用户类的系统的全局最优解。此外，MapReduce作业的执行时间平均在上限的19%以内。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Optimal Capacity Allocation for Executing MapReduce Jobs in Cloud Systems

Nowadays, analyzing large amount of data is of paramount importance for many companies. Big data and business intelligence applications are facilitated by the MapReduce programming model while, at infrastructural layer, cloud computing provides flexible and cost effective solutions for allocating on demand large clusters. Capacity allocation in such systems is a key challenge to providing performance for MapReduce jobs and minimize cloud resource cost. The contribution of this paper is twofold: (i) we formulate a linear programming model able to minimize cloud resources cost and job rejection penalties for the execution of jobs of multiple classes with (soft) deadline guarantees, (ii) we provide new upper and lower bounds for MapReduce job execution time in shared Hadoop clusters. Moreover, our solutions are validated by a large set of experiments. We demonstrate that our method is able to determine the global optimal solution for systems including up to 1000 user classes in less than 0.5 seconds. Moreover, the execution time of MapReduce jobs are within 19% of our upper bounds on average.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 16th International Symposium on Symbolic and Numeric Algorithms for Scientific Computing

自引率

0.00%

发文量