Distributed data placement to minimize communication costs via graph partitioning

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management Pub Date : 2014-06-30 DOI:10.1145/2618243.2618258

Lukasz Golab, Marios Hadjieleftheriou, H. Karloff, B. Saha

{"title":"Distributed data placement to minimize communication costs via graph partitioning","authors":"Lukasz Golab, Marios Hadjieleftheriou, H. Karloff, B. Saha","doi":"10.1145/2618243.2618258","DOIUrl":null,"url":null,"abstract":"With the widespread use of shared-nothing clusters of servers, there has been a proliferation of distributed object stores that offer high availability, reliability and enhanced performance for MapReduce-style workloads. However, data-intensive scientific workflows and join-intensive queries cannot always be evaluated efficiently using MapReduce-style processing without extensive data migrations, which cause network congestion and reduced query throughput. In this paper, we study the problem of computing data placement strategies that minimize the data communication costs incurred by such workloads in a distributed setting.\n Our main contribution is a reduction of the data placement problem to the well-studied problem of Graph Partitioning, which is NP-Hard but for which efficient approximation algorithms exist. The novelty and significance of this result lie in representing the communication cost exactly and using standard graphs instead of hypergraphs, which were used in prior work on data placement that optimized for different objectives.\n We study several practical extensions of the problem: with load balancing, with replication, and with complex workflows consisting of multiple steps that may be computed on different servers. We provide integer linear programs (IPs) that may be used with any IP solver to find an optimal data placement. For the no-replication case, we use publicly available graph partitioning libraries (e.g., METIS) to efficiently compute nearly-optimal solutions. For the versions with replication, we introduce two heuristics that utilize the Graph Partitioning solution of the no-replication case. Using a workload based on TPC-DS, it may take an IP solver weeks to compute an optimal data placement, whereas our reduction produces nearly-optimal solutions in seconds.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"1 1","pages":"20:1-20:12"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"49","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2618243.2618258","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 49

Abstract

With the widespread use of shared-nothing clusters of servers, there has been a proliferation of distributed object stores that offer high availability, reliability and enhanced performance for MapReduce-style workloads. However, data-intensive scientific workflows and join-intensive queries cannot always be evaluated efficiently using MapReduce-style processing without extensive data migrations, which cause network congestion and reduced query throughput. In this paper, we study the problem of computing data placement strategies that minimize the data communication costs incurred by such workloads in a distributed setting. Our main contribution is a reduction of the data placement problem to the well-studied problem of Graph Partitioning, which is NP-Hard but for which efficient approximation algorithms exist. The novelty and significance of this result lie in representing the communication cost exactly and using standard graphs instead of hypergraphs, which were used in prior work on data placement that optimized for different objectives. We study several practical extensions of the problem: with load balancing, with replication, and with complex workflows consisting of multiple steps that may be computed on different servers. We provide integer linear programs (IPs) that may be used with any IP solver to find an optimal data placement. For the no-replication case, we use publicly available graph partitioning libraries (e.g., METIS) to efficiently compute nearly-optimal solutions. For the versions with replication, we introduce two heuristics that utilize the Graph Partitioning solution of the no-replication case. Using a workload based on TPC-DS, it may take an IP solver weeks to compute an optimal data placement, whereas our reduction produces nearly-optimal solutions in seconds.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

分布式数据放置，通过图分区最小化通信成本

随着无共享服务器集群的广泛使用，为mapreduce风格的工作负载提供高可用性、可靠性和增强性能的分布式对象存储已经大量出现。然而，使用mapreduce风格的处理，如果没有大量的数据迁移，数据密集型科学工作流和连接密集型查询并不总是能够有效地评估，这会导致网络拥塞和查询吞吐量降低。在本文中，我们研究了计算数据放置策略的问题，该策略可以最大限度地减少分布式设置中此类工作负载所产生的数据通信成本。我们的主要贡献是将数据放置问题简化为经过充分研究的图分区问题，这是NP-Hard问题，但存在有效的近似算法。该结果的新颖性和意义在于准确地表示了通信成本，并且使用标准图而不是超图，超图在先前的数据放置工作中用于针对不同目标进行优化。我们研究了这个问题的几个实际扩展:负载平衡、复制和由多个步骤组成的复杂工作流，这些步骤可能在不同的服务器上计算。我们提供整数线性程序(IP)，可以与任何IP求解器一起使用，以找到最佳的数据放置。对于无复制的情况，我们使用公开可用的图分区库(例如，METIS)来有效地计算接近最优的解决方案。对于具有复制的版本，我们引入了两种启发式方法，它们利用无复制情况下的图分区解决方案。使用基于TPC-DS的工作负载，IP求解器可能需要数周的时间才能计算出最优的数据放置，而我们的缩减在几秒钟内就产生了近乎最优的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management

自引率

0.00%

发文量

期刊最新文献

Towards Co-Evolution of Data-Centric Ecosystems. Data perturbation for outlier detection ensembles SLACID - sparse linear algebra in a column-oriented in-memory database system SensorBench: benchmarking approaches to processing wireless sensor network data Efficient data management and statistics with zero-copy integration