{"title":"Efficient Distributed Algorithms for Minimum Spanning Tree in Dense Graphs","authors":"M. Bateni, Morteza Monemizadeh, Kees Voorintholt","doi":"10.1109/ICDMW58026.2022.00106","DOIUrl":null,"url":null,"abstract":"In recent years, the Massively Parallel Computation (MPC) model capturing the MapReduce framework has become the de facto standard model for large-scale data analysis, given the ubiquity of efficient and affordable cloud implementations. In this model, an input of size $m$ is initially distributed among $t$ machines, each with a local space of size $s$. Computation proceeds in synchronous rounds in which each machine performs arbitrary local computation on its data and then sends messages to other machines. In this paper, we study the Minimum Spanning Tree (MST) problem for dense graphs in the MPC model. We say a graph $G(V,\\ E)$ is relatively dense if $m=\\Theta(n^{1+c})$ where $n=\\vert V\\vert$ is the number of vertices, $m=\\vert E\\vert$ is the number of edges in this graph, and $0 < c\\leq 1$. We develop the first work- and space-efficient MPC algorithm that with high probability computes an MST of $G$ using $\\lceil\\log\\frac{c}{\\epsilon}\\rceil+1$ rounds of communication. As an MPC algorithm, our algorithm uses $t=O(n^{c-\\epsilon})$ machines each one having local storage of size $s=O(n^{1+\\epsilon})$ for any $0 < \\epsilon\\leq c$. Indeed, not only is this algorithm very simple and easy to implement, it also simultaneously achieves optimal total work, per-machine space, and number of rounds.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the Massively Parallel Computation (MPC) model capturing the MapReduce framework has become the de facto standard model for large-scale data analysis, given the ubiquity of efficient and affordable cloud implementations. In this model, an input of size $m$ is initially distributed among $t$ machines, each with a local space of size $s$. Computation proceeds in synchronous rounds in which each machine performs arbitrary local computation on its data and then sends messages to other machines. In this paper, we study the Minimum Spanning Tree (MST) problem for dense graphs in the MPC model. We say a graph $G(V,\ E)$ is relatively dense if $m=\Theta(n^{1+c})$ where $n=\vert V\vert$ is the number of vertices, $m=\vert E\vert$ is the number of edges in this graph, and $0 < c\leq 1$. We develop the first work- and space-efficient MPC algorithm that with high probability computes an MST of $G$ using $\lceil\log\frac{c}{\epsilon}\rceil+1$ rounds of communication. As an MPC algorithm, our algorithm uses $t=O(n^{c-\epsilon})$ machines each one having local storage of size $s=O(n^{1+\epsilon})$ for any $0 < \epsilon\leq c$. Indeed, not only is this algorithm very simple and easy to implement, it also simultaneously achieves optimal total work, per-machine space, and number of rounds.