{"title":"A parallel particle cluster algorithm using nearest neighbour graphs and passive target communication","authors":"Matthias Frey, Steven Böing, Rui F. G. Apóstolo","doi":"arxiv-2408.15348","DOIUrl":null,"url":null,"abstract":"We present a parallel cluster algorithm for $N$-body simulations which uses a\nnearest neighbour search algorithm and one-sided messaging passing interface\n(MPI) communication. The nearest neighbour is defined by the Euclidean distance\nin three-dimensional space. The resulting directed nearest neighbour graphs\nthat are used to define the clusters are split up in an iterative procedure\nwith MPI remote memory access (RMA) communication. The method has been\nimplemented as part of the elliptical parcel-in-cell (EPIC) method targeting\ngeophysical fluid flows. The parallel scalability of the algorithm is discussed\nby means of an artificial and a standard fluid dynamics test case. The cluster\nalgorithm shows good weak and strong scalability up to 16,384 cores with a\nparallel weak scaling efficiency of about 80% for balanced workloads. In poorly\nbalanced problems, MPI synchronisation dominates execution of the cluster\nalgorithm and thus drastically worsens its parallel scalability.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"45 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.15348","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We present a parallel cluster algorithm for $N$-body simulations which uses a
nearest neighbour search algorithm and one-sided messaging passing interface
(MPI) communication. The nearest neighbour is defined by the Euclidean distance
in three-dimensional space. The resulting directed nearest neighbour graphs
that are used to define the clusters are split up in an iterative procedure
with MPI remote memory access (RMA) communication. The method has been
implemented as part of the elliptical parcel-in-cell (EPIC) method targeting
geophysical fluid flows. The parallel scalability of the algorithm is discussed
by means of an artificial and a standard fluid dynamics test case. The cluster
algorithm shows good weak and strong scalability up to 16,384 cores with a
parallel weak scaling efficiency of about 80% for balanced workloads. In poorly
balanced problems, MPI synchronisation dominates execution of the cluster
algorithm and thus drastically worsens its parallel scalability.