Evaluation of a Minimally Synchronous Algorithm for 2:1 Octree Balance

Hansol Suh, T. Isaac
{"title":"Evaluation of a Minimally Synchronous Algorithm for 2:1 Octree Balance","authors":"Hansol Suh, T. Isaac","doi":"10.1109/SC41405.2020.00027","DOIUrl":null,"url":null,"abstract":"The p4est library implements octree-based adaptive mesh refinement (AMR) and has demonstrated parallel scalability beyond 100,000 MPI processes in previous weak scaling studies. This work focuses on the strong scalability of mesh adaptivity in p4est, where the communication pattern of the existing 2:1-balance is a latency bottleneck. The sorting-based algorithm of Malhotra and Biros has balanced communication, but synchronizes all processes. We propose an algorithm that combines sorting and neighbor-to-neighbor exchange to minimize the number of processes each process synchronizes with.We measure the performance of these algorithms on several test problems on Stampede2 at TACC. Both the parallel-sorting and minimally-synchronous algorithms significantly outperform the existing algorithm and have nearly identical performance out to 1,024 Xeon Phi KNL nodes, meaning the asymptotic advantage of the minimally-synchronous algorithm does not translate to improved performance at this scale. We conclude by showing that global metadata communication will limit future strong scaling.","PeriodicalId":424429,"journal":{"name":"SC20: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC20: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC41405.2020.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The p4est library implements octree-based adaptive mesh refinement (AMR) and has demonstrated parallel scalability beyond 100,000 MPI processes in previous weak scaling studies. This work focuses on the strong scalability of mesh adaptivity in p4est, where the communication pattern of the existing 2:1-balance is a latency bottleneck. The sorting-based algorithm of Malhotra and Biros has balanced communication, but synchronizes all processes. We propose an algorithm that combines sorting and neighbor-to-neighbor exchange to minimize the number of processes each process synchronizes with.We measure the performance of these algorithms on several test problems on Stampede2 at TACC. Both the parallel-sorting and minimally-synchronous algorithms significantly outperform the existing algorithm and have nearly identical performance out to 1,024 Xeon Phi KNL nodes, meaning the asymptotic advantage of the minimally-synchronous algorithm does not translate to improved performance at this scale. We conclude by showing that global metadata communication will limit future strong scaling.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种2:1八叉树平衡最小同步算法的评估
p4est库实现了基于八叉树的自适应网格细化(AMR),并在之前的弱缩放研究中展示了超过100,000 MPI进程的并行可扩展性。本文重点研究了p4test中网格自适应的强大可扩展性,其中现有的2:1-balance通信模式是延迟瓶颈。Malhotra和Biros的基于排序的算法具有平衡的通信,但同步所有进程。我们提出了一种结合排序和邻居间交换的算法,以最小化每个进程同步的进程数量。我们在Stampede2在TACC上的几个测试问题上测量了这些算法的性能。并行排序和最小同步算法都明显优于现有算法,并且在1024 Xeon Phi KNL节点上具有几乎相同的性能,这意味着最小同步算法的渐近优势在这种规模下不会转化为改进的性能。我们的结论是,全球元数据通信将限制未来的强扩展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CAB-MPI: Exploring Interprocess Work-Stealing towards Balanced MPI Communication Toward Realization of Numerical Towing-Tank Tests by Wall-Resolved Large Eddy Simulation based on 32 Billion Grid Finite-Element Computation Scalable yet Rigorous Floating-Point Error Analysis Scalable Knowledge Graph Analytics at 136 Petaflop/s BORA: A Bag Optimizer for Robotic Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1