{"title":"Towards a Scalable and Efficient PGAS-based Distributed OpenMP","authors":"Baodi Shan, Mauricio Araya-Polo, Barbara Chapman","doi":"arxiv-2409.02830","DOIUrl":null,"url":null,"abstract":"MPI+X has been the de facto standard for distributed memory parallel\nprogramming. It is widely used primarily as an explicit two-sided communication\nmodel, which often leads to complex and error-prone code. Alternatively, PGAS\nmodel utilizes efficient one-sided communication and more intuitive\ncommunication primitives. In this paper, we present a novel approach that\nintegrates PGAS concepts into the OpenMP programming model, leveraging the LLVM\ncompiler infrastructure and the GASNet-EX communication library. Our model\naddresses the complexity associated with traditional MPI+OpenMP programming\nmodels while ensuring excellent performance and scalability. We evaluate our\napproach using a set of micro-benchmarks and application kernels on two\ndistinct platforms: Ookami from Stony Brook University and NERSC Perlmutter.\nThe results demonstrate that DiOMP achieves superior bandwidth and lower\nlatency compared to MPI+OpenMP, up to 25% higher bandwidth and down to 45% on\nlatency. DiOMP offers a promising alternative to the traditional MPI+OpenMP\nhybrid programming model, towards providing a more productive and efficient way\nto develop high-performance parallel applications for distributed memory\nsystems.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"40 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02830","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
MPI+X has been the de facto standard for distributed memory parallel
programming. It is widely used primarily as an explicit two-sided communication
model, which often leads to complex and error-prone code. Alternatively, PGAS
model utilizes efficient one-sided communication and more intuitive
communication primitives. In this paper, we present a novel approach that
integrates PGAS concepts into the OpenMP programming model, leveraging the LLVM
compiler infrastructure and the GASNet-EX communication library. Our model
addresses the complexity associated with traditional MPI+OpenMP programming
models while ensuring excellent performance and scalability. We evaluate our
approach using a set of micro-benchmarks and application kernels on two
distinct platforms: Ookami from Stony Brook University and NERSC Perlmutter.
The results demonstrate that DiOMP achieves superior bandwidth and lower
latency compared to MPI+OpenMP, up to 25% higher bandwidth and down to 45% on
latency. DiOMP offers a promising alternative to the traditional MPI+OpenMP
hybrid programming model, towards providing a more productive and efficient way
to develop high-performance parallel applications for distributed memory
systems.