MVAPICH2-X的高性能Coarray Fortran支持:初步经验和评估

Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, D. Panda
{"title":"MVAPICH2-X的高性能Coarray Fortran支持:初步经验和评估","authors":"Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, D. Panda","doi":"10.1109/IPDPSW.2015.115","DOIUrl":null,"url":null,"abstract":"Coarray Fortran (CAF) is a parallel programming paradigm that extends Fortran for the partitioned global address space (PGAS) programming model at the language level. The current runtime implementations of CAF are mainly using MPI or GASNet as underlying communication components. MVAPICH2-X is a hybrid MPI+PGAS programming library with a Unified Communication Runtime (UCR) design. In this paper, the classic implementation of CAF runtime in Open UH is redesigned and rebuilt on top of MVAPICH2-X. The proposed design does not only enable the support of MPI+CAF hybrid programming model, but also provides superior performance on most of the CAF one-sided operations and the newly proposed collective operations in Fortran 2015 specification. A comprehensive evaluation with different benchmarks and applications has been performed. Comparing with current GASNet-based solutions, the CAF runtime with MVAPICH2-X can improve the bandwidths of put and bidirectional operations up to 3.5X for inter-node communication, and improve the bandwidths of collective communication operations represented by broadcast up to 3.0X on 64 processes. It also reduces the execution time of NPB CAF benchmarks by up to 18% on 256 processes.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation\",\"authors\":\"Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, D. Panda\",\"doi\":\"10.1109/IPDPSW.2015.115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coarray Fortran (CAF) is a parallel programming paradigm that extends Fortran for the partitioned global address space (PGAS) programming model at the language level. The current runtime implementations of CAF are mainly using MPI or GASNet as underlying communication components. MVAPICH2-X is a hybrid MPI+PGAS programming library with a Unified Communication Runtime (UCR) design. In this paper, the classic implementation of CAF runtime in Open UH is redesigned and rebuilt on top of MVAPICH2-X. The proposed design does not only enable the support of MPI+CAF hybrid programming model, but also provides superior performance on most of the CAF one-sided operations and the newly proposed collective operations in Fortran 2015 specification. A comprehensive evaluation with different benchmarks and applications has been performed. Comparing with current GASNet-based solutions, the CAF runtime with MVAPICH2-X can improve the bandwidths of put and bidirectional operations up to 3.5X for inter-node communication, and improve the bandwidths of collective communication operations represented by broadcast up to 3.0X on 64 processes. It also reduces the execution time of NPB CAF benchmarks by up to 18% on 256 processes.\",\"PeriodicalId\":340697,\"journal\":{\"name\":\"2015 IEEE International Parallel and Distributed Processing Symposium Workshop\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Parallel and Distributed Processing Symposium Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2015.115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2015.115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

Coarray Fortran (CAF)是一种并行编程范例,它在语言级别上为分区全局地址空间(PGAS)编程模型扩展了Fortran。当前CAF的运行时实现主要使用MPI或GASNet作为底层通信组件。MVAPICH2-X是一个混合MPI+PGAS编程库,具有统一通信运行时(UCR)设计。本文在MVAPICH2-X的基础上,对Open UH中CAF运行时的经典实现进行了重新设计和重构。提出的设计不仅支持MPI+CAF混合编程模型,而且在大多数CAF单面操作和Fortran 2015规范中新提出的集合操作上提供了优越的性能。对不同的基准和应用程序进行了全面的评估。与目前基于gasnet的解决方案相比,采用MVAPICH2-X的CAF运行时可将节点间通信的放操作和双向操作带宽提高3.5倍,将以广播为代表的64进程集体通信操作带宽提高3.0倍。它还可以在256个进程上将NPB CAF基准测试的执行时间最多减少18%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation
Coarray Fortran (CAF) is a parallel programming paradigm that extends Fortran for the partitioned global address space (PGAS) programming model at the language level. The current runtime implementations of CAF are mainly using MPI or GASNet as underlying communication components. MVAPICH2-X is a hybrid MPI+PGAS programming library with a Unified Communication Runtime (UCR) design. In this paper, the classic implementation of CAF runtime in Open UH is redesigned and rebuilt on top of MVAPICH2-X. The proposed design does not only enable the support of MPI+CAF hybrid programming model, but also provides superior performance on most of the CAF one-sided operations and the newly proposed collective operations in Fortran 2015 specification. A comprehensive evaluation with different benchmarks and applications has been performed. Comparing with current GASNet-based solutions, the CAF runtime with MVAPICH2-X can improve the bandwidths of put and bidirectional operations up to 3.5X for inter-node communication, and improve the bandwidths of collective communication operations represented by broadcast up to 3.0X on 64 processes. It also reduces the execution time of NPB CAF benchmarks by up to 18% on 256 processes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Accelerating Large-Scale Single-Source Shortest Path on FPGA Relocation-Aware Floorplanning for Partially-Reconfigurable FPGA-Based Systems iWAPT Introduction and Committees Computing the Pseudo-Inverse of a Graph's Laplacian Using GPUs Optimizing Defensive Investments in Energy-Based Cyber-Physical Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1