MVAPICH2-X的高性能Coarray Fortran支持:初步经验和评估

2015 IEEE International Parallel and Distributed Processing Symposium Workshop Pub Date : 2015-05-25 DOI:10.1109/IPDPSW.2015.115

Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, D. Panda

{"title":"MVAPICH2-X的高性能Coarray Fortran支持:初步经验和评估","authors":"Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, D. Panda","doi":"10.1109/IPDPSW.2015.115","DOIUrl":null,"url":null,"abstract":"Coarray Fortran (CAF) is a parallel programming paradigm that extends Fortran for the partitioned global address space (PGAS) programming model at the language level. The current runtime implementations of CAF are mainly using MPI or GASNet as underlying communication components. MVAPICH2-X is a hybrid MPI+PGAS programming library with a Unified Communication Runtime (UCR) design. In this paper, the classic implementation of CAF runtime in Open UH is redesigned and rebuilt on top of MVAPICH2-X. The proposed design does not only enable the support of MPI+CAF hybrid programming model, but also provides superior performance on most of the CAF one-sided operations and the newly proposed collective operations in Fortran 2015 specification. A comprehensive evaluation with different benchmarks and applications has been performed. Comparing with current GASNet-based solutions, the CAF runtime with MVAPICH2-X can improve the bandwidths of put and bidirectional operations up to 3.5X for inter-node communication, and improve the bandwidths of collective communication operations represented by broadcast up to 3.0X on 64 processes. It also reduces the execution time of NPB CAF benchmarks by up to 18% on 256 processes.","PeriodicalId":340697,"journal":{"name":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation\",\"authors\":\"Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, D. Panda\",\"doi\":\"10.1109/IPDPSW.2015.115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Coarray Fortran (CAF) is a parallel programming paradigm that extends Fortran for the partitioned global address space (PGAS) programming model at the language level. The current runtime implementations of CAF are mainly using MPI or GASNet as underlying communication components. MVAPICH2-X is a hybrid MPI+PGAS programming library with a Unified Communication Runtime (UCR) design. In this paper, the classic implementation of CAF runtime in Open UH is redesigned and rebuilt on top of MVAPICH2-X. The proposed design does not only enable the support of MPI+CAF hybrid programming model, but also provides superior performance on most of the CAF one-sided operations and the newly proposed collective operations in Fortran 2015 specification. A comprehensive evaluation with different benchmarks and applications has been performed. Comparing with current GASNet-based solutions, the CAF runtime with MVAPICH2-X can improve the bandwidths of put and bidirectional operations up to 3.5X for inter-node communication, and improve the bandwidths of collective communication operations represented by broadcast up to 3.0X on 64 processes. It also reduces the execution time of NPB CAF benchmarks by up to 18% on 256 processes.\",\"PeriodicalId\":340697,\"journal\":{\"name\":\"2015 IEEE International Parallel and Distributed Processing Symposium Workshop\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Parallel and Distributed Processing Symposium Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2015.115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Parallel and Distributed Processing Symposium Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2015.115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

Coarray Fortran (CAF)是一种并行编程范例，它在语言级别上为分区全局地址空间(PGAS)编程模型扩展了Fortran。当前CAF的运行时实现主要使用MPI或GASNet作为底层通信组件。MVAPICH2-X是一个混合MPI+PGAS编程库，具有统一通信运行时(UCR)设计。本文在MVAPICH2-X的基础上，对Open UH中CAF运行时的经典实现进行了重新设计和重构。提出的设计不仅支持MPI+CAF混合编程模型，而且在大多数CAF单面操作和Fortran 2015规范中新提出的集合操作上提供了优越的性能。对不同的基准和应用程序进行了全面的评估。与目前基于gasnet的解决方案相比，采用MVAPICH2-X的CAF运行时可将节点间通信的放操作和双向操作带宽提高3.5倍，将以广播为代表的64进程集体通信操作带宽提高3.0倍。它还可以在256个进程上将NPB CAF基准测试的执行时间最多减少18%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation

Coarray Fortran (CAF) is a parallel programming paradigm that extends Fortran for the partitioned global address space (PGAS) programming model at the language level. The current runtime implementations of CAF are mainly using MPI or GASNet as underlying communication components. MVAPICH2-X is a hybrid MPI+PGAS programming library with a Unified Communication Runtime (UCR) design. In this paper, the classic implementation of CAF runtime in Open UH is redesigned and rebuilt on top of MVAPICH2-X. The proposed design does not only enable the support of MPI+CAF hybrid programming model, but also provides superior performance on most of the CAF one-sided operations and the newly proposed collective operations in Fortran 2015 specification. A comprehensive evaluation with different benchmarks and applications has been performed. Comparing with current GASNet-based solutions, the CAF runtime with MVAPICH2-X can improve the bandwidths of put and bidirectional operations up to 3.5X for inter-node communication, and improve the bandwidths of collective communication operations represented by broadcast up to 3.0X on 64 processes. It also reduces the execution time of NPB CAF benchmarks by up to 18% on 256 processes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Parallel and Distributed Processing Symposium Workshop

自引率

0.00%

发文量

期刊最新文献

Accelerating Large-Scale Single-Source Shortest Path on FPGA Relocation-Aware Floorplanning for Partially-Reconfigurable FPGA-Based Systems iWAPT Introduction and Committees Computing the Pseudo-Inverse of a Graph's Laplacian Using GPUs Optimizing Defensive Investments in Energy-Based Cyber-Physical Systems