MPI与MPI+OpenMP在IBM SP上的NAS基准测试

ACM/IEEE SC 2000 Conference (SC'00) Pub Date : 2000-11-01 DOI:10.1109/SC.2000.10001

F. Cappello, D. Etiemble

{"title":"MPI与MPI+OpenMP在IBM SP上的NAS基准测试","authors":"F. Cappello, D. Etiemble","doi":"10.1109/SC.2000.10001","DOIUrl":null,"url":null,"abstract":"The hybrid memory model of clusters of multiprocessors raises two issues: programming model and performance. Many parallel programs have been written by using the MPI standard. To evaluate the pertinence of hybrid models for existing MPI codes, we compare a unified model (MPI) and a hybrid one (OpenMP fine grain parallelization after profiling) for the NAS 2.3 benchmarks on two IBM SP systems. The superiority of one model depends on 1) the level of shared memory model parallelization, 2) the communication patterns and 3) the memory access patterns. The relative speeds of the main architecture components (CPU, memory, and network) are of tremendous importance for selecting one model. With the used hybrid model, our results show that a unified MPI approach is better for most of the benchmarks. The hybrid approach becomes better only when fast processors make the communication performance significant and the level of parallelization is sufficient.","PeriodicalId":228250,"journal":{"name":"ACM/IEEE SC 2000 Conference (SC'00)","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"303","resultStr":"{\"title\":\"MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks\",\"authors\":\"F. Cappello, D. Etiemble\",\"doi\":\"10.1109/SC.2000.10001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The hybrid memory model of clusters of multiprocessors raises two issues: programming model and performance. Many parallel programs have been written by using the MPI standard. To evaluate the pertinence of hybrid models for existing MPI codes, we compare a unified model (MPI) and a hybrid one (OpenMP fine grain parallelization after profiling) for the NAS 2.3 benchmarks on two IBM SP systems. The superiority of one model depends on 1) the level of shared memory model parallelization, 2) the communication patterns and 3) the memory access patterns. The relative speeds of the main architecture components (CPU, memory, and network) are of tremendous importance for selecting one model. With the used hybrid model, our results show that a unified MPI approach is better for most of the benchmarks. The hybrid approach becomes better only when fast processors make the communication performance significant and the level of parallelization is sufficient.\",\"PeriodicalId\":228250,\"journal\":{\"name\":\"ACM/IEEE SC 2000 Conference (SC'00)\",\"volume\":\"124 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"303\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM/IEEE SC 2000 Conference (SC'00)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SC.2000.10001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM/IEEE SC 2000 Conference (SC'00)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SC.2000.10001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 303

摘要

多处理器集群的混合内存模型提出了两个问题:编程模型和性能。许多并行程序都是使用MPI标准编写的。为了评估混合模型对现有MPI代码的相关性，我们在两个IBM SP系统上比较了用于NAS 2.3基准测试的统一模型(MPI)和混合模型(分析后的OpenMP细粒度并行化)。一个模型的优越性取决于:1)共享内存模型的并行化程度;2)通信模式;3)内存访问模式。主要架构组件(CPU、内存和网络)的相对速度对于选择一个模型非常重要。使用混合模型，我们的结果表明，统一的MPI方法对大多数基准测试都更好。只有当快速处理器使通信性能显著且并行化水平足够时，混合方法才会变得更好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

MPI versus MPI+OpenMP on the IBM SP for the NAS Benchmarks

The hybrid memory model of clusters of multiprocessors raises two issues: programming model and performance. Many parallel programs have been written by using the MPI standard. To evaluate the pertinence of hybrid models for existing MPI codes, we compare a unified model (MPI) and a hybrid one (OpenMP fine grain parallelization after profiling) for the NAS 2.3 benchmarks on two IBM SP systems. The superiority of one model depends on 1) the level of shared memory model parallelization, 2) the communication patterns and 3) the memory access patterns. The relative speeds of the main architecture components (CPU, memory, and network) are of tremendous importance for selecting one model. With the used hybrid model, our results show that a unified MPI approach is better for most of the benchmarks. The hybrid approach becomes better only when fast processors make the communication performance significant and the level of parallelization is sufficient.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM/IEEE SC 2000 Conference (SC'00)

自引率

0.00%

发文量

期刊最新文献

An Object-Oriented Job Execution Environment Automatically Tuned Collective Communications Architectural and Performance Evaluation of GigaNet and Myrinet Interconnects on Clusters of Small-Scale SMP Servers Parallel Smoothed Aggregation Multigrid : Aggregation Strategies on Massively Parallel Machines High-Cost CFD on a Low-Cost Cluster