Efficient and scalable all-to-all personalized exchange for InfiniBand-based clusters

International Conference on Parallel Processing, 2004. ICPP 2004. Pub Date : 2004-08-15 DOI:10.1109/ICPP.2004.1327932

S. Sur, Hyun-Wook Jin, D. Panda

{"title":"Efficient and scalable all-to-all personalized exchange for InfiniBand-based clusters","authors":"S. Sur, Hyun-Wook Jin, D. Panda","doi":"10.1109/ICPP.2004.1327932","DOIUrl":null,"url":null,"abstract":"The all-to-all personalized exchange is the most dense collective communication function offered by the MPI specification. The operation involves every process sending a different message to all other participating processes. This collective operation is essential for many parallel scientific applications. With increasing system and message sizes, it becomes challenging to offer a fast, scalable and efficient implementation of this operation. InfiniBand is an emerging modern interconnect. It offers very low latency, high bandwidth and one-sided operations like RDMA write. Its advanced features like RDMA write gather allow us to design and implement all-to-all algorithms much more efficiently than in the past. Our aim in This work is to design efficient and scalable implementations of traditional personalized exchange algorithms. We present two novel approaches towards designing all-to-all algorithms for short and long messages respectively. The hypercube RDMA write gather and direct eager schemes effectively leverage the RDMA and RDMA with write gather mechanisms offered by InfiniBand. Performance evaluation of our design and implementation reveals that it is able to reduce the all-to-all communication time by upto a factor of 3.07 for 32 byte messages on a 16 node InfiniBand cluster. Our analytical models suggest that the proposed designs perform 64% better on InfiniBand clusters with 1024 nodes for 4k message size.","PeriodicalId":106240,"journal":{"name":"International Conference on Parallel Processing, 2004. ICPP 2004.","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Parallel Processing, 2004. ICPP 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2004.1327932","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

Abstract

The all-to-all personalized exchange is the most dense collective communication function offered by the MPI specification. The operation involves every process sending a different message to all other participating processes. This collective operation is essential for many parallel scientific applications. With increasing system and message sizes, it becomes challenging to offer a fast, scalable and efficient implementation of this operation. InfiniBand is an emerging modern interconnect. It offers very low latency, high bandwidth and one-sided operations like RDMA write. Its advanced features like RDMA write gather allow us to design and implement all-to-all algorithms much more efficiently than in the past. Our aim in This work is to design efficient and scalable implementations of traditional personalized exchange algorithms. We present two novel approaches towards designing all-to-all algorithms for short and long messages respectively. The hypercube RDMA write gather and direct eager schemes effectively leverage the RDMA and RDMA with write gather mechanisms offered by InfiniBand. Performance evaluation of our design and implementation reveals that it is able to reduce the all-to-all communication time by upto a factor of 3.07 for 32 byte messages on a 16 node InfiniBand cluster. Our analytical models suggest that the proposed designs perform 64% better on InfiniBand clusters with 1024 nodes for 4k message size.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

针对基于infiniband的集群的高效、可扩展的全对全个性化交换

全对全的个性化交换是MPI规范提供的最密集的集体通信功能。该操作涉及到每个流程向所有其他参与流程发送不同的消息。这种集体操作对于许多并行的科学应用是必不可少的。随着系统和消息大小的增加，提供此操作的快速、可扩展和高效实现变得具有挑战性。InfiniBand是一种新兴的现代互连技术。它提供了非常低的延迟，高带宽和单侧操作，如RDMA写入。它的高级特性，如RDMA写入收集，使我们能够比过去更有效地设计和实现所有对所有算法。我们在这项工作中的目标是设计传统个性化交换算法的高效和可扩展实现。我们提出了两种新颖的方法，分别为短消息和长消息设计全对全算法。超立方体RDMA写收集和直接渴望方案有效地利用了RDMA和RDMA与InfiniBand提供的写收集机制。我们的设计和实现的性能评估表明，它能够将16节点InfiniBand集群上32字节消息的所有对所有通信时间减少3.07倍。我们的分析模型表明，在具有1024个节点的InfiniBand集群上，对于4k消息大小，所提出的设计性能提高64%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Conference on Parallel Processing, 2004. ICPP 2004.

自引率

0.00%

发文量

期刊最新文献

Non-uniform dependences partitioned by recurrence chains Clustering strategies for cluster timestamps An effective fault-tolerant routing methodology for direct networks Complexity results and heuristics for pipelined multicast operations on heterogeneous platforms Low-cost register-pressure prediction for scalar replacement using pseudo-schedules