Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand

Abhinav Vishnu, M. Krishnan
{"title":"Efficient On-Demand Connection Management Mechanisms with PGAS Models over InfiniBand","authors":"Abhinav Vishnu, M. Krishnan","doi":"10.1109/CCGRID.2010.58","DOIUrl":null,"url":null,"abstract":"In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.","PeriodicalId":444485,"journal":{"name":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGRID.2010.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

In the last decade or so, clusters have observed a tremendous rise in popularity due to the excellent price to performance ratio. A variety of Interconnects have been proposed during this period, with InfiniBand leading the way due to its high performance and open standard. At the same time, multiple programming models have emerged in order to meet the requirements of various applications and their programming models. To support requirements of multiple programming models, InfiniBand provides multiple transport semantics, ranging from unreliable connectionless to reliable connected characteristics. Among them, the reliable connection (RC) semantics is being widely used due to its high performance and support for novel features like Remote Direct Memory Acesss (RDMA), hardware atomics and Network Fault Tolerance. However, the pair wise connection oriented nature of the RC transport semantics limits its scalability and usage at the increasing processor counts. In this paper, we design and implement on-demand connection management approaches in the context of Partitioned Global Address Space (PGAS) programming models, which provided shared memory abstraction and one-sided communication semantics, leading to the development of multiple languages (UPC, X10, Chapel) and libraries (Global Arrays, MPI-RMA). Using Global Arrays as the research vehicle, we implement this approach with Aggregate Remote Memory Copy Interface (ARMCI), the runtime system of Global Arrays. We evaluate our approach, ARMCI-On Demand Connection Management (ARMCI-ODCM) using various micro benchmarks and benchmarks (LU Factorization, Random-Access and Lennard Jones simulation) and application (Subsurface transport over multiple phases (STOMP)). With the performance evaluation for up to 4096 processors, we are able to have a multi-fold reduction in connection memory with a negligible degradation in performance. Using STOMP at 4096 processors, reduces the overall connection memory by 66 times with no performance degradation. To the best of our knowledge, this is the first design, implementation and evaluation of on-demand connection management with InfiniBand using PGAS models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于ib的PGAS模型的高效按需连接管理机制
在过去十年左右的时间里,由于出色的性价比,集群的受欢迎程度大幅上升。在此期间,各种各样的互连被提出,InfiniBand因其高性能和开放标准而引领潮流。同时,为了满足各种应用及其编程模型的需求,出现了多种编程模型。为了支持多种编程模型的需求,InfiniBand提供了多种传输语义,从不可靠的无连接到可靠的连接特征。其中,可靠连接(RC)语义由于其高性能和对远程直接内存访问(RDMA)、硬件原子和网络容错等新特性的支持而得到广泛应用。然而,RC传输语义的面向对连接的特性限制了它的可伸缩性和在处理器数量增加时的使用。在本文中,我们在分区全局地址空间(PGAS)编程模型的背景下设计并实现了按需连接管理方法,该方法提供了共享内存抽象和单向通信语义,从而导致了多种语言(UPC, X10, Chapel)和库(全局数组,MPI-RMA)的发展。本文以全局阵列为研究载体,利用全局阵列的运行时系统——聚合远程内存复制接口(armi)实现了该方法。我们使用各种微基准和基准(LU分解、随机访问和Lennard Jones模拟)以及应用(多阶段地下传输(STOMP))来评估我们的方法——armci随需应变连接管理(ARMCI-ODCM)。通过对多达4096个处理器的性能评估,我们能够将连接内存减少数倍,而性能的下降可以忽略不计。在4096个处理器上使用STOMP,在没有性能下降的情况下,将总连接内存减少了66倍。据我们所知,这是使用PGAS模型的InfiniBand按需连接管理的第一个设计,实施和评估。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
In Search of Visualization Metaphors for PlanetLab Multi-criteria Content Adaptation Service Selection Broker Enabling the Next Generation of Scalable Clusters Development and Support of Platforms for Research into Rare Diseases Using Cloud Constructs and Predictive Analysis to Enable Pre-Failure Process Migration in HPC Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1