基于事后确认的轻量级 RDMA 连接协议

IF 3.4 3区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS Journal of Parallel and Distributed Computing Pub Date : 2024-10-01 DOI:10.1016/j.jpdc.2024.104991

Ke Wu, Dezun Dong, Weixia Xu

{"title":"基于事后确认的轻量级 RDMA 连接协议","authors":"Ke Wu, Dezun Dong, Weixia Xu","doi":"10.1016/j.jpdc.2024.104991","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing scale and complexity of high-performance computing systems, the rising failure rate poses significant challenges for RDMA networks that aim for high bandwidth and low latency. RDMA networks require hardware-level end-to-end reliable data transmission services to avoid the high cost of software failure recovery. Tianhe HPC interconnection network adopts a NIC-based RDMA reliable connection protocol, RCP. RCP establishes a connection for each message that enters the NIC and releases it after the transmission is complete. However, this introduces an additional round-trip time RTT connection overhead for each message, which severely impacts the performance of networks dominated by short messages in high-performance computing systems. We have found that utilization of receiver-side connection resources has been consistently low because maintaining message-grained connections on the NIC results in rapid release of connections. Therefore, we propose a lightweight RDMA connection protocol based on post-hoc confirmation, PCP. PCP assumes the receiver has connection resources by default and eliminates the need for confirmation from the receiver before sending a message, thus reducing the connection overhead of almost all messages by one RTT. At the same time, PCP also includes mechanisms to address the special case where the receiver lacks connection resources. Evaluation results demonstrate that PCP significantly optimizes short messages and applications dominated by short messages. Moreover, PCP further reduces the usage of receiver-side connection resources. Additionally, PCP does not experience performance degradation even under large-scale heavy loads and severe endpoint congestion.</div></div>","PeriodicalId":54775,"journal":{"name":"Journal of Parallel and Distributed Computing","volume":"195 ","pages":"Article 104991"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A lightweight RDMA connection protocol based on post-hoc confirmation\",\"authors\":\"Ke Wu, Dezun Dong, Weixia Xu\",\"doi\":\"10.1016/j.jpdc.2024.104991\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing scale and complexity of high-performance computing systems, the rising failure rate poses significant challenges for RDMA networks that aim for high bandwidth and low latency. RDMA networks require hardware-level end-to-end reliable data transmission services to avoid the high cost of software failure recovery. Tianhe HPC interconnection network adopts a NIC-based RDMA reliable connection protocol, RCP. RCP establishes a connection for each message that enters the NIC and releases it after the transmission is complete. However, this introduces an additional round-trip time RTT connection overhead for each message, which severely impacts the performance of networks dominated by short messages in high-performance computing systems. We have found that utilization of receiver-side connection resources has been consistently low because maintaining message-grained connections on the NIC results in rapid release of connections. Therefore, we propose a lightweight RDMA connection protocol based on post-hoc confirmation, PCP. PCP assumes the receiver has connection resources by default and eliminates the need for confirmation from the receiver before sending a message, thus reducing the connection overhead of almost all messages by one RTT. At the same time, PCP also includes mechanisms to address the special case where the receiver lacks connection resources. Evaluation results demonstrate that PCP significantly optimizes short messages and applications dominated by short messages. Moreover, PCP further reduces the usage of receiver-side connection resources. Additionally, PCP does not experience performance degradation even under large-scale heavy loads and severe endpoint congestion.</div></div>\",\"PeriodicalId\":54775,\"journal\":{\"name\":\"Journal of Parallel and Distributed Computing\",\"volume\":\"195 \",\"pages\":\"Article 104991\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Parallel and Distributed Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0743731524001552\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Parallel and Distributed Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0743731524001552","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

随着高性能计算系统的规模和复杂性不断扩大，故障率的上升给追求高带宽和低延迟的 RDMA 网络带来了巨大挑战。RDMA 网络需要硬件级的端到端可靠数据传输服务，以避免高昂的软件故障恢复成本。天河高性能计算互连网络采用了基于网卡的 RDMA 可靠连接协议 RCP。RCP 为每个进入网卡的报文建立连接，并在传输完成后释放连接。然而，这为每个报文带来了额外的往返时间 RTT 连接开销，严重影响了高性能计算系统中以短报文为主的网络性能。我们发现，接收端连接资源的利用率一直很低，因为在网卡上维护消息粒度连接会导致连接的快速释放。因此，我们提出了一种基于事后确认的轻量级 RDMA 连接协议 PCP。PCP 默认假定接收方拥有连接资源，在发送消息前无需接收方确认，因此几乎所有消息的连接开销都减少了一个 RTT。同时，PCP 还包括处理接收方缺乏连接资源的特殊情况的机制。评估结果表明，PCP 显著优化了短信息和以短信息为主的应用。此外，PCP 还进一步减少了接收方连接资源的使用。此外，即使在大规模重负载和端点严重拥塞的情况下，PCP 也不会出现性能下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A lightweight RDMA connection protocol based on post-hoc confirmation

With the increasing scale and complexity of high-performance computing systems, the rising failure rate poses significant challenges for RDMA networks that aim for high bandwidth and low latency. RDMA networks require hardware-level end-to-end reliable data transmission services to avoid the high cost of software failure recovery. Tianhe HPC interconnection network adopts a NIC-based RDMA reliable connection protocol, RCP. RCP establishes a connection for each message that enters the NIC and releases it after the transmission is complete. However, this introduces an additional round-trip time RTT connection overhead for each message, which severely impacts the performance of networks dominated by short messages in high-performance computing systems. We have found that utilization of receiver-side connection resources has been consistently low because maintaining message-grained connections on the NIC results in rapid release of connections. Therefore, we propose a lightweight RDMA connection protocol based on post-hoc confirmation, PCP. PCP assumes the receiver has connection resources by default and eliminates the need for confirmation from the receiver before sending a message, thus reducing the connection overhead of almost all messages by one RTT. At the same time, PCP also includes mechanisms to address the special case where the receiver lacks connection resources. Evaluation results demonstrate that PCP significantly optimizes short messages and applications dominated by short messages. Moreover, PCP further reduces the usage of receiver-side connection resources. Additionally, PCP does not experience performance degradation even under large-scale heavy loads and severe endpoint congestion.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Parallel and Distributed Computing 工程技术-计算机：理论方法

CiteScore

10.30

自引率

2.60%

发文量

172

审稿时长

12 months

期刊介绍： This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.