ByteTuning: Watermark Tuning for RoCEv2

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Cloud Computing Pub Date : 2025-01-03 DOI:10.1109/TCC.2025.3525496
Lizhuang Tan;Zhuo Jiang;Kefei Liu;Haoran Wei;Pengfei Huo;Huiling Shi;Wei Zhang;Wei Su
{"title":"ByteTuning: Watermark Tuning for RoCEv2","authors":"Lizhuang Tan;Zhuo Jiang;Kefei Liu;Haoran Wei;Pengfei Huo;Huiling Shi;Wei Zhang;Wei Su","doi":"10.1109/TCC.2025.3525496","DOIUrl":null,"url":null,"abstract":"RDMA over Converged Ethernet v2 (RoCEv2) is one of the most popular high-speed datacenter networking solutions. Watermark is the general term for various trigger and release thresholds of RoCEv2 flow control protocols, and its reasonable configuration is an important factor affecting RoCEv2 performance. In this paper, we propose ByteTuning, a centralized watermark tuning system for RoCEv2. First, three real cases of network performance degradation caused by non-optimal or improper watermark configuration are reported, and the network performance results of different watermark configurations in three typical scenarios are traversed, indicating the necessity of watermark tuning. Then, based on the RDMA Fluid model, the influence of watermark on the RoCEv2 performance is modeled and evaluated. Next, the design of the ByteTuning is introduced, which includes three mechanisms. They are 1) using simulated annealing algorithm to make the real-time watermark converge to the near-optimal configuration, 2) using network telemetry to optimize the feedback overhead, 3) compressing the search space to improve the tuning efficiency. Finally, We validate the performance of ByteTuning in multiple real datacenter networking environments, and the results show that ByteTuning outperforms existing solutions.","PeriodicalId":13202,"journal":{"name":"IEEE Transactions on Cloud Computing","volume":"13 1","pages":"303-320"},"PeriodicalIF":5.0000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cloud Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10820527/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

RDMA over Converged Ethernet v2 (RoCEv2) is one of the most popular high-speed datacenter networking solutions. Watermark is the general term for various trigger and release thresholds of RoCEv2 flow control protocols, and its reasonable configuration is an important factor affecting RoCEv2 performance. In this paper, we propose ByteTuning, a centralized watermark tuning system for RoCEv2. First, three real cases of network performance degradation caused by non-optimal or improper watermark configuration are reported, and the network performance results of different watermark configurations in three typical scenarios are traversed, indicating the necessity of watermark tuning. Then, based on the RDMA Fluid model, the influence of watermark on the RoCEv2 performance is modeled and evaluated. Next, the design of the ByteTuning is introduced, which includes three mechanisms. They are 1) using simulated annealing algorithm to make the real-time watermark converge to the near-optimal configuration, 2) using network telemetry to optimize the feedback overhead, 3) compressing the search space to improve the tuning efficiency. Finally, We validate the performance of ByteTuning in multiple real datacenter networking environments, and the results show that ByteTuning outperforms existing solutions.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ByteTuning:用于RoCEv2的水印调整
RDMA基于融合以太网v2 (RoCEv2)是最流行的高速数据中心网络解决方案之一。水印是RoCEv2流量控制协议的各种触发和释放阈值的总称,其合理配置是影响RoCEv2性能的重要因素。本文提出了一种用于RoCEv2的集中式水印调优系统ByteTuning。首先,报告了三个由于水印配置不优或不当导致网络性能下降的真实案例,并遍历了三种典型场景下不同水印配置的网络性能结果,表明了水印调优的必要性。然后,基于RDMA流体模型,对水印对RoCEv2性能的影响进行建模和评估。接下来,介绍了ByteTuning的设计,它包括三种机制。它们分别是:1)利用模拟退火算法使实时水印收敛到接近最优配置;2)利用网络遥测优化反馈开销;3)压缩搜索空间提高调优效率。最后,我们在多个真实的数据中心网络环境中验证了ByteTuning的性能,结果表明ByteTuning优于现有的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Cloud Computing
IEEE Transactions on Cloud Computing Computer Science-Software
CiteScore
9.40
自引率
6.20%
发文量
167
期刊介绍: The IEEE Transactions on Cloud Computing (TCC) is dedicated to the multidisciplinary field of cloud computing. It is committed to the publication of articles that present innovative research ideas, application results, and case studies in cloud computing, focusing on key technical issues related to theory, algorithms, systems, applications, and performance.
期刊最新文献
Smart-to-Compress: A Predictive and Game-Theoretic Framework for Data Reduction Decisions Side Channel Attacks on Resource-Constrained Devices Enabled Through Secure Cloud Outsourcing Security Weaknesses of a Lightweight Privacy-Preserving Edge Computing Based Ciphertext Retrieval Scheme Real-Time Adaptive Workflow Scheduling With Graph Learning and Transformer-Driven Reinforcement in Heterogeneous Clouds Transfer Learning-Enabled System for Drone Medicine Delivery Based on Spatio-Temporal Remote Sensing Data in Edge Cloud Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1