Enhancing Load Balancing With In-Network Recirculation to Prevent Packet Reordering in Lossless Data Centers

IF 3.6 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE IEEE/ACM Transactions on Networking Pub Date : 2024-03-27 DOI:10.1109/TNET.2024.3403671
Jinbin Hu;Yi He;Wangqing Luo;Jiawei Huang;Jin Wang
{"title":"Enhancing Load Balancing With In-Network Recirculation to Prevent Packet Reordering in Lossless Data Centers","authors":"Jinbin Hu;Yi He;Wangqing Luo;Jiawei Huang;Jin Wang","doi":"10.1109/TNET.2024.3403671","DOIUrl":null,"url":null,"abstract":"Many existing load balancing mechanisms work effectively in lossy datacenter networks (DCNs), but they suffer from serious packet reordering in lossless Ethernet DCNs deployed with the hop-by-hop Priority-based Flow Control (PFC). The key reason is that the prior solutions are not able to perceive PFC triggering correctly and in a timely manner when making load balancing decisions. Once the forwarding path pauses transmission due to PFC triggering, the packets allocated on it are blocked, inevitably leading to out-of-order packets and retransmission. In this paper, we present an Reordering-robust Load Balancing (RLB) scheme with PFC prediction in lossless DCNs. At its heart, RLB leverages the derivative of ingress queue length to predict PFC triggering and proactively notifies the upstream switches to choose an appropriate rerouting path or perform packet recirculation to avoid reordering. Furthermore, under switch failure scenarios, RLB adjusts the recirculation threshold adaptively to mitigate the risk of packets over-recirculation. We have implemented RLB in the hardware programmable switch. As a building block for existing load balancing mechanisms, we have integrated RLB into Presto, LetFlow, Hermes and DRILL. The evaluation results show that the RLB-enhanced solutions deliver significant performance by avoiding packet reordering. For example, it reduces the \n<inline-formula> <tex-math>$99^{th}$ </tex-math></inline-formula>\n percentile flow completion time (FCT) by up to 72%, 67%, 58% and 54% over DRILL, Presto, LetFlow and Hermes, respectively.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"4114-4127"},"PeriodicalIF":3.6000,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10539004/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Many existing load balancing mechanisms work effectively in lossy datacenter networks (DCNs), but they suffer from serious packet reordering in lossless Ethernet DCNs deployed with the hop-by-hop Priority-based Flow Control (PFC). The key reason is that the prior solutions are not able to perceive PFC triggering correctly and in a timely manner when making load balancing decisions. Once the forwarding path pauses transmission due to PFC triggering, the packets allocated on it are blocked, inevitably leading to out-of-order packets and retransmission. In this paper, we present an Reordering-robust Load Balancing (RLB) scheme with PFC prediction in lossless DCNs. At its heart, RLB leverages the derivative of ingress queue length to predict PFC triggering and proactively notifies the upstream switches to choose an appropriate rerouting path or perform packet recirculation to avoid reordering. Furthermore, under switch failure scenarios, RLB adjusts the recirculation threshold adaptively to mitigate the risk of packets over-recirculation. We have implemented RLB in the hardware programmable switch. As a building block for existing load balancing mechanisms, we have integrated RLB into Presto, LetFlow, Hermes and DRILL. The evaluation results show that the RLB-enhanced solutions deliver significant performance by avoiding packet reordering. For example, it reduces the $99^{th}$ percentile flow completion time (FCT) by up to 72%, 67%, 58% and 54% over DRILL, Presto, LetFlow and Hermes, respectively.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用网络内再循环加强负载平衡,防止无损数据中心的数据包重排序
许多现有的负载平衡机制在有损数据中心网络(DCN)中都能有效工作,但在采用逐跳基于优先级的流量控制(PFC)部署的无损以太网 DCN 中,这些机制却存在严重的数据包重排序问题。主要原因是,先前的解决方案在做出负载平衡决策时,无法正确、及时地感知 PFC 触发。一旦转发路径因 PFC 触发而暂停传输,其上分配的数据包就会被阻塞,不可避免地导致数据包失序和重传。在本文中,我们提出了一种在无损 DCN 中具有 PFC 预测功能的重组负载平衡(RLB)方案。RLB 的核心是利用入口队列长度的导数来预测 PFC 的触发,并主动通知上游交换机选择合适的重路由路径或执行数据包再循环以避免重排序。此外,在交换机故障情况下,RLB 还会自适应地调整再循环阈值,以降低数据包过度再循环的风险。我们在硬件可编程交换机中实现了 RLB。作为现有负载均衡机制的构件,我们将 RLB 集成到了 Presto、LetFlow、Hermes 和 DRILL 中。评估结果表明,通过避免数据包重排序,RLB 增强型解决方案可提供显著的性能。例如,与 DRILL、Presto、LetFlow 和 Hermes 相比,RLB 将 $99^{th}$ 百分位流完成时间(FCT)分别缩短了 72%、67%、58% 和 54%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking 工程技术-电信学
CiteScore
8.20
自引率
5.40%
发文量
246
审稿时长
4-8 weeks
期刊介绍: The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.
期刊最新文献
Table of Contents IEEE/ACM Transactions on Networking Information for Authors IEEE/ACM Transactions on Networking Society Information IEEE/ACM Transactions on Networking Publication Information FPCA: Parasitic Coding Authentication for UAVs by FM Signals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1