{"title":"FlowSail: Fine-Grained and Practical Flow Control for Datacenter Networks","authors":"Wenxue Li;Chaoliang Zeng;Jinbin Hu;Kai Chen","doi":"10.1109/TNET.2024.3406613","DOIUrl":null,"url":null,"abstract":"As datacenter networks continue to support a wider range of applications and faster link speeds, they face the challenge of managing bursty traffic and transient congestion. End-to-end congestion controls (CCs) find it increasingly difficult to maintain effectiveness due to the inherent feedback delay. To address this issue, per-hop flow control (FC) has gained popularity due to its ability to react promptly to transient congestion. However, existing FC mechanisms either lack fine-grained (i.e., per-flow granularity) control or require an impractical number of queues that exceeds the capabilities of commodity switches. In this paper, we introduce FlowSail, an innovative FC scheme that enables fine-grained control at the per-flow level while requiring a practical number of switch queues, theoretically as few as two. The core of FlowSail is an effective approximation of ideal FC by three key design components: dynamic flow-to-queue mapping, hierarchical congested flow identification, and on-demand isolation. We have implemented a prototype of FlowSail using the programmable P4 switch and conducted extensive testbed experiments and simulations. The results indicate that FlowSail effectively sustains performance with significantly fewer queues compared to existing FC schemes. For instance, FlowSail achieves \n<inline-formula> <tex-math>$4.3\\times $ </tex-math></inline-formula>\n lower tail latency under the same number of queues, matches existing FC schemes with \n<inline-formula> <tex-math>$4\\times $ </tex-math></inline-formula>\n fewer queues, and holds robust performance with a minimum of 2 queues.","PeriodicalId":13443,"journal":{"name":"IEEE/ACM Transactions on Networking","volume":"32 5","pages":"3916-3928"},"PeriodicalIF":3.6000,"publicationDate":"2024-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE/ACM Transactions on Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10541943/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
As datacenter networks continue to support a wider range of applications and faster link speeds, they face the challenge of managing bursty traffic and transient congestion. End-to-end congestion controls (CCs) find it increasingly difficult to maintain effectiveness due to the inherent feedback delay. To address this issue, per-hop flow control (FC) has gained popularity due to its ability to react promptly to transient congestion. However, existing FC mechanisms either lack fine-grained (i.e., per-flow granularity) control or require an impractical number of queues that exceeds the capabilities of commodity switches. In this paper, we introduce FlowSail, an innovative FC scheme that enables fine-grained control at the per-flow level while requiring a practical number of switch queues, theoretically as few as two. The core of FlowSail is an effective approximation of ideal FC by three key design components: dynamic flow-to-queue mapping, hierarchical congested flow identification, and on-demand isolation. We have implemented a prototype of FlowSail using the programmable P4 switch and conducted extensive testbed experiments and simulations. The results indicate that FlowSail effectively sustains performance with significantly fewer queues compared to existing FC schemes. For instance, FlowSail achieves
$4.3\times $
lower tail latency under the same number of queues, matches existing FC schemes with
$4\times $
fewer queues, and holds robust performance with a minimum of 2 queues.
随着数据中心网络不断支持更广泛的应用和更快的链路速度,它们面临着管理突发流量和瞬时拥塞的挑战。由于固有的反馈延迟,端到端拥塞控制(CC)越来越难以保持有效性。为解决这一问题,每跳流量控制(FC)因其能对瞬时拥塞做出迅速反应而越来越受欢迎。然而,现有的 FC 机制要么缺乏细粒度(即每流粒度)控制,要么需要超出商品交换机能力的不切实际的队列数量。在本文中,我们介绍了 FlowSail,这是一种创新的 FC 方案,它能在每个流量级别实现细粒度控制,同时只需要实用数量的交换机队列,理论上只需要两个队列。FlowSail 的核心是通过三个关键设计组件有效逼近理想 FC:动态流量到队列映射、分层拥塞流量识别和按需隔离。我们利用可编程 P4 交换机实现了 FlowSail 的原型,并进行了广泛的测试平台实验和仿真。结果表明,与现有的 FC 方案相比,FlowSail 能有效地以更少的队列维持性能。例如,在队列数相同的情况下,FlowSail的尾部延迟降低了4.3倍,与队列数减少4倍的现有FC方案不相上下,并且在至少有2个队列的情况下仍能保持稳定的性能。
期刊介绍:
The IEEE/ACM Transactions on Networking’s high-level objective is to publish high-quality, original research results derived from theoretical or experimental exploration of the area of communication/computer networking, covering all sorts of information transport networks over all sorts of physical layer technologies, both wireline (all kinds of guided media: e.g., copper, optical) and wireless (e.g., radio-frequency, acoustic (e.g., underwater), infra-red), or hybrids of these. The journal welcomes applied contributions reporting on novel experiences and experiments with actual systems.