Proceedings of the 2021 ACM SIGCOMM 2021 Conference最新文献

英文中文

CocoSketch

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472892

Yinda Zhang, Zaoxing Liu, Ruixin Wang, Tong Yang, Jizhou Li, Ruijie Miao, Peng Liu, Ruwen Zhang, Junchen Jiang

Sketch-based measurement has emerged as a promising alternative to the traditional sampling-based network measurement approaches due to its high accuracy and resource efficiency. While there have been various designs around sketches, they focus on measuring one particular flow key, and it is infeasible to support many keys based on these sketches. In this work, we take a significant step towards supporting arbitrary partial key queries, where we only need to specify a full range of possible flow keys that are of interest before measurement starts, and in query time, we can extract the information of any key in that range. We design CocoSketch, which casts arbitrary partial key queries to the subset sum estimation problem and makes the theoretical tools for subset sum estimation practical. To realize desirable resource-accuracy tradeoffs in software and hardware platforms, we propose two techniques: (1) stochastic variance minimization to significantly reduce per-packet update delay, and (2) removing circular dependencies in the per-packet update logic to make the implementation hardware-friendly. We implement CocoSketch on four popular platforms (CPU, Open vSwitch, P4, and FPGA) and show that compared to baselines that use traditional single-key sketches, CocoSketch improves average packet processing throughput by 27.2x and accuracy by 10.4x when measuring six flow keys.

{"title":"CocoSketch","authors":"Yinda Zhang, Zaoxing Liu, Ruixin Wang, Tong Yang, Jizhou Li, Ruijie Miao, Peng Liu, Ruwen Zhang, Junchen Jiang","doi":"10.1145/3452296.3472892","DOIUrl":"https://doi.org/10.1145/3452296.3472892","url":null,"abstract":"Sketch-based measurement has emerged as a promising alternative to the traditional sampling-based network measurement approaches due to its high accuracy and resource efficiency. While there have been various designs around sketches, they focus on measuring one particular flow key, and it is infeasible to support many keys based on these sketches. In this work, we take a significant step towards supporting arbitrary partial key queries, where we only need to specify a full range of possible flow keys that are of interest before measurement starts, and in query time, we can extract the information of any key in that range. We design CocoSketch, which casts arbitrary partial key queries to the subset sum estimation problem and makes the theoretical tools for subset sum estimation practical. To realize desirable resource-accuracy tradeoffs in software and hardware platforms, we propose two techniques: (1) stochastic variance minimization to significantly reduce per-packet update delay, and (2) removing circular dependencies in the per-packet update logic to make the implementation hardware-friendly. We implement CocoSketch on four popular platforms (CPU, Open vSwitch, P4, and FPGA) and show that compared to baselines that use traditional single-key sketches, CocoSketch improves average packet processing throughput by 27.2x and accuracy by 10.4x when measuring six flow keys.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81500936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

1Pipe 1管

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472909

Bojie Li, Gefei Zuo, Wei Bai, Lintao Zhang

This paper proposes 1Pipe, a novel communication abstraction that enables different receivers to process messages from senders in a consistent total order. More precisely, 1Pipe provides both unicast and scattering (i.e., a group of messages to different destinations) in a causally and totally ordered manner. 1Pipe provides a best effort service that delivers each message at most once, as well as a reliable service that guarantees delivery and provides restricted atomic delivery for each scattering. 1Pipe can simplify and accelerate many distributed applications, e.g., transactional key-value stores, log replication, and distributed data structures. We propose a scalable and efficient method to implement 1Pipe inside data centers. To achieve total order delivery in a scalable manner, 1Pipe separates the bookkeeping of order information from message forwarding, and distributes the work to each switch and host. 1Pipe aggregates order information using in-network computation at switches. This forms the “control plane” of the system. On the “data plane”, 1Pipe forwards messages in the network as usual and reorders them at the receiver based on the order information. Evaluation on a 32-server testbed shows that 1Pipe achieves scalable throughput (80M messages per second per host) and low latency (10𝜇s) with little CPU and network overhead. 1Pipe achieves linearly scalable throughput and low latency in transactional key-value store, TPC-C, remote data structures, and replication that outperforms traditional designs by 2∼20x.

{"title":"1Pipe","authors":"Bojie Li, Gefei Zuo, Wei Bai, Lintao Zhang","doi":"10.1145/3452296.3472909","DOIUrl":"https://doi.org/10.1145/3452296.3472909","url":null,"abstract":"This paper proposes 1Pipe, a novel communication abstraction that enables different receivers to process messages from senders in a consistent total order. More precisely, 1Pipe provides both unicast and scattering (i.e., a group of messages to different destinations) in a causally and totally ordered manner. 1Pipe provides a best effort service that delivers each message at most once, as well as a reliable service that guarantees delivery and provides restricted atomic delivery for each scattering. 1Pipe can simplify and accelerate many distributed applications, e.g., transactional key-value stores, log replication, and distributed data structures. We propose a scalable and efficient method to implement 1Pipe inside data centers. To achieve total order delivery in a scalable manner, 1Pipe separates the bookkeeping of order information from message forwarding, and distributes the work to each switch and host. 1Pipe aggregates order information using in-network computation at switches. This forms the “control plane” of the system. On the “data plane”, 1Pipe forwards messages in the network as usual and reorders them at the receiver based on the order information. Evaluation on a 32-server testbed shows that 1Pipe achieves scalable throughput (80M messages per second per host) and low latency (10𝜇s) with little CPU and network overhead. 1Pipe achieves linearly scalable throughput and low latency in transactional key-value store, TPC-C, remote data structures, and replication that outperforms traditional designs by 2∼20x.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"53 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73621185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

ACC: automatic ECN tuning for high-speed datacenter networks ACC:用于高速数据中心网络的自动ECN调优

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472927

Siyu Yan, Xiaoliang Wang, Xiaolong Zheng, Yinben Xia, Derui Liu, Weishan Deng

For the widely deployed ECN-based congestion control schemes, the marking threshold is the key to deliver high bandwidth and low latency. However, due to traffic dynamics in the high-speed production networks, it is difficult to maintain persistent performance by using the static ECN setting. To meet the operational challenge, in this paper we report the design and implementation of an automatic run-time optimization scheme, ACC, which leverages the multi-agent reinforcement learning technique to dynamically adjust the marking threshold at each switch. The proposed approach works in a distributed fashion and combines offline and online training to adapt to dynamic traffic patterns. It can be easily deployed based on the common features supported by major commodity switching chips. Both testbed experiments and large-scale simulations have shown that ACC achieves low flow completion time (FCT) for both mice flows and elephant flows at line-rate. Under heterogeneous production environments with 300 machines, compared with the well-tuned static ECN settings, ACC achieves up to 20% improvement on IOPS and 30% lower FCT for storage service. ACC has been applied in high-speed datacenter networks and significantly simplifies the network operations.

对于广泛部署的基于ecn的拥塞控制方案，标记阈值是实现高带宽和低延迟的关键。然而，由于高速生产网络中流量的动态性，使用静态ECN设置很难保持持久的性能。为了应对操作挑战，本文报告了一种自动运行时优化方案ACC的设计和实现，该方案利用多智能体强化学习技术动态调整每次切换时的标记阈值。所提出的方法以分布式方式工作，并将离线和在线培训相结合，以适应动态流量模式。它可以基于主要商品交换芯片支持的通用特性轻松部署。试验台实验和大规模模拟都表明，ACC在小鼠流和大象流中均实现了低流完成时间(FCT)。在拥有300台机器的异构生产环境下，与经过优化的静态ECN设置相比，ACC在IOPS方面提高了20%，在存储服务方面降低了30%的FCT。ACC已应用于高速数据中心网络，大大简化了网络操作。

{"title":"ACC: automatic ECN tuning for high-speed datacenter networks","authors":"Siyu Yan, Xiaoliang Wang, Xiaolong Zheng, Yinben Xia, Derui Liu, Weishan Deng","doi":"10.1145/3452296.3472927","DOIUrl":"https://doi.org/10.1145/3452296.3472927","url":null,"abstract":"For the widely deployed ECN-based congestion control schemes, the marking threshold is the key to deliver high bandwidth and low latency. However, due to traffic dynamics in the high-speed production networks, it is difficult to maintain persistent performance by using the static ECN setting. To meet the operational challenge, in this paper we report the design and implementation of an automatic run-time optimization scheme, ACC, which leverages the multi-agent reinforcement learning technique to dynamically adjust the marking threshold at each switch. The proposed approach works in a distributed fashion and combines offline and online training to adapt to dynamic traffic patterns. It can be easily deployed based on the common features supported by major commodity switching chips. Both testbed experiments and large-scale simulations have shown that ACC achieves low flow completion time (FCT) for both mice flows and elephant flows at line-rate. Under heterogeneous production environments with 300 machines, compared with the well-tuned static ECN settings, ACC achieves up to 20% improvement on IOPS and 30% lower FCT for storage service. ACC has been applied in high-speed datacenter networks and significantly simplifies the network operations.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82069135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

A composition framework for change management 变更管理的组合框架

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472901

A. Mahimkar, Carlos Eduardo de Andrade, R. Sinha, Giritharan Rana

Change management has been a long-standing challenge for network operations. The large scale and diversity of networks, their complex dependencies, and continuous evolution through technology and software updates combined with the risk of service impact create tremendous challenges to effectively manage changes. In this paper, we use data from a large service provider and experiences of their operations teams to highlight the need for quick and easy adaptation of change management capabilities and keep up with the continuous network changes. We propose a new framework CORNET (COmposition fRamework for chaNge managEmenT) with key ideas of modularization of changes into building blocks, flexible composition into change workflows, change plan optimization, change impact verification, and automated translation of high-level change management intent into low-level implementations and mathematical models. We demonstrate the effectiveness of CORNET using real-world data collected from 4G and 5G cellular networks and virtualized services such as VPN and SDWAN running in the cloud as well as experiments conducted on a testbed of virtualized network functions. We also share our operational experiences and lessons learned from successfully using CORNET within a large service provider network over the last three years.

变更管理一直是网络运营的一个长期挑战。网络的大规模和多样性、它们复杂的依赖关系以及通过技术和软件更新而不断发展，再加上服务影响的风险，为有效地管理变化带来了巨大的挑战。在本文中，我们使用了来自一家大型服务提供商的数据及其运营团队的经验，以强调快速简便地适应变更管理能力的需求，并跟上持续的网络变化。我们提出了一个新的框架CORNET(变更管理组合框架)，其关键思想是将变更模块化为构建块，灵活组合为变更工作流，变更计划优化，变更影响验证，以及将高级变更管理意图自动转换为低级实现和数学模型。我们使用从4G和5G蜂窝网络和虚拟化服务(如VPN和SDWAN)中收集的真实数据以及在虚拟化网络功能测试平台上进行的实验来证明CORNET的有效性。我们还分享了过去三年在大型服务提供商网络中成功使用CORNET的操作经验和教训。

{"title":"A composition framework for change management","authors":"A. Mahimkar, Carlos Eduardo de Andrade, R. Sinha, Giritharan Rana","doi":"10.1145/3452296.3472901","DOIUrl":"https://doi.org/10.1145/3452296.3472901","url":null,"abstract":"Change management has been a long-standing challenge for network operations. The large scale and diversity of networks, their complex dependencies, and continuous evolution through technology and software updates combined with the risk of service impact create tremendous challenges to effectively manage changes. In this paper, we use data from a large service provider and experiences of their operations teams to highlight the need for quick and easy adaptation of change management capabilities and keep up with the continuous network changes. We propose a new framework CORNET (COmposition fRamework for chaNge managEmenT) with key ideas of modularization of changes into building blocks, flexible composition into change workflows, change plan optimization, change impact verification, and automated translation of high-level change management intent into low-level implementations and mathematical models. We demonstrate the effectiveness of CORNET using real-world data collected from 4G and 5G cellular networks and virtualized services such as VPN and SDWAN running in the cloud as well as experiments conducted on a testbed of virtualized network functions. We also share our operational experiences and lessons learned from successfully using CORNET within a large service provider network over the last three years.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82292515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Gimbal 常平架

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472940

Jaehong Min, Ming G. Liu, Tapan Chugh, Chenxingyu Zhao, Andrew Wei, I. Doh, A. Krishnamurthy

Emerging SmartNIC-based disaggregated NVMe storage has become a promising storage infrastructure due to its competitive IO performance and low cost. These SmartNIC JBOFs are shared among multiple co-resident applications, and there is a need for the platform to ensure fairness, QoS, and high utilization. Unfortunately, given the limited computing capability of the SmartNICs and the non-deterministic nature of NVMe drives, it is challenging to provide such support on today's SmartNIC JBOFs. This paper presents Gimbal, a software storage switch that orchestrates IO traffic between Ethernet ports and NVMe drives for co-located tenants. It enables efficient multi-tenancy on SmartNIC JBOFs using the following techniques: a delay-based SSD congestion control algorithm, dynamic estimation of SSD write costs, a fair scheduler that operates at the granularity of a virtual slot, and an end-to-end credit-based flow control channel. Our prototyped system not only achieves up to x6.6 better utilization and 62.6% less tail latency but also improves the fairness for complex workloads. It also improves a commercial key-value store performance in a multi-tenant environment with x1.7 better throughput and 35.0% less tail latency on average.

{"title":"Gimbal","authors":"Jaehong Min, Ming G. Liu, Tapan Chugh, Chenxingyu Zhao, Andrew Wei, I. Doh, A. Krishnamurthy","doi":"10.1145/3452296.3472940","DOIUrl":"https://doi.org/10.1145/3452296.3472940","url":null,"abstract":"Emerging SmartNIC-based disaggregated NVMe storage has become a promising storage infrastructure due to its competitive IO performance and low cost. These SmartNIC JBOFs are shared among multiple co-resident applications, and there is a need for the platform to ensure fairness, QoS, and high utilization. Unfortunately, given the limited computing capability of the SmartNICs and the non-deterministic nature of NVMe drives, it is challenging to provide such support on today's SmartNIC JBOFs. This paper presents Gimbal, a software storage switch that orchestrates IO traffic between Ethernet ports and NVMe drives for co-located tenants. It enables efficient multi-tenancy on SmartNIC JBOFs using the following techniques: a delay-based SSD congestion control algorithm, dynamic estimation of SSD write costs, a fair scheduler that operates at the granularity of a virtual slot, and an end-to-end credit-based flow control channel. Our prototyped system not only achieves up to x6.6 better utilization and 62.6% less tail latency but also improves the fairness for complex workloads. It also improves a commercial key-value store performance in a multi-tenant environment with x1.7 better throughput and 35.0% less tail latency on average.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"63 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80761101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 35

Understanding host network stack overheads 了解主机网络堆栈开销

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472888

Qizhe Cai, Shubham Chaudhary, Midhul Vuppalapati, Jaehyun Hwang, R. Agarwal

Traditional end-host network stacks are struggling to keep up with rapidly increasing datacenter access link bandwidths due to their unsustainable CPU overheads. Motivated by this, our community is exploring a multitude of solutions for future network stacks: from Linux kernel optimizations to partial hardware offload to clean-slate userspace stacks to specialized host network hardware. The design space explored by these solutions would benefit from a detailed understanding of CPU inefficiencies in existing network stacks. This paper presents measurement and insights for Linux kernel network stack performance for 100Gbps access link bandwidths. Our study reveals that such high bandwidth links, coupled with relatively stagnant technology trends for other host resources (e.g., CPU speeds and capacity, cache sizes, NIC buffer sizes, etc.), mark a fundamental shift in host network stack bottlenecks. For instance, we find that a single core is no longer able to process packets at line rate, with data copy from kernel to application buffers at the receiver becoming the core performance bottleneck. In addition, increase in bandwidth-delay products have outpaced the increase in cache sizes, resulting in inefficient DMA pipeline between the NIC and the CPU. Finally, we find that traditional loosely-coupled design of network stack and CPU schedulers in existing operating systems becomes a limiting factor in scaling network stack performance across cores. Based on insights from our study, we discuss implications to design of future operating systems, network protocols, and host hardware.

由于其不可持续的CPU开销，传统的终端主机网络堆栈正在努力跟上快速增长的数据中心访问链路带宽。受此激励，我们的社区正在为未来的网络栈探索多种解决方案:从Linux内核优化到部分硬件卸载，从全新的用户空间栈到专门的主机网络硬件。这些解决方案所探索的设计空间将受益于对现有网络堆栈中CPU效率低下的详细理解。本文介绍了100Gbps访问链路带宽下Linux内核网络堆栈性能的测量和见解。我们的研究表明，这样的高带宽链路，加上其他主机资源相对停滞的技术趋势(例如，CPU速度和容量，缓存大小，NIC缓冲区大小等)，标志着主机网络堆栈瓶颈的根本转变。例如，我们发现单个内核不再能够以线速率处理数据包，从内核到接收器的应用程序缓冲区的数据复制成为核心性能瓶颈。此外，带宽延迟产品的增加超过了缓存大小的增加，导致网卡和CPU之间的DMA管道效率低下。最后，我们发现，在现有的操作系统中，传统的网络堆栈和CPU调度程序的松耦合设计成为跨核扩展网络堆栈性能的限制因素。基于我们研究的见解，我们讨论了对未来操作系统、网络协议和主机硬件设计的影响。

{"title":"Understanding host network stack overheads","authors":"Qizhe Cai, Shubham Chaudhary, Midhul Vuppalapati, Jaehyun Hwang, R. Agarwal","doi":"10.1145/3452296.3472888","DOIUrl":"https://doi.org/10.1145/3452296.3472888","url":null,"abstract":"Traditional end-host network stacks are struggling to keep up with rapidly increasing datacenter access link bandwidths due to their unsustainable CPU overheads. Motivated by this, our community is exploring a multitude of solutions for future network stacks: from Linux kernel optimizations to partial hardware offload to clean-slate userspace stacks to specialized host network hardware. The design space explored by these solutions would benefit from a detailed understanding of CPU inefficiencies in existing network stacks. This paper presents measurement and insights for Linux kernel network stack performance for 100Gbps access link bandwidths. Our study reveals that such high bandwidth links, coupled with relatively stagnant technology trends for other host resources (e.g., CPU speeds and capacity, cache sizes, NIC buffer sizes, etc.), mark a fundamental shift in host network stack bottlenecks. For instance, we find that a single core is no longer able to process packets at line rate, with data copy from kernel to application buffers at the receiver becoming the core performance bottleneck. In addition, increase in bandwidth-delay products have outpaced the increase in cache sizes, resulting in inefficient DMA pipeline between the NIC and the CPU. Finally, we find that traditional loosely-coupled design of network stack and CPU schedulers in existing operating systems becomes a limiting factor in scaling network stack performance across cores. Based on insights from our study, we discuss implications to design of future operating systems, network protocols, and host hardware.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86539323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

Sailfish 旗鱼

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472889

Tian Pan, Nianbing Yu, Chenhao Jia, Jianwen Pi, Liang Xu, Yisong Qiao, Zhiguo Li, Kun Liu, Jie Lu, Jianyuan Lu, Enge Song, Jiao Zhang, Tao Huang, Shunmin Zhu

The cloud gateway is essential in the public cloud as the central hub of cloud traffic. We show that horizontal scaling of software gateways, once sustainable for years, is no longer future-proof facing the massive scale and rapid growth of today's cloud. The root cause is the stagnant performance of the CPU core, which is prone to be overloaded by heavy hitters as traffic growth goes far beyond Moore's law. To address this, we propose emph{Sailfish}, a cloud-scale multi-tenant multi-service gateway accelerated by programmable switches. The new challenge is that large forwarding tables due to multi-tenancy cannot be fit into the limited on-chip memories. To this end, we devise a multi-pronged approach with (1) hardware/software co-design for table sharing, (2) horizontal table splitting among gateway clusters, (3) pipeline-aware table compression for a single node. Compared with the x86 gateway of a similar price, Sailfish reduces latency by 95% (2μs), improves throughput by more than 20x in bps (3.2Tbps) and 71x in pps (1.8Gpps) with packet length < 256B. Sailfish has been deployed in Alibaba Cloud for more than two years. It is the first P4-based cloud gateway in the industry, of which a single cluster carries dozens of Tbps traffic, withstanding peak-hour traffic in large online shopping festivals.

{"title":"Sailfish","authors":"Tian Pan, Nianbing Yu, Chenhao Jia, Jianwen Pi, Liang Xu, Yisong Qiao, Zhiguo Li, Kun Liu, Jie Lu, Jianyuan Lu, Enge Song, Jiao Zhang, Tao Huang, Shunmin Zhu","doi":"10.1145/3452296.3472889","DOIUrl":"https://doi.org/10.1145/3452296.3472889","url":null,"abstract":"The cloud gateway is essential in the public cloud as the central hub of cloud traffic. We show that horizontal scaling of software gateways, once sustainable for years, is no longer future-proof facing the massive scale and rapid growth of today's cloud. The root cause is the stagnant performance of the CPU core, which is prone to be overloaded by heavy hitters as traffic growth goes far beyond Moore's law. To address this, we propose emph{Sailfish}, a cloud-scale multi-tenant multi-service gateway accelerated by programmable switches. The new challenge is that large forwarding tables due to multi-tenancy cannot be fit into the limited on-chip memories. To this end, we devise a multi-pronged approach with (1) hardware/software co-design for table sharing, (2) horizontal table splitting among gateway clusters, (3) pipeline-aware table compression for a single node. Compared with the x86 gateway of a similar price, Sailfish reduces latency by 95% (2μs), improves throughput by more than 20x in bps (3.2Tbps) and 71x in pps (1.8Gpps) with packet length < 256B. Sailfish has been deployed in Alibaba Cloud for more than two years. It is the first P4-based cloud gateway in the industry, of which a single cluster carries dozens of Tbps traffic, withstanding peak-hour traffic in large online shopping festivals.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"25 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81637277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Congestion detection in lossless networks 无损网络中的拥塞检测

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472899

Yiran Zhang, Yifan Liu, Qingkai Meng, Fengyuan Ren

Congestion detection is the cornerstone of end-to-end congestion control. Through in-depth observations and understandings, we reveal that existing congestion detection mechanisms in mainstream lossless networks (i.e., Converged Enhanced Ethernet and InfiniBand) are improper, due to failing to cognize the interaction between hop-by-hop flow controls and congestion detection behaviors in switches. We define ternary states of switch ports and present Ternary Congestion Detection (TCD) for mainstream lossless networks. Testbed and extensive simulations demonstrate that TCD can detect congestion ports accurately and identify flows contributing to congestion as well as flows only affected by hop-by-hop flow controls. Meanwhile, we shed light on how to incorporate TCD with rate control. Case studies show that existing congestion control algorithms can achieve 3.3x and 2.0x better median and 99th-percentile FCT slowdown by combining with TCD.

拥塞检测是端到端拥塞控制的基础。通过深入的观察和理解，我们发现主流无损网络(即融合增强型以太网和InfiniBand)中现有的拥塞检测机制是不合适的，原因是无法识别交换机中逐跳流控制与拥塞检测行为之间的相互作用。定义了交换机端口的三元状态，提出了主流无损网络的三元拥塞检测(TCD)方法。测试平台和大量的仿真表明，TCD可以准确地检测拥塞端口，并识别导致拥塞的流以及仅受逐跳流控制影响的流。与此同时，我们也就如何把租客收费与费率管制结合起来提出建议。案例研究表明，通过与TCD相结合，现有的拥塞控制算法可以实现3.3倍和2.0倍的中位数和99个百分点的FCT减速。

引用次数: 9

mmTag

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1145/3452296.3472917

M. Mazaheri, Alex K Chen, Omid Abari

Recent advances in IoT, machine learning and cloud computing have placed a huge strain on wireless networks. In particular, many emerging applications require streaming rich content (such as videos) in real time, while they are constrained by energy sources. A wireless network which supports high data-rate while consuming low-power would be very attractive for these applications. Unfortunately, existing wireless networks do not satisfy this requirement. For example, WiFi backscatter and Bluetooth networks have very low power consumption, but their data-rate is very limited (less than a Mbps). On the other hand, modern WiFi and mmWave networks support high throughput, but have a high power consumption (more than a watt). To address this problem, we present mmTag, a novel mmWave backscatter network which enables low-power high-throughput wireless links for emerging applications. mmTag is a backscatter system which operates in the mmWave frequency bands. mmTag addresses the key challenges that prevent existing backscatter networks from operating at mmWave bands. We implemented mmTag and evaluated its performance empirically. Our results show that mmTag is capable of achieving 1 Gbps and 100 Mbps at 4.6 m and 8 m, respectively, while consuming only 2.4 nJ/bit.

{"title":"mmTag","authors":"M. Mazaheri, Alex K Chen, Omid Abari","doi":"10.1145/3452296.3472917","DOIUrl":"https://doi.org/10.1145/3452296.3472917","url":null,"abstract":"Recent advances in IoT, machine learning and cloud computing have placed a huge strain on wireless networks. In particular, many emerging applications require streaming rich content (such as videos) in real time, while they are constrained by energy sources. A wireless network which supports high data-rate while consuming low-power would be very attractive for these applications. Unfortunately, existing wireless networks do not satisfy this requirement. For example, WiFi backscatter and Bluetooth networks have very low power consumption, but their data-rate is very limited (less than a Mbps). On the other hand, modern WiFi and mmWave networks support high throughput, but have a high power consumption (more than a watt). To address this problem, we present mmTag, a novel mmWave backscatter network which enables low-power high-throughput wireless links for emerging applications. mmTag is a backscatter system which operates in the mmWave frequency bands. mmTag addresses the key challenges that prevent existing backscatter networks from operating at mmWave bands. We implemented mmTag and evaluated its performance empirically. Our results show that mmTag is capable of achieving 1 Gbps and 100 Mbps at 4.6 m and 8 m, respectively, while consuming only 2.4 nJ/bit.","PeriodicalId":20487,"journal":{"name":"Proceedings of the 2021 ACM SIGCOMM 2021 Conference","volume":"77 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85730133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

RoS

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

Pub Date : 2021-08-09 DOI: 10.1007/978-3-662-48986-4_301467

J. Nolan, Kun Qian, Xinyu Zhang

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 2021 ACM SIGCOMM 2021 Conference

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀