Ziyuan Liu, Zhixiong Niu, Ran Shu, Liang Gao, Guohong Lai, Na Wang, Zongying He, Jacob Nelson, Dan R. K. Ports, Lihua Yuan, Peng Cheng, Y. Xiong
{"title":"SlimeMold:数据中心的大规模硬件负载均衡器","authors":"Ziyuan Liu, Zhixiong Niu, Ran Shu, Liang Gao, Guohong Lai, Na Wang, Zongying He, Jacob Nelson, Dan R. K. Ports, Lihua Yuan, Peng Cheng, Y. Xiong","doi":"10.1145/3600061.3600067","DOIUrl":null,"url":null,"abstract":"Stateful load balancers (LB) are essential services in cloud data centers, playing a crucial role in enhancing the availability and capacity of applications. Numerous studies have proposed methods to improve the throughput, connections per second, and concurrent flows of single LBs. For instance, with the advancement of programmable switches, hardware-based load balancers (HLB) have become mainstream due to their high efficiency. However, programmable switches still face the issue of limited registers and table entries, preventing them from fully meeting the performance requirements of data centers. In this paper, rather than solely focusing on enhancing individual HLBs, we introduce SlimeMold, which enables HLBs to work collaboratively at scale as an integrated LB system in data centers. First, we design a novel HLB building block capable of achieving load balancing and exchanging states with other building blocks in the data plane. Next, we decouple forwarding and state operations, organizing the states using our proposed 2-level mapping mechanism. Finally, we optimize the system with flow caching and table entry balancing. We implement a real HLB building block using the Broadcom 56788 SmartToR chip, which attains line rate for state read and >1M OPS for flow write operations. Our simulation demonstrates full scalability in large-scale experiments, supporting 454 million concurrent flows with 512 state-hosting building blocks.","PeriodicalId":228934,"journal":{"name":"Proceedings of the 7th Asia-Pacific Workshop on Networking","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SlimeMold: Hardware Load Balancer at Scale in Datacenter\",\"authors\":\"Ziyuan Liu, Zhixiong Niu, Ran Shu, Liang Gao, Guohong Lai, Na Wang, Zongying He, Jacob Nelson, Dan R. K. Ports, Lihua Yuan, Peng Cheng, Y. Xiong\",\"doi\":\"10.1145/3600061.3600067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stateful load balancers (LB) are essential services in cloud data centers, playing a crucial role in enhancing the availability and capacity of applications. Numerous studies have proposed methods to improve the throughput, connections per second, and concurrent flows of single LBs. For instance, with the advancement of programmable switches, hardware-based load balancers (HLB) have become mainstream due to their high efficiency. However, programmable switches still face the issue of limited registers and table entries, preventing them from fully meeting the performance requirements of data centers. In this paper, rather than solely focusing on enhancing individual HLBs, we introduce SlimeMold, which enables HLBs to work collaboratively at scale as an integrated LB system in data centers. First, we design a novel HLB building block capable of achieving load balancing and exchanging states with other building blocks in the data plane. Next, we decouple forwarding and state operations, organizing the states using our proposed 2-level mapping mechanism. Finally, we optimize the system with flow caching and table entry balancing. We implement a real HLB building block using the Broadcom 56788 SmartToR chip, which attains line rate for state read and >1M OPS for flow write operations. Our simulation demonstrates full scalability in large-scale experiments, supporting 454 million concurrent flows with 512 state-hosting building blocks.\",\"PeriodicalId\":228934,\"journal\":{\"name\":\"Proceedings of the 7th Asia-Pacific Workshop on Networking\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th Asia-Pacific Workshop on Networking\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3600061.3600067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th Asia-Pacific Workshop on Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3600061.3600067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SlimeMold: Hardware Load Balancer at Scale in Datacenter
Stateful load balancers (LB) are essential services in cloud data centers, playing a crucial role in enhancing the availability and capacity of applications. Numerous studies have proposed methods to improve the throughput, connections per second, and concurrent flows of single LBs. For instance, with the advancement of programmable switches, hardware-based load balancers (HLB) have become mainstream due to their high efficiency. However, programmable switches still face the issue of limited registers and table entries, preventing them from fully meeting the performance requirements of data centers. In this paper, rather than solely focusing on enhancing individual HLBs, we introduce SlimeMold, which enables HLBs to work collaboratively at scale as an integrated LB system in data centers. First, we design a novel HLB building block capable of achieving load balancing and exchanging states with other building blocks in the data plane. Next, we decouple forwarding and state operations, organizing the states using our proposed 2-level mapping mechanism. Finally, we optimize the system with flow caching and table entry balancing. We implement a real HLB building block using the Broadcom 56788 SmartToR chip, which attains line rate for state read and >1M OPS for flow write operations. Our simulation demonstrates full scalability in large-scale experiments, supporting 454 million concurrent flows with 512 state-hosting building blocks.