{"title":"A Two-Stage Heavy Hitter Detection System Based on CPU Spikes at Cloud-Scale Gateways","authors":"Jianyuan Lu, Tian Pan, Shan He, Mao Miao, Guangzhe Zhou, Yining Qi, Biao Lyu, Shunmin Zhu","doi":"10.1109/ICDCS51616.2021.00041","DOIUrl":null,"url":null,"abstract":"The cloud network provides sharing resources for tens of thousands of tenants to achieve economics of scale. However, heavy hitters caused by a single tenant will probably interfere with the processing of the cloud gateways, undermining the predictable performance expected by other cloud tenants. To prevent it, heavy hitter detection becomes a key concern at the performance-critical cloud gateways but faces the dilemma between fine granularity and low overhead. In this work, we present CloudSentry, a scalable two-stage heavy hitter detection system dedicated to multi-tenant cloud gateways against such a dilemma. CloudSentry contains a lightweight coarse-grained detection running 24/7 to localize infrequent CPU spikes. Then it invokes a fine-grained detection to precisely dump and analyze the potential heavy-hitter packets at the CPU spikes. After that, a more comprehensive analysis is conducted to associate heavy hitters with the cloud service scenarios and invoke a corresponding backpressure procedure. CloudSentry significantly reduces memory, computation and storage overhead compared with existing approaches. Additionally, it has been deployed world-wide in Alibaba Cloud for over one year, with rich deployment experiences. In a gateway cluster under an average traffic throughput of of 251Gbps, CloudSentry consumes only a fraction of 2%-5% CPU utilization with 8KB run-time memory, producing only 10MB heavy hitter logs during one month.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"311 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00041","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The cloud network provides sharing resources for tens of thousands of tenants to achieve economics of scale. However, heavy hitters caused by a single tenant will probably interfere with the processing of the cloud gateways, undermining the predictable performance expected by other cloud tenants. To prevent it, heavy hitter detection becomes a key concern at the performance-critical cloud gateways but faces the dilemma between fine granularity and low overhead. In this work, we present CloudSentry, a scalable two-stage heavy hitter detection system dedicated to multi-tenant cloud gateways against such a dilemma. CloudSentry contains a lightweight coarse-grained detection running 24/7 to localize infrequent CPU spikes. Then it invokes a fine-grained detection to precisely dump and analyze the potential heavy-hitter packets at the CPU spikes. After that, a more comprehensive analysis is conducted to associate heavy hitters with the cloud service scenarios and invoke a corresponding backpressure procedure. CloudSentry significantly reduces memory, computation and storage overhead compared with existing approaches. Additionally, it has been deployed world-wide in Alibaba Cloud for over one year, with rich deployment experiences. In a gateway cluster under an average traffic throughput of of 251Gbps, CloudSentry consumes only a fraction of 2%-5% CPU utilization with 8KB run-time memory, producing only 10MB heavy hitter logs during one month.