Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00065
Jie Xu, Yingying Cheng, Cong Wang, X. Jia
Blockchain scalability is one of the most desired properties for permissionless blockchain. Many recent blockchain protocols have focused on increasing the transaction throughput. However, existing protocols cannot dynamically scale the throughput to meet transaction demand. In this paper, we propose Occam, a secure and adaptive scaling scheme. Occam adaptively changes the transaction throughput by expanding and shrinking according to the transaction demand in the network. We introduce a dynamic adjustment mechanism of mining difficulty and a mining power load balancing mechanism to resist various attacks. Furthermore, we implement Occam on Amazon EC2 cluster with 1000 full nodes. Experimental results show that Occam can greatly increase the throughput of the blockchain and the mining power utilization.
{"title":"Occam: A Secure and Adaptive Scaling Scheme for Permissionless Blockchain","authors":"Jie Xu, Yingying Cheng, Cong Wang, X. Jia","doi":"10.1109/ICDCS51616.2021.00065","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00065","url":null,"abstract":"Blockchain scalability is one of the most desired properties for permissionless blockchain. Many recent blockchain protocols have focused on increasing the transaction throughput. However, existing protocols cannot dynamically scale the throughput to meet transaction demand. In this paper, we propose Occam, a secure and adaptive scaling scheme. Occam adaptively changes the transaction throughput by expanding and shrinking according to the transaction demand in the network. We introduce a dynamic adjustment mechanism of mining difficulty and a mining power load balancing mechanism to resist various attacks. Furthermore, we implement Occam on Amazon EC2 cluster with 1000 full nodes. Experimental results show that Occam can greatly increase the throughput of the blockchain and the mining power utilization.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"174 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115486021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00019
Myoungwon Oh, Jiwoong Park, S. Park, Adel Choi, Jongyoul Lee, Jin-Hyeok Choi, H. Yeom
In cloud ecosystems, distributed block storage systems are used to provide a persistent block storage service, which is the fundamental building block for operating cloud native services. However, existing distributed storage systems performed poorly for random write workloads in an all-NVMe storage configuration, becoming CPU-bottlenecked. Our roofline-based approach to performance analysis on a conventional distributed block storage system with NVMe SSDs reveals that the bottleneck does not lie in one specific software module, but across the entire software stack; (1) tightly coupled I/O processing, (2) inefficient threading architecture, and (3) local backend data store causing excessive CPU usage. To this end, we re-architect a modern distributed block storage system for improving random write performance. The key ingredients of our system are (1) decoupled operation processing using non-volatile memory, (2) prioritized thread control, and (3) CPU-efficient backend data store. Our system emphasizes low CPU overhead and high CPU efficiency to efficiently utilize NVMe SSDs in a distributed storage environment. We implement our system in Ceph. Compared to the native Ceph, our prototype system delivers more than 3x performance improvement for small random write I/Os in terms of both IOPS and latency by efficiently utilizing CPU cores.
{"title":"Re-architecting Distributed Block Storage System for Improving Random Write Performance","authors":"Myoungwon Oh, Jiwoong Park, S. Park, Adel Choi, Jongyoul Lee, Jin-Hyeok Choi, H. Yeom","doi":"10.1109/ICDCS51616.2021.00019","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00019","url":null,"abstract":"In cloud ecosystems, distributed block storage systems are used to provide a persistent block storage service, which is the fundamental building block for operating cloud native services. However, existing distributed storage systems performed poorly for random write workloads in an all-NVMe storage configuration, becoming CPU-bottlenecked. Our roofline-based approach to performance analysis on a conventional distributed block storage system with NVMe SSDs reveals that the bottleneck does not lie in one specific software module, but across the entire software stack; (1) tightly coupled I/O processing, (2) inefficient threading architecture, and (3) local backend data store causing excessive CPU usage. To this end, we re-architect a modern distributed block storage system for improving random write performance. The key ingredients of our system are (1) decoupled operation processing using non-volatile memory, (2) prioritized thread control, and (3) CPU-efficient backend data store. Our system emphasizes low CPU overhead and high CPU efficiency to efficiently utilize NVMe SSDs in a distributed storage environment. We implement our system in Ceph. Compared to the native Ceph, our prototype system delivers more than 3x performance improvement for small random write I/Os in terms of both IOPS and latency by efficiently utilizing CPU cores.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117342109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00043
Zhijun Hu, Libing Wu, Jianxin Li, Chao Ma, Xiaochuan Shi
Probing techniques are widely used to identify faulty nodes in networks. Existing probe-based solutions for SDN fault localizationcan focus on two ways: per-rule and per-path. Both promote some certain switches to reporters by installing on them report rules. To avoid hindering other test packets, such report rules must vary between tests or be deleted before a next test, thus incurring excessive consumption on either TCAM resources of switches or bandwidth reserved for control messages. In this paper we present Voyager, a hybrid fault localization solution for SDN that fully combines the advantages of per-rule and per-path tests. Voyager significantly reduces the number of report rules and allows them to reside and function in switches persistently. With only one well-designed report rule for each switch installed, Voyager pinpoints faulty switches easily and tightly by sending test packets straight. Tests in Voyager are parallelizable and report rules are non-invasive. The performance evaluation on realistic datasets shows that Voyager is 24.0% to 92.3% faster than existing solutions.
{"title":"Everyone in SDN Contributes: Fault Localization via Well-Designed Rules","authors":"Zhijun Hu, Libing Wu, Jianxin Li, Chao Ma, Xiaochuan Shi","doi":"10.1109/ICDCS51616.2021.00043","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00043","url":null,"abstract":"Probing techniques are widely used to identify faulty nodes in networks. Existing probe-based solutions for SDN fault localizationcan focus on two ways: per-rule and per-path. Both promote some certain switches to reporters by installing on them report rules. To avoid hindering other test packets, such report rules must vary between tests or be deleted before a next test, thus incurring excessive consumption on either TCAM resources of switches or bandwidth reserved for control messages. In this paper we present Voyager, a hybrid fault localization solution for SDN that fully combines the advantages of per-rule and per-path tests. Voyager significantly reduces the number of report rules and allows them to reside and function in switches persistently. With only one well-designed report rule for each switch installed, Voyager pinpoints faulty switches easily and tightly by sending test packets straight. Tests in Voyager are parallelizable and report rules are non-invasive. The performance evaluation on realistic datasets shows that Voyager is 24.0% to 92.3% faster than existing solutions.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116389134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Biometric plays an important role in user authentication. However, the most widely used biometrics, such as facial feature and fingerprint, are easy to capture or record, and thus vulnerable to spoofing attacks. On the contrary, intracorporal biometrics, such as electrocardiography and electroencephalography, are hard to collect, and hence more secure for authentication. Unfortunately, adopting them is not user-friendly due to their complicated collection methods and inconvenient constraints on users. In this paper, we propose a novel biometric-based authentication system, namely MandiPass. MandiPass leverages inertial measurement units (IMU), which have been widely deployed in portable devices, to collect intracorporal biometric from the vibration of user's mandible. The authentication merely requires user to voice a short ‘EMM’ for generating the vibration. In this way, MandiPass enables a secure and user-friendly biometric-based authentication. We theoretically validate the feasibility of MandiPass and develop a two-branch deep neural network for effective biometric extraction. We also utilize a Gaussian matrix to defend against replay attacks. Extensive experiment results with 34 volunteers show that MandiPass can achieve an equal error rate of 1.28%, even under various harsh environments.
{"title":"MandiPass: Secure and Usable User Authentication via Earphone IMU","authors":"Jianwei Liu, Wenfan Song, Leming Shen, Jinsong Han, Xian Xu, K. Ren","doi":"10.1109/ICDCS51616.2021.00070","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00070","url":null,"abstract":"Biometric plays an important role in user authentication. However, the most widely used biometrics, such as facial feature and fingerprint, are easy to capture or record, and thus vulnerable to spoofing attacks. On the contrary, intracorporal biometrics, such as electrocardiography and electroencephalography, are hard to collect, and hence more secure for authentication. Unfortunately, adopting them is not user-friendly due to their complicated collection methods and inconvenient constraints on users. In this paper, we propose a novel biometric-based authentication system, namely MandiPass. MandiPass leverages inertial measurement units (IMU), which have been widely deployed in portable devices, to collect intracorporal biometric from the vibration of user's mandible. The authentication merely requires user to voice a short ‘EMM’ for generating the vibration. In this way, MandiPass enables a secure and user-friendly biometric-based authentication. We theoretically validate the feasibility of MandiPass and develop a two-branch deep neural network for effective biometric extraction. We also utilize a Gaussian matrix to defend against replay attacks. Extensive experiment results with 34 volunteers show that MandiPass can achieve an equal error rate of 1.28%, even under various harsh environments.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123549797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00023
Haoyu Wang, Haiying Shen, Zijian Li, Shuhao Tian
More and more web applications are deployed on the cloud storage services that store data objects of the web applications in the geo-distributed datacenters belonging to Cloud Service Providers (CSPs). In order to provide low request latency to the web application users, in the previous work, the web application developers need to store more data object replicas in a large number of datacenters or send redundant requests to multiple datacenters (e.g., closest datacenters), both of which increase monetary cost. In this paper, we conducted request latency measurement from a GENI server (as a client) to AWS S3 datacenters for one month, and our observations lay the foundation for our proposed system called GeoCol, a geo-distributed cloud storage system with low cost and latency using reinforcement learning (RL). To achieve the optimal tradeoff between the monetary cost and the request latency, GeoCol encompasses a request split method and a storage planning method. The request split method uses the SARIMA machine learning (ML) technique to predict the request latency as an input to an RL model to determine the number of sub-requests and the datacenter for each sub-request for a request in order to enable the parallel transmissions for a data object. In the storage planning method, each datacenter uses RL to determine whether each data object should be stored and the storage type of each stored data object. Our trace-driven experiment on AWS S3 and GENI platform shows that GeoCol outperforms other comparison methods in monetary cost with 32 % reduction and data object request latency with 51 % reduction.
{"title":"GeoCol: A Geo-distributed Cloud Storage System with Low Cost and Latency using Reinforcement Learning","authors":"Haoyu Wang, Haiying Shen, Zijian Li, Shuhao Tian","doi":"10.1109/ICDCS51616.2021.00023","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00023","url":null,"abstract":"More and more web applications are deployed on the cloud storage services that store data objects of the web applications in the geo-distributed datacenters belonging to Cloud Service Providers (CSPs). In order to provide low request latency to the web application users, in the previous work, the web application developers need to store more data object replicas in a large number of datacenters or send redundant requests to multiple datacenters (e.g., closest datacenters), both of which increase monetary cost. In this paper, we conducted request latency measurement from a GENI server (as a client) to AWS S3 datacenters for one month, and our observations lay the foundation for our proposed system called GeoCol, a geo-distributed cloud storage system with low cost and latency using reinforcement learning (RL). To achieve the optimal tradeoff between the monetary cost and the request latency, GeoCol encompasses a request split method and a storage planning method. The request split method uses the SARIMA machine learning (ML) technique to predict the request latency as an input to an RL model to determine the number of sub-requests and the datacenter for each sub-request for a request in order to enable the parallel transmissions for a data object. In the storage planning method, each datacenter uses RL to determine whether each data object should be stored and the storage type of each stored data object. Our trace-driven experiment on AWS S3 and GENI platform shows that GeoCol outperforms other comparison methods in monetary cost with 32 % reduction and data object request latency with 51 % reduction.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":" 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120831353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00117
Tianxiang Dai, Haya Shulman, M. Waidner
The proliferation of Voice-over-IP (VoIP) technologies make them a lucrative target of attacks. While many attack vectors have been uncovered, one critical vector has not yet received attention: hijacking telephony via DNS cache poisoning. We demonstrate practical VoIP hijack attacks by manipulating DNS responses with a weak off-path attacker. We evaluate our attacks against popular telephony VoIP systems in the Internet and provide a live demo of the attack against Extensible Messaging and Presence Protocol at https://sit4.me/M4.
{"title":"Poster: Off-path VoIP Interception Attacks","authors":"Tianxiang Dai, Haya Shulman, M. Waidner","doi":"10.1109/ICDCS51616.2021.00117","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00117","url":null,"abstract":"The proliferation of Voice-over-IP (VoIP) technologies make them a lucrative target of attacks. While many attack vectors have been uncovered, one critical vector has not yet received attention: hijacking telephony via DNS cache poisoning. We demonstrate practical VoIP hijack attacks by manipulating DNS responses with a weak off-path attacker. We evaluate our attacks against popular telephony VoIP systems in the Internet and provide a live demo of the attack against Extensible Messaging and Presence Protocol at https://sit4.me/M4.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123991011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00046
Pierre Civit, Seth Gilbert, V. Gramoli
In this paper, we introduce Polygraph, the first accountable Byzantine consensus algorithm. If among $n$ users $t < n/3$ are malicious then it ensures consensus; otherwise (if $tgeq n/3)$, it eventually detects malicious users that cause disagreement. Polygraph is appealing for blockchain applications as it allows them to totally order blocks in a chain whenever possible, hence avoiding forks and double spending and, otherwise, to punish (e.g., via slashing) at least $n/3$ malicious users when a fork occurs. This problem is more difficult than perhaps it first appears. One could try identifying malicious senders by extending classic Byzantine consensus algorithms to piggyback signed messages. We show however that to achieve accountability the resulting algorithms would then need to exchange $Omega(kappa^{2}cdot n^{5})$ bits, where $kappa$ is the security parameter of the signature scheme. By contrast, Polygraph has communication complexity $O(kappacdot n^{4})$. Finally, we implement Polygraph in a blockchain and compare it to the Red Belly Blockchain to show that it commits more than 10,000 Bitcoin-like transactions per second when deployed on 80 geodistributed machines.
{"title":"Polygraph: Accountable Byzantine Agreement","authors":"Pierre Civit, Seth Gilbert, V. Gramoli","doi":"10.1109/ICDCS51616.2021.00046","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00046","url":null,"abstract":"In this paper, we introduce Polygraph, the first accountable Byzantine consensus algorithm. If among $n$ users $t < n/3$ are malicious then it ensures consensus; otherwise (if $tgeq n/3)$, it eventually detects malicious users that cause disagreement. Polygraph is appealing for blockchain applications as it allows them to totally order blocks in a chain whenever possible, hence avoiding forks and double spending and, otherwise, to punish (e.g., via slashing) at least $n/3$ malicious users when a fork occurs. This problem is more difficult than perhaps it first appears. One could try identifying malicious senders by extending classic Byzantine consensus algorithms to piggyback signed messages. We show however that to achieve accountability the resulting algorithms would then need to exchange $Omega(kappa^{2}cdot n^{5})$ bits, where $kappa$ is the security parameter of the signature scheme. By contrast, Polygraph has communication complexity $O(kappacdot n^{4})$. Finally, we implement Polygraph in a blockchain and compare it to the Red Belly Blockchain to show that it commits more than 10,000 Bitcoin-like transactions per second when deployed on 80 geodistributed machines.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125890368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, deep neural networks (DNNs) have witnessed a booming of artificial intelligence Internet of Things applications with stringent demands across high accuracy and low latency. A widely adopted solution is to process such computation-intensive DNNs inference tasks with edge computing. Nevertheless, existing edge-based DNN processing methods still cannot achieve acceptable performance due to the intensive transmission data and unnecessary computation. To address the above limitations, we take the advantage of Multi-exit DNNs (ME-DNNs) that allows the tasks to exit early at different depths of the DNN during inference, based on the input complexity. However, naively deploying ME-DNNs in edge still fails to deliver fast and consistent inference in the wild environment. Specifically, 1) at the model-level, unsuitable exit settings will increase additional computational overhead and will lead to excessive queuing delay; 2) at the computation-level, it is hard to sustain high performance consistently in the dynamic edge computing environment. In this paper, we present a Low Latency Edge Intelligence Scheme based on Multi-Exit DNNs (LEIME) to tackle the aforementioned problem. At the model-level, we propose an exit setting algorithm to automatically build optimal ME-DNNs with lower time complexity; At the computation-level, we present a distributed offloading mechanism to fine-tune the task dispatching at runtime to sustain high performance in the dynamic environment, which has the property of close-to-optimal performance guarantee. Finally, we implement a prototype system and extensively evaluate it through testbed and large-scale simulation experiments. Experimental results demonstrate that LEIME significantly improves applications' performance, achieving 1.1–18.7 × speedup in different situations.
{"title":"Enabling Low Latency Edge Intelligence based on Multi-exit DNNs in the Wild","authors":"Zhaowu Huang, Fang Dong, Dian Shen, Junxue Zhang, Huitian Wang, Guangxing Cai, Qiang He","doi":"10.1109/ICDCS51616.2021.00075","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00075","url":null,"abstract":"In recent years, deep neural networks (DNNs) have witnessed a booming of artificial intelligence Internet of Things applications with stringent demands across high accuracy and low latency. A widely adopted solution is to process such computation-intensive DNNs inference tasks with edge computing. Nevertheless, existing edge-based DNN processing methods still cannot achieve acceptable performance due to the intensive transmission data and unnecessary computation. To address the above limitations, we take the advantage of Multi-exit DNNs (ME-DNNs) that allows the tasks to exit early at different depths of the DNN during inference, based on the input complexity. However, naively deploying ME-DNNs in edge still fails to deliver fast and consistent inference in the wild environment. Specifically, 1) at the model-level, unsuitable exit settings will increase additional computational overhead and will lead to excessive queuing delay; 2) at the computation-level, it is hard to sustain high performance consistently in the dynamic edge computing environment. In this paper, we present a Low Latency Edge Intelligence Scheme based on Multi-Exit DNNs (LEIME) to tackle the aforementioned problem. At the model-level, we propose an exit setting algorithm to automatically build optimal ME-DNNs with lower time complexity; At the computation-level, we present a distributed offloading mechanism to fine-tune the task dispatching at runtime to sustain high performance in the dynamic environment, which has the property of close-to-optimal performance guarantee. Finally, we implement a prototype system and extensively evaluate it through testbed and large-scale simulation experiments. Experimental results demonstrate that LEIME significantly improves applications' performance, achieving 1.1–18.7 × speedup in different situations.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128290884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00024
Haoran Cai, Q. Cao, Hong Jiang, Qiang Wang
In recent years, the design of green datacenters and their enabling technologies, including renewable power managements, have gained a lot of attraction in both industry and academia. However, the maintenance and upgrade of the underlying server system over time (e.g., server replacement due to failures, capacity increases, or migrations), which make datacenters increasingly more heterogeneous in their key processing components (e.g., capacity and variety of processors, memory and storage devices), present a great challenge to optimal allocation of renewable power supply. In other words, the current heterogeneity-unaware power allocation policies have failed to achieve optimal performance given a limited and time varying renewable power supply. In this paper, we propose a dynamic power allocation framework called GreenHetero, which enables adaptive power allocation among heterogeneous servers in green datacenters to achieve the optimal performance when the renewable power varies. Specifically, the GreenHetero scheduler dynamically maintains and updates a performance-power database for each server configuration and workload type through lightweight profiling method. Based on the database and power prediction, the scheduler leverages a well-designed solver to determine the optimal power allocation ratio among heterogeneous servers at runtime. Finally, the power enforcer is used to implement the power source selections and the power allocation decisions. We build an experimental prototype to evaluate GreenHetero. The evaluation shows that our solution can improve the average performance by 1.2x-2.2x and the renewable power utilization by up to 2.7x under tens of representative datacenter workloads compared with the heterogeneity-unaware baseline scheduler.
{"title":"GreenHetero: Adaptive Power Allocation for Heterogeneous Green Datacenters","authors":"Haoran Cai, Q. Cao, Hong Jiang, Qiang Wang","doi":"10.1109/ICDCS51616.2021.00024","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00024","url":null,"abstract":"In recent years, the design of green datacenters and their enabling technologies, including renewable power managements, have gained a lot of attraction in both industry and academia. However, the maintenance and upgrade of the underlying server system over time (e.g., server replacement due to failures, capacity increases, or migrations), which make datacenters increasingly more heterogeneous in their key processing components (e.g., capacity and variety of processors, memory and storage devices), present a great challenge to optimal allocation of renewable power supply. In other words, the current heterogeneity-unaware power allocation policies have failed to achieve optimal performance given a limited and time varying renewable power supply. In this paper, we propose a dynamic power allocation framework called GreenHetero, which enables adaptive power allocation among heterogeneous servers in green datacenters to achieve the optimal performance when the renewable power varies. Specifically, the GreenHetero scheduler dynamically maintains and updates a performance-power database for each server configuration and workload type through lightweight profiling method. Based on the database and power prediction, the scheduler leverages a well-designed solver to determine the optimal power allocation ratio among heterogeneous servers at runtime. Finally, the power enforcer is used to implement the power source selections and the power allocation decisions. We build an experimental prototype to evaluate GreenHetero. The evaluation shows that our solution can improve the average performance by 1.2x-2.2x and the renewable power utilization by up to 2.7x under tens of representative datacenter workloads compared with the heterogeneity-unaware baseline scheduler.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130853585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00094
Phuong Ha, Minh Vu, T. Le, Lisong Xu
Google introduced BBR representing a new model-based TCP class in 2016, which improves throughput and latency of Google's backbone and services and is now the second most popular TCP on the Internet. As BBR is designed as a general-purpose congestion control to replace current widely deployed congestion control such as Reno and CUBIC, this raises the importance of studying its performance in different types of networks. In this paper, we study BBR's performance in cloud networks, which have grown rapidly but have not been studied in the existing BBR works. For the first time, we show both analytically and experimentally that due to the virtual machine (VM) scheduling in cloud networks, BBR underestimates the pacing rate, delivery rate, and estimated bandwidth, which are three key elements of its control loop. This underestimation can exacerbate iteratively and exponentially over time, and can cause BBR's throughput to reduce to almost zero. We propose a BBR patch that captures the VM scheduling impact on BBR's model and improves its throughput in cloud networks. Our evaluation of the modified BBR on the testbed and EC2 shows a significant improvement in the throughput and bandwidth estimation accuracy over the original BBR in cloud networks with heavy VM scheduling.
{"title":"TCP BBR in Cloud Networks: Challenges, Analysis, and Solutions","authors":"Phuong Ha, Minh Vu, T. Le, Lisong Xu","doi":"10.1109/ICDCS51616.2021.00094","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00094","url":null,"abstract":"Google introduced BBR representing a new model-based TCP class in 2016, which improves throughput and latency of Google's backbone and services and is now the second most popular TCP on the Internet. As BBR is designed as a general-purpose congestion control to replace current widely deployed congestion control such as Reno and CUBIC, this raises the importance of studying its performance in different types of networks. In this paper, we study BBR's performance in cloud networks, which have grown rapidly but have not been studied in the existing BBR works. For the first time, we show both analytically and experimentally that due to the virtual machine (VM) scheduling in cloud networks, BBR underestimates the pacing rate, delivery rate, and estimated bandwidth, which are three key elements of its control loop. This underestimation can exacerbate iteratively and exponentially over time, and can cause BBR's throughput to reduce to almost zero. We propose a BBR patch that captures the VM scheduling impact on BBR's model and improves its throughput in cloud networks. Our evaluation of the modified BBR on the testbed and EC2 shows a significant improvement in the throughput and bandwidth estimation accuracy over the original BBR in cloud networks with heavy VM scheduling.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130486662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}