Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00021
Jesse Donkervliet, J. Cuijpers, A. Iosup
Gaming is one of the most popular and lucrative entertainment industries. Minecraft alone exceeds 130 million active monthly players and sells millions of licenses annually; it is also provided as a (paid) service. Minecraft, and thousands of others, provide each a Modifiable Virtual Environment (MVE). However, Minecraft-like games only scale using isolated instances that support at most a few hundred players in the same virtual world, thus preventing their large player-base from actually gaming together. When operating as a service, even fewer players can game together. Existing techniques for managing data in distributed systems do not scale for such games: they either do not work for high-density areas (e.g., village centers or other places where the MVE is often modified), or can introduce an unbounded amount of inconsistency that can lower the quality of experience. In this work, we propose Dyconits, a middleware that allows games to scale, by bounding inconsistency in MVEs, optimistically and dynamically. Dyconits allow game developers to partition offline the game-world and its objects into units, each with its own bounds. The Dyconits system controls, dynamically and policy-based, the creation of dyconits and the management of their bounds. Importantly, the Dyconits system is thin, and reuses the existing game codebase and in particular the network stack. To demonstrate and evaluate Dyconits in practice, we modify an existing, open-source, Minecraft-like game, and evaluate its effectiveness through real-world experiments. Our approach supports up to 40% more concurrent players and reduces network bandwidth by up to 85%, with only minor modifications to the game and without increasing game latency.
{"title":"Dyconits: Scaling Minecraft-like Services through Dynamically Managed Inconsistency","authors":"Jesse Donkervliet, J. Cuijpers, A. Iosup","doi":"10.1109/ICDCS51616.2021.00021","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00021","url":null,"abstract":"Gaming is one of the most popular and lucrative entertainment industries. Minecraft alone exceeds 130 million active monthly players and sells millions of licenses annually; it is also provided as a (paid) service. Minecraft, and thousands of others, provide each a Modifiable Virtual Environment (MVE). However, Minecraft-like games only scale using isolated instances that support at most a few hundred players in the same virtual world, thus preventing their large player-base from actually gaming together. When operating as a service, even fewer players can game together. Existing techniques for managing data in distributed systems do not scale for such games: they either do not work for high-density areas (e.g., village centers or other places where the MVE is often modified), or can introduce an unbounded amount of inconsistency that can lower the quality of experience. In this work, we propose Dyconits, a middleware that allows games to scale, by bounding inconsistency in MVEs, optimistically and dynamically. Dyconits allow game developers to partition offline the game-world and its objects into units, each with its own bounds. The Dyconits system controls, dynamically and policy-based, the creation of dyconits and the management of their bounds. Importantly, the Dyconits system is thin, and reuses the existing game codebase and in particular the network stack. To demonstrate and evaluate Dyconits in practice, we modify an existing, open-source, Minecraft-like game, and evaluate its effectiveness through real-world experiments. Our approach supports up to 40% more concurrent players and reduces network bandwidth by up to 85%, with only minor modifications to the game and without increasing game latency.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114551902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The cloud network provides sharing resources for tens of thousands of tenants to achieve economics of scale. However, heavy hitters caused by a single tenant will probably interfere with the processing of the cloud gateways, undermining the predictable performance expected by other cloud tenants. To prevent it, heavy hitter detection becomes a key concern at the performance-critical cloud gateways but faces the dilemma between fine granularity and low overhead. In this work, we present CloudSentry, a scalable two-stage heavy hitter detection system dedicated to multi-tenant cloud gateways against such a dilemma. CloudSentry contains a lightweight coarse-grained detection running 24/7 to localize infrequent CPU spikes. Then it invokes a fine-grained detection to precisely dump and analyze the potential heavy-hitter packets at the CPU spikes. After that, a more comprehensive analysis is conducted to associate heavy hitters with the cloud service scenarios and invoke a corresponding backpressure procedure. CloudSentry significantly reduces memory, computation and storage overhead compared with existing approaches. Additionally, it has been deployed world-wide in Alibaba Cloud for over one year, with rich deployment experiences. In a gateway cluster under an average traffic throughput of of 251Gbps, CloudSentry consumes only a fraction of 2%-5% CPU utilization with 8KB run-time memory, producing only 10MB heavy hitter logs during one month.
{"title":"A Two-Stage Heavy Hitter Detection System Based on CPU Spikes at Cloud-Scale Gateways","authors":"Jianyuan Lu, Tian Pan, Shan He, Mao Miao, Guangzhe Zhou, Yining Qi, Biao Lyu, Shunmin Zhu","doi":"10.1109/ICDCS51616.2021.00041","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00041","url":null,"abstract":"The cloud network provides sharing resources for tens of thousands of tenants to achieve economics of scale. However, heavy hitters caused by a single tenant will probably interfere with the processing of the cloud gateways, undermining the predictable performance expected by other cloud tenants. To prevent it, heavy hitter detection becomes a key concern at the performance-critical cloud gateways but faces the dilemma between fine granularity and low overhead. In this work, we present CloudSentry, a scalable two-stage heavy hitter detection system dedicated to multi-tenant cloud gateways against such a dilemma. CloudSentry contains a lightweight coarse-grained detection running 24/7 to localize infrequent CPU spikes. Then it invokes a fine-grained detection to precisely dump and analyze the potential heavy-hitter packets at the CPU spikes. After that, a more comprehensive analysis is conducted to associate heavy hitters with the cloud service scenarios and invoke a corresponding backpressure procedure. CloudSentry significantly reduces memory, computation and storage overhead compared with existing approaches. Additionally, it has been deployed world-wide in Alibaba Cloud for over one year, with rich deployment experiences. In a gateway cluster under an average traffic throughput of of 251Gbps, CloudSentry consumes only a fraction of 2%-5% CPU utilization with 8KB run-time memory, producing only 10MB heavy hitter logs during one month.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"311 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115832526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00093
Shaojun Zou, Jiawei Huang, Jingling Liu, Tao Zhang, Ning Jiang, Jianxin Wang
To improve the quality of experience for worldwide users, an increasing number of service providers deploy their services on geographically dispersed data centers, which are connected by wide area network (WAN). In the cross-datacenter networks, however, the intra- and inter-datacenter parts have different characteristics, including switch buffer depth, round-trip time and bandwidth. Besides, most of intra-DC flows belong to interactive services that require low delay while inter-DC flows typically need to achieve high throughput. Unfortunately, existing sender-based and receiver-driven transport protocols do not consider the network heterogeneity between inter- and intra- DC networks so that they fail to simultaneously achieve low latency for intra-DC flows and high throughput for inter-DC flows. This paper proposes a general hybrid congestion control mechanism called GTCP to address this problem. When the inter-DC flow detects congestion inside data center, it switches to the receiver-driven mode to avoid the impact on intra-DC flows. Otherwise, it switches back to the sender-based mode to proactively explore the available bandwidth. Besides, the intra-DC flow leverages the pausing mechanism to eliminate the queue build-up. Through a series of testbed experiments and large-scale NS2 simulations, we demonstrate that GTCP reduces flow completion time by up to 79.3% compared with existing protocols.
{"title":"GTCP: Hybrid Congestion Control for Cross-Datacenter Networks","authors":"Shaojun Zou, Jiawei Huang, Jingling Liu, Tao Zhang, Ning Jiang, Jianxin Wang","doi":"10.1109/ICDCS51616.2021.00093","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00093","url":null,"abstract":"To improve the quality of experience for worldwide users, an increasing number of service providers deploy their services on geographically dispersed data centers, which are connected by wide area network (WAN). In the cross-datacenter networks, however, the intra- and inter-datacenter parts have different characteristics, including switch buffer depth, round-trip time and bandwidth. Besides, most of intra-DC flows belong to interactive services that require low delay while inter-DC flows typically need to achieve high throughput. Unfortunately, existing sender-based and receiver-driven transport protocols do not consider the network heterogeneity between inter- and intra- DC networks so that they fail to simultaneously achieve low latency for intra-DC flows and high throughput for inter-DC flows. This paper proposes a general hybrid congestion control mechanism called GTCP to address this problem. When the inter-DC flow detects congestion inside data center, it switches to the receiver-driven mode to avoid the impact on intra-DC flows. Otherwise, it switches back to the sender-based mode to proactively explore the available bandwidth. Besides, the intra-DC flow leverages the pausing mechanism to eliminate the queue build-up. Through a series of testbed experiments and large-scale NS2 simulations, we demonstrate that GTCP reduces flow completion time by up to 79.3% compared with existing protocols.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116764313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00051
Songshi Dou, Zehua Guo, Yuanqing Xia
Software-Defined Networking (SDN) promises good network performance in Wide Area Networks (WANs) with the logically centralized control using physically distributed controllers. In Software-Defined WANs (SD-WANs), maintaining path programmability, which enables flexible path change on flows, is crucial for maintaining network performance under traffic variation. However, when controllers fail, existing solutions are essentially coarse-grained switch-controller mapping solutions and only recover the path programmability of a limited number of offline flows, which traverse offline switches controlled by failed controllers. In this paper, we propose ProgrammabilityMedic (PM) to provide predictable path programmability recovery under controller failures in SD-WANs. The key idea of PM is to approximately realize flow-controller mappings using hybrid SDN/legacy routing supported by high-end commercial SDN switches. Using the hybrid routing, we can recover programmability by fine-grainedly selecting a routing mode for each offline flow at each offline switch to fit the given control resource from active controllers. Thus, PM can effectively map offline switches to active controllers to improve recovery efficiency. Simulation results show that PM outperforms existing switch-level solutions by maintaining balanced programmability and increasing the total programmability of recovered offline flows up to 315% under two controller failures and 340% under three controller failures.
{"title":"ProgrammabilityMedic: Predictable Path Programmability Recovery under Multiple Controller Failures in SD-WANs","authors":"Songshi Dou, Zehua Guo, Yuanqing Xia","doi":"10.1109/ICDCS51616.2021.00051","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00051","url":null,"abstract":"Software-Defined Networking (SDN) promises good network performance in Wide Area Networks (WANs) with the logically centralized control using physically distributed controllers. In Software-Defined WANs (SD-WANs), maintaining path programmability, which enables flexible path change on flows, is crucial for maintaining network performance under traffic variation. However, when controllers fail, existing solutions are essentially coarse-grained switch-controller mapping solutions and only recover the path programmability of a limited number of offline flows, which traverse offline switches controlled by failed controllers. In this paper, we propose ProgrammabilityMedic (PM) to provide predictable path programmability recovery under controller failures in SD-WANs. The key idea of PM is to approximately realize flow-controller mappings using hybrid SDN/legacy routing supported by high-end commercial SDN switches. Using the hybrid routing, we can recover programmability by fine-grainedly selecting a routing mode for each offline flow at each offline switch to fit the given control resource from active controllers. Thus, PM can effectively map offline switches to active controllers to improve recovery efficiency. Simulation results show that PM outperforms existing switch-level solutions by maintaining balanced programmability and increasing the total programmability of recovered offline flows up to 315% under two controller failures and 340% under three controller failures.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116935753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00059
S. Shi, Lin Zhang, Bo Li
Distributed training with synchronous stochastic gradient descent (SGD) on GPU clusters has been widely used to accelerate the training process of deep models. However, SGD only utilizes the first-order gradient in model parameter updates, which may take days or weeks. Recent studies have successfully exploited approximate second-order information to speed up the training process, in which the Kronecker-Factored Approximate Curvature (KFAC) emerges as one of the most efficient approximation algorithms for training deep models. Yet, when leveraging GPU clusters to train models with distributed KFAC (D-KFAC), it incurs extensive computation as well as introduces extra communications during each iteration. In this work, we propose D-KFAC (SPD-KFAC) with smart parallelism of computing and communication tasks to reduce the iteration time. Specifically, 1) we first characterize the performance bottlenecks of D-KFAC, 2) we design and implement a pipelining mechanism for Kronecker factors computation and communication with dynamic tensor fusion, and 3) we develop a load balancing placement for inverting multiple matrices on GPU clusters. We conduct realworld experiments on a 64-GPU cluster with 100Gb/s InfiniBand interconnect. Experimental results show that our proposed SPD-KFAC training scheme can achieve 10%-35% improvement over state-of-the-art algorithms.
{"title":"Accelerating Distributed K-FAC with Smart Parallelism of Computing and Communication Tasks","authors":"S. Shi, Lin Zhang, Bo Li","doi":"10.1109/ICDCS51616.2021.00059","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00059","url":null,"abstract":"Distributed training with synchronous stochastic gradient descent (SGD) on GPU clusters has been widely used to accelerate the training process of deep models. However, SGD only utilizes the first-order gradient in model parameter updates, which may take days or weeks. Recent studies have successfully exploited approximate second-order information to speed up the training process, in which the Kronecker-Factored Approximate Curvature (KFAC) emerges as one of the most efficient approximation algorithms for training deep models. Yet, when leveraging GPU clusters to train models with distributed KFAC (D-KFAC), it incurs extensive computation as well as introduces extra communications during each iteration. In this work, we propose D-KFAC (SPD-KFAC) with smart parallelism of computing and communication tasks to reduce the iteration time. Specifically, 1) we first characterize the performance bottlenecks of D-KFAC, 2) we design and implement a pipelining mechanism for Kronecker factors computation and communication with dynamic tensor fusion, and 3) we develop a load balancing placement for inverting multiple matrices on GPU clusters. We conduct realworld experiments on a 64-GPU cluster with 100Gb/s InfiniBand interconnect. Experimental results show that our proposed SPD-KFAC training scheme can achieve 10%-35% improvement over state-of-the-art algorithms.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115210201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00120
Anshul Jindal, Mohak Chadha, M. Gerndt, Julian Frielinghaus, Vladimir Podolskiy, Pengfei Chen
Several of today's cloud applications are spread over heterogeneous connected computing resources and are highly dynamic in their structure and resource requirements. However, serverless computing and Function-as-a-Service (FaaS) platforms are limited to homogeneous clusters and homogeneous functions. We introduce an extension of FaaS to heterogeneous computing and to support heterogeneous functions through a network of distributed heterogeneous target platforms called Function Delivery Network (FDN). A target platform is a combination of a cluster of a homogeneous computing system and a FaaS platform on top of it. FDN provides Function-Delivery-as-a-Service (FDaaS), delivering the function invocations to the right target platform. We showcase the opportunities such as collaborative execution between multiple target platforms and varied target platform's characteristics that the FDN offers in fulfilling two objectives: Service Level Objective (SLO) requirements and energy efficiency when scheduling functions invocations by evaluating over five distributed target platforms.
{"title":"Poster: Function Delivery Network: Extending Serverless to Heterogeneous Computing","authors":"Anshul Jindal, Mohak Chadha, M. Gerndt, Julian Frielinghaus, Vladimir Podolskiy, Pengfei Chen","doi":"10.1109/ICDCS51616.2021.00120","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00120","url":null,"abstract":"Several of today's cloud applications are spread over heterogeneous connected computing resources and are highly dynamic in their structure and resource requirements. However, serverless computing and Function-as-a-Service (FaaS) platforms are limited to homogeneous clusters and homogeneous functions. We introduce an extension of FaaS to heterogeneous computing and to support heterogeneous functions through a network of distributed heterogeneous target platforms called Function Delivery Network (FDN). A target platform is a combination of a cluster of a homogeneous computing system and a FaaS platform on top of it. FDN provides Function-Delivery-as-a-Service (FDaaS), delivering the function invocations to the right target platform. We showcase the opportunities such as collaborative execution between multiple target platforms and varied target platform's characteristics that the FDN offers in fulfilling two objectives: Service Level Objective (SLO) requirements and energy efficiency when scheduling functions invocations by evaluating over five distributed target platforms.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127078884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00108
Mona Elsaadawy, Laetitia Fesselier, Bettina Kemme
The recent rise of cloud applications, representing large complex modern distributed services, has made performance monitoring a major issue and a critical process for both cloud providers and cloud customers. Many different monitoring techniques are used such as tracking resource consumption, performing application-specific measures or analyzing message exchanges. Typically the collected data is logged at the host on which the application is deployed, then either analyzed locally or forwarded to a remote analysis host. In contrast, this demonstration paper presents a Monitoring as a Service prototype that uses the advances in Software Defined Networking (SDN) to move some of the logging functionality into the network. The core of our MaaS is implemented as a virtual network function where agents are co-located with software switches in order to extract performance metrics from the message flows between components in a non-intrusive manner and send the calculated measures to the clients for visualization in near real-time. The MaaS has a lot of flexibility in how it is deployed and does not require to instrument software or platforms. In our demo we show the tool in action demonstrating how users can choose to monitor different service types and performance metrics in a user-friendly manner.
{"title":"Demo: Application Monitoring as a Network Service","authors":"Mona Elsaadawy, Laetitia Fesselier, Bettina Kemme","doi":"10.1109/ICDCS51616.2021.00108","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00108","url":null,"abstract":"The recent rise of cloud applications, representing large complex modern distributed services, has made performance monitoring a major issue and a critical process for both cloud providers and cloud customers. Many different monitoring techniques are used such as tracking resource consumption, performing application-specific measures or analyzing message exchanges. Typically the collected data is logged at the host on which the application is deployed, then either analyzed locally or forwarded to a remote analysis host. In contrast, this demonstration paper presents a Monitoring as a Service prototype that uses the advances in Software Defined Networking (SDN) to move some of the logging functionality into the network. The core of our MaaS is implemented as a virtual network function where agents are co-located with software switches in order to extract performance metrics from the message flows between components in a non-intrusive manner and send the calculated measures to the clients for visualization in near real-time. The MaaS has a lot of flexibility in how it is deployed and does not require to instrument software or platforms. In our demo we show the tool in action demonstrating how users can choose to monitor different service types and performance metrics in a user-friendly manner.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123690065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00064
Nikolay Ivanov, Qiben Yan, Qingyang Wang
Public blockchains have spurred the growing popularity of decentralized transactions and smart contracts, especially on the financial market. However, public blockchains exhibit their limitations on the transaction throughput, storage availability, and compute capacity. To avoid transaction gridlock, public blockchains impose large fees and per-block resource limits, making it difficult to accommodate the ever-growing high transaction demand. Previous research endeavors to improve the scalability and performance of blockchain through various technologies, such as side-chaining, sharding, secured off-chain computation, communication network optimizations, and efficient consensus protocols. However, these approaches have not attained a widespread adoption due to their inability in delivering a cloud-like performance, in terms of the scalability in transaction throughput, storage, and compute capacity. In this work, we determine that the major obstacle to public blockchain scalability is their underlying unstructured P2P networks. We further show that a centralized network can support the deployment of decentralized smart contracts. We propose a novel approach for achieving scalable decentralization: instead of trying to make blockchain scalable, we deliver decentralization to already scalable cloud by using an Ethereum smart contract. We introduce Blockumulus, a framework that can deploy decentralized cloud smart contract environments using a novel technique called overlay consensus. Through experiments, we demonstrate that Blockumulus is scalable in all three dimensions: computation, data storage, and transaction throughput. Besides eliminating the current code execution and storage restrictions, Blockumulus delivers a transaction latency between 2 and 5 seconds under normal load. Moreover, the stress test of our prototype reveals the ability to execute 20,000 simultaneous transactions under 26 seconds, which is on par with the average throughput of worldwide credit card transactions.
{"title":"Blockumulus: A Scalable Framework for Smart Contracts on the Cloud","authors":"Nikolay Ivanov, Qiben Yan, Qingyang Wang","doi":"10.1109/ICDCS51616.2021.00064","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00064","url":null,"abstract":"Public blockchains have spurred the growing popularity of decentralized transactions and smart contracts, especially on the financial market. However, public blockchains exhibit their limitations on the transaction throughput, storage availability, and compute capacity. To avoid transaction gridlock, public blockchains impose large fees and per-block resource limits, making it difficult to accommodate the ever-growing high transaction demand. Previous research endeavors to improve the scalability and performance of blockchain through various technologies, such as side-chaining, sharding, secured off-chain computation, communication network optimizations, and efficient consensus protocols. However, these approaches have not attained a widespread adoption due to their inability in delivering a cloud-like performance, in terms of the scalability in transaction throughput, storage, and compute capacity. In this work, we determine that the major obstacle to public blockchain scalability is their underlying unstructured P2P networks. We further show that a centralized network can support the deployment of decentralized smart contracts. We propose a novel approach for achieving scalable decentralization: instead of trying to make blockchain scalable, we deliver decentralization to already scalable cloud by using an Ethereum smart contract. We introduce Blockumulus, a framework that can deploy decentralized cloud smart contract environments using a novel technique called overlay consensus. Through experiments, we demonstrate that Blockumulus is scalable in all three dimensions: computation, data storage, and transaction throughput. Besides eliminating the current code execution and storage restrictions, Blockumulus delivers a transaction latency between 2 and 5 seconds under normal load. Moreover, the stress test of our prototype reveals the ability to execute 20,000 simultaneous transactions under 26 seconds, which is on par with the average throughput of worldwide credit card transactions.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122965448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00122
P. Yao, Longkun Guo, Jiguo Yu
Emerging applications impose the min-max line barrier coverage (LBC) problem that aims to minimize the maximum movement of the sensors for the sake of balancing energy consumption. In the paper, we devise an algorithm for LBC that finds an optimal solution within a runtime $O(n^{2})$, improving the previous state-of-art runtime $o(n^{2}log n)$ due to [7]. The key idea to accelerating the computation of the optimum solutions is to use approximation solutions that are obtained by our devised approximation algorithm. Numerical experiments demonstrate our algorithms outperform all the other baselines including the previous state-of-art algorithm.
{"title":"Poster: Quadratic-Time Algorithms for Optimal Min-Max Barrier Coverage with Mobile Sensors on the Plane","authors":"P. Yao, Longkun Guo, Jiguo Yu","doi":"10.1109/ICDCS51616.2021.00122","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00122","url":null,"abstract":"Emerging applications impose the min-max line barrier coverage (LBC) problem that aims to minimize the maximum movement of the sensors for the sake of balancing energy consumption. In the paper, we devise an algorithm for LBC that finds an optimal solution within a runtime $O(n^{2})$, improving the previous state-of-art runtime $o(n^{2}log n)$ due to [7]. The key idea to accelerating the computation of the optimum solutions is to use approximation solutions that are obtained by our devised approximation algorithm. Numerical experiments demonstrate our algorithms outperform all the other baselines including the previous state-of-art algorithm.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"2 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121014911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-01DOI: 10.1109/ICDCS51616.2021.00077
Huizi Xiao, Qingyang Zhang, Qingqi Pei, Weisong Shi
Edge computing is a promising paradigm that pushes computing, storage, and energy to the networks' edge. It utilizes the data nearby the users to provide real-time, energy-efficient, and reliable services. Neural network inference in edge computing is a powerful tool for various applications. However, edge server will collect more personal sensitive information of users inevitably. It is the most basic requirement for users to ensure their security and privacy while obtaining accurate inference results. Homomorphic encryption (HE) technology is confidential computing that directly performs mathematical computing on encrypted data. But it only can carry out limited addition and multiplication operation with very low efficiency. Intel software guard extension (SGX) can provide a trusted isolation space in the CPU to ensure the confidentiality and integrity of code and data executed. But several defects are hard to overcome due to hardware design limitations when applying SGX in inference services. This paper proposes a hybrid framework utilizing SGX to accelerate the HE-based convolutional neural network (CNN) inference, eliminating the approximation operations in HE to improve inference accuracy in theory. Besides, SGX is also taken as a built-in trusted third party to distribute keys, thereby improving our framework's scalability and flexibility. We have quantified the various CNN operations in the respective cases of HE and SGX to provide the foresight practice. Taking the connected and autonomous vehicles as a case study in edge computing, we implemented this hybrid framework in CNN to verify its feasibility and advantage.
{"title":"Privacy-Preserving Neural Network Inference Framework via Homomorphic Encryption and SGX","authors":"Huizi Xiao, Qingyang Zhang, Qingqi Pei, Weisong Shi","doi":"10.1109/ICDCS51616.2021.00077","DOIUrl":"https://doi.org/10.1109/ICDCS51616.2021.00077","url":null,"abstract":"Edge computing is a promising paradigm that pushes computing, storage, and energy to the networks' edge. It utilizes the data nearby the users to provide real-time, energy-efficient, and reliable services. Neural network inference in edge computing is a powerful tool for various applications. However, edge server will collect more personal sensitive information of users inevitably. It is the most basic requirement for users to ensure their security and privacy while obtaining accurate inference results. Homomorphic encryption (HE) technology is confidential computing that directly performs mathematical computing on encrypted data. But it only can carry out limited addition and multiplication operation with very low efficiency. Intel software guard extension (SGX) can provide a trusted isolation space in the CPU to ensure the confidentiality and integrity of code and data executed. But several defects are hard to overcome due to hardware design limitations when applying SGX in inference services. This paper proposes a hybrid framework utilizing SGX to accelerate the HE-based convolutional neural network (CNN) inference, eliminating the approximation operations in HE to improve inference accuracy in theory. Besides, SGX is also taken as a built-in trusted third party to distribute keys, thereby improving our framework's scalability and flexibility. We have quantified the various CNN operations in the respective cases of HE and SGX to provide the foresight practice. Taking the connected and autonomous vehicles as a case study in edge computing, we implemented this hybrid framework in CNN to verify its feasibility and advantage.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124982852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}