We present a game-theoretic approach for power reduction in large-scale distributed storage systems. The key idea is to use distributed hash tables to dynamically migrate virtual nodes, thus skewing the workload towards a subset of physical disks without overloading them. To realize this idea in an autonomous way (i.e., without any kind of central controller), virtual nodes are considered to be selfish agents playing a game in which each node receives a payoff according to the workload of the disk on which it currently resides. We model this setting as a potential game, where an increase in the payoff to a virtual node reduces the power of the system. This game consists of a pair of global and private utility functions, derived by means of the Wonderful Life Utility technique. The former function evaluates the state of the system, and the latter provides criteria for the migration of each node. The performance of our method is measured by simulations and a prototype implementation. From these evaluations, we find that our method reduces the running time of the disks in active mode by 12.7-18.7%, with an overall average response time of 50-190 ms.
{"title":"Using a Potential Game for Power Reduction in Distributed Storage Systems","authors":"Koji Hasebe, Takumi Sawada, Kazuhiko Kato","doi":"10.1109/IC2E.2014.70","DOIUrl":"https://doi.org/10.1109/IC2E.2014.70","url":null,"abstract":"We present a game-theoretic approach for power reduction in large-scale distributed storage systems. The key idea is to use distributed hash tables to dynamically migrate virtual nodes, thus skewing the workload towards a subset of physical disks without overloading them. To realize this idea in an autonomous way (i.e., without any kind of central controller), virtual nodes are considered to be selfish agents playing a game in which each node receives a payoff according to the workload of the disk on which it currently resides. We model this setting as a potential game, where an increase in the payoff to a virtual node reduces the power of the system. This game consists of a pair of global and private utility functions, derived by means of the Wonderful Life Utility technique. The former function evaluates the state of the system, and the latter provides criteria for the migration of each node. The performance of our method is measured by simulations and a prototype implementation. From these evaluations, we find that our method reduces the running time of the disks in active mode by 12.7-18.7%, with an overall average response time of 50-190 ms.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131370217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mathew Ryden, Kwangsung Oh, A. Chandra, J. Weissman
Centralized cloud infrastructures have become the de-facto platform for data-intensive computing today. However, they suffer from inefficient data mobility due to the centralization of cloud resources, and hence, are highly unsuited for dispersed-data-intensive applications, where the data may be spread at multiple geographical locations. In this paper, we present Nebula: a dispersed cloud infrastructure that uses voluntary edge resources for both computation and data storage. We describe the lightweight Nebula architecture that enables distributed data-intensive computing through a number of optimizations including location-aware data and computation placement, replication, and recovery. We evaluate Nebula's performance on an emulated volunteer platform that spans over 50 PlanetLab nodes distributed across Europe, and show how a common data-intensive computing framework, MapReduce, can be easily deployed and run on Nebula. We show Nebula MapReduce is robust to a wide array of failures and substantially outperforms other wide-area versions based on a BOINC like model.
{"title":"Nebula: Distributed Edge Cloud for Data Intensive Computing","authors":"Mathew Ryden, Kwangsung Oh, A. Chandra, J. Weissman","doi":"10.1109/IC2E.2014.34","DOIUrl":"https://doi.org/10.1109/IC2E.2014.34","url":null,"abstract":"Centralized cloud infrastructures have become the de-facto platform for data-intensive computing today. However, they suffer from inefficient data mobility due to the centralization of cloud resources, and hence, are highly unsuited for dispersed-data-intensive applications, where the data may be spread at multiple geographical locations. In this paper, we present Nebula: a dispersed cloud infrastructure that uses voluntary edge resources for both computation and data storage. We describe the lightweight Nebula architecture that enables distributed data-intensive computing through a number of optimizations including location-aware data and computation placement, replication, and recovery. We evaluate Nebula's performance on an emulated volunteer platform that spans over 50 PlanetLab nodes distributed across Europe, and show how a common data-intensive computing framework, MapReduce, can be easily deployed and run on Nebula. We show Nebula MapReduce is robust to a wide array of failures and substantially outperforms other wide-area versions based on a BOINC like model.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120933835","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software-Defined Networking (SDN) increasingly attracts more researchers as well as industry attentions. Most of current SDN packet processing approaches classify packets based on matching a set of fields on the packet against a flow table and then applying an action on the packet. We argue we can simplify this mechanism using single-field classification and reduce the overhead. We propose a tag-based packet classification architecture to reduce filtering and flow management overhead. Then, we show how to use this extra capacity to perform application layer classification for different purposes. In this work-in-progress paper we demonstrate our preliminary evaluation results to indicate the effectiveness of the proposal.
{"title":"Rethinking Flow Classification in SDN","authors":"H. Farhadi, A. Nakao","doi":"10.1109/IC2E.2014.24","DOIUrl":"https://doi.org/10.1109/IC2E.2014.24","url":null,"abstract":"Software-Defined Networking (SDN) increasingly attracts more researchers as well as industry attentions. Most of current SDN packet processing approaches classify packets based on matching a set of fields on the packet against a flow table and then applying an action on the packet. We argue we can simplify this mechanism using single-field classification and reduce the overhead. We propose a tag-based packet classification architecture to reduce filtering and flow management overhead. Then, we show how to use this extra capacity to perform application layer classification for different purposes. In this work-in-progress paper we demonstrate our preliminary evaluation results to indicate the effectiveness of the proposal.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115016434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper studies the problem of secret-message transmission over a wiretap channel with correlated sources in the presence of an eavesdropper who has no source observation. A coding scheme is proposed based on a careful combination of 1) Wyner-Ziv's source coding to generate secret key from correlated sources based on a certain cost on the channel, 2) one-time pad to secure messages without additional cost, and 3) Wyner's secrecy coding to achieve secrecy based on the advantage of legitimate receiver's channel over the eavesdropper's. The work sheds light on optimal strategies for practical code design for secure communication/storage systems.
{"title":"Wiretap Channel with Correlated Sources","authors":"Yanling Chen, N. Cai, A. Sezgin","doi":"10.1109/IC2E.2014.80","DOIUrl":"https://doi.org/10.1109/IC2E.2014.80","url":null,"abstract":"This paper studies the problem of secret-message transmission over a wiretap channel with correlated sources in the presence of an eavesdropper who has no source observation. A coding scheme is proposed based on a careful combination of 1) Wyner-Ziv's source coding to generate secret key from correlated sources based on a certain cost on the channel, 2) one-time pad to secure messages without additional cost, and 3) Wyner's secrecy coding to achieve secrecy based on the advantage of legitimate receiver's channel over the eavesdropper's. The work sheds light on optimal strategies for practical code design for secure communication/storage systems.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133610053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
David Erickson, Brandon Heller, N. McKeown, M. Rosenblum
The scale and expense of modern data centers motivates running them as efficiently as possible. This paper explores how virtualized data center performance can be improved when network traffic and topology data informs VM placement. Our practical heuristics, tested on network-heavy, scale-out workloads in an 80 server cluster, improve overall performance by up to 70% compared to random placement in a multi-tenant configuration.
{"title":"Using Network Knowledge to Improve Workload Performance in Virtualized Data Centers","authors":"David Erickson, Brandon Heller, N. McKeown, M. Rosenblum","doi":"10.1109/IC2E.2014.81","DOIUrl":"https://doi.org/10.1109/IC2E.2014.81","url":null,"abstract":"The scale and expense of modern data centers motivates running them as efficiently as possible. This paper explores how virtualized data center performance can be improved when network traffic and topology data informs VM placement. Our practical heuristics, tested on network-heavy, scale-out workloads in an 80 server cluster, improve overall performance by up to 70% compared to random placement in a multi-tenant configuration.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134026863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph databases have become increasingly popular for a variety of uses ranging from modeling online code repositories to tracking software engineering dependencies. These areas use graph databases because many of their problems can be expressed in terms of graph traversals. Recent work has applied graph databases to virtualization management, noting that many IT questions can also be expressed as graph traversals. In this paper, we study another area in which graphs are valuable: reporting and auditing in cloud infrastructure. We first examine cloud infrastructure and map its data model to a graph. Building upon this model, we recast a number of reporting queries in terms of graph traversals. We then modify the model both for performance and for accommodating additional use cases related to cloud computing, including migration from private to hybrid clouds. Our results show that while a graph backend makes it straightforward to formulate certain kinds of queries, a naive mapping of graphs to a graph database can result in poor performance. Utilizing knowledge of the problem domain and restructuring the graph can provide dramatic gains in performance and make a graph database feasible for such queries.
{"title":"Applying Graph Databases to Cloud Management: An Exploration","authors":"V. Soundararajan, Shishir Kakaraddi","doi":"10.1109/IC2E.2014.47","DOIUrl":"https://doi.org/10.1109/IC2E.2014.47","url":null,"abstract":"Graph databases have become increasingly popular for a variety of uses ranging from modeling online code repositories to tracking software engineering dependencies. These areas use graph databases because many of their problems can be expressed in terms of graph traversals. Recent work has applied graph databases to virtualization management, noting that many IT questions can also be expressed as graph traversals. In this paper, we study another area in which graphs are valuable: reporting and auditing in cloud infrastructure. We first examine cloud infrastructure and map its data model to a graph. Building upon this model, we recast a number of reporting queries in terms of graph traversals. We then modify the model both for performance and for accommodating additional use cases related to cloud computing, including migration from private to hybrid clouds. Our results show that while a graph backend makes it straightforward to formulate certain kinds of queries, a naive mapping of graphs to a graph database can result in poor performance. Utilizing knowledge of the problem domain and restructuring the graph can provide dramatic gains in performance and make a graph database feasible for such queries.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121969868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Conventional cloud computing architectures may seriously constrain computational throughput for high performance computing (HPC) and high-performance data (HPD) applications. The traditional approach to circumvent such problems has been to map these applications and problems onto other specialized hardware and coprocessor architectures. This is both time and resource expensive, and poses a challenge for rapidly rising demands for computation and data analytics. In this paper we report on progress to develop an alternative experimental software defined cloud implementation that virtualizes the topology of a standard HPC computational architecture. This software defined system re-arranges access to the nodes and dynamically customizes the features of the HPC hardware architecture so that they map to the specifics of the computation and data analysis application. This allows a cloud computing implementation to utilize the specialized infrastructure capabilities of an HPC system. We have created this type of user reconfigurable architecture on an IBM Blue Gene/P supercomputing environment at the Department of Energy's Argonne Leadership Computing Facility (ALCF). This pilot configuration was implemented using both an open source cloud technology called VCL (Virtual Computing Laboratory) in combination with a provisioning module called Kittyhawk. Cloud security is addressed by configuring and running a root-less version of the VCL cloud system on the ALCF's Blue Gene/P login node.
传统的云计算架构可能严重限制高性能计算(HPC)和高性能数据(HPD)应用程序的计算吞吐量。规避此类问题的传统方法是将这些应用程序和问题映射到其他专用硬件和协处理器架构上。这既耗费时间又耗费资源,并且对快速增长的计算和数据分析需求提出了挑战。在本文中,我们报告了开发另一种实验性软件定义的云实现的进展,该实现虚拟化了标准HPC计算架构的拓扑结构。这个软件定义的系统重新安排对节点的访问,并动态定制HPC硬件架构的特性,以便它们映射到计算和数据分析应用程序的细节。这允许云计算实现利用HPC系统的专门基础设施功能。我们已经在能源部阿贡领导计算设施(ALCF)的IBM Blue Gene/P超级计算环境中创建了这种类型的用户可重构架构。这个试验配置是使用名为VCL(虚拟计算实验室)的开源云技术和名为Kittyhawk的供应模块来实现的。通过在ALCF的Blue Gene/P登录节点上配置和运行无根版本的VCL云系统来解决云安全问题。
{"title":"Toward Implementation of a Software Defined Cloud on a Supercomputer","authors":"P. Dreher, Georgy Kallumkal","doi":"10.1109/IC2E.2014.57","DOIUrl":"https://doi.org/10.1109/IC2E.2014.57","url":null,"abstract":"Conventional cloud computing architectures may seriously constrain computational throughput for high performance computing (HPC) and high-performance data (HPD) applications. The traditional approach to circumvent such problems has been to map these applications and problems onto other specialized hardware and coprocessor architectures. This is both time and resource expensive, and poses a challenge for rapidly rising demands for computation and data analytics. In this paper we report on progress to develop an alternative experimental software defined cloud implementation that virtualizes the topology of a standard HPC computational architecture. This software defined system re-arranges access to the nodes and dynamically customizes the features of the HPC hardware architecture so that they map to the specifics of the computation and data analysis application. This allows a cloud computing implementation to utilize the specialized infrastructure capabilities of an HPC system. We have created this type of user reconfigurable architecture on an IBM Blue Gene/P supercomputing environment at the Department of Energy's Argonne Leadership Computing Facility (ALCF). This pilot configuration was implemented using both an open source cloud technology called VCL (Virtual Computing Laboratory) in combination with a provisioning module called Kittyhawk. Cloud security is addressed by configuring and running a root-less version of the VCL cloud system on the ALCF's Blue Gene/P login node.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124818983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Steffen Müller, David Bermbach, S. Tai, Frank Pallas
Cloud storage services and NoSQL systems are optimized for performance and availability. Hence, enterprise-grade features like security mechanisms are typically neglected even though there is a need for them with increased cloud adoption by enterprises. Only Transport Layer Security (TLS) is frequently supported. Furthermore, the standard Transport Layer Security (TLS) protocol offers many configuration options which are usually chosen purely based on chance. We argue that in cloud database systems, configuration options should be chosen based on the degree of vulnerability to attacks and security threats as well as on the performance overhead of the respective algorithms. Our contributions are a benchmarking approach for transparent analysis of the performance impact of various TLS configuration options and a custom TLS socket implementation which offers more fine-grained control over the configuration options chosen. We also use our benchmarking approach to study the performance impact of TLS in Amazon DynamoDB and Apache Cassandra.
{"title":"Benchmarking the Performance Impact of Transport Layer Security in Cloud Database Systems","authors":"Steffen Müller, David Bermbach, S. Tai, Frank Pallas","doi":"10.1109/IC2E.2014.48","DOIUrl":"https://doi.org/10.1109/IC2E.2014.48","url":null,"abstract":"Cloud storage services and NoSQL systems are optimized for performance and availability. Hence, enterprise-grade features like security mechanisms are typically neglected even though there is a need for them with increased cloud adoption by enterprises. Only Transport Layer Security (TLS) is frequently supported. Furthermore, the standard Transport Layer Security (TLS) protocol offers many configuration options which are usually chosen purely based on chance. We argue that in cloud database systems, configuration options should be chosen based on the degree of vulnerability to attacks and security threats as well as on the performance overhead of the respective algorithms. Our contributions are a benchmarking approach for transparent analysis of the performance impact of various TLS configuration options and a custom TLS socket implementation which offers more fine-grained control over the configuration options chosen. We also use our benchmarking approach to study the performance impact of TLS in Amazon DynamoDB and Apache Cassandra.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129765509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PaaSs emerge to help SaaS providers to conquer the challenges involved in guaranteeing the QoS of international SaaSs. A well structured machine readable SLA is critical for PaaS providers to automate InterCloud management to serve the worldwide elastic user demands. WS-Agreement is a widely accepted extendible language for describing agreements. In this paper, we propose PSLA, a well structured PaaS Level SLA description language based on WS-Agreement. We summarize and describe the semantic clauses needed to be considered in PaaS level SLA. In particular, some specific characteristics of PaaS are taken into account in PSLA, such as elasticity of workload, undefined metric properties and fuzzy value range.
{"title":"PSLA: A PaaS Level SLA Description Language","authors":"Ge Li, Frédéric Pourraz, P. Moreaux","doi":"10.1109/IC2E.2014.29","DOIUrl":"https://doi.org/10.1109/IC2E.2014.29","url":null,"abstract":"PaaSs emerge to help SaaS providers to conquer the challenges involved in guaranteeing the QoS of international SaaSs. A well structured machine readable SLA is critical for PaaS providers to automate InterCloud management to serve the worldwide elastic user demands. WS-Agreement is a widely accepted extendible language for describing agreements. In this paper, we propose PSLA, a well structured PaaS Level SLA description language based on WS-Agreement. We summarize and describe the semantic clauses needed to be considered in PaaS level SLA. In particular, some specific characteristics of PaaS are taken into account in PSLA, such as elasticity of workload, undefined metric properties and fuzzy value range.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115188075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Techniques for big data analytics should support principles of elasticity that are inherent in types of data and data resources being analyzed, computational models and computing units used for analyzing data, and the quality of results expected from the consumer. In this paper, we analyze and present these principles and their consequences for software-defined environments to support data analytics. We will conceptualize software-defined elastic systems for data analytics and present a case study in smart city management, urban mobility and energy systems with our elasticity supports.
{"title":"Principles of Software-Defined Elastic Systems for Big Data Analytics","authors":"Hong Linh Truong, S. Dustdar","doi":"10.1109/IC2E.2014.67","DOIUrl":"https://doi.org/10.1109/IC2E.2014.67","url":null,"abstract":"Techniques for big data analytics should support principles of elasticity that are inherent in types of data and data resources being analyzed, computational models and computing units used for analyzing data, and the quality of results expected from the consumer. In this paper, we analyze and present these principles and their consequences for software-defined environments to support data analytics. We will conceptualize software-defined elastic systems for data analytics and present a case study in smart city management, urban mobility and energy systems with our elasticity supports.","PeriodicalId":273902,"journal":{"name":"2014 IEEE International Conference on Cloud Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115359687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}