首页 > 最新文献

2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)最新文献

英文 中文
OptCon: An Adaptable SLA-Aware Consistency Tuning Framework for Quorum-Based Stores OptCon:基于群体存储的适应性sla感知一致性调优框架
Subhajit Sidhanta, W. Golab, S. Mukhopadhyay, Saikat Basu
Users of distributed datastores that employquorum-based replication are burdened with the choice of asuitable client-centric consistency setting for each storage operation. The above matching choice is difficult to reason about asit requires deliberating about the tradeoff between the latencyand staleness, i.e., how stale (old) the result is. The latencyand staleness for a given operation depend on the client-centricconsistency setting applied, as well as dynamic parameters such asthe current workload and network condition. We present OptCon, a novel machine learning-based predictive framework, that canautomate the choice of client-centric consistency setting underuser-specified latency and staleness thresholds given in the servicelevel agreement (SLA). Under a given SLA, OptCon predictsa client-centric consistency setting that is matching, i.e., it isweak enough to satisfy the latency threshold, while being strongenough to satisfy the staleness threshold. While manually tunedconsistency settings remain fixed unless explicitly reconfigured, OptCon tunes consistency settings on a per-operation basis withrespect to changing workload and network state. Using decisiontree learning, OptCon yields 0.14 cross validation error in predictingmatching consistency settings under latency and stalenessthresholds given in the SLA. We demonstrate experimentally thatOptCon is at least as effective as any manually chosen consistencysettings in adapting to the SLA thresholds for different usecases. We also demonstrate that OptCon adapts to variationsin workload, whereas a given manually chosen fixed consistencysetting satisfies the SLA only for a characteristic workload.
使用基于群体的复制的分布式数据存储的用户需要为每个存储操作选择合适的以客户端为中心的一致性设置。上面的匹配选择很难推理,因为它需要考虑延迟和过时之间的权衡,即,结果有多陈旧(旧)。给定操作的延迟和过期取决于应用的以客户端为中心的一致性设置,以及当前工作负载和网络条件等动态参数。我们提出了一种新的基于机器学习的预测框架OptCon,它可以自动选择以客户为中心的一致性设置,在服务水平协议(SLA)中给出用户指定的延迟和过期阈值。在给定的SLA下,OptCon预测的是以客户端为中心的一致性设置是匹配的,也就是说,它足够弱以满足延迟阈值,同时足够强以满足过期阈值。手动调整的一致性设置保持固定,除非明确重新配置,OptCon根据工作负载和网络状态的变化在每次操作的基础上调整一致性设置。使用决策树学习,OptCon在预测SLA中给定的延迟和延迟阈值下的匹配一致性设置时产生0.14的交叉验证误差。我们通过实验证明,optcon在适应不同用例的SLA阈值方面至少与任何手动选择的一致性设置一样有效。我们还证明了OptCon适应工作负载的变化,而给定的手动选择的固定一致性设置仅满足特征工作负载的SLA。
{"title":"OptCon: An Adaptable SLA-Aware Consistency Tuning Framework for Quorum-Based Stores","authors":"Subhajit Sidhanta, W. Golab, S. Mukhopadhyay, Saikat Basu","doi":"10.1109/CCGrid.2016.9","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.9","url":null,"abstract":"Users of distributed datastores that employquorum-based replication are burdened with the choice of asuitable client-centric consistency setting for each storage operation. The above matching choice is difficult to reason about asit requires deliberating about the tradeoff between the latencyand staleness, i.e., how stale (old) the result is. The latencyand staleness for a given operation depend on the client-centricconsistency setting applied, as well as dynamic parameters such asthe current workload and network condition. We present OptCon, a novel machine learning-based predictive framework, that canautomate the choice of client-centric consistency setting underuser-specified latency and staleness thresholds given in the servicelevel agreement (SLA). Under a given SLA, OptCon predictsa client-centric consistency setting that is matching, i.e., it isweak enough to satisfy the latency threshold, while being strongenough to satisfy the staleness threshold. While manually tunedconsistency settings remain fixed unless explicitly reconfigured, OptCon tunes consistency settings on a per-operation basis withrespect to changing workload and network state. Using decisiontree learning, OptCon yields 0.14 cross validation error in predictingmatching consistency settings under latency and stalenessthresholds given in the SLA. We demonstrate experimentally thatOptCon is at least as effective as any manually chosen consistencysettings in adapting to the SLA thresholds for different usecases. We also demonstrate that OptCon adapts to variationsin workload, whereas a given manually chosen fixed consistencysetting satisfies the SLA only for a characteristic workload.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134015846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
OptEx: A Deadline-Aware Cost Optimization Model for Spark OptEx:一个面向Spark的截止日期感知成本优化模型
Subhajit Sidhanta, W. Golab, S. Mukhopadhyay
We present OptEx, a closed-form model of job execution on Apache Spark, a popular parallel processing engine. To the best of our knowledge, OptEx is the first work that analytically models job completion time on Spark. The model can be used to estimate the completion time of a given Spark job on a cloud, with respect to the size of the input dataset, the number of iterations, the number of nodes comprising the underlying cluster. Experimental results demonstrate that OptEx yields a mean relative error of 6% in estimating the job completion time. Furthermore, the model can be applied for estimating the cost optimal cluster composition for running a given Spark job on a cloud under a completion deadline specified in the SLO (i.e.,Service Level Objective). We show experimentally that OptEx is able to correctly estimate the cost optimal cluster composition for running a given Spark job under an SLO deadline with an accuracy of 98%.
我们提出了OptEx,一个在Apache Spark(一种流行的并行处理引擎)上的作业执行的封闭形式模型。据我们所知,OptEx是第一个在Spark上分析建模作业完成时间的工作。该模型可用于估计云上给定Spark作业的完成时间,涉及输入数据集的大小、迭代次数、组成底层集群的节点数量。实验结果表明,OptEx估计作业完成时间的平均相对误差为6%。此外,该模型还可以用于估计在SLO中指定的完成期限(即服务水平目标)下在云上运行给定Spark作业的成本最优集群组成。我们通过实验证明,OptEx能够正确估计在SLO截止日期下运行给定Spark作业的成本最优集群组成,准确率为98%。
{"title":"OptEx: A Deadline-Aware Cost Optimization Model for Spark","authors":"Subhajit Sidhanta, W. Golab, S. Mukhopadhyay","doi":"10.1109/CCGrid.2016.10","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.10","url":null,"abstract":"We present OptEx, a closed-form model of job execution on Apache Spark, a popular parallel processing engine. To the best of our knowledge, OptEx is the first work that analytically models job completion time on Spark. The model can be used to estimate the completion time of a given Spark job on a cloud, with respect to the size of the input dataset, the number of iterations, the number of nodes comprising the underlying cluster. Experimental results demonstrate that OptEx yields a mean relative error of 6% in estimating the job completion time. Furthermore, the model can be applied for estimating the cost optimal cluster composition for running a given Spark job on a cloud under a completion deadline specified in the SLO (i.e.,Service Level Objective). We show experimentally that OptEx is able to correctly estimate the cost optimal cluster composition for running a given Spark job under an SLO deadline with an accuracy of 98%.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127749937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
DocLite: A Docker-Based Lightweight Cloud Benchmarking Tool docclite:基于docker的轻量级云基准测试工具
B. Varghese, Lawan Thamsuhang Subba, Long Thai, A. Barker
Existing benchmarking methods are time consuming processes as they typically benchmark the entire Virtual Machine (VM) in order to generate accurate performance data, making them less suitable for real-time analytics. The research in this paper is aimed to surmount the above challenge by presenting DocLite - Docker Container-based Lightweight benchmarking tool. DocLite explores lightweight cloud benchmarking methods for rapidly executing benchmarks in near real-time. DocLite is built on the Docker container technology, which allows a user-defined memory size and number of CPU cores of the VM to be benchmarked. The tool incorporates two benchmarking methods - the first referred to as the native method employs containers to benchmark a small portion of the VM and generate performance ranks, and the second uses historic benchmark data along with the native method as a hybrid to generate VM ranks. The proposed methods are evaluated on three use-cases and are observed to be up to 91 times faster than benchmarking the entire VM. In both methods, small containers provide the same quality of rankings as a large container. The native method generates ranks with over 90% and 86% accuracy for sequential and parallel execution of an application compared against benchmarking the whole VM. The hybrid method did not improve the quality of the rankings significantly.
现有的基准测试方法是耗时的过程,因为它们通常对整个虚拟机(VM)进行基准测试,以生成准确的性能数据,这使得它们不太适合实时分析。本文的研究旨在通过提出基于docclite - Docker容器的轻量级基准测试工具来克服上述挑战。DocLite探索轻量级云基准测试方法,以便近乎实时地快速执行基准测试。docclite基于Docker容器技术,允许用户自定义虚拟机的内存大小和CPU核数进行基准测试。该工具包含两种基准测试方法——第一种称为本机方法,使用容器对VM的一小部分进行基准测试并生成性能排名,第二种使用历史基准测试数据和本机方法作为混合方法来生成VM排名。建议的方法在三个用例上进行了评估,并且观察到比对整个VM进行基准测试快91倍。在这两种方法中,小容器提供与大容器相同质量的排名。与对整个VM进行基准测试相比,本机方法对应用程序的顺序和并行执行生成的排名准确率分别超过90%和86%。混合方法并没有显著提高排名的质量。
{"title":"DocLite: A Docker-Based Lightweight Cloud Benchmarking Tool","authors":"B. Varghese, Lawan Thamsuhang Subba, Long Thai, A. Barker","doi":"10.1109/CCGrid.2016.14","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.14","url":null,"abstract":"Existing benchmarking methods are time consuming processes as they typically benchmark the entire Virtual Machine (VM) in order to generate accurate performance data, making them less suitable for real-time analytics. The research in this paper is aimed to surmount the above challenge by presenting DocLite - Docker Container-based Lightweight benchmarking tool. DocLite explores lightweight cloud benchmarking methods for rapidly executing benchmarks in near real-time. DocLite is built on the Docker container technology, which allows a user-defined memory size and number of CPU cores of the VM to be benchmarked. The tool incorporates two benchmarking methods - the first referred to as the native method employs containers to benchmark a small portion of the VM and generate performance ranks, and the second uses historic benchmark data along with the native method as a hybrid to generate VM ranks. The proposed methods are evaluated on three use-cases and are observed to be up to 91 times faster than benchmarking the entire VM. In both methods, small containers provide the same quality of rankings as a large container. The native method generates ranks with over 90% and 86% accuracy for sequential and parallel execution of an application compared against benchmarking the whole VM. The hybrid method did not improve the quality of the rankings significantly.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128437073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Checkpointing to Minimize Completion Time for Inter-Dependent Parallel Processes on Volunteer Grids 志愿网格上相互依赖的并行进程的检查点最小化完成时间
M. T. Rahman, Hien Nguyen, J. Subhlok, Gopal Pandurangan
Volunteer computing is being used successfully for large scale scientific computations. This research is in the context of Volpex, a programming framework that supports communicating parallel processes in a volunteer environment. Redundancy and checkpointing are combined to ensure consistent forward progress with Volpex in this unique execution environment characterized by heterogeneous failure prone nodes and interdependent replicated processes. An important parameter for optimizing performance with Volpex is the frequency of checkpointing. The paper presents a mathematical model to minimize the completion time for inter-dependent parallel processes running in a volunteer environment by finding a suitable checkpoint interval. Validation is performed with a sample real world application running on a pool of distributed volunteer nodes. The results indicate that the performance with our predicted checkpoint interval is fairly close to the best performance obtained empirically by varying the checkpoint interval.
志愿者计算正在成功地用于大规模的科学计算。这项研究是在Volpex的背景下进行的,Volpex是一个支持在志愿者环境中通信并行进程的编程框架。冗余和检查点相结合,以确保Volpex在这个独特的执行环境中保持一致的前进进度,该环境以异构故障易发节点和相互依赖的复制过程为特征。使用Volpex优化性能的一个重要参数是检查点的频率。本文提出了一个数学模型,通过寻找一个合适的检查点间隔来最小化在志愿者环境中运行的相互依赖的并行进程的完成时间。验证是通过在分布式志愿节点池上运行的示例实际应用程序来执行的。结果表明,我们预测的检查点间隔的性能相当接近通过改变检查点间隔获得的最佳性能。
{"title":"Checkpointing to Minimize Completion Time for Inter-Dependent Parallel Processes on Volunteer Grids","authors":"M. T. Rahman, Hien Nguyen, J. Subhlok, Gopal Pandurangan","doi":"10.1109/CCGrid.2016.78","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.78","url":null,"abstract":"Volunteer computing is being used successfully for large scale scientific computations. This research is in the context of Volpex, a programming framework that supports communicating parallel processes in a volunteer environment. Redundancy and checkpointing are combined to ensure consistent forward progress with Volpex in this unique execution environment characterized by heterogeneous failure prone nodes and interdependent replicated processes. An important parameter for optimizing performance with Volpex is the frequency of checkpointing. The paper presents a mathematical model to minimize the completion time for inter-dependent parallel processes running in a volunteer environment by finding a suitable checkpoint interval. Validation is performed with a sample real world application running on a pool of distributed volunteer nodes. The results indicate that the performance with our predicted checkpoint interval is fairly close to the best performance obtained empirically by varying the checkpoint interval.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129748246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Distributed System for Storing and Processing Data from Earth-Observing Satellites: System Design and Performance Evaluation of the Visualisation Tool 地球观测卫星数据存储与处理的分布式系统:可视化工具的系统设计与性能评估
M. Szuba, P. Ameri, U. Grabowski, Jörg Meyer, A. Streit
We present a distributed system for storage, processing, three-dimensional visualisation and basic analysis of data from Earth-observing satellites. The database and the server have been designed for high performance and scalability, whereas the client is highly portable thanks to having been designed as a HTML5- and WebGL-based Web application. The system is based on the so-called MEAN stack, a modern replacement for LAMP which has steadily been gaining traction among high-performance Web applications. We demonstrate the performance of the system from the perspective of an user operating the client.
提出了一种分布式系统,用于对地观测卫星数据的存储、处理、三维可视化和基本分析。数据库和服务器被设计为高性能和可伸缩性,而客户端由于被设计为基于HTML5和webgl的Web应用程序而具有高度可移植性。该系统基于所谓的MEAN堆栈,这是LAMP的现代替代品,LAMP在高性能Web应用程序中越来越受欢迎。我们从用户操作客户端的角度来演示系统的性能。
{"title":"A Distributed System for Storing and Processing Data from Earth-Observing Satellites: System Design and Performance Evaluation of the Visualisation Tool","authors":"M. Szuba, P. Ameri, U. Grabowski, Jörg Meyer, A. Streit","doi":"10.1109/CCGrid.2016.19","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.19","url":null,"abstract":"We present a distributed system for storage, processing, three-dimensional visualisation and basic analysis of data from Earth-observing satellites. The database and the server have been designed for high performance and scalability, whereas the client is highly portable thanks to having been designed as a HTML5- and WebGL-based Web application. The system is based on the so-called MEAN stack, a modern replacement for LAMP which has steadily been gaining traction among high-performance Web applications. We demonstrate the performance of the system from the perspective of an user operating the client.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114741543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Medusa: An Efficient Cloud Fault-Tolerant MapReduce Medusa:一个高效的云容错MapReduce
Pedro Costa, Xiao Bai, Fernando M. V. Ramos, M. Correia
Applications such as web search and social networking have been moving from centralized to decentralized cloud architectures to improve their scalability. MapReduce, a programming framework for processing large amounts of data using thousands of machines in a single cloud, also needs to be scaled out to multiple clouds to adapt to this evolution. The challenge of building a multi-cloud distributed architecture is substantial. Notwithstanding, the ability to deal with the new types of faults introduced by such setting, such as the outage of a whole datacenter or an arbitrary fault caused by a malicious cloud insider, increases the endeavor considerably. In this paper we propose Medusa, a platform that allows MapReduce computations to scale out to multiple clouds and tolerate several types of faults. Our solution fulfills four objectives. First, it is transparent to the user, who writes her typical MapReduce application without modification. Second, it does not require any modification to the widely used Hadoop framework. Third, the proposed system goes well beyond the fault-tolerance offered by MapReduce to tolerate arbitrary faults, cloud outages, and even malicious faults caused by corrupt cloud insiders. Fourth, it achieves this increased level of fault tolerance at reasonable cost. We performed an extensive experimental evaluation in the ExoGENI testbed, demonstrating that our solution significantly reduces execution time when compared to traditional methods that achieve the same level of resilience.
web搜索和社交网络等应用程序已经从集中式云架构转向分散式云架构,以提高其可扩展性。MapReduce是一个在单个云中使用数千台机器处理大量数据的编程框架,它也需要扩展到多个云中以适应这种发展。构建多云分布式架构的挑战是巨大的。尽管如此,处理由这种设置引入的新类型故障的能力(例如整个数据中心的中断或由恶意云内部人员引起的任意故障)大大增加了工作量。在本文中,我们提出了Medusa,一个允许MapReduce计算向外扩展到多个云并容忍多种类型故障的平台。我们的解决方案实现了四个目标。首先,它对用户是透明的,用户无需修改即可编写典型的MapReduce应用程序。其次,它不需要对广泛使用的Hadoop框架进行任何修改。第三,所提出的系统远远超出了MapReduce提供的容错能力,可以容忍任意错误、云中断,甚至是由腐败的云内部人员造成的恶意错误。第四,它以合理的成本实现了这种更高级别的容错。我们在ExoGENI测试平台上进行了广泛的实验评估,证明与达到相同弹性水平的传统方法相比,我们的解决方案显着减少了执行时间。
{"title":"Medusa: An Efficient Cloud Fault-Tolerant MapReduce","authors":"Pedro Costa, Xiao Bai, Fernando M. V. Ramos, M. Correia","doi":"10.1109/CCGrid.2016.20","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.20","url":null,"abstract":"Applications such as web search and social networking have been moving from centralized to decentralized cloud architectures to improve their scalability. MapReduce, a programming framework for processing large amounts of data using thousands of machines in a single cloud, also needs to be scaled out to multiple clouds to adapt to this evolution. The challenge of building a multi-cloud distributed architecture is substantial. Notwithstanding, the ability to deal with the new types of faults introduced by such setting, such as the outage of a whole datacenter or an arbitrary fault caused by a malicious cloud insider, increases the endeavor considerably. In this paper we propose Medusa, a platform that allows MapReduce computations to scale out to multiple clouds and tolerate several types of faults. Our solution fulfills four objectives. First, it is transparent to the user, who writes her typical MapReduce application without modification. Second, it does not require any modification to the widely used Hadoop framework. Third, the proposed system goes well beyond the fault-tolerance offered by MapReduce to tolerate arbitrary faults, cloud outages, and even malicious faults caused by corrupt cloud insiders. Fourth, it achieves this increased level of fault tolerance at reasonable cost. We performed an extensive experimental evaluation in the ExoGENI testbed, demonstrating that our solution significantly reduces execution time when compared to traditional methods that achieve the same level of resilience.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129421055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
De-Fragmenting the Cloud 对云进行碎片化
Mayank Mishra, U. Bellur
Existing Virtual Machine (VM) placement schemes have looked to conserve either CPU and Memory on the physical machine (PM) OR network resources (bandwidth) but not both. However, real applications use all resource types to varying degrees. The result of applying existing placement schemes to VMs running real applications is a fragmented data center where resources along one dimension become unusable even though they are available because of the unavailability of resources along other dimensions. An example of this fragmentation is unusable CPU because of a bottlenecked network link from the PM which has available CPU. To date, evaluations of the efficacy of VM placement schemes has not recognized this fragmentation and it's ill effects, let alone try to measure it and avoid it. In this paper, we first define the notion of what we term "relative resource fragmentation" and illustrate how it can be measured in a data center. The metric we put forth for capturing the degree of fragmentation is comprehensive and includes all key data center resource types. We then propose a VM placement scheme that minimizes this fragmentation and therefore maximizes the utility of data center resources. Results of empirical evaluations of our placement scheme compared to existing placement schemes show a reduction of fragmentation by as much as 15% and an increase in the number of successfully placed applications by as much as 20%.
现有的虚拟机(VM)布局方案要么在物理机(PM)上节省CPU和内存,要么在网络资源(带宽)上节省CPU和内存,但不是两者都节省。然而,实际应用程序在不同程度上使用所有资源类型。将现有的布局方案应用于运行实际应用程序的vm的结果是一个碎片化的数据中心,其中一个维度上的资源变得不可用,即使它们是可用的,因为其他维度上的资源不可用。这种碎片的一个例子是由于来自具有可用CPU的PM的网络链接出现瓶颈而导致CPU不可用。到目前为止,对VM安置方案有效性的评估还没有认识到这种碎片化及其不良影响,更不用说试图衡量和避免它了。在本文中,我们首先定义了所谓的“相对资源碎片”的概念,并说明了如何在数据中心中测量它。我们提出的用于捕获碎片化程度的度量是全面的,包括所有关键数据中心资源类型。然后,我们提出了一种VM放置方案,可以最大限度地减少这种碎片,从而最大化数据中心资源的效用。与现有的安置方案相比,我们的安置方案的经验评估结果表明,碎片化减少了15%,成功安置的申请数量增加了20%。
{"title":"De-Fragmenting the Cloud","authors":"Mayank Mishra, U. Bellur","doi":"10.1109/CCGrid.2016.21","DOIUrl":"https://doi.org/10.1109/CCGrid.2016.21","url":null,"abstract":"Existing Virtual Machine (VM) placement schemes have looked to conserve either CPU and Memory on the physical machine (PM) OR network resources (bandwidth) but not both. However, real applications use all resource types to varying degrees. The result of applying existing placement schemes to VMs running real applications is a fragmented data center where resources along one dimension become unusable even though they are available because of the unavailability of resources along other dimensions. An example of this fragmentation is unusable CPU because of a bottlenecked network link from the PM which has available CPU. To date, evaluations of the efficacy of VM placement schemes has not recognized this fragmentation and it's ill effects, let alone try to measure it and avoid it. In this paper, we first define the notion of what we term \"relative resource fragmentation\" and illustrate how it can be measured in a data center. The metric we put forth for capturing the degree of fragmentation is comprehensive and includes all key data center resource types. We then propose a VM placement scheme that minimizes this fragmentation and therefore maximizes the utility of data center resources. Results of empirical evaluations of our placement scheme compared to existing placement schemes show a reduction of fragmentation by as much as 15% and an increase in the number of successfully placed applications by as much as 20%.","PeriodicalId":103641,"journal":{"name":"2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124068507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1