Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2017-07-13 DOI:10.1109/ICDCS.2017.231

Juan F. Pérez, R. Birke, Mathias Björkqvist, L. Chen

{"title":"Dual Scaling VMs and Queries: Cost-Effective Latency Curtailment","authors":"Juan F. Pérez, R. Birke, Mathias Björkqvist, L. Chen","doi":"10.1109/ICDCS.2017.231","DOIUrl":null,"url":null,"abstract":"Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance for cost. In this study, we analytically and experimentally show that simultaneously scaling resources at coarse granularity and workloads, i.e., submitting multiple query clones to different servers, at fine granularity can overcome the performance disadvantages of wimpy VM instances and achieve stringent latency targets that are even lower than the average execution times of wimpy servers. To such an end, we first derive a closed-form analysis for the latency under any given VM provisioning and query replication level, considering cloning policies that can (not) terminate outstanding clones with (without) an overhead. Validated on trace-driven simulations, our analysis is able to accurately predict the latency and efficiently search for the optimal number of VMs and clones. Secondly, we develop a dual elastic scaler, DuoScale, that dynamically scales VMs and clones according to the workload dynamics so as to achieve the target latency in a cost-effective manner. The effectiveness of DuoScale lies on the observation that the application performance only scales sub-linearly with increasing vertical or horizontal resource provisioning, i.e., resources per VM or number of VMs. We evaluate DuoScale against VM-only scaling strategies via extensive trace-driven simulations as well as experimental results on a cloud test-bed. Our results show that DuoScale is able to achieve the stringent target latency by using clones on wimpy VMs with cost savings up to 50%, compared to scaling brawny VMs that have better performance at a higher unit cost.","PeriodicalId":127689,"journal":{"name":"2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.2017.231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Wimpy virtual instances equipped with small numbers of cores and RAM are popular public and private cloud offerings because of their low cost for hosting applications. The challenge is how to run latency-sensitive applications using such instances, which trade off performance for cost. In this study, we analytically and experimentally show that simultaneously scaling resources at coarse granularity and workloads, i.e., submitting multiple query clones to different servers, at fine granularity can overcome the performance disadvantages of wimpy VM instances and achieve stringent latency targets that are even lower than the average execution times of wimpy servers. To such an end, we first derive a closed-form analysis for the latency under any given VM provisioning and query replication level, considering cloning policies that can (not) terminate outstanding clones with (without) an overhead. Validated on trace-driven simulations, our analysis is able to accurately predict the latency and efficiently search for the optimal number of VMs and clones. Secondly, we develop a dual elastic scaler, DuoScale, that dynamically scales VMs and clones according to the workload dynamics so as to achieve the target latency in a cost-effective manner. The effectiveness of DuoScale lies on the observation that the application performance only scales sub-linearly with increasing vertical or horizontal resource provisioning, i.e., resources per VM or number of VMs. We evaluate DuoScale against VM-only scaling strategies via extensive trace-driven simulations as well as experimental results on a cloud test-bed. Our results show that DuoScale is able to achieve the stringent target latency by using clones on wimpy VMs with cost savings up to 50%, compared to scaling brawny VMs that have better performance at a higher unit cost.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

双伸缩虚拟机和查询:经济高效的延迟缩减

配备少量核心和RAM的Wimpy虚拟实例是流行的公共和私有云产品，因为它们托管应用程序的成本较低。挑战在于如何使用这样的实例运行对延迟敏感的应用程序，以性能换取成本。在本研究中，我们通过分析和实验表明，同时以粗粒度和工作负载扩展资源，即向不同的服务器提交多个查询克隆，以细粒度可以克服wimpy VM实例的性能缺点，并实现严格的延迟目标，甚至低于wimpy服务器的平均执行时间。为此，我们首先对任何给定的VM供应和查询复制级别下的延迟进行了封闭式分析，考虑到克隆策略可以(不)终止有(不)开销的未完成克隆。通过跟踪驱动的仿真验证，我们的分析能够准确地预测延迟并有效地搜索最佳数量的vm和克隆。其次，我们开发了一个双弹性扩展器DuoScale，它可以根据工作负载的动态动态扩展虚拟机和克隆，从而以经济有效的方式实现目标延迟。DuoScale的有效性在于观察到应用程序性能仅随着垂直或水平资源供应(即每个虚拟机的资源或虚拟机数量)的增加而呈亚线性扩展。我们通过广泛的跟踪驱动模拟以及在云测试平台上的实验结果，评估了DuoScale对虚拟机扩展策略的影响。我们的结果表明，与以更高的单位成本扩展具有更好性能的健壮vm相比，DuoScale能够通过在弱小vm上使用克隆来实现严格的目标延迟，并节省高达50%的成本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)

自引率

0.00%

发文量