Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation

2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC) Pub Date : 2020-12-01 DOI:10.1109/UCC48980.2020.00037

M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema

{"title":"Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation","authors":"M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema","doi":"10.1109/UCC48980.2020.00037","DOIUrl":null,"url":null,"abstract":"Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the \"average\" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the \"tail\" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.","PeriodicalId":125849,"journal":{"name":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC48980.2020.00037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the "average" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the "tail" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

云中Web应用程序的自动伸缩:尾部延迟评估

动态添加和删除虚拟机(vm)以降低成本同时最小化延迟的机制称为自动伸缩。延迟改进主要是通过最小化“平均”响应时间来实现的，而Web应用程序的不可预测性和波动(即flash crowd)可能会导致用户请求的非常高的延迟。受快闪人群影响的请求有很长的延迟，被称为异常值。这种异常值在很大程度上是不可避免的，因为自动缩放解决方案将继续提高平均延迟，而不是延迟的“尾部”。在本文中，我们研究了Web应用程序自动扩展机制中尾部延迟的可能来源。基于我们在真实云平台上的广泛评估，我们发现了尾部延迟的来源:1)大型请求，即那些数据密集型请求;2)长期缩放间隔;3)缩放参数的即时分析;4)保守，即紧，阈值调整;5)用于执行缩减决策的负载不感知剩余VM选择策略;6)冷却功能，虽然性价比高;7)虚拟机启动延迟。我们还发现，在通过自动缩放机制改善平均延迟后，尾部的行为可能会有所不同，这需要针对自动缩放机制的专用尾部感知解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊