云中Web应用程序的自动伸缩:尾部延迟评估

2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC) Pub Date : 2020-12-01 DOI:10.1109/UCC48980.2020.00037

M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema

{"title":"云中Web应用程序的自动伸缩:尾部延迟评估","authors":"M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema","doi":"10.1109/UCC48980.2020.00037","DOIUrl":null,"url":null,"abstract":"Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the \"average\" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the \"tail\" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.","PeriodicalId":125849,"journal":{"name":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation\",\"authors\":\"M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema\",\"doi\":\"10.1109/UCC48980.2020.00037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the \\\"average\\\" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the \\\"tail\\\" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.\",\"PeriodicalId\":125849,\"journal\":{\"name\":\"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UCC48980.2020.00037\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC48980.2020.00037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

动态添加和删除虚拟机(vm)以降低成本同时最小化延迟的机制称为自动伸缩。延迟改进主要是通过最小化“平均”响应时间来实现的，而Web应用程序的不可预测性和波动(即flash crowd)可能会导致用户请求的非常高的延迟。受快闪人群影响的请求有很长的延迟，被称为异常值。这种异常值在很大程度上是不可避免的，因为自动缩放解决方案将继续提高平均延迟，而不是延迟的“尾部”。在本文中，我们研究了Web应用程序自动扩展机制中尾部延迟的可能来源。基于我们在真实云平台上的广泛评估，我们发现了尾部延迟的来源:1)大型请求，即那些数据密集型请求;2)长期缩放间隔;3)缩放参数的即时分析;4)保守，即紧，阈值调整;5)用于执行缩减决策的负载不感知剩余VM选择策略;6)冷却功能，虽然性价比高;7)虚拟机启动延迟。我们还发现，在通过自动缩放机制改善平均延迟后，尾部的行为可能会有所不同，这需要针对自动缩放机制的专用尾部感知解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation

Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the "average" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the "tail" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)

自引率

0.00%

发文量

期刊最新文献

Blockchain Mobility Solution for Charging Transactions of Electrical Vehicles Open-source Serverless Architectures: an Evaluation of Apache OpenWhisk Explaining probabilistic Artificial Intelligence (AI) models by discretizing Deep Neural Networks Message from the B2D2LM 2020 Workshop Chairs Dynamic Network Slicing in Fog Computing for Mobile Users in MobFogSim