{"title":"云中Web应用程序的自动伸缩:尾部延迟评估","authors":"M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema","doi":"10.1109/UCC48980.2020.00037","DOIUrl":null,"url":null,"abstract":"Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the \"average\" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the \"tail\" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.","PeriodicalId":125849,"journal":{"name":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation\",\"authors\":\"M. Aslanpour, A. Toosi, R. Gaire, M. A. Cheema\",\"doi\":\"10.1109/UCC48980.2020.00037\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the \\\"average\\\" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the \\\"tail\\\" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.\",\"PeriodicalId\":125849,\"journal\":{\"name\":\"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UCC48980.2020.00037\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC48980.2020.00037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Auto-scaling of Web Applications in Clouds: A Tail Latency Evaluation
Mechanisms for dynamically adding and removing Virtual Machines (VMs) to reduce cost while minimizing the latency are called auto-scaling. Latency improvements are mainly fulfilled through minimizing the "average" response times while unpredictabilities and fluctuations of the Web applications, aka flash crowds, can result in very high latencies for users’ requests. Requests influenced by flash crowd suffer from long latencies, known as outliers. Such outliers are inevitable to a large extent as auto-scaling solutions continue to improve the average, not the "tail" of latencies. In this paper, we study possible sources of tail latency in auto-scaling mechanisms for Web applications. Based on our extensive evaluations in a real cloud platform, we discovered sources of a tail latency as 1) large requests, i.e. those data-intensive; 2) long-term scaling intervals; 3) instant analysis of scaling parameters; 4) conservative, i.e. tight, threshold tuning; 5) load-unaware surplus VM selection policies used for executing a scale-down decision; 6) cooldown feature, although cost-effective; and 7) VM start-up delay. We also discovered that after improving the average latency by auto-scaling mechanisms, the tail may behave differently, demanding dedicated tail-aware solutions for auto-scaling mechanisms.