{"title":"慢点还是慢点?对云用户来说似乎是一样的","authors":"Laiping Zhao, Xiaobo Zhou","doi":"10.1145/3129457.3129496","DOIUrl":null,"url":null,"abstract":"Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, \"slow response\" and \"service unavailable\" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent tasks, and postpone the schedule of children tasks; A service experiencing bursty workloads may preempt more resources, or a machine failure will reduce the available resources in the pool, leading to performance slowdowns of other co-loated services.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Slow or Down?: Seem to Be the Same for Cloud Users\",\"authors\":\"Laiping Zhao, Xiaobo Zhou\",\"doi\":\"10.1145/3129457.3129496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, \\\"slow response\\\" and \\\"service unavailable\\\" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent tasks, and postpone the schedule of children tasks; A service experiencing bursty workloads may preempt more resources, or a machine failure will reduce the available resources in the pool, leading to performance slowdowns of other co-loated services.\",\"PeriodicalId\":345943,\"journal\":{\"name\":\"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3129457.3129496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3129457.3129496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Slow or Down?: Seem to Be the Same for Cloud Users
Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, "slow response" and "service unavailable" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent tasks, and postpone the schedule of children tasks; A service experiencing bursty workloads may preempt more resources, or a machine failure will reduce the available resources in the pool, leading to performance slowdowns of other co-loated services.