慢点还是慢点?对云用户来说似乎是一样的

Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters Pub Date : 2017-04-08 DOI:10.1145/3129457.3129496

Laiping Zhao, Xiaobo Zhou

{"title":"慢点还是慢点?对云用户来说似乎是一样的","authors":"Laiping Zhao, Xiaobo Zhou","doi":"10.1145/3129457.3129496","DOIUrl":null,"url":null,"abstract":"Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, \"slow response\" and \"service unavailable\" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent tasks, and postpone the schedule of children tasks; A service experiencing bursty workloads may preempt more resources, or a machine failure will reduce the available resources in the pool, leading to performance slowdowns of other co-loated services.","PeriodicalId":345943,"journal":{"name":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Slow or Down?: Seem to Be the Same for Cloud Users\",\"authors\":\"Laiping Zhao, Xiaobo Zhou\",\"doi\":\"10.1145/3129457.3129496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, \\\"slow response\\\" and \\\"service unavailable\\\" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent tasks, and postpone the schedule of children tasks; A service experiencing bursty workloads may preempt more resources, or a machine failure will reduce the available resources in the pool, leading to performance slowdowns of other co-loated services.\",\"PeriodicalId\":345943,\"journal\":{\"name\":\"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3129457.3129496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3129457.3129496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

近年来，云计算市场迅速增长。大量的企业应用程序，如社交网络、电子商务、视频流、电子邮件、网络搜索、mapreduce、spark，正在向云系统转移。这些应用程序通常需要数十或数百个任务或微服务才能完成，并且每天需要处理数十亿次访问，同时处理前所未有的数据量。同时，这些应用程序需要为用户提供快速且可预测的响应时间。然而，性能可预测性一直是云计算中最大的挑战之一。尽管在硬件和软件上进行了许多优化和改进，但Google后端服务的延迟分布表明，虽然大多数请求大约需要50-60 ms，但相当一部分请求需要超过100 ms，最大的差异几乎是600倍[10]。巨大的差异会影响用户的体验质量(QoE)，直接导致收入损失和运营成本增加。谷歌的研究表明，如果响应时间从0.4秒增加到0.9秒，那么流量和广告收入将下降20%[1]。亚马逊还报告说，响应时间每增加100毫秒，销售额就会下降1%[4]。根据Nielsen[14]的说法，(i) 0.1秒是让用户感觉系统反应迅速的极限。(ii) 1.0秒是用户的思想流保持不间断的极限，即使用户会注意到延迟。(iii) 10秒是将用户的注意力集中在对话上的极限。对于较长的延迟，用户将希望在等待计算机完成时执行其他任务。从这个意义上说，“响应慢”和“服务不可用”对云用户来说似乎是一样的。目前，像亚马逊、微软和谷歌这样的主要云提供商只是在他们的服务水平协议(SLA)中声明了正常运行时间的可用性保证，但从未提供QoE(例如，响应时间)的保证。由于传统的可用性是基于云服务的故障/修复行为来定义的，这显然不能满足用户对快速响应时间的需求。其原因是云系统中复杂多样的不确定性行为使得性能的可预测性非常困难。总体而言，这些不确定性具有两个主要特征:•多样性:云系统中的不确定性来自许多不同的来源，包括硬件层(如故障、系统资源竞争、网络资源竞争)和软件层(如调度算法、软件bug、意外工作负载、数据丢失)[9]。•可传递性:不确定性可能不仅影响单个服务，还会降低服务链或其他协同应用程序的性能。例如，中间数据的丢失需要从父任务中重新生成数据，并推迟子任务的进度;经历突发工作负载的服务可能会抢占更多资源，或者机器故障将减少池中的可用资源，从而导致其他协同部署服务的性能下降。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Slow or Down?: Seem to Be the Same for Cloud Users

Recent years have seen the rapidly growing cloud computing market. A massive enterprise applications, like social networking, e-commerce, video streaming, email, web search, mapreduce, spark, are moving to cloud systems. These applications often require tens or hundreds of tasks or micro-services to complete, and need to deal with billions of visits per day while handling unprecedented volumes of data. At the same time, these applications need to deliver quick and predictable response times to their users. However, performance predictability has always been one of the biggest challenges in cloud computing. Despite many optimizations and improvements on both hardware and software, the distribution of latencies for Google's back end services show that while majority of requests take around 50-60 ms, significant fraction of requests takes longer than 100 ms, with the largest difference being almost 600 times [10]. The great variance impacts the quality of experience (QoE) for users and directly leads to revenue losses as well as increases in operational costs. Google's study shows that if the response time increase from 0.4 second to 0.9 second, then traffic and ad revenues down 20% [1]. Amazon also reports that every 100 ms increase on the response time leads to sales down 1% [4]. According to Nielsen [14], (i) 0.1 second is about the limit for having the user feel that the system is reacting instantaneously. (ii) 1.0 second is about the limit for the user's flow of thought to stay uninterrupted, even though the user will notice the delay. (iii) 10 seconds is about the limit for keeping the user's attention focused on the dialogue. For longer delays, users will want to perform other tasks while waiting for the computer to finish. In this sense, "slow response" and "service unavailable" seem to be the same for cloud users. Currently, major cloud providers like Amazon, Microsoft, and Google merely state the uptime availability guarantee in their Service Level Agreements (SLA), but never provide guarantee on QoE (e.g., response time). Since the traditional availability is defined based on the failure/repair behaviors of cloud services, this clearly cannot satisfy user's requirements on quick response time. The reason for this is that the complex and diverse uncertainty behaviors in cloud systems make performance predictability very difficult. In general, these uncertainties have two main characteristics: • Diversity: Uncertainties in cloud systems come from many diverse sources, including hardware layer (e.g., failures, system resource competition, network resource competition) and software layer (e.g., scheduling algorithm, software bugs, unexpected workload, loss of data) [9]. • Transmissibility: The uncertainties may not only affect a single service, but also degrade the performance of a chain of services or other co-loated applications. For example, the loss of a piece of intermediate data would require the re-generation of data from its parent tasks, and postpone the schedule of children tasks; A service experiencing bursty workloads may preempt more resources, or a machine failure will reduce the available resources in the pool, leading to performance slowdowns of other co-loated services.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters

自引率

0.00%

发文量

期刊最新文献

TCS: FaaS (FPGA as a service) Rethinking the SDN Abstraction Distributed SAR Image Change Detection with OpenCL-Enabled Spark Anomaly Detection in Clouds: Challenges and Practice DoCE: Direct Extension of On-Chip Interconnects over Converged Ethernet for Rack-Scale Memory Sharing