Variability-aware request replication for latency curtailment

IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications Pub Date : 2016-04-10 DOI:10.1109/INFOCOM.2016.7524365

Z. Qiu, Juan F. Pérez, P. Harrison

{"title":"Variability-aware request replication for latency curtailment","authors":"Z. Qiu, Juan F. Pérez, P. Harrison","doi":"10.1109/INFOCOM.2016.7524365","DOIUrl":null,"url":null,"abstract":"Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.","PeriodicalId":274591,"journal":{"name":"IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications","volume":"310 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM.2016.7524365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

Abstract

Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

可感知的请求复制，以减少延迟

处理时间可变性在分布式系统中很常见，其中资源由于不同的工作负载级别、后台进程和虚拟化环境中的争用而显示不同的性能。然而，为了提供响应性服务，服务提供者控制响应时间的可变性是至关重要的。我们研究了如何使用请求复制来利用处理时间的可变性来减少响应时间，不仅考虑平均值，而且考虑响应时间分布的尾部。我们重点关注分布式设置，其中复制是通过在多个服务器上运行请求副本来实现的，否则这些服务器将独立发展，并等待第一个副本完成服务。我们构建了一些模型，这些模型使用近似方法捕获具有复制请求的系统的演变，并观察到高度可变的服务时间为复制提供了最佳机会——特别是减少了响应时间尾部。此外，复制的影响在响应时间分布上是不均匀的:一个度量(例如平均值)的增益可能以另一个度量(例如尾百分位数)的代价为代价。这在广泛的数值虚拟实验中得到了证明。可以看出，捕获服务时间可变性是评估延迟容忍策略及其设计的关键。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications

自引率

0.00%

发文量

期刊最新文献

Heavy-traffic analysis of QoE optimality for on-demand video streams over fading channels The quest for resilient (static) forwarding tables CSMA networks in a many-sources regime: A mean-field approach Variability-aware request replication for latency curtailment Apps on the move: A fine-grained analysis of usage behavior of mobile apps