Variability-aware request replication for latency curtailment

Z. Qiu, Juan F. Pérez, P. Harrison
{"title":"Variability-aware request replication for latency curtailment","authors":"Z. Qiu, Juan F. Pérez, P. Harrison","doi":"10.1109/INFOCOM.2016.7524365","DOIUrl":null,"url":null,"abstract":"Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.","PeriodicalId":274591,"journal":{"name":"IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications","volume":"310 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM.2016.7524365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Processing time variability is commonplace in distributed systems, where resources display disparate performance due to, e.g., different workload levels, background processes, and contention in virtualized environments. However, it is paramount for service providers to keep variability in response time under control in order to offer responsive services. We investigate how request replication can be used to exploit processing time variability to reduce response times, considering not only mean values but also the tail of the response time distribution. We focus on the distributed setup, where replication is achieved by running copies of requests on multiple servers that otherwise evolve independently, and waiting for the first replica to complete service. We construct models that capture the evolution of a system with replicated requests using approximate methods and observe that highly variable service times offer the best opportunities for replication - reducing the response time tail in particular. Further, the effect of replication is non-uniform over the response time distribution: gains in one metric, e.g., the mean, can be at the cost of another, e.g., the tail percentiles. This is demonstrated in wide range of numerical virtual experiments. It can be seen that capturing service time variability is key to the evaluation of latency tolerance strategies and in their design.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
可感知的请求复制,以减少延迟
处理时间可变性在分布式系统中很常见,其中资源由于不同的工作负载级别、后台进程和虚拟化环境中的争用而显示不同的性能。然而,为了提供响应性服务,服务提供者控制响应时间的可变性是至关重要的。我们研究了如何使用请求复制来利用处理时间的可变性来减少响应时间,不仅考虑平均值,而且考虑响应时间分布的尾部。我们重点关注分布式设置,其中复制是通过在多个服务器上运行请求副本来实现的,否则这些服务器将独立发展,并等待第一个副本完成服务。我们构建了一些模型,这些模型使用近似方法捕获具有复制请求的系统的演变,并观察到高度可变的服务时间为复制提供了最佳机会——特别是减少了响应时间尾部。此外,复制的影响在响应时间分布上是不均匀的:一个度量(例如平均值)的增益可能以另一个度量(例如尾百分位数)的代价为代价。这在广泛的数值虚拟实验中得到了证明。可以看出,捕获服务时间可变性是评估延迟容忍策略及其设计的关键。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Heavy-traffic analysis of QoE optimality for on-demand video streams over fading channels The quest for resilient (static) forwarding tables CSMA networks in a many-sources regime: A mean-field approach Variability-aware request replication for latency curtailment Apps on the move: A fine-grained analysis of usage behavior of mobile apps
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1