Statistical Tail-Latency Bounded QoS Provisioning for Parallel and Distributed Data Centers

Xi Zhang, Qixuan Zhu
{"title":"Statistical Tail-Latency Bounded QoS Provisioning for Parallel and Distributed Data Centers","authors":"Xi Zhang, Qixuan Zhu","doi":"10.1109/ICDCS51616.2021.00078","DOIUrl":null,"url":null,"abstract":"The large-scale interactive services distribute clients' requests across a large number of physical machine in data center architectures to enhance the quality-of-service (QoS) performance. In parallel and distributed data center architecture, even a temporary spike in latency of any service component can significantly impact the end-to-end delay. Besides the average latency, tail-latency (i.e., worst case latency) of a service has also attracted a lot of research attentions. The tail-latency is a critical performance metric in data centers, where long tail latencies refer to the higher percentiles (such as 98th, 99th) of latency in comparison to the average latency time. While the statistical delay-bounded QoS provisioning theory has been shown to be a powerful technique and useful performance metric for supporting time-sensitive multimedia transmissions over mobile computing networks, how to efficiently extend and implement this technique/performance-metric for statistically bounding the tail-latency for data center networks has neither been well understood nor thoroughly studied. In this paper, we model and characterize the tail-latency distribution in a three-layer parallel and distributed data center architecture, where clients request different types of services and ten download their requested data packets from data center through a first-come-first-serve M/M/1 queueing system. We first define the statistical tail-latency bounded QoS, and investigate the tail-latency problem through generalized extreme value (GEV) theory and generalized Pareto distribution (GPD) theory. Then, we propose a scheme to identify the dominant sources of latency variance in a semantic context, so that we are able to optimize the instructions of those sources to reduce the latency tail. Finally, using numerical analyses we validate and evaluate our developed modeling techniques and schemes for characterizing the tail-latency QoS provisioning theories in supporting data center networks.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The large-scale interactive services distribute clients' requests across a large number of physical machine in data center architectures to enhance the quality-of-service (QoS) performance. In parallel and distributed data center architecture, even a temporary spike in latency of any service component can significantly impact the end-to-end delay. Besides the average latency, tail-latency (i.e., worst case latency) of a service has also attracted a lot of research attentions. The tail-latency is a critical performance metric in data centers, where long tail latencies refer to the higher percentiles (such as 98th, 99th) of latency in comparison to the average latency time. While the statistical delay-bounded QoS provisioning theory has been shown to be a powerful technique and useful performance metric for supporting time-sensitive multimedia transmissions over mobile computing networks, how to efficiently extend and implement this technique/performance-metric for statistically bounding the tail-latency for data center networks has neither been well understood nor thoroughly studied. In this paper, we model and characterize the tail-latency distribution in a three-layer parallel and distributed data center architecture, where clients request different types of services and ten download their requested data packets from data center through a first-come-first-serve M/M/1 queueing system. We first define the statistical tail-latency bounded QoS, and investigate the tail-latency problem through generalized extreme value (GEV) theory and generalized Pareto distribution (GPD) theory. Then, we propose a scheme to identify the dominant sources of latency variance in a semantic context, so that we are able to optimize the instructions of those sources to reduce the latency tail. Finally, using numerical analyses we validate and evaluate our developed modeling techniques and schemes for characterizing the tail-latency QoS provisioning theories in supporting data center networks.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
并行和分布式数据中心的统计尾延迟有界QoS提供
大规模交互服务将客户端的请求分布在数据中心体系结构中的大量物理机器上,以提高服务质量(QoS)性能。在并行和分布式数据中心体系结构中,任何服务组件的延迟即使是暂时的峰值也会显著影响端到端延迟。除了平均延迟之外,服务的尾部延迟(即最坏情况延迟)也引起了很多研究的关注。尾延迟是数据中心中的一个关键性能指标,其中长尾延迟是指与平均延迟时间相比,延迟的较高百分位数(例如98、99)。虽然统计延迟限制QoS提供理论已被证明是支持移动计算网络上时间敏感的多媒体传输的强大技术和有用的性能指标,但如何有效地扩展和实现这一技术/性能指标,以统计限制数据中心网络的尾延迟,既没有得到很好的理解,也没有得到彻底的研究。在三层并行分布式数据中心体系结构中,客户端请求不同类型的服务,并通过先到先服务的M/M/1排队系统从数据中心下载其请求的数据包。首先定义了统计尾延迟有界QoS,并利用广义极值(GEV)理论和广义Pareto分布(GPD)理论研究了尾延迟问题。然后,我们提出了一种在语义上下文中识别延迟方差的主要来源的方案,以便我们能够优化这些来源的指令以减少延迟尾部。最后,使用数值分析,我们验证和评估了我们开发的建模技术和方案,用于描述支持数据中心网络的尾延迟QoS提供理论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Practical Location Privacy Attacks and Defense on Point-of-interest Aggregates Hand-Key: Leveraging Multiple Hand Biometrics for Attack-Resilient User Authentication Using COTS RFID Recognizing 3D Orientation of a Two-RFID-Tag Labeled Object in Multipath Environments Using Deep Transfer Learning The Vertical Cuckoo Filters: A Family of Insertion-friendly Sketches for Online Applications Dyconits: Scaling Minecraft-like Services through Dynamically Managed Inconsistency
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1