Statistical Tail-Latency Bounded QoS Provisioning for Parallel and Distributed Data Centers

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS) Pub Date : 2021-07-01 DOI:10.1109/ICDCS51616.2021.00078

Xi Zhang, Qixuan Zhu

{"title":"Statistical Tail-Latency Bounded QoS Provisioning for Parallel and Distributed Data Centers","authors":"Xi Zhang, Qixuan Zhu","doi":"10.1109/ICDCS51616.2021.00078","DOIUrl":null,"url":null,"abstract":"The large-scale interactive services distribute clients' requests across a large number of physical machine in data center architectures to enhance the quality-of-service (QoS) performance. In parallel and distributed data center architecture, even a temporary spike in latency of any service component can significantly impact the end-to-end delay. Besides the average latency, tail-latency (i.e., worst case latency) of a service has also attracted a lot of research attentions. The tail-latency is a critical performance metric in data centers, where long tail latencies refer to the higher percentiles (such as 98th, 99th) of latency in comparison to the average latency time. While the statistical delay-bounded QoS provisioning theory has been shown to be a powerful technique and useful performance metric for supporting time-sensitive multimedia transmissions over mobile computing networks, how to efficiently extend and implement this technique/performance-metric for statistically bounding the tail-latency for data center networks has neither been well understood nor thoroughly studied. In this paper, we model and characterize the tail-latency distribution in a three-layer parallel and distributed data center architecture, where clients request different types of services and ten download their requested data packets from data center through a first-come-first-serve M/M/1 queueing system. We first define the statistical tail-latency bounded QoS, and investigate the tail-latency problem through generalized extreme value (GEV) theory and generalized Pareto distribution (GPD) theory. Then, we propose a scheme to identify the dominant sources of latency variance in a semantic context, so that we are able to optimize the instructions of those sources to reduce the latency tail. Finally, using numerical analyses we validate and evaluate our developed modeling techniques and schemes for characterizing the tail-latency QoS provisioning theories in supporting data center networks.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The large-scale interactive services distribute clients' requests across a large number of physical machine in data center architectures to enhance the quality-of-service (QoS) performance. In parallel and distributed data center architecture, even a temporary spike in latency of any service component can significantly impact the end-to-end delay. Besides the average latency, tail-latency (i.e., worst case latency) of a service has also attracted a lot of research attentions. The tail-latency is a critical performance metric in data centers, where long tail latencies refer to the higher percentiles (such as 98th, 99th) of latency in comparison to the average latency time. While the statistical delay-bounded QoS provisioning theory has been shown to be a powerful technique and useful performance metric for supporting time-sensitive multimedia transmissions over mobile computing networks, how to efficiently extend and implement this technique/performance-metric for statistically bounding the tail-latency for data center networks has neither been well understood nor thoroughly studied. In this paper, we model and characterize the tail-latency distribution in a three-layer parallel and distributed data center architecture, where clients request different types of services and ten download their requested data packets from data center through a first-come-first-serve M/M/1 queueing system. We first define the statistical tail-latency bounded QoS, and investigate the tail-latency problem through generalized extreme value (GEV) theory and generalized Pareto distribution (GPD) theory. Then, we propose a scheme to identify the dominant sources of latency variance in a semantic context, so that we are able to optimize the instructions of those sources to reduce the latency tail. Finally, using numerical analyses we validate and evaluate our developed modeling techniques and schemes for characterizing the tail-latency QoS provisioning theories in supporting data center networks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

并行和分布式数据中心的统计尾延迟有界QoS提供

大规模交互服务将客户端的请求分布在数据中心体系结构中的大量物理机器上，以提高服务质量(QoS)性能。在并行和分布式数据中心体系结构中，任何服务组件的延迟即使是暂时的峰值也会显著影响端到端延迟。除了平均延迟之外，服务的尾部延迟(即最坏情况延迟)也引起了很多研究的关注。尾延迟是数据中心中的一个关键性能指标，其中长尾延迟是指与平均延迟时间相比，延迟的较高百分位数(例如98、99)。虽然统计延迟限制QoS提供理论已被证明是支持移动计算网络上时间敏感的多媒体传输的强大技术和有用的性能指标，但如何有效地扩展和实现这一技术/性能指标，以统计限制数据中心网络的尾延迟，既没有得到很好的理解，也没有得到彻底的研究。在三层并行分布式数据中心体系结构中，客户端请求不同类型的服务，并通过先到先服务的M/M/1排队系统从数据中心下载其请求的数据包。首先定义了统计尾延迟有界QoS，并利用广义极值(GEV)理论和广义Pareto分布(GPD)理论研究了尾延迟问题。然后，我们提出了一种在语义上下文中识别延迟方差的主要来源的方案，以便我们能够优化这些来源的指令以减少延迟尾部。最后，使用数值分析，我们验证和评估了我们开发的建模技术和方案，用于描述支持数据中心网络的尾延迟QoS提供理论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)

自引率

0.00%

发文量