In the distributed graident coding problem, it has been established that, to exactly recover the gradient under s slow machines, the mmimum computation load (number of stored data partitions) of each worker is at least linear ($s+1$), which incurs a large overhead when s is large[13]. In this paper, we focus on approximate gradient coding that aims to recover the gradient with bounded error ε. Theoretically, our main contributions are three-fold: (i) we analyze the structure of optimal gradient codes, and derive the information-theoretical lower bound of minimum computation load: O(log(n)/log(n/s)) for ε = 0 and d≥ O(log(1/ε)/log(n/s)) for ε>0, where d is the computation load, and ε is the error in the gradient computation; (ii) we design two approximate gradient coding schemes that exactly match such lower bounds based on random edge removal process; (iii) we implement our schemes and demonstrate the advantage of the approaches over the current fastest gradient coding strategies. The proposed schemes provide order-wise improvement over the state of the art in terms of computation load, and are also optimal in terms of both computation load and latency.
{"title":"Fundamental Limits of Approximate Gradient Coding","authors":"Sinong Wang, Jiashang Liu, N. Shroff","doi":"10.1145/3393691.3394188","DOIUrl":"https://doi.org/10.1145/3393691.3394188","url":null,"abstract":"In the distributed graident coding problem, it has been established that, to exactly recover the gradient under s slow machines, the mmimum computation load (number of stored data partitions) of each worker is at least linear ($s+1$), which incurs a large overhead when s is large[13]. In this paper, we focus on approximate gradient coding that aims to recover the gradient with bounded error ε. Theoretically, our main contributions are three-fold: (i) we analyze the structure of optimal gradient codes, and derive the information-theoretical lower bound of minimum computation load: O(log(n)/log(n/s)) for ε = 0 and d≥ O(log(1/ε)/log(n/s)) for ε>0, where d is the computation load, and ε is the error in the gradient computation; (ii) we design two approximate gradient coding schemes that exactly match such lower bounds based on random edge removal process; (iii) we implement our schemes and demonstrate the advantage of the approaches over the current fastest gradient coding strategies. The proposed schemes provide order-wise improvement over the state of the art in terms of computation load, and are also optimal in terms of both computation load and latency.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114739402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the fast growth of cloud services and network scales, the heavy and highly dynamic traffic demands pose great challenges to the efficient traffic engineering in today's data center networks (DCNs) [21]. The DCN flows can be broadly classified into two main categories: delay-sensitive small flows (e.g., queries or realtime small messages) and throughput-sensitive large flows (e.g., the backup traffic). In general, more than 80% flows in data centers are small flows, while the majority of the traffic volume is contributed by the top 10% large flows [3, 7]. To handle the mixed traffic, today's data centers [1, 14] generally follow the tree-based topologies (e.g., fat-tree) and take the load-agnostic routing strategies based on random path selection (e.g., ECMP1) [14, 19]. Although it is applicable for routing small flows which are highly random, these strategies are likely to route several large flows through the same output link and lead to long-lived congestions [2, 8]. With the limited switch buffer occupied by large flows for a long time, small flows are reported to experience one order of magnitude larger delay, which compromises the performance of DCNs and makes the users suffer [3].
{"title":"Achieving Efficient Routing in Reconfigurable DCNs","authors":"Zhenjie Yang, Yong Cui, Shihan Xiao, Xin Wang, Minming Li, Chuming Li, Yadong Liu","doi":"10.1145/3393691.3394175","DOIUrl":"https://doi.org/10.1145/3393691.3394175","url":null,"abstract":"With the fast growth of cloud services and network scales, the heavy and highly dynamic traffic demands pose great challenges to the efficient traffic engineering in today's data center networks (DCNs) [21]. The DCN flows can be broadly classified into two main categories: delay-sensitive small flows (e.g., queries or realtime small messages) and throughput-sensitive large flows (e.g., the backup traffic). In general, more than 80% flows in data centers are small flows, while the majority of the traffic volume is contributed by the top 10% large flows [3, 7]. To handle the mixed traffic, today's data centers [1, 14] generally follow the tree-based topologies (e.g., fat-tree) and take the load-agnostic routing strategies based on random path selection (e.g., ECMP1) [14, 19]. Although it is applicable for routing small flows which are highly random, these strategies are likely to route several large flows through the same output link and lead to long-lived congestions [2, 8]. With the limited switch buffer occupied by large flows for a long time, small flows are reported to experience one order of magnitude larger delay, which compromises the performance of DCNs and makes the users suffer [3].","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126206282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikita Ivkin, Ran Ben Basat, Zaoxing Liu, Gil Einziger, R. Friedman, V. Braverman
Modern telemetry systems require advanced analytic capabilities such as drill down queries. These queries can be used to detect the beginning and end of a network anomaly by efficiently refining the search space. We present the first integral solution that (i) enables multiple measurement tasks inside the same data structure, (ii) supports specifying the time frame of interest as part of its queries, and (iii) is sketch-based and thus space efficient. Namely, our approach allows the user to define both the measurement task (e.g., heavy hitters, entropy estimation, cardinality estimation) and the time frame of relevance (e.g., 5PM-6PM) at query time. Our approach provides accuracy guarantees and is the only space-efficient solution that offers such capabilities. Finally, we demonstrate how the algorithm can be used to accurately pinpoint the beginning of a realistic DDoS attack.
{"title":"I Know What You Did Last Summer: Network Monitoring using Interval Queries","authors":"Nikita Ivkin, Ran Ben Basat, Zaoxing Liu, Gil Einziger, R. Friedman, V. Braverman","doi":"10.1145/3393691.3394193","DOIUrl":"https://doi.org/10.1145/3393691.3394193","url":null,"abstract":"Modern telemetry systems require advanced analytic capabilities such as drill down queries. These queries can be used to detect the beginning and end of a network anomaly by efficiently refining the search space. We present the first integral solution that (i) enables multiple measurement tasks inside the same data structure, (ii) supports specifying the time frame of interest as part of its queries, and (iii) is sketch-based and thus space efficient. Namely, our approach allows the user to define both the measurement task (e.g., heavy hitters, entropy estimation, cardinality estimation) and the time frame of relevance (e.g., 5PM-6PM) at query time. Our approach provides accuracy guarantees and is the only space-efficient solution that offers such capabilities. Finally, we demonstrate how the algorithm can be used to accurately pinpoint the beginning of a realistic DDoS attack.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131703443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Lin, Keyur Muzumdar, N. Laptev, M. Curelea, Seunghak Lee, S. Sankar
Root cause analysis in a large-scale production environment is challenging due to the complexity and scale of the services running across global data centers. It is often difficult to review the logs jointly for understanding production issues given the distributed nature of the system. Additionally, there could easily be millions of entities, each described by hundreds of features. In this paper we present a fast dimensional analysis framework that automates the root cause analysis on structured logs with improved scalability. We first explore item-sets, i.e. combinations of feature values, that could identify groups of samples with sufficient support for the target failures using the Apriori algorithm and a subsequent improvement, FP-Growth. These algorithms were designed for frequent item-set mining and association rule learning over transactional databases. After applying them on structured logs, we select the item-sets that are most unique to the target failures based on lift. We propose pre-processing steps with the use of a large-scale real-time database and post-processing techniques and parallelism to further speed up the analysis and improve interpretability, and demonstrate that such optimization is necessary for handling large- scale production datasets. We have successfully rolled out this approach for root cause investigation purposes within Facebook's infrastructure. We also present the setup and results from multiple production use cases in this paper.
{"title":"Fast Dimensional Analysis for Root Cause Investigation in a Large-Scale Service Environment","authors":"F. Lin, Keyur Muzumdar, N. Laptev, M. Curelea, Seunghak Lee, S. Sankar","doi":"10.1145/3393691.3394185","DOIUrl":"https://doi.org/10.1145/3393691.3394185","url":null,"abstract":"Root cause analysis in a large-scale production environment is challenging due to the complexity and scale of the services running across global data centers. It is often difficult to review the logs jointly for understanding production issues given the distributed nature of the system. Additionally, there could easily be millions of entities, each described by hundreds of features. In this paper we present a fast dimensional analysis framework that automates the root cause analysis on structured logs with improved scalability. We first explore item-sets, i.e. combinations of feature values, that could identify groups of samples with sufficient support for the target failures using the Apriori algorithm and a subsequent improvement, FP-Growth. These algorithms were designed for frequent item-set mining and association rule learning over transactional databases. After applying them on structured logs, we select the item-sets that are most unique to the target failures based on lift. We propose pre-processing steps with the use of a large-scale real-time database and post-processing techniques and parallelism to further speed up the analysis and improve interpretability, and demonstrate that such optimization is necessary for handling large- scale production datasets. We have successfully rolled out this approach for root cause investigation purposes within Facebook's infrastructure. We also present the setup and results from multiple production use cases in this paper.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127056763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In 2016, Google proposed and deployed a new TCP variant called BBR. BBR represents a major departure from traditional congestion control as it uses estimates of bandwidth and round-trip delays to regulate its sending rate. BBR has since been introduced in the upstream Linux kernel and deployed by Google across its data centers. Since the last major study to identify TCP congestion control variants on the Internet was done before BBR, it is timely to conduct a new census to give us a sense of the current distribution of congestion control variants on the Internet. To this end, we designed and implemented Gordon, a tool that allows us to measure the congestion window (cwnd) corresponding to each successive RTT in the TCP connection response of a congestion control algorithm. To compare a measured flow to the known variants, we created a localized bottleneck and introduced a variety of network changes like loss events, changes in bandwidth and delay, while normalizing all measurements by RTT. We built an offline classifier to identify the TCP variant based on the cwnd trace over time. Our results suggest that CUBIC is currently the dominant TCP variant on the Internet, and is deployed on about 36% of the websites in the Alexa Top 20,000 list. While BBR and its variant BBR G1.1 are currently in second place with a 22% share by website count, their present share of total Internet traffic volume is estimated to be larger than 40%. We also found that Akamai has deployed a unique loss-agnostic rate-based TCP variant on some 6% of the Alexa Top 20,000 websites and there are likely other undocumented variants. Therefore, the traditional assumption that TCP variants ''in the wild'' will come from a small known set is not likely to be true anymore. Our results suggest that some variant of BBR seems poised to replace CUBIC as the next dominant TCP variant on the Internet.
2016年,谷歌提出并部署了一种名为BBR的新的TCP变体。BBR与传统的拥塞控制有很大不同,因为它使用带宽和往返延迟的估计来调节其发送速率。此后,BBR被引入到上游Linux内核中,并由Google在其数据中心部署。由于上一次识别互联网上TCP拥塞控制变体的主要研究是在BBR之前完成的,因此进行一次新的普查以使我们了解互联网上拥塞控制变体的当前分布是及时的。为此,我们设计并实现了Gordon,这是一个工具,它允许我们测量拥塞控制算法的TCP连接响应中每个连续RTT对应的拥塞窗口(cwnd)。为了将测量的流量与已知的流量进行比较,我们创建了一个局部瓶颈,并引入了各种网络变化,如丢失事件、带宽和延迟的变化,同时通过RTT将所有测量结果归一化。我们构建了一个离线分类器,根据随时间变化的cwnd跟踪来识别TCP变体。我们的研究结果表明,CUBIC目前是互联网上占主导地位的TCP变体,并且部署在Alexa Top 20,000列表中约36%的网站上。虽然BBR及其变体BBR G1.1目前以22%的网站数量排名第二,但它们目前在互联网总流量中所占的份额估计超过40%。我们还发现,Akamai在Alexa排名前2万的网站中约6%的网站上部署了一种独特的基于损失不可知率的TCP变体,可能还有其他未记录的变体。因此,传统的假设,即TCP变体“在野外”将来自一个已知的小集合,不太可能是正确的了。我们的研究结果表明,BBR的某些变体似乎有望取代CUBIC,成为互联网上下一个占主导地位的TCP变体。
{"title":"The Great Internet TCP Congestion Control Census","authors":"Ayush Mishra, Xiangpeng Sun, Atishya Jain, Sameer Pande, Raj Joshi, B. Leong","doi":"10.1145/3393691.3394221","DOIUrl":"https://doi.org/10.1145/3393691.3394221","url":null,"abstract":"In 2016, Google proposed and deployed a new TCP variant called BBR. BBR represents a major departure from traditional congestion control as it uses estimates of bandwidth and round-trip delays to regulate its sending rate. BBR has since been introduced in the upstream Linux kernel and deployed by Google across its data centers. Since the last major study to identify TCP congestion control variants on the Internet was done before BBR, it is timely to conduct a new census to give us a sense of the current distribution of congestion control variants on the Internet. To this end, we designed and implemented Gordon, a tool that allows us to measure the congestion window (cwnd) corresponding to each successive RTT in the TCP connection response of a congestion control algorithm. To compare a measured flow to the known variants, we created a localized bottleneck and introduced a variety of network changes like loss events, changes in bandwidth and delay, while normalizing all measurements by RTT. We built an offline classifier to identify the TCP variant based on the cwnd trace over time. Our results suggest that CUBIC is currently the dominant TCP variant on the Internet, and is deployed on about 36% of the websites in the Alexa Top 20,000 list. While BBR and its variant BBR G1.1 are currently in second place with a 22% share by website count, their present share of total Internet traffic volume is estimated to be larger than 40%. We also found that Akamai has deployed a unique loss-agnostic rate-based TCP variant on some 6% of the Alexa Top 20,000 websites and there are likely other undocumented variants. Therefore, the traditional assumption that TCP variants ''in the wild'' will come from a small known set is not likely to be true anymore. Our results suggest that some variant of BBR seems poised to replace CUBIC as the next dominant TCP variant on the Internet.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116494232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stochastic Processing Networks that model wired and wireless networks, and other queueing systems have been studied in heavy-traffic limit in the literature under the so-called Complete Resource Pooling (CRP) condition. Under the CRP condition, these systems behave like a single server queue. When the CRP condition is not satisfied, heavy-traffic results are known only in the special case of an input-queued switch and bandwidth-sharing network. In this paper, we consider a very general queueing system called the 'generalized switch' that includes wireless networks under fading, data center networks, input-queued switch, etc. The primary contribution of this paper is to present the exact value of the steady-state mean of certain linear combinations of queue lengths in the heavy-traffic limit under the MaxWeight scheduling algorithm. We do this using the Drift method, and we also present a negative result that it is not possible to obtain the remaining linear combinations (and consequently all the individual mean queue lengths) using this method. We do this by presenting an alternate view of the Drift method in terms of an (under-determined) system of linear equations. Finally, we use this system of equations to obtain upper and lower bounds on all linear combinations of queue lengths.
{"title":"Heavy-traffic Analysis of the Generalized Switch under Multidimensional State Space Collapse","authors":"Daniela Hurtado-Lange, S. T. Maguluri","doi":"10.1145/3393691.3394192","DOIUrl":"https://doi.org/10.1145/3393691.3394192","url":null,"abstract":"Stochastic Processing Networks that model wired and wireless networks, and other queueing systems have been studied in heavy-traffic limit in the literature under the so-called Complete Resource Pooling (CRP) condition. Under the CRP condition, these systems behave like a single server queue. When the CRP condition is not satisfied, heavy-traffic results are known only in the special case of an input-queued switch and bandwidth-sharing network. In this paper, we consider a very general queueing system called the 'generalized switch' that includes wireless networks under fading, data center networks, input-queued switch, etc. The primary contribution of this paper is to present the exact value of the steady-state mean of certain linear combinations of queue lengths in the heavy-traffic limit under the MaxWeight scheduling algorithm. We do this using the Drift method, and we also present a negative result that it is not possible to obtain the remaining linear combinations (and consequently all the individual mean queue lengths) using this method. We do this by presenting an alternate view of the Drift method in terms of an (under-determined) system of linear equations. Finally, we use this system of equations to obtain upper and lower bounds on all linear combinations of queue lengths.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128282858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We analyze the problem of how to optimally bid for ad spaces in online ad auctions. For this we consider the general case of multiple ad campaigns with overlapping targeting criteria. In our analysis we first characterize the structure of an optimal bidding strategy. In particular, we show that an optimal bidding strategies decomposes the problem into disjoint sets of campaigns and targeting groups. In addition, we show that pure bidding strategies that use only a single bid value for each campaign are not optimal when the supply curves are not continuous. For this case, we derive a lower-bound on the optimal cost of any bidding strategy, as well as mixed bidding strategies that either achieve the lower-bound, or can get arbitrarily close to it.
{"title":"Optimal Bidding Strategies for Online Ad Auctions with Overlapping Targeting Criteria","authors":"Erik Tillberg, P. Marbach, R. Mazumdar","doi":"10.1145/3393691.3394210","DOIUrl":"https://doi.org/10.1145/3393691.3394210","url":null,"abstract":"We analyze the problem of how to optimally bid for ad spaces in online ad auctions. For this we consider the general case of multiple ad campaigns with overlapping targeting criteria. In our analysis we first characterize the structure of an optimal bidding strategy. In particular, we show that an optimal bidding strategies decomposes the problem into disjoint sets of campaigns and targeting groups. In addition, we show that pure bidding strategies that use only a single bid value for each campaign are not optimal when the supply curves are not continuous. For this case, we derive a lower-bound on the optimal cost of any bidding strategy, as well as mixed bidding strategies that either achieve the lower-bound, or can get arbitrarily close to it.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"92 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129351010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziv Scully, Mor Harchol-Balter, Alan Scheller-Wolf
We consider the problem of preemptively scheduling jobs to minimize mean response time of an M/G/1 queue. When we know each job's size, the shortest remaining processing time (SRPT) policy is optimal. Unfortunately, in many settings we do not have access to each job's size. Instead, we know only the job size distribution. In this setting the Gittins policy is known to minimize mean response time, but its complex priority structure can be computationally intractable. A much simpler alternative to Gittins is the shortest expected remaining processing time (SERPT) policy. While SERPT is a natural extension of SRPT to unknown job sizes, it is unknown whether or not SERPT is close to optimal for mean response time. We present a new variant of SERPT called monotonic SERPT (M-SERPT) which is as simple as SERPT but has provably near-optimal mean response time at all loads for any job size distribution. Specifically, we prove the mean response time ratio between M-SERPT and Gittins is at most 3 for load ρ ≤ 8/9 and at most 5 for any load. This makes M-SERPT the only non-Gittins scheduling policy known to have a constant-factor approximation ratio for mean response time.
{"title":"Simple Near-Optimal Scheduling for the M/G/1","authors":"Ziv Scully, Mor Harchol-Balter, Alan Scheller-Wolf","doi":"10.1145/3393691.3394216","DOIUrl":"https://doi.org/10.1145/3393691.3394216","url":null,"abstract":"We consider the problem of preemptively scheduling jobs to minimize mean response time of an M/G/1 queue. When we know each job's size, the shortest remaining processing time (SRPT) policy is optimal. Unfortunately, in many settings we do not have access to each job's size. Instead, we know only the job size distribution. In this setting the Gittins policy is known to minimize mean response time, but its complex priority structure can be computationally intractable. A much simpler alternative to Gittins is the shortest expected remaining processing time (SERPT) policy. While SERPT is a natural extension of SRPT to unknown job sizes, it is unknown whether or not SERPT is close to optimal for mean response time. We present a new variant of SERPT called monotonic SERPT (M-SERPT) which is as simple as SERPT but has provably near-optimal mean response time at all loads for any job size distribution. Specifically, we prove the mean response time ratio between M-SERPT and Gittins is at most 3 for load ρ ≤ 8/9 and at most 5 for any load. This makes M-SERPT the only non-Gittins scheduling policy known to have a constant-factor approximation ratio for mean response time.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130076466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on the aggregate type-action counts. Such a framework encapsulates many online stochastic variants of common optimization problems including bin packing, generalized assignment, and network revenue management. In such settings, we study a natural model-predictive control algorithm that in each period, acts greedily based on an updated certainty-equivalent optimization problem. We introduce a simple, yet general, condition under which this algorithm obtains uniform additive loss (independent of the horizon) compared to an optimal solution with full knowledge of arrivals. Our condition is fulfilled by the above-mentioned problems, as well as more general settings involving piece-wise linear objectives and offline index policies, including an airline overbooking problem.
{"title":"Uniform Loss Algorithms for Online Stochastic Decision-Making With Applications to Bin Packing","authors":"Siddhartha Banerjee, Daniel Freund","doi":"10.1145/3393691.3394224","DOIUrl":"https://doi.org/10.1145/3393691.3394224","url":null,"abstract":"We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on the aggregate type-action counts. Such a framework encapsulates many online stochastic variants of common optimization problems including bin packing, generalized assignment, and network revenue management. In such settings, we study a natural model-predictive control algorithm that in each period, acts greedily based on an updated certainty-equivalent optimization problem. We introduce a simple, yet general, condition under which this algorithm obtains uniform additive loss (independent of the horizon) compared to an optimal solution with full knowledge of arrivals. Our condition is fulfilled by the above-mentioned problems, as well as more general settings involving piece-wise linear objectives and offline index policies, including an airline overbooking problem.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121262746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziv Scully, Lucas van Kreveld, O. Boxma, Jan-Pieter L. Dorsman, A. Wierman
We consider the tail behavior of the response time distribution in an M/G/1 queue with heavy-tailed job sizes, specifically those with intermediately regularly varying tails. In this setting, the response time tail of many individual policies has been characterized, and it is known that policies such as Shortest Remaining Processing Time (SRPT) and Foreground-Background (FB) have response time tails of the same order as the job size tail, and thus such policies are tail-optimal. Our goal in this work is to move beyond individual policies and characterize the set of policies that are tail-optimal. Toward that end, we use the recently introduced SOAP framework to derive sufficient conditions on the form of prioritization used by a scheduling policy that ensure the policy is tail-optimal. These conditions are general and lead to new results for important policies that have previously resisted analysis, including the Gittins policy, which minimizes mean response time among policies that do not have access to job size information. As a by-product of our analysis, we derive a general upper bound for fractional moments of M/G/1 busy periods, which is of independent interest.
{"title":"Characterizing Policies with Optimal Response Time Tails under Heavy-Tailed Job Sizes","authors":"Ziv Scully, Lucas van Kreveld, O. Boxma, Jan-Pieter L. Dorsman, A. Wierman","doi":"10.1145/3393691.3394179","DOIUrl":"https://doi.org/10.1145/3393691.3394179","url":null,"abstract":"We consider the tail behavior of the response time distribution in an M/G/1 queue with heavy-tailed job sizes, specifically those with intermediately regularly varying tails. In this setting, the response time tail of many individual policies has been characterized, and it is known that policies such as Shortest Remaining Processing Time (SRPT) and Foreground-Background (FB) have response time tails of the same order as the job size tail, and thus such policies are tail-optimal. Our goal in this work is to move beyond individual policies and characterize the set of policies that are tail-optimal. Toward that end, we use the recently introduced SOAP framework to derive sufficient conditions on the form of prioritization used by a scheduling policy that ensure the policy is tail-optimal. These conditions are general and lead to new results for important policies that have previously resisted analysis, including the Gittins policy, which minimizes mean response time among policies that do not have access to job size information. As a by-product of our analysis, we derive a general upper bound for fractional moments of M/G/1 busy periods, which is of independent interest.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"366 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121729787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}