Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems最新文献

英文中文

Fundamental Limits of Approximate Gradient Coding 近似梯度编码的基本极限

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394188

Sinong Wang, Jiashang Liu, N. Shroff

In the distributed graident coding problem, it has been established that, to exactly recover the gradient under s slow machines, the mmimum computation load (number of stored data partitions) of each worker is at least linear ($s+1$), which incurs a large overhead when s is large[13]. In this paper, we focus on approximate gradient coding that aims to recover the gradient with bounded error ε. Theoretically, our main contributions are three-fold: (i) we analyze the structure of optimal gradient codes, and derive the information-theoretical lower bound of minimum computation load: O(log(n)/log(n/s)) for ε = 0 and d≥ O(log(1/ε)/log(n/s)) for ε>0, where d is the computation load, and ε is the error in the gradient computation; (ii) we design two approximate gradient coding schemes that exactly match such lower bounds based on random edge removal process; (iii) we implement our schemes and demonstrate the advantage of the approaches over the current fastest gradient coding strategies. The proposed schemes provide order-wise improvement over the state of the art in terms of computation load, and are also optimal in terms of both computation load and latency.

在分布式梯度编码问题中，已经确定，要准确恢复s台慢机下的梯度，每个worker的最小计算负载(存储数据分区数)至少是线性的($s+1$)，当s很大时，会产生很大的开销[13]。在本文中，我们重点研究近似梯度编码，旨在恢复具有有界误差ε的梯度。在理论上，我们的主要贡献有三方面:(i)分析了最优梯度码的结构，推导出最小计算负荷的信息理论下界:ε = 0时O(log(n)/log(n/s))， ε>0时d≥O(log(1/ε)/log(n/s))，其中d为计算负荷，ε为梯度计算误差;(ii)基于随机边缘去除过程，设计了两种精确匹配下界的近似梯度编码方案;(iii)我们实现了我们的方案，并证明了这些方法相对于目前最快的梯度编码策略的优势。所提出的方案在计算负载方面提供了对现有技术的顺序改进，并且在计算负载和延迟方面也是最佳的。

{"title":"Fundamental Limits of Approximate Gradient Coding","authors":"Sinong Wang, Jiashang Liu, N. Shroff","doi":"10.1145/3393691.3394188","DOIUrl":"https://doi.org/10.1145/3393691.3394188","url":null,"abstract":"In the distributed graident coding problem, it has been established that, to exactly recover the gradient under s slow machines, the mmimum computation load (number of stored data partitions) of each worker is at least linear ($s+1$), which incurs a large overhead when s is large[13]. In this paper, we focus on approximate gradient coding that aims to recover the gradient with bounded error ε. Theoretically, our main contributions are three-fold: (i) we analyze the structure of optimal gradient codes, and derive the information-theoretical lower bound of minimum computation load: O(log(n)/log(n/s)) for ε = 0 and d≥ O(log(1/ε)/log(n/s)) for ε>0, where d is the computation load, and ε is the error in the gradient computation; (ii) we design two approximate gradient coding schemes that exactly match such lower bounds based on random edge removal process; (iii) we implement our schemes and demonstrate the advantage of the approaches over the current fastest gradient coding strategies. The proposed schemes provide order-wise improvement over the state of the art in terms of computation load, and are also optimal in terms of both computation load and latency.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114739402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Achieving Efficient Routing in Reconfigurable DCNs 在可重构dcn中实现高效路由

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394175

Zhenjie Yang, Yong Cui, Shihan Xiao, Xin Wang, Minming Li, Chuming Li, Yadong Liu

With the fast growth of cloud services and network scales, the heavy and highly dynamic traffic demands pose great challenges to the efficient traffic engineering in today's data center networks (DCNs) [21]. The DCN flows can be broadly classified into two main categories: delay-sensitive small flows (e.g., queries or realtime small messages) and throughput-sensitive large flows (e.g., the backup traffic). In general, more than 80% flows in data centers are small flows, while the majority of the traffic volume is contributed by the top 10% large flows [3, 7]. To handle the mixed traffic, today's data centers [1, 14] generally follow the tree-based topologies (e.g., fat-tree) and take the load-agnostic routing strategies based on random path selection (e.g., ECMP1) [14, 19]. Although it is applicable for routing small flows which are highly random, these strategies are likely to route several large flows through the same output link and lead to long-lived congestions [2, 8]. With the limited switch buffer occupied by large flows for a long time, small flows are reported to experience one order of magnitude larger delay, which compromises the performance of DCNs and makes the users suffer [3].

随着云服务和网络规模的快速增长，海量、高动态的流量需求对当今数据中心网络(DCNs)的高效流量工程提出了巨大挑战[21]。DCN流可以大致分为两大类:延迟敏感的小流(例如，查询或实时小消息)和吞吐量敏感的大流(例如，备份流量)。一般来说，数据中心80%以上的流量是小流量，而大部分流量是由前10%的大流量贡献的[3,7]。为了处理混合流量，当今的数据中心[1,14]通常遵循基于树的拓扑(如胖树)，并采用基于随机路径选择的负载不可知路由策略(如ECMP1)[14,19]。虽然它适用于路由高度随机的小流，但这些策略很可能通过相同的输出链路路由几个大流，并导致长时间的拥塞[2,8]。由于大流量长时间占用有限开关缓冲区，据报道，小流量的延迟要大一个数量级，这会影响DCNs的性能，让用户感到痛苦[3]。

{"title":"Achieving Efficient Routing in Reconfigurable DCNs","authors":"Zhenjie Yang, Yong Cui, Shihan Xiao, Xin Wang, Minming Li, Chuming Li, Yadong Liu","doi":"10.1145/3393691.3394175","DOIUrl":"https://doi.org/10.1145/3393691.3394175","url":null,"abstract":"With the fast growth of cloud services and network scales, the heavy and highly dynamic traffic demands pose great challenges to the efficient traffic engineering in today's data center networks (DCNs) [21]. The DCN flows can be broadly classified into two main categories: delay-sensitive small flows (e.g., queries or realtime small messages) and throughput-sensitive large flows (e.g., the backup traffic). In general, more than 80% flows in data centers are small flows, while the majority of the traffic volume is contributed by the top 10% large flows [3, 7]. To handle the mixed traffic, today's data centers [1, 14] generally follow the tree-based topologies (e.g., fat-tree) and take the load-agnostic routing strategies based on random path selection (e.g., ECMP1) [14, 19]. Although it is applicable for routing small flows which are highly random, these strategies are likely to route several large flows through the same output link and lead to long-lived congestions [2, 8]. With the limited switch buffer occupied by large flows for a long time, small flows are reported to experience one order of magnitude larger delay, which compromises the performance of DCNs and makes the users suffer [3].","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126206282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

I Know What You Did Last Summer: Network Monitoring using Interval Queries 我知道你去年夏天做了什么:使用间隔查询进行网络监控

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394193

Nikita Ivkin, Ran Ben Basat, Zaoxing Liu, Gil Einziger, R. Friedman, V. Braverman

Modern telemetry systems require advanced analytic capabilities such as drill down queries. These queries can be used to detect the beginning and end of a network anomaly by efficiently refining the search space. We present the first integral solution that (i) enables multiple measurement tasks inside the same data structure, (ii) supports specifying the time frame of interest as part of its queries, and (iii) is sketch-based and thus space efficient. Namely, our approach allows the user to define both the measurement task (e.g., heavy hitters, entropy estimation, cardinality estimation) and the time frame of relevance (e.g., 5PM-6PM) at query time. Our approach provides accuracy guarantees and is the only space-efficient solution that offers such capabilities. Finally, we demonstrate how the algorithm can be used to accurately pinpoint the beginning of a realistic DDoS attack.

现代遥测系统需要先进的分析能力，如钻取查询。这些查询可以通过有效地优化搜索空间来检测网络异常的开始和结束。我们提出了第一个完整的解决方案，它(i)在同一数据结构中实现多个测量任务，(ii)支持指定感兴趣的时间框架作为其查询的一部分，(iii)基于草图，因此空间效率高。也就是说，我们的方法允许用户在查询时定义度量任务(例如，重击者，熵估计，基数估计)和相关的时间框架(例如，下午5点到下午6点)。我们的方法提供了准确性保证，并且是提供此类功能的唯一节省空间的解决方案。最后，我们演示了如何使用该算法准确地定位现实DDoS攻击的开始。

引用次数: 13

Fast Dimensional Analysis for Root Cause Investigation in a Large-Scale Service Environment 大规模服务环境中根因调查的快速量纲分析

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394185

F. Lin, Keyur Muzumdar, N. Laptev, M. Curelea, Seunghak Lee, S. Sankar

Root cause analysis in a large-scale production environment is challenging due to the complexity and scale of the services running across global data centers. It is often difficult to review the logs jointly for understanding production issues given the distributed nature of the system. Additionally, there could easily be millions of entities, each described by hundreds of features. In this paper we present a fast dimensional analysis framework that automates the root cause analysis on structured logs with improved scalability. We first explore item-sets, i.e. combinations of feature values, that could identify groups of samples with sufficient support for the target failures using the Apriori algorithm and a subsequent improvement, FP-Growth. These algorithms were designed for frequent item-set mining and association rule learning over transactional databases. After applying them on structured logs, we select the item-sets that are most unique to the target failures based on lift. We propose pre-processing steps with the use of a large-scale real-time database and post-processing techniques and parallelism to further speed up the analysis and improve interpretability, and demonstrate that such optimization is necessary for handling large- scale production datasets. We have successfully rolled out this approach for root cause investigation purposes within Facebook's infrastructure. We also present the setup and results from multiple production use cases in this paper.

由于跨全球数据中心运行的服务的复杂性和规模，在大规模生产环境中进行根本原因分析是具有挑战性的。考虑到系统的分布式特性，通常很难联合审查日志以理解生产问题。此外，很容易有数百万个实体，每个实体由数百个特征描述。在本文中，我们提出了一个快速的维度分析框架，它可以自动化结构化日志的根本原因分析，并具有改进的可扩展性。我们首先探索项目集，即特征值的组合，可以使用Apriori算法和随后的改进FP-Growth来识别具有足够支持目标失败的样本组。这些算法被设计用于事务性数据库的频繁项集挖掘和关联规则学习。在将它们应用于结构化测井之后，我们根据举升情况选择目标故障最独特的项集。我们提出了使用大规模实时数据库和后处理技术以及并行性的预处理步骤，以进一步加快分析和提高可解释性，并证明这种优化对于处理大规模生产数据集是必要的。我们已经成功地在Facebook的基础设施中推出了这种方法，用于根本原因调查。我们还在本文中展示了来自多个生产用例的设置和结果。

{"title":"Fast Dimensional Analysis for Root Cause Investigation in a Large-Scale Service Environment","authors":"F. Lin, Keyur Muzumdar, N. Laptev, M. Curelea, Seunghak Lee, S. Sankar","doi":"10.1145/3393691.3394185","DOIUrl":"https://doi.org/10.1145/3393691.3394185","url":null,"abstract":"Root cause analysis in a large-scale production environment is challenging due to the complexity and scale of the services running across global data centers. It is often difficult to review the logs jointly for understanding production issues given the distributed nature of the system. Additionally, there could easily be millions of entities, each described by hundreds of features. In this paper we present a fast dimensional analysis framework that automates the root cause analysis on structured logs with improved scalability. We first explore item-sets, i.e. combinations of feature values, that could identify groups of samples with sufficient support for the target failures using the Apriori algorithm and a subsequent improvement, FP-Growth. These algorithms were designed for frequent item-set mining and association rule learning over transactional databases. After applying them on structured logs, we select the item-sets that are most unique to the target failures based on lift. We propose pre-processing steps with the use of a large-scale real-time database and post-processing techniques and parallelism to further speed up the analysis and improve interpretability, and demonstrate that such optimization is necessary for handling large- scale production datasets. We have successfully rolled out this approach for root cause investigation purposes within Facebook's infrastructure. We also present the setup and results from multiple production use cases in this paper.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127056763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

The Great Internet TCP Congestion Control Census 互联网TCP拥塞控制普查

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394221

Ayush Mishra, Xiangpeng Sun, Atishya Jain, Sameer Pande, Raj Joshi, B. Leong

In 2016, Google proposed and deployed a new TCP variant called BBR. BBR represents a major departure from traditional congestion control as it uses estimates of bandwidth and round-trip delays to regulate its sending rate. BBR has since been introduced in the upstream Linux kernel and deployed by Google across its data centers. Since the last major study to identify TCP congestion control variants on the Internet was done before BBR, it is timely to conduct a new census to give us a sense of the current distribution of congestion control variants on the Internet. To this end, we designed and implemented Gordon, a tool that allows us to measure the congestion window (cwnd) corresponding to each successive RTT in the TCP connection response of a congestion control algorithm. To compare a measured flow to the known variants, we created a localized bottleneck and introduced a variety of network changes like loss events, changes in bandwidth and delay, while normalizing all measurements by RTT. We built an offline classifier to identify the TCP variant based on the cwnd trace over time. Our results suggest that CUBIC is currently the dominant TCP variant on the Internet, and is deployed on about 36% of the websites in the Alexa Top 20,000 list. While BBR and its variant BBR G1.1 are currently in second place with a 22% share by website count, their present share of total Internet traffic volume is estimated to be larger than 40%. We also found that Akamai has deployed a unique loss-agnostic rate-based TCP variant on some 6% of the Alexa Top 20,000 websites and there are likely other undocumented variants. Therefore, the traditional assumption that TCP variants ''in the wild'' will come from a small known set is not likely to be true anymore. Our results suggest that some variant of BBR seems poised to replace CUBIC as the next dominant TCP variant on the Internet.

2016年，谷歌提出并部署了一种名为BBR的新的TCP变体。BBR与传统的拥塞控制有很大不同，因为它使用带宽和往返延迟的估计来调节其发送速率。此后，BBR被引入到上游Linux内核中，并由Google在其数据中心部署。由于上一次识别互联网上TCP拥塞控制变体的主要研究是在BBR之前完成的，因此进行一次新的普查以使我们了解互联网上拥塞控制变体的当前分布是及时的。为此，我们设计并实现了Gordon，这是一个工具，它允许我们测量拥塞控制算法的TCP连接响应中每个连续RTT对应的拥塞窗口(cwnd)。为了将测量的流量与已知的流量进行比较，我们创建了一个局部瓶颈，并引入了各种网络变化，如丢失事件、带宽和延迟的变化，同时通过RTT将所有测量结果归一化。我们构建了一个离线分类器，根据随时间变化的cwnd跟踪来识别TCP变体。我们的研究结果表明，CUBIC目前是互联网上占主导地位的TCP变体，并且部署在Alexa Top 20,000列表中约36%的网站上。虽然BBR及其变体BBR G1.1目前以22%的网站数量排名第二，但它们目前在互联网总流量中所占的份额估计超过40%。我们还发现，Akamai在Alexa排名前2万的网站中约6%的网站上部署了一种独特的基于损失不可知率的TCP变体，可能还有其他未记录的变体。因此，传统的假设，即TCP变体“在野外”将来自一个已知的小集合，不太可能是正确的了。我们的研究结果表明，BBR的某些变体似乎有望取代CUBIC，成为互联网上下一个占主导地位的TCP变体。

{"title":"The Great Internet TCP Congestion Control Census","authors":"Ayush Mishra, Xiangpeng Sun, Atishya Jain, Sameer Pande, Raj Joshi, B. Leong","doi":"10.1145/3393691.3394221","DOIUrl":"https://doi.org/10.1145/3393691.3394221","url":null,"abstract":"In 2016, Google proposed and deployed a new TCP variant called BBR. BBR represents a major departure from traditional congestion control as it uses estimates of bandwidth and round-trip delays to regulate its sending rate. BBR has since been introduced in the upstream Linux kernel and deployed by Google across its data centers. Since the last major study to identify TCP congestion control variants on the Internet was done before BBR, it is timely to conduct a new census to give us a sense of the current distribution of congestion control variants on the Internet. To this end, we designed and implemented Gordon, a tool that allows us to measure the congestion window (cwnd) corresponding to each successive RTT in the TCP connection response of a congestion control algorithm. To compare a measured flow to the known variants, we created a localized bottleneck and introduced a variety of network changes like loss events, changes in bandwidth and delay, while normalizing all measurements by RTT. We built an offline classifier to identify the TCP variant based on the cwnd trace over time. Our results suggest that CUBIC is currently the dominant TCP variant on the Internet, and is deployed on about 36% of the websites in the Alexa Top 20,000 list. While BBR and its variant BBR G1.1 are currently in second place with a 22% share by website count, their present share of total Internet traffic volume is estimated to be larger than 40%. We also found that Akamai has deployed a unique loss-agnostic rate-based TCP variant on some 6% of the Alexa Top 20,000 websites and there are likely other undocumented variants. Therefore, the traditional assumption that TCP variants ''in the wild'' will come from a small known set is not likely to be true anymore. Our results suggest that some variant of BBR seems poised to replace CUBIC as the next dominant TCP variant on the Internet.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"227 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116494232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

Heavy-traffic Analysis of the Generalized Switch under Multidimensional State Space Collapse 多维状态空间坍缩下广义开关的大流量分析

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394192

Daniela Hurtado-Lange, S. T. Maguluri

Stochastic Processing Networks that model wired and wireless networks, and other queueing systems have been studied in heavy-traffic limit in the literature under the so-called Complete Resource Pooling (CRP) condition. Under the CRP condition, these systems behave like a single server queue. When the CRP condition is not satisfied, heavy-traffic results are known only in the special case of an input-queued switch and bandwidth-sharing network. In this paper, we consider a very general queueing system called the 'generalized switch' that includes wireless networks under fading, data center networks, input-queued switch, etc. The primary contribution of this paper is to present the exact value of the steady-state mean of certain linear combinations of queue lengths in the heavy-traffic limit under the MaxWeight scheduling algorithm. We do this using the Drift method, and we also present a negative result that it is not possible to obtain the remaining linear combinations (and consequently all the individual mean queue lengths) using this method. We do this by presenting an alternate view of the Drift method in terms of an (under-determined) system of linear equations. Finally, we use this system of equations to obtain upper and lower bounds on all linear combinations of queue lengths.

在所谓的完全资源池(CRP)条件下，在大流量限制下研究了模拟有线和无线网络以及其他排队系统的随机处理网络。在CRP条件下，这些系统表现得像单个服务器队列。当CRP条件不满足时，只有在输入排队交换机和带宽共享网络的特殊情况下才知道大流量结果。本文考虑了一种非常通用的排队系统，称为“广义交换机”，它包括衰落无线网络、数据中心网络、输入排队交换机等。本文的主要贡献是给出了MaxWeight调度算法下大流量限制下队列长度的某些线性组合的稳态均值的精确值。我们使用漂移方法来做到这一点，并且我们还提出了一个负面结果，即使用该方法不可能获得剩余的线性组合(因此所有单个平均队列长度)。我们通过在线性方程(待定)系统方面提出漂移方法的另一种观点来做到这一点。最后，我们利用这个方程组得到了所有队列长度线性组合的上界和下界。

{"title":"Heavy-traffic Analysis of the Generalized Switch under Multidimensional State Space Collapse","authors":"Daniela Hurtado-Lange, S. T. Maguluri","doi":"10.1145/3393691.3394192","DOIUrl":"https://doi.org/10.1145/3393691.3394192","url":null,"abstract":"Stochastic Processing Networks that model wired and wireless networks, and other queueing systems have been studied in heavy-traffic limit in the literature under the so-called Complete Resource Pooling (CRP) condition. Under the CRP condition, these systems behave like a single server queue. When the CRP condition is not satisfied, heavy-traffic results are known only in the special case of an input-queued switch and bandwidth-sharing network. In this paper, we consider a very general queueing system called the 'generalized switch' that includes wireless networks under fading, data center networks, input-queued switch, etc. The primary contribution of this paper is to present the exact value of the steady-state mean of certain linear combinations of queue lengths in the heavy-traffic limit under the MaxWeight scheduling algorithm. We do this using the Drift method, and we also present a negative result that it is not possible to obtain the remaining linear combinations (and consequently all the individual mean queue lengths) using this method. We do this by presenting an alternate view of the Drift method in terms of an (under-determined) system of linear equations. Finally, we use this system of equations to obtain upper and lower bounds on all linear combinations of queue lengths.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128282858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimal Bidding Strategies for Online Ad Auctions with Overlapping Targeting Criteria 具有重叠目标标准的在线广告拍卖的最优竞价策略

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394210

Erik Tillberg, P. Marbach, R. Mazumdar

We analyze the problem of how to optimally bid for ad spaces in online ad auctions. For this we consider the general case of multiple ad campaigns with overlapping targeting criteria. In our analysis we first characterize the structure of an optimal bidding strategy. In particular, we show that an optimal bidding strategies decomposes the problem into disjoint sets of campaigns and targeting groups. In addition, we show that pure bidding strategies that use only a single bid value for each campaign are not optimal when the supply curves are not continuous. For this case, we derive a lower-bound on the optimal cost of any bidding strategy, as well as mixed bidding strategies that either achieve the lower-bound, or can get arbitrarily close to it.

我们分析了如何在在线广告拍卖中对广告空间进行最佳出价的问题。为此，我们考虑具有重叠目标标准的多个广告活动的一般情况。在我们的分析中，我们首先描述了最优竞价策略的结构。特别地，我们展示了一个最优的投标策略将问题分解为不相交的活动集和目标群体。此外，我们还表明，当供给曲线不连续时，对每个活动仅使用单个出价值的纯投标策略并不是最优的。在这种情况下，我们推导出任何投标策略的最优成本的下界，以及达到下界或可以任意接近下界的混合投标策略。

引用次数: 1

Simple Near-Optimal Scheduling for the M/G/1 M/G/1的简单近最优调度

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394216

Ziv Scully, Mor Harchol-Balter, Alan Scheller-Wolf

We consider the problem of preemptively scheduling jobs to minimize mean response time of an M/G/1 queue. When we know each job's size, the shortest remaining processing time (SRPT) policy is optimal. Unfortunately, in many settings we do not have access to each job's size. Instead, we know only the job size distribution. In this setting the Gittins policy is known to minimize mean response time, but its complex priority structure can be computationally intractable. A much simpler alternative to Gittins is the shortest expected remaining processing time (SERPT) policy. While SERPT is a natural extension of SRPT to unknown job sizes, it is unknown whether or not SERPT is close to optimal for mean response time. We present a new variant of SERPT called monotonic SERPT (M-SERPT) which is as simple as SERPT but has provably near-optimal mean response time at all loads for any job size distribution. Specifically, we prove the mean response time ratio between M-SERPT and Gittins is at most 3 for load ρ ≤ 8/9 and at most 5 for any load. This makes M-SERPT the only non-Gittins scheduling policy known to have a constant-factor approximation ratio for mean response time.

考虑以最小化M/G/1队列平均响应时间为目标的抢占调度问题。当我们知道每个作业的大小时，最短剩余处理时间(SRPT)策略是最优的。不幸的是，在许多情况下，我们无法访问每个作业的大小。相反，我们只知道工作规模的分布。在这种情况下，Gittins策略可以最小化平均响应时间，但其复杂的优先级结构在计算上是难以处理的。gittin的一个更简单的替代方案是最短预期剩余处理时间(SERPT)策略。虽然SERPT是SRPT对未知作业大小的自然扩展，但对于平均响应时间而言，SERPT是否接近最优是未知的。我们提出了一种新的SERPT变体，称为单调SERPT (M-SERPT)，它与SERPT一样简单，但在任何作业大小分布的所有负载下都具有接近最佳的平均响应时间。具体来说，我们证明了M-SERPT和Gittins的平均响应时间比在ρ≤8/9时不超过3，在任何负载下不超过5。这使得M-SERPT成为已知的唯一具有平均响应时间常数因子近似比的非gittin调度策略。

{"title":"Simple Near-Optimal Scheduling for the M/G/1","authors":"Ziv Scully, Mor Harchol-Balter, Alan Scheller-Wolf","doi":"10.1145/3393691.3394216","DOIUrl":"https://doi.org/10.1145/3393691.3394216","url":null,"abstract":"We consider the problem of preemptively scheduling jobs to minimize mean response time of an M/G/1 queue. When we know each job's size, the shortest remaining processing time (SRPT) policy is optimal. Unfortunately, in many settings we do not have access to each job's size. Instead, we know only the job size distribution. In this setting the Gittins policy is known to minimize mean response time, but its complex priority structure can be computationally intractable. A much simpler alternative to Gittins is the shortest expected remaining processing time (SERPT) policy. While SERPT is a natural extension of SRPT to unknown job sizes, it is unknown whether or not SERPT is close to optimal for mean response time. We present a new variant of SERPT called monotonic SERPT (M-SERPT) which is as simple as SERPT but has provably near-optimal mean response time at all loads for any job size distribution. Specifically, we prove the mean response time ratio between M-SERPT and Gittins is at most 3 for load ρ ≤ 8/9 and at most 5 for any load. This makes M-SERPT the only non-Gittins scheduling policy known to have a constant-factor approximation ratio for mean response time.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130076466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Uniform Loss Algorithms for Online Stochastic Decision-Making With Applications to Bin Packing 在线随机决策的一致损失算法及其在装箱问题上的应用

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394224

Siddhartha Banerjee, Daniel Freund

We consider a general class of finite-horizon online decision-making problems, where in each period a controller is presented a stochastic arrival and must choose an action from a set of permissible actions, and the final objective depends only on the aggregate type-action counts. Such a framework encapsulates many online stochastic variants of common optimization problems including bin packing, generalized assignment, and network revenue management. In such settings, we study a natural model-predictive control algorithm that in each period, acts greedily based on an updated certainty-equivalent optimization problem. We introduce a simple, yet general, condition under which this algorithm obtains uniform additive loss (independent of the horizon) compared to an optimal solution with full knowledge of arrivals. Our condition is fulfilled by the above-mentioned problems, as well as more general settings involving piece-wise linear objectives and offline index policies, including an airline overbooking problem.

我们考虑了一类一般的有限视界在线决策问题，其中控制器在每个时段都有一个随机到达，并且必须从一组允许的动作中选择一个动作，最终目标仅取决于集合类型-动作计数。这样的框架封装了许多常见优化问题的在线随机变体，包括装箱、广义分配和网络收益管理。在这种情况下，我们研究了一种自然模型预测控制算法，该算法在每个周期内基于更新的确定性等效优化问题贪婪地行动。我们引入了一个简单但一般的条件，在该条件下，与完全知道到达点的最优解相比，该算法获得均匀的加性损失(与视界无关)。我们的条件可以通过上述问题，以及涉及分段线性目标和离线索引策略(包括航空公司超售问题)的更一般的设置来满足。

引用次数: 16

Characterizing Policies with Optimal Response Time Tails under Heavy-Tailed Job Sizes 重尾作业规模下具有最优响应时间尾的策略特征

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-06-08 DOI: 10.1145/3393691.3394179

Ziv Scully, Lucas van Kreveld, O. Boxma, Jan-Pieter L. Dorsman, A. Wierman

We consider the tail behavior of the response time distribution in an M/G/1 queue with heavy-tailed job sizes, specifically those with intermediately regularly varying tails. In this setting, the response time tail of many individual policies has been characterized, and it is known that policies such as Shortest Remaining Processing Time (SRPT) and Foreground-Background (FB) have response time tails of the same order as the job size tail, and thus such policies are tail-optimal. Our goal in this work is to move beyond individual policies and characterize the set of policies that are tail-optimal. Toward that end, we use the recently introduced SOAP framework to derive sufficient conditions on the form of prioritization used by a scheduling policy that ensure the policy is tail-optimal. These conditions are general and lead to new results for important policies that have previously resisted analysis, including the Gittins policy, which minimizes mean response time among policies that do not have access to job size information. As a by-product of our analysis, we derive a general upper bound for fractional moments of M/G/1 busy periods, which is of independent interest.

我们考虑具有重尾作业规模的M/G/1队列中响应时间分布的尾部行为，特别是那些具有中等规则变化尾部的队列。在此设置中，已经描述了许多单个策略的响应时间尾部，并且已知诸如最短剩余处理时间(SRPT)和前景-后台(FB)等策略的响应时间尾部与作业大小尾部的顺序相同，因此这些策略是尾部最优的。我们在这项工作中的目标是超越单个政策，并描述尾部最优的一组政策。为此，我们使用最近引入的SOAP框架来推导调度策略所使用的优先级形式的充分条件，以确保策略是尾部最优的。这些条件是通用的，并为以前无法分析的重要政策带来了新的结果，包括Gittins政策，该政策最大限度地减少了无法访问作业大小信息的政策的平均响应时间。作为我们分析的副产品，我们导出了M/G/1繁忙期分数阶矩的一般上界，这是我们独立感兴趣的。

{"title":"Characterizing Policies with Optimal Response Time Tails under Heavy-Tailed Job Sizes","authors":"Ziv Scully, Lucas van Kreveld, O. Boxma, Jan-Pieter L. Dorsman, A. Wierman","doi":"10.1145/3393691.3394179","DOIUrl":"https://doi.org/10.1145/3393691.3394179","url":null,"abstract":"We consider the tail behavior of the response time distribution in an M/G/1 queue with heavy-tailed job sizes, specifically those with intermediately regularly varying tails. In this setting, the response time tail of many individual policies has been characterized, and it is known that policies such as Shortest Remaining Processing Time (SRPT) and Foreground-Background (FB) have response time tails of the same order as the job size tail, and thus such policies are tail-optimal. Our goal in this work is to move beyond individual policies and characterize the set of policies that are tail-optimal. Toward that end, we use the recently introduced SOAP framework to derive sufficient conditions on the form of prioritization used by a scheduling policy that ensure the policy is tail-optimal. These conditions are general and lead to new results for important policies that have previously resisted analysis, including the Gittins policy, which minimizes mean response time among policies that do not have access to job size information. As a by-product of our analysis, we derive a general upper bound for fractional moments of M/G/1 busy periods, which is of independent interest.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"366 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121729787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀