Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems最新文献_第4页

Online Linear Optimization with Inventory Management Constraints 具有库存管理约束的在线线性优化

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394207

Lin Yang, M. Hajiesmaili, R. Sitaraman, A. Wierman, Enrique Mallada, W. Wong

This paper considers the problem of online linear optimization with inventory management constraints. Specifically, we consider an online scenario where a decision maker needs to satisfy her timevarying demand for some units of an asset, either from a market with a time-varying price or from her own inventory. In each time slot, the decision maker is presented a (linear) price and must immediately decide the amount to purchase for covering the demand and/or for storing in the inventory for future use. The inventory has a limited capacity and can be used to buy and store assets at low price and cover the demand when the price is high. The ultimate goal of the decision maker is to cover the demand at each time slot while minimizing the cost of buying assets from the market. We propose ARP, an online algorithm for linear programming with inventory constraints, and ARPRate, an extended version that handles rate constraints to/from the inventory. Both ARP and ARPRate achieve optimal competitive ratios, meaning that no other online algorithm can achieve a better theoretical guarantee. To illustrate the results, we use the proposed algorithms in a case study focused on energy procurement and storage management strategies for data centers.

研究了具有库存管理约束的在线线性优化问题。具体来说，我们考虑一个在线场景，其中决策者需要满足她对某些资产单位的时变需求，这些需求可能来自具有时变价格的市场，也可能来自她自己的库存。在每个时隙中，决策者会看到一个(线性)价格，并且必须立即决定购买的数量，以满足需求和/或储存在库存中以备将来使用。库存的容量有限，可以用于低价购买和储存资产，并在价格高时满足需求。决策者的最终目标是满足每个时隙的需求，同时使从市场购买资产的成本最小化。我们提出了一种用于库存约束线性规划的在线算法ARP，以及一种用于处理进出库存的速率约束的扩展版本ARPRate。ARP和ARPRate都实现了最优竞争比，这意味着没有其他在线算法可以实现更好的理论保证。为了说明结果，我们在数据中心能源采购和存储管理策略的案例研究中使用了所提出的算法。

{"title":"Online Linear Optimization with Inventory Management Constraints","authors":"Lin Yang, M. Hajiesmaili, R. Sitaraman, A. Wierman, Enrique Mallada, W. Wong","doi":"10.1145/3393691.3394207","DOIUrl":"https://doi.org/10.1145/3393691.3394207","url":null,"abstract":"This paper considers the problem of online linear optimization with inventory management constraints. Specifically, we consider an online scenario where a decision maker needs to satisfy her timevarying demand for some units of an asset, either from a market with a time-varying price or from her own inventory. In each time slot, the decision maker is presented a (linear) price and must immediately decide the amount to purchase for covering the demand and/or for storing in the inventory for future use. The inventory has a limited capacity and can be used to buy and store assets at low price and cover the demand when the price is high. The ultimate goal of the decision maker is to cover the demand at each time slot while minimizing the cost of buying assets from the market. We propose ARP, an online algorithm for linear programming with inventory constraints, and ARPRate, an extended version that handles rate constraints to/from the inventory. Both ARP and ARPRate achieve optimal competitive ratios, meaning that no other online algorithm can achieve a better theoretical guarantee. To illustrate the results, we use the proposed algorithms in a case study focused on energy procurement and storage management strategies for data centers.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127623309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

User-level Threading: Have Your Cake and Eat It Too 用户级线程:鱼与熊掌兼得

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394226

M. Karsten, Saman Barghi

An important class of computer software, such as network servers, exhibits concurrency through many loosely coupled and potentially long-running communication sessions. For these applications, a long-standing open question is whether thread-per-session programming can deliver comparable performance to event-driven programming. This paper clearly demonstrates, for the first time, that it is possible to employ user-level threading for building thread-per-session applications without compromising functionality, efficiency, performance, or scalability. We present the design and implementation of a general-purpose, yet nimble, user-level M:N threading runtime that is built from scratch to accomplish these objectives. Its key components are efficient and effective load balancing and user-level I/O blocking. While no other runtime exists with comparable characteristics, an important fundamental finding of this work is that building this runtime does not require particularly intricate data structures or algorithms. The runtime is thus a straightforward existence proof for user-level threading without performance compromises and can serve as a reference platform for future research. It is evaluated in comparison to event-driven software, system-level threading, and several other user-level threading runtimes. An experimental evaluation is conducted using benchmark programs, as well as the popular Memcached application. We demonstrate that our user-level runtime outperforms other threading runtimes and enables thread-per-session programming at high levels of concurrency and hardware parallelism without sacrificing performance.

一类重要的计算机软件，如网络服务器，通过许多松散耦合且可能长时间运行的通信会话显示并发性。对于这些应用程序，一个长期存在的问题是，每会话线程编程是否能够提供与事件驱动编程相当的性能。本文首次清晰地展示了在不影响功能、效率、性能或可伸缩性的情况下，使用用户级线程来构建每个会话一个线程的应用程序是可能的。我们提出了一个通用的、灵活的、用户级的M:N线程运行时的设计和实现，它是为了实现这些目标而从头构建的。它的关键组件是高效和有效的负载平衡和用户级I/O阻塞。虽然没有其他运行时具有类似的特性，但这项工作的一个重要的基本发现是，构建这个运行时不需要特别复杂的数据结构或算法。因此，运行时是用户级线程在不影响性能的情况下直接存在的证明，可以作为未来研究的参考平台。将其与事件驱动软件、系统级线程和其他几个用户级线程运行时进行比较。使用基准程序以及流行的Memcached应用程序进行了实验评估。我们演示了我们的用户级运行时优于其他线程运行时，并且在不牺牲性能的情况下，支持高并发性和硬件并行性的每会话线程编程。

{"title":"User-level Threading: Have Your Cake and Eat It Too","authors":"M. Karsten, Saman Barghi","doi":"10.1145/3393691.3394226","DOIUrl":"https://doi.org/10.1145/3393691.3394226","url":null,"abstract":"An important class of computer software, such as network servers, exhibits concurrency through many loosely coupled and potentially long-running communication sessions. For these applications, a long-standing open question is whether thread-per-session programming can deliver comparable performance to event-driven programming. This paper clearly demonstrates, for the first time, that it is possible to employ user-level threading for building thread-per-session applications without compromising functionality, efficiency, performance, or scalability. We present the design and implementation of a general-purpose, yet nimble, user-level M:N threading runtime that is built from scratch to accomplish these objectives. Its key components are efficient and effective load balancing and user-level I/O blocking. While no other runtime exists with comparable characteristics, an important fundamental finding of this work is that building this runtime does not require particularly intricate data structures or algorithms. The runtime is thus a straightforward existence proof for user-level threading without performance compromises and can serve as a reference platform for future research. It is evaluated in comparison to event-driven software, system-level threading, and several other user-level threading runtimes. An experimental evaluation is conducted using benchmark programs, as well as the popular Memcached application. We demonstrate that our user-level runtime outperforms other threading runtimes and enables thread-per-session programming at high levels of concurrency and hardware parallelism without sacrificing performance.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127368936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

On the Complexity of Traffic Traces and Implications 交通轨迹的复杂性及其影响

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394205

C. Avin, M. Ghobadi, Chen Griner, S. Schmid

This paper presents a systematic approach to identify and quantify the types of structures featured by packet traces in communication networks. Our approach leverages an information-theoretic methodology, based on iterative randomization and compression of the packet trace, which allows us to systematically remove and measure dimensions of structure in the trace. In particular, we introduce the notion of trace complexity which approximates the entropy rate of a packet trace. Considering several real-world traces, we show that trace complexity can provide unique insights into the characteristics of various applications. Based on our approach, we also propose a traffic generator model able to produce a synthetic trace that matches the complexity levels of its corresponding real-world trace. Using a case study in the context of datacenters, we show that insights into the structure of packet traces can lead to improved demand-aware network designs: datacenter topologies that are optimized for specific traffic patterns.

本文提出了一种系统的方法来识别和量化通信网络中数据包轨迹的结构类型。我们的方法利用了基于迭代随机化和数据包跟踪压缩的信息论方法，这使我们能够系统地删除和测量跟踪中的结构维度。特别是，我们引入了跟踪复杂度的概念，它近似于数据包跟踪的熵率。考虑几个真实世界的跟踪，我们展示了跟踪复杂性可以为各种应用程序的特征提供独特的见解。基于我们的方法，我们还提出了一种流量生成器模型，该模型能够生成与其对应的真实世界轨迹的复杂程度相匹配的合成轨迹。通过数据中心背景下的案例研究，我们展示了对数据包跟踪结构的洞察可以改进需求感知网络设计:针对特定流量模式进行优化的数据中心拓扑。

引用次数: 17

Online Optimization with Predictions and Non-convex Losses 具有预测和非凸损失的在线优化

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394208

Yiheng Lin, Gautam Goel, A. Wierman

We study online optimization in a setting where an online learner seeks to optimize a per-round hitting cost, which may be non-convex, while incurring a movement cost when changing actions between rounds. We ask: under what general conditions is it possible for an online learner to leverage predictions of future cost functions in order to achieve near-optimal costs? Prior work has provided near-optimal online algorithms for specific combinations of assumptions about hitting and switching costs, but no general results are known. In this work, we give two general sufficient conditions that specify a relationship between the hitting and movement costs which guarantees that a new algorithm, Synchronized Fixed Horizon Control (SFHC), achieves a 1+O(1/w) competitive ratio, where w is the number of predictions available to the learner. Our conditions do not require the cost functions to be convex, and we also derive competitive ratio results for non-convex hitting and movement costs. Our results provide the first constant, dimension-free competitive ratio for online non-convex optimization with movement costs. We also give an example of a natural problem, Convex Body Chasing (CBC), where the sufficient conditions are not satisfied and prove that no online algorithm can have a competitive ratio that converges to 1.

我们研究在线优化设置，其中在线学习者寻求优化每轮命中成本，这可能是非凸的，同时在回合之间改变动作时会产生移动成本。我们的问题是:在什么一般情况下，在线学习者有可能利用对未来成本函数的预测来实现接近最优的成本?先前的工作已经提供了接近最优的在线算法，用于特定的碰撞和切换成本的假设组合，但没有一般的结果是已知的。在这项工作中，我们给出了两个一般的充分条件来指定命中和移动成本之间的关系，这保证了一个新的算法，同步固定地平线控制(SFHC)，达到1+O(1/w)的竞争比，其中w是学习者可用的预测数量。我们的条件不要求成本函数是凸的，我们也得到了非凸命中和移动成本的竞争比结果。我们的结果为带移动成本的在线非凸优化提供了第一个恒定的、无维的竞争比。我们还给出了一个不满足充分条件的自然问题凸体追逐(CBC)的例子，并证明了任何在线算法都不可能具有收敛于1的竞争比。

{"title":"Online Optimization with Predictions and Non-convex Losses","authors":"Yiheng Lin, Gautam Goel, A. Wierman","doi":"10.1145/3393691.3394208","DOIUrl":"https://doi.org/10.1145/3393691.3394208","url":null,"abstract":"We study online optimization in a setting where an online learner seeks to optimize a per-round hitting cost, which may be non-convex, while incurring a movement cost when changing actions between rounds. We ask: under what general conditions is it possible for an online learner to leverage predictions of future cost functions in order to achieve near-optimal costs? Prior work has provided near-optimal online algorithms for specific combinations of assumptions about hitting and switching costs, but no general results are known. In this work, we give two general sufficient conditions that specify a relationship between the hitting and movement costs which guarantees that a new algorithm, Synchronized Fixed Horizon Control (SFHC), achieves a 1+O(1/w) competitive ratio, where w is the number of predictions available to the learner. Our conditions do not require the cost functions to be convex, and we also derive competitive ratio results for non-convex hitting and movement costs. Our results provide the first constant, dimension-free competitive ratio for online non-convex optimization with movement costs. We also give an example of a natural problem, Convex Body Chasing (CBC), where the sufficient conditions are not satisfied and prove that no online algorithm can have a competitive ratio that converges to 1.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134476320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Under the Concealing Surface: Detecting and Understanding Live Webcams in the Wild 在隐藏的表面之下:检测和理解野外的实时网络摄像头

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394220

Jinke Song, Qiang Li, Haining Wang, Limin Sun

Given the central role of webcams in monitoring physical surroundings, it behooves the research community to understand the characteristics of webcams' distribution and their privacy/security implications. In this paper, we conduct the first systematic study on live webcams from both aggregation sites and individual webcams (webpages/IP hosts). We propose a series of efficient, automated techniques for detecting and fingerprinting live webcams. In particular, we leverage distributed algorithms to detect aggregation sites and generate webcam fingerprints by utilizing the Graphical User Interface (GUI) of the built-in web server of a device. Overall, we observe 0.85 million webpages from aggregation sites hosting live webcams and 2.2 million live webcams in the public IPv4 space. Our study reveals that aggregation sites have a typical long-tail distribution in hosting live streams (5.8% of sites contain 90.44% of live streaming contents), and 85.4% of aggregation websites scrape webcams from others. Further, we observe that (1) 277,239 webcams from aggregation sites and IP hosts (11.7%) directly expose live streams to the public, (2) aggregation sites expose 187,897 geolocation names and more detailed 23,083 longitude/latitude pairs of webcams, (3) the default usernames and passwords of 38,942 webcams are visible on aggregation sites in plaintext, and (4) 1,237 webcams are detected as having been compromised to conduct malicious behaviors.

鉴于网络摄像头在监控物理环境中的核心作用，研究团体有必要了解网络摄像头分布的特点及其对隐私/安全的影响。在本文中，我们对聚合站点和单个网络摄像机(网页/IP主机)的实时网络摄像机进行了首次系统研究。我们提出了一系列有效的、自动化的技术来检测和识别实时网络摄像头。特别是，我们利用分布式算法来检测聚合站点，并通过利用设备内置web服务器的图形用户界面(GUI)生成网络摄像头指纹。总体而言，我们观察到85万个来自聚合网站的网页托管实时网络摄像头和220万个公共IPv4空间的实时网络摄像头。我们的研究表明，聚合网站在托管直播方面具有典型的长尾分布(5.8%的网站包含90.44%的直播内容)，85.4%的聚合网站从其他网站抓取网络摄像头。此外，我们观察到:(1)来自聚合站点和IP主机的277,239个网络摄像头(11.7%)直接向公众公开直播流，(2)聚合站点暴露了187,897个地理位置名称和更详细的23,083个经纬度对网络摄像头，(3)38,942个网络摄像头的默认用户名和密码在聚合站点上以明文形式可见，(4)检测到1,237个网络摄像头被泄露进行恶意行为。

{"title":"Under the Concealing Surface: Detecting and Understanding Live Webcams in the Wild","authors":"Jinke Song, Qiang Li, Haining Wang, Limin Sun","doi":"10.1145/3393691.3394220","DOIUrl":"https://doi.org/10.1145/3393691.3394220","url":null,"abstract":"Given the central role of webcams in monitoring physical surroundings, it behooves the research community to understand the characteristics of webcams' distribution and their privacy/security implications. In this paper, we conduct the first systematic study on live webcams from both aggregation sites and individual webcams (webpages/IP hosts). We propose a series of efficient, automated techniques for detecting and fingerprinting live webcams. In particular, we leverage distributed algorithms to detect aggregation sites and generate webcam fingerprints by utilizing the Graphical User Interface (GUI) of the built-in web server of a device. Overall, we observe 0.85 million webpages from aggregation sites hosting live webcams and 2.2 million live webcams in the public IPv4 space. Our study reveals that aggregation sites have a typical long-tail distribution in hosting live streams (5.8% of sites contain 90.44% of live streaming contents), and 85.4% of aggregation websites scrape webcams from others. Further, we observe that (1) 277,239 webcams from aggregation sites and IP hosts (11.7%) directly expose live streams to the public, (2) aggregation sites expose 187,897 geolocation names and more detailed 23,083 longitude/latitude pairs of webcams, (3) the default usernames and passwords of 38,942 webcams are visible on aggregation sites in plaintext, and (4) 1,237 webcams are detected as having been compromised to conduct malicious behaviors.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130304660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Optimal Data Placement for Heterogeneous Cache, Memory, and Storage Systems 异构缓存、内存和存储系统的最佳数据放置

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394229

Lei Zhang, Reza Karimi, I. Ahmad, Ymir Vigfusson

New memory technologies are blurring the previously distinctive performance characteristics of adjacent layers in the memory hierarchy. No longer are such layers orders of magnitude different in request latency or capacity. Beyond the traditional single-layer view of caching, we now must re-cast the problem as a data placement challenge: which data should be cached in faster memory if it could instead be served directly from slower memory? We present CHOPT, an offline algorithm for data placement across multiple tiers of memory with asymmetric read and write costs. We show that CHOPT is optimal and can therefore serve as the upper bound of performance gain for any data placement algorithm. We also demonstrate an approximation of CHOPT which makes its execution time for long traces practical using spatial sampling of requests incurring a small 0.2% average error on representative workloads at a sampling ratio of 1%. Our evaluation of CHOPT on more than 30 production traces and benchmarks shows that optimal data placement decisions could improve average request latency by 8.2%-44.8% when compared with the long-established gold standard: Belady and Mattson's offline, evict-farthest-in-the-future optimal algorithms. Our results identify substantial improvement opportunities for future online memory management research.

新的记忆体技术模糊了记忆体层级中相邻层先前明显的性能特征。这些层在请求延迟或容量方面不再有数量级的不同。在传统的单层缓存视图之外，我们现在必须将这个问题重新定义为数据放置挑战:如果可以直接从较慢的内存提供服务，哪些数据应该缓存在较快的内存中?我们提出了CHOPT，一种离线算法，用于跨多层内存的数据放置，具有非对称的读写成本。我们证明CHOPT是最优的，因此可以作为任何数据放置算法性能增益的上限。我们还演示了一个近似的CHOPT，它使用请求的空间采样使其长跟踪的执行时间变得可行，在抽样比率为1%的代表性工作负载上产生0.2%的平均误差。我们在30多条生产轨迹和基准上对CHOPT进行了评估，结果表明，与长期建立的黄金标准(Belady和Mattson的离线、最长远的优化算法)相比，最优数据放置决策可以将平均请求延迟提高8.2%-44.8%。我们的结果确定了未来在线内存管理研究的实质性改进机会。

{"title":"Optimal Data Placement for Heterogeneous Cache, Memory, and Storage Systems","authors":"Lei Zhang, Reza Karimi, I. Ahmad, Ymir Vigfusson","doi":"10.1145/3393691.3394229","DOIUrl":"https://doi.org/10.1145/3393691.3394229","url":null,"abstract":"New memory technologies are blurring the previously distinctive performance characteristics of adjacent layers in the memory hierarchy. No longer are such layers orders of magnitude different in request latency or capacity. Beyond the traditional single-layer view of caching, we now must re-cast the problem as a data placement challenge: which data should be cached in faster memory if it could instead be served directly from slower memory? We present CHOPT, an offline algorithm for data placement across multiple tiers of memory with asymmetric read and write costs. We show that CHOPT is optimal and can therefore serve as the upper bound of performance gain for any data placement algorithm. We also demonstrate an approximation of CHOPT which makes its execution time for long traces practical using spatial sampling of requests incurring a small 0.2% average error on representative workloads at a sampling ratio of 1%. Our evaluation of CHOPT on more than 30 production traces and benchmarks shows that optimal data placement decisions could improve average request latency by 8.2%-44.8% when compared with the long-established gold standard: Belady and Mattson's offline, evict-farthest-in-the-future optimal algorithms. Our results identify substantial improvement opportunities for future online memory management research.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123822483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Unimodal Bandits with Continuous Arms: Order-optimal Regret without Smoothness 具有连续武器的单峰强盗:无平滑的秩序最优后悔

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394225

Richard Combes, A. Proutière, A. Fauquette

We consider stochastic bandit problems with a continuous set of arms and where the expected reward is a continuous and unimodal function of the arm. For these problems, we propose the Stochastic Polychotomy (SP) algorithms, and derive finite-time upper bounds on their regret and optimization error. We show that, for a class of reward functions, the SP algorithm achieves a regret and an optimization error with optimal scalings, i.e., O(√T) and O(1/√T) (up to a logarithmic factor), respectively.

我们考虑具有连续臂的随机强盗问题，其中期望奖励是臂的连续单峰函数。针对这些问题，我们提出了随机多切分(SP)算法，并推导了它们的遗憾和优化误差的有限时间上界。我们证明，对于一类奖励函数，SP算法分别实现了具有最优缩放的遗憾和优化误差，即O(√T)和O(1/√T)(最高为对数因子)。

引用次数: 11

Delay-Optimal Policies in Partial Fork-Join Systems with Redundancy and Random Slowdowns 具有冗余和随机减速的部分分叉连接系统的延迟最优策略

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-27 DOI: 10.1145/3393691.3394181

Martin Zubeldia

We consider a large distributed service system consisting of n homogeneous servers with infinite capacity FIFO queues. Jobs arrive as a Poisson process of rate λ n/kn (for some positive constant λ and integer kn). Each incoming job consists of kn identical tasks that can be executed in parallel, and that can be encoded into at least kn "replicas" of the same size (by introducing redundancy) so that the job is considered to be completed when any kn replicas associated with it finish their service. Moreover, we assume that servers can experience random slowdowns in their processing rate so that the service time of a replica is the product of its size and a random slowdown. First, we assume that the server slowdowns are shifted exponential and independent of the replica sizes. In this setting we show that the delay of a typical job is asymptotically minimized (as n→∞) when the number of replicas per task is a constant that only depends on the arrival rate λ, and on the expected slowdown of servers. Second, we introduce a new model for the server slowdowns in which larger tasks experience less variable slowdowns than smaller tasks. In this setting we show that, under the class of policies where all replicas start their service at the same time, the delay of a typical job is asymptotically minimized (as n→∞) when the number of replicas per task is made to depend on the actual size of the tasks being replicated, with smaller tasks being replicated more than larger tasks.

考虑一个由n个具有无限容量FIFO队列的同构服务器组成的大型分布式服务系统。作业到达的泊松过程速率为λ n/kn(对于某些正常数λ和整数kn)。每个传入作业由kn个相同的任务组成，这些任务可以并行执行，并且可以被编码为至少kn个相同大小的“副本”(通过引入冗余)，以便当与之相关的任何kn个副本完成其服务时，该作业被认为已经完成。此外，我们假设服务器的处理速度会随机变慢，因此副本的服务时间是其大小和随机变慢的乘积。首先，我们假设服务器的减速是指数变化的，并且与副本大小无关。在这个设置中，我们展示了当每个任务的副本数量是一个仅取决于到达率λ和服务器的预期减速的常数时，典型作业的延迟是渐近最小化的(作为n→∞)。其次，我们为服务器减速引入了一个新的模型，其中较大的任务比较小的任务经历更少的可变减速。在此设置中，我们展示了，在所有副本同时启动服务的策略类别下，当每个任务的副本数量取决于被复制任务的实际大小时，典型作业的延迟渐近最小化(为n→∞)，较小的任务被复制的次数多于较大的任务。

{"title":"Delay-Optimal Policies in Partial Fork-Join Systems with Redundancy and Random Slowdowns","authors":"Martin Zubeldia","doi":"10.1145/3393691.3394181","DOIUrl":"https://doi.org/10.1145/3393691.3394181","url":null,"abstract":"We consider a large distributed service system consisting of n homogeneous servers with infinite capacity FIFO queues. Jobs arrive as a Poisson process of rate λ n/kn (for some positive constant λ and integer kn). Each incoming job consists of kn identical tasks that can be executed in parallel, and that can be encoded into at least kn \"replicas\" of the same size (by introducing redundancy) so that the job is considered to be completed when any kn replicas associated with it finish their service. Moreover, we assume that servers can experience random slowdowns in their processing rate so that the service time of a replica is the product of its size and a random slowdown. First, we assume that the server slowdowns are shifted exponential and independent of the replica sizes. In this setting we show that the delay of a typical job is asymptotically minimized (as n→∞) when the number of replicas per task is a constant that only depends on the arrival rate λ, and on the expected slowdown of servers. Second, we introduce a new model for the server slowdowns in which larger tasks experience less variable slowdowns than smaller tasks. In this setting we show that, under the class of policies where all replicas start their service at the same time, the delay of a typical job is asymptotically minimized (as n→∞) when the number of replicas per task is made to depend on the actual size of the tasks being replicated, with smaller tasks being replicated more than larger tasks.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122572888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Non-Asymptotic Analysis of Monte Carlo Tree Search 蒙特卡罗树搜索的非渐近分析

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-07 DOI: 10.1145/3393691.3394202

D. Shah, Qiaomin Xie, Zhi Xu

In this work, we consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite-horizon discounted cost Markov Decision Process (MDP) with deterministic transitions. While MCTS is believed to provide an approximate value function for a given state with enough simulations, cf. [Kocsis and Szepesvari 2006; Kocsis et al. 2006], the claimed proof of this property is incomplete. This is due to the fact that the variant of MCTS, the Upper Confidence Bound for Trees (UCT), analyzed in prior works utilizes "logarithmic" bonus term for balancing exploration and exploitation within the tree-based search, following the insights from stochastic multi-arm bandit (MAB) literature, cf. [Agrawal 1995; Auer et al. 2002]. In effect, such an approach assumes that the regret of the underlying recursively dependent non-stationary MABs concentrates around their mean exponentially in the number of steps, which is unlikely to hold as pointed out in [Audibert et al. 2009], even for stationary MABs. As the key contribution of this work, we establish polynomial concentration property of regret for a class of non-stationary multi-arm bandits. This in turn establishes that the MCTS with appropriate polynomial rather than logarithmic bonus term in UCB has the claimed property of [Kocsis and Szepesvari 2006; Kocsis et al. 2006]. Interestingly enough, empirically successful approaches (cf. [Silver et al. 2017]) utilize a similar polynomial form of MCTS as suggested by our result. Using this as a building block, we argue that MCTS, combined with nearest neighbor supervised learning, acts as a "policy improvement" operator, i.e., it iteratively improves value function approximation for all states, due to combining with supervised learning, despite evaluating at only finitely many states. In effect, we establish that to learn an ε-approximation of the value function for deterministic MDPs with respect to ℓ∞ norm, MCTS combined with nearest neighbor requires a sample size scaling as Õ (ε-(d+4), where d is the dimension of the state space. This is nearly optimal due to a minimax lower bound of ∼Ω (ε-(d+2) [Shah and Xie 2018], suggesting the strength of the variant of MCTS we propose here and our resulting analysis.

在这项工作中，我们考虑了在具有确定性过渡的无限视界贴现成本马尔可夫决策过程(MDP)的背景下，在强化学习框架内流行的基于树的搜索策略，蒙特卡洛树搜索(MCTS)。虽然MCTS被认为为给定状态提供了一个近似的值函数，但有足够的模拟，参见[Kocsis和Szepesvari 2006;Kocsis et al. 2006]，该属性的证明是不完整的。这是由于MCTS的变体，即树的上置信限(UCT)，在先前的工作中分析，利用“对数”奖励项来平衡基于树的搜索中的探索和开发，这是根据随机多臂强盗(MAB)文献的见解，参见[Agrawal 1995;Auer et al. 2002]。实际上，这种方法假设潜在的递归依赖的非平稳mab的遗憾在步数上以指数形式集中在它们的平均值周围，正如[Audibert et al. 2009]所指出的那样，即使对于平稳mab，这也不太可能成立。作为本工作的关键贡献，我们建立了一类非平稳多臂强盗的多项式后悔集中性质。这反过来又确立了UCB中具有适当多项式而不是对数奖励项的MCTS具有[Kocsis和Szepesvari 2006;Kocsis et al. 2006]。有趣的是，经验上成功的方法(参见[Silver et al. 2017])利用了与我们的结果相似的多项式形式的MCTS。使用此作为构建块，我们认为MCTS与最近邻监督学习相结合，充当“策略改进”算子，即，尽管仅在有限多个状态下评估，但由于与监督学习相结合，它迭代地改进了所有状态的值函数近似。实际上，我们建立了为了学习确定性MDPs相对于r∞范数的值函数的ε-近似，MCTS结合最近邻需要样本大小缩放为Õ (ε-(d+4)，其中d是状态空间的维数。由于最小和最大下界为~ Ω (ε-(d+2) [Shah and Xie 2018]，这几乎是最优的，这表明我们在这里提出的MCTS变体的强度以及我们的结果分析。

{"title":"Non-Asymptotic Analysis of Monte Carlo Tree Search","authors":"D. Shah, Qiaomin Xie, Zhi Xu","doi":"10.1145/3393691.3394202","DOIUrl":"https://doi.org/10.1145/3393691.3394202","url":null,"abstract":"In this work, we consider the popular tree-based search strategy within the framework of reinforcement learning, the Monte Carlo Tree Search (MCTS), in the context of infinite-horizon discounted cost Markov Decision Process (MDP) with deterministic transitions. While MCTS is believed to provide an approximate value function for a given state with enough simulations, cf. [Kocsis and Szepesvari 2006; Kocsis et al. 2006], the claimed proof of this property is incomplete. This is due to the fact that the variant of MCTS, the Upper Confidence Bound for Trees (UCT), analyzed in prior works utilizes \"logarithmic\" bonus term for balancing exploration and exploitation within the tree-based search, following the insights from stochastic multi-arm bandit (MAB) literature, cf. [Agrawal 1995; Auer et al. 2002]. In effect, such an approach assumes that the regret of the underlying recursively dependent non-stationary MABs concentrates around their mean exponentially in the number of steps, which is unlikely to hold as pointed out in [Audibert et al. 2009], even for stationary MABs. As the key contribution of this work, we establish polynomial concentration property of regret for a class of non-stationary multi-arm bandits. This in turn establishes that the MCTS with appropriate polynomial rather than logarithmic bonus term in UCB has the claimed property of [Kocsis and Szepesvari 2006; Kocsis et al. 2006]. Interestingly enough, empirically successful approaches (cf. [Silver et al. 2017]) utilize a similar polynomial form of MCTS as suggested by our result. Using this as a building block, we argue that MCTS, combined with nearest neighbor supervised learning, acts as a \"policy improvement\" operator, i.e., it iteratively improves value function approximation for all states, due to combining with supervised learning, despite evaluating at only finitely many states. In effect, we establish that to learn an ε-approximation of the value function for deterministic MDPs with respect to ℓ∞ norm, MCTS combined with nearest neighbor requires a sample size scaling as Õ (ε-(d+4), where d is the dimension of the state space. This is nearly optimal due to a minimax lower bound of ∼Ω (ε-(d+2) [Shah and Xie 2018], suggesting the strength of the variant of MCTS we propose here and our resulting analysis.","PeriodicalId":188517,"journal":{"name":"Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems","volume":"199 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132809674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Dynamic Pricing and Matching for Two-Sided Queues 双边队列的动态定价与匹配

Abstracts of the 2020 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems

Pub Date : 2020-05-07 DOI: 10.1145/3393691.3394183

Sushil Mahavir Varma, Pornpawee Bumpensanti, S. T. Maguluri, He Wang

Motivated by diverse applications in sharing economy and online marketplaces, we consider optimal pricing and matching control in a two-sided queueing system. We assume that heterogeneous customers and servers arrive to the system with price-dependent arrival rates. The compatibility between servers and customers is specified by a bipartite graph. Once a pair of customer and server are matched, they depart from the system instantaneously. The objective is to maximize the long-run average profits of the system while minimizing average waiting time. We first propose a static pricing and max-weight matching policy, which achieves O(√η) optimality rate when all of the arrival rates are scaled by η. We further show that a dynamic pricing and modified max-weight matching policy achieves an improved O(η1/3) optimality rate. In addition, we propose a constraint generation algorithm that solves value function approximation of the MDP and demonstrate strong numerical performance of this algorithm.

基于共享经济和在线市场的多种应用，我们考虑了双边排队系统中的最优定价和匹配控制。我们假设异质客户和服务器到达系统时的到达率与价格相关。服务器和客户之间的兼容性由二部图指定。一旦一对客户和服务器匹配，他们就会立即离开系统。目标是最大化系统的长期平均利润，同时最小化平均等待时间。我们首先提出了一种静态定价和最大权重匹配策略，当所有到达率都按η缩放时，该策略实现了O(√η)最优率。我们进一步证明了动态定价和改进的最大权重匹配策略实现了改进的O(η1/3)最优率。此外，我们提出了一种求解MDP值函数逼近的约束生成算法，并证明了该算法具有较强的数值性能。

引用次数: 2