Adversarial Deep Learning for Online Resource Allocation

IF 1.6 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Modeling and Performance Evaluation of Computing Systems Pub Date : 2021-11-19 DOI:10.1145/3494526

Bingqian Du, Zhiyi Huang, Chuan Wu

{"title":"Adversarial Deep Learning for Online Resource Allocation","authors":"Bingqian Du, Zhiyi Huang, Chuan Wu","doi":"10.1145/3494526","DOIUrl":null,"url":null,"abstract":"Online algorithms are an important branch in algorithm design. Designing online algorithms with a bounded competitive ratio (in terms of worst-case performance) can be hard and usually relies on problem-specific assumptions. Inspired by adversarial training from Generative Adversarial Net and the fact that the competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks (NNs) to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input. Specifically, we leverage two NNs as the algorithm and the adversary, respectively, and let them play a zero sum game, with the adversary being responsible for generating worst-case input while the algorithm learns the best strategy based on the input provided by the adversary. To ensure better convergence of the algorithm network (to the desired online algorithm), we propose a novel per-round update method to handle sequential decision making to break complex dependency among different rounds so that update can be done for every possible action instead of only sampled actions. To the best of our knowledge, our work is the first using deep NNs to design an online algorithm from the perspective of worst-case performance guarantee. Empirical studies show that our updating methods ensure convergence to Nash equilibrium and the learned algorithm outperforms state-of-the-art online algorithms under various settings.","PeriodicalId":56350,"journal":{"name":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","volume":"6 1","pages":"1 - 25"},"PeriodicalIF":1.6000,"publicationDate":"2021-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Modeling and Performance Evaluation of Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3494526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 4

Abstract

Online algorithms are an important branch in algorithm design. Designing online algorithms with a bounded competitive ratio (in terms of worst-case performance) can be hard and usually relies on problem-specific assumptions. Inspired by adversarial training from Generative Adversarial Net and the fact that the competitive ratio of an online algorithm is based on worst-case input, we adopt deep neural networks (NNs) to learn an online algorithm for a resource allocation and pricing problem from scratch, with the goal that the performance gap between offline optimum and the learned online algorithm can be minimized for worst-case input. Specifically, we leverage two NNs as the algorithm and the adversary, respectively, and let them play a zero sum game, with the adversary being responsible for generating worst-case input while the algorithm learns the best strategy based on the input provided by the adversary. To ensure better convergence of the algorithm network (to the desired online algorithm), we propose a novel per-round update method to handle sequential decision making to break complex dependency among different rounds so that update can be done for every possible action instead of only sampled actions. To the best of our knowledge, our work is the first using deep NNs to design an online algorithm from the perspective of worst-case performance guarantee. Empirical studies show that our updating methods ensure convergence to Nash equilibrium and the learned algorithm outperforms state-of-the-art online algorithms under various settings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于在线资源分配的对抗性深度学习

在线算法是算法设计中的一个重要分支。设计具有有界竞争比（就最坏情况下的性能而言）的在线算法可能很困难，并且通常依赖于特定于问题的假设。受生成对抗性网络的对抗性训练以及在线算法的竞争比基于最坏情况输入的事实的启发，我们采用深度神经网络（NN）从头开始学习资源分配和定价问题的在线算法，目标是对于最坏情况的输入，可以最小化离线最优算法和学习的在线算法之间的性能差距。具体来说，我们分别利用两个NN作为算法和对手，让它们玩零和游戏，对手负责生成最坏情况的输入，而算法则根据对手提供的输入学习最佳策略。为了确保算法网络更好地收敛（到所需的在线算法），我们提出了一种新的每轮更新方法来处理顺序决策，以打破不同轮之间的复杂依赖关系，从而可以对每一个可能的动作进行更新，而不仅仅是采样动作。据我们所知，我们的工作是首次使用深度神经网络从最坏情况性能保证的角度设计在线算法。实证研究表明，我们的更新方法确保了收敛到纳什均衡，并且在各种设置下，所学习的算法优于最先进的在线算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

ACM Transactions on Modeling and Performance Evaluation of Computing Systems COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

2.10

自引率

0.00%

发文量