The computational power of optimization in online learning

Proceedings of the forty-eighth annual ACM symposium on Theory of Computing Pub Date : 2015-04-08 DOI:10.1145/2897518.2897536

Elad Hazan, Tomer Koren

{"title":"The computational power of optimization in online learning","authors":"Elad Hazan, Tomer Koren","doi":"10.1145/2897518.2897536","DOIUrl":null,"url":null,"abstract":"We consider the fundamental problem of prediction with expert advice where the experts are “optimizable”: there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to N experts in total O(√N) computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is Θ(N). These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle—i.e., an efficient empirical risk minimizer—allows to learn a finite hypothesis class of size N in time O(logN). We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their best-response to any mixed strategy of their opponent. We show that the runtime required for approximating the minimax value of the game in this setting is Θ(√N), yielding again a quadratic improvement upon the oracle-free setting, where Θ(N) is known to be tight.","PeriodicalId":442965,"journal":{"name":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"52","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the forty-eighth annual ACM symposium on Theory of Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2897518.2897536","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 52

Abstract

We consider the fundamental problem of prediction with expert advice where the experts are “optimizable”: there is a black-box optimization oracle that can be used to compute, in constant time, the leading expert in retrospect at any point in time. In this setting, we give a novel online algorithm that attains vanishing regret with respect to N experts in total O(√N) computation time. We also give a lower bound showing that this running time cannot be improved (up to log factors) in the oracle model, thereby exhibiting a quadratic speedup as compared to the standard, oracle-free setting where the required time for vanishing regret is Θ(N). These results demonstrate an exponential gap between the power of optimization in online learning and its power in statistical learning: in the latter, an optimization oracle—i.e., an efficient empirical risk minimizer—allows to learn a finite hypothesis class of size N in time O(logN). We also study the implications of our results to learning in repeated zero-sum games, in a setting where the players have access to oracles that compute, in constant time, their best-response to any mixed strategy of their opponent. We show that the runtime required for approximating the minimax value of the game in this setting is Θ(√N), yielding again a quadratic improvement upon the oracle-free setting, where Θ(N) is known to be tight.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

优化在线学习的计算能力

我们考虑专家建议预测的基本问题，其中专家是“可优化的”:有一个黑盒优化oracle，可用于在恒定时间内计算在任何时间点回顾的领先专家。在这种情况下，我们给出了一种新的在线算法，该算法在总共O(√N)的计算时间内实现了N个专家的遗憾消失。我们还给出了一个下界，表明这个运行时间在oracle模型中不能改进(最多log因子)，因此与标准的、无oracle的设置相比，显示出二次加速，其中消除遗憾所需的时间为Θ(N)。这些结果表明，在线学习中的优化能力与统计学习中的优化能力之间存在指数差距:在统计学习中，一个优化预言机——即。，一个有效的经验风险最小化器，允许在时间O(logN)内学习大小为N的有限假设类。我们还研究了在重复的零和游戏中我们的结果对学习的影响，在这个游戏中，玩家可以使用预言机，在恒定的时间内计算出他们对对手的任何混合策略的最佳反应。我们表明，在这种设置中，近似游戏的最小最大值所需的运行时间为Θ(√N)，在无oracle的设置上再次得到二次改进，其中Θ(N)已知是紧的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the forty-eighth annual ACM symposium on Theory of Computing

自引率

0.00%

发文量

期刊最新文献

Exponential separation of communication and external information Proceedings of the forty-eighth annual ACM symposium on Theory of Computing Explicit two-source extractors and resilient functions Constant-rate coding for multiparty interactive communication is impossible Approximating connectivity domination in weighted bounded-genus graphs