Pub Date : 2018-09-03DOI: 10.1109/ITA50056.2020.9244951
César A. Uribe, Soomin Lee, A. Gasnikov, A. Nedić
We study dual-based algorithms for distributed convex optimization problems over networks, where the objective is to minimize a sum $sumnolimits_{i = 1}^m {{f_i}left( z right)} $ of functions over in a network. We provide complexity bounds for four different cases, namely: each function fi is strongly convex and smooth, each function is either strongly convex or smooth, and when it is convex but neither strongly convex nor smooth. Our approach is based on the dual of an appropriately formulated primal problem, which includes a graph that models the communication restrictions. We propose distributed algorithms that achieve the same optimal rates as their centralized counterparts (up to constant and logarithmic factors), with an additional optimal cost related to the spectral properties of the network. Initially, we focus on functions for which we can explicitly minimize its Legendre–Fenchel conjugate, i.e., admissible or dual friendly functions. Then, we study distributed optimization algorithms for non-dual friendly functions, as well as a method to improve the dependency on the parameters of the functions involved. Numerical analysis of the proposed algorithms is also provided.
我们研究了网络上分布凸优化问题的双重算法,其目标是最小化网络中函数的和$sumnolimits_{i = 1}^m {{f_i}left( z right)} $。我们提供了四种不同情况下的复杂度界限,即:每个函数fi是强凸光滑的,每个函数要么是强凸要么是光滑的,当它是凸但既不是强凸也不是光滑的。我们的方法是基于一个适当表述的原始问题的对偶,其中包括一个模拟通信限制的图。我们提出分布式算法,实现与集中式算法相同的最优速率(高达常数和对数因子),并具有与网络频谱特性相关的额外最优成本。首先,我们关注的是可以显式最小化其legende - fenchel共轭的函数,即可容许函数或对偶友好函数。然后,我们研究了非对偶友好函数的分布式优化算法,以及一种改善函数对参数依赖性的方法。对所提出的算法进行了数值分析。
{"title":"A Dual Approach for Optimal Algorithms in Distributed Optimization over Networks","authors":"César A. Uribe, Soomin Lee, A. Gasnikov, A. Nedić","doi":"10.1109/ITA50056.2020.9244951","DOIUrl":"https://doi.org/10.1109/ITA50056.2020.9244951","url":null,"abstract":"We study dual-based algorithms for distributed convex optimization problems over networks, where the objective is to minimize a sum $sumnolimits_{i = 1}^m {{f_i}left( z right)} $ of functions over in a network. We provide complexity bounds for four different cases, namely: each function fi is strongly convex and smooth, each function is either strongly convex or smooth, and when it is convex but neither strongly convex nor smooth. Our approach is based on the dual of an appropriately formulated primal problem, which includes a graph that models the communication restrictions. We propose distributed algorithms that achieve the same optimal rates as their centralized counterparts (up to constant and logarithmic factors), with an additional optimal cost related to the spectral properties of the network. Initially, we focus on functions for which we can explicitly minimize its Legendre–Fenchel conjugate, i.e., admissible or dual friendly functions. Then, we study distributed optimization algorithms for non-dual friendly functions, as well as a method to improve the dependency on the parameters of the functions involved. Numerical analysis of the proposed algorithms is also provided.","PeriodicalId":137257,"journal":{"name":"2020 Information Theory and Applications Workshop (ITA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123817152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-06-05DOI: 10.1109/ita50056.2020.9244938
Steve Hanneke
This work initiates a general study of learning and generalization without the i.i.d. assumption, starting from first principles. While the standard approach to statistical learning theory is based on assumptions chosen largely for their convenience (e.g., i.i.d. or stationary ergodic), in this work we are interested in developing a theory of learning based only on the most fundamental and natural assumptions implicit in the requirements of the learning problem itself. We specifically study universally consistent function learning, where the objective is to obtain low long-run average loss for any target function, when the data follow a given stochastic process. We are then interested in the question of whether there exist learning rules guaranteed to be universally consistent given only the assumption that universally consistent learning is possible for the given data process. The reasoning that motivates this criterion emanates from a kind of optimist’s decision theory, and so we refer to such learning rules as being optimistically universal. We study this question in three natural learning settings: inductive, self-adaptive, and online. Remarkably, as our strongest positive result, we find that optimistically universal learning rules do indeed exist in the self-adaptive learning setting. Establishing this fact requires us to develop new approaches to the design of learning algorithms. Along the way, we also identify concise characterizations of the family of processes under which universally consistent learning is possible in the inductive and self-adaptive settings. We additionally pose a number of enticing open problems, particularly for the online learning setting.
{"title":"Learning Whenever Learning is Possible: Universal Learning under General Stochastic Processes","authors":"Steve Hanneke","doi":"10.1109/ita50056.2020.9244938","DOIUrl":"https://doi.org/10.1109/ita50056.2020.9244938","url":null,"abstract":"This work initiates a general study of learning and generalization without the i.i.d. assumption, starting from first principles. While the standard approach to statistical learning theory is based on assumptions chosen largely for their convenience (e.g., i.i.d. or stationary ergodic), in this work we are interested in developing a theory of learning based only on the most fundamental and natural assumptions implicit in the requirements of the learning problem itself. We specifically study universally consistent function learning, where the objective is to obtain low long-run average loss for any target function, when the data follow a given stochastic process. We are then interested in the question of whether there exist learning rules guaranteed to be universally consistent given only the assumption that universally consistent learning is possible for the given data process. The reasoning that motivates this criterion emanates from a kind of optimist’s decision theory, and so we refer to such learning rules as being optimistically universal. We study this question in three natural learning settings: inductive, self-adaptive, and online. Remarkably, as our strongest positive result, we find that optimistically universal learning rules do indeed exist in the self-adaptive learning setting. Establishing this fact requires us to develop new approaches to the design of learning algorithms. Along the way, we also identify concise characterizations of the family of processes under which universally consistent learning is possible in the inductive and self-adaptive settings. We additionally pose a number of enticing open problems, particularly for the online learning setting.","PeriodicalId":137257,"journal":{"name":"2020 Information Theory and Applications Workshop (ITA)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2017-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129839968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}