Tradeoffs Between Convergence Rate and Noise Amplification for Momentum-Based Accelerated Optimization Algorithms

IF 7 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automatic Control Pub Date : 2024-09-03 DOI:10.1109/TAC.2024.3453656
Hesameddin Mohammadi;Meisam Razaviyayn;Mihailo R. Jovanović
{"title":"Tradeoffs Between Convergence Rate and Noise Amplification for Momentum-Based Accelerated Optimization Algorithms","authors":"Hesameddin Mohammadi;Meisam Razaviyayn;Mihailo R. Jovanović","doi":"10.1109/TAC.2024.3453656","DOIUrl":null,"url":null,"abstract":"In this article, we study momentum-based first-order optimization algorithms in which the iterations utilize information from the two previous steps and are subject to an additive white noise. This setup uses noise to account for uncertainty in either gradient evaluation or iteration updates, and it includes Polyak's heavy-ball and Nesterov's accelerated methods as special cases. For strongly convex quadratic problems, we use the steady-state variance of the error in the optimization variable to quantify noise amplification and identify fundamental stochastic performance tradeoffs. Our approach utilizes the Jury stability criterion to provide a novel geometric characterization of conditions for linear convergence, and it reveals the relation between the noise amplification and convergence rate as well as their dependence on the condition number and the constant algorithmic parameters. This geometric insight leads to simple alternative proofs of standard convergence results and allows us to establish “uncertainty principle” of strongly convex optimization: for the two-step momentum method with linear convergence rate, the lower bound on the product between the settling time and noise amplification scales quadratically with the condition number. Our analysis also identifies a key difference between the gradient and iterate noise models: while the amplification of gradient noise can be made arbitrarily small by sufficiently decelerating the algorithm, the best achievable variance for the iterate noise model increases linearly with the settling time in the decelerating regime. Finally, we introduce two parameterized families of algorithms that strike a balance between noise amplification and settling time while preserving orderwise Pareto optimality for both noise models.","PeriodicalId":13201,"journal":{"name":"IEEE Transactions on Automatic Control","volume":"70 2","pages":"889-904"},"PeriodicalIF":7.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automatic Control","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10663923/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

In this article, we study momentum-based first-order optimization algorithms in which the iterations utilize information from the two previous steps and are subject to an additive white noise. This setup uses noise to account for uncertainty in either gradient evaluation or iteration updates, and it includes Polyak's heavy-ball and Nesterov's accelerated methods as special cases. For strongly convex quadratic problems, we use the steady-state variance of the error in the optimization variable to quantify noise amplification and identify fundamental stochastic performance tradeoffs. Our approach utilizes the Jury stability criterion to provide a novel geometric characterization of conditions for linear convergence, and it reveals the relation between the noise amplification and convergence rate as well as their dependence on the condition number and the constant algorithmic parameters. This geometric insight leads to simple alternative proofs of standard convergence results and allows us to establish “uncertainty principle” of strongly convex optimization: for the two-step momentum method with linear convergence rate, the lower bound on the product between the settling time and noise amplification scales quadratically with the condition number. Our analysis also identifies a key difference between the gradient and iterate noise models: while the amplification of gradient noise can be made arbitrarily small by sufficiently decelerating the algorithm, the best achievable variance for the iterate noise model increases linearly with the settling time in the decelerating regime. Finally, we introduce two parameterized families of algorithms that strike a balance between noise amplification and settling time while preserving orderwise Pareto optimality for both noise models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于动量的加速优化算法在收敛速度和噪声放大之间的权衡
在本文中,我们研究了基于动量的一阶优化算法,其中迭代利用来自前两个步骤的信息,并受到加性白噪声的影响。这种设置使用噪声来解释梯度计算或迭代更新中的不确定性,它包括Polyak的heavy-ball和Nesterov的加速方法作为特殊情况。对于强凸二次问题,我们使用优化变量中误差的稳态方差来量化噪声放大并确定基本的随机性能权衡。该方法利用陪审团稳定性判据提供了线性收敛条件的一种新的几何表征,揭示了噪声放大与收敛速率之间的关系以及它们对条件数和恒定算法参数的依赖关系。这种几何洞察力导致标准收敛结果的简单替代证明,并允许我们建立强凸优化的“不确定性原理”:对于具有线性收敛速率的两步动量方法,沉降时间与噪声放大之间的乘积的下界与条件数成二次比例。我们的分析还确定了梯度和迭代噪声模型之间的一个关键区别:虽然通过充分减速算法可以使梯度噪声的放大任意小,但迭代噪声模型的最佳可实现方差随着减速状态下的沉降时间线性增加。最后,我们介绍了两个参数化的算法族,它们在噪声放大和稳定时间之间取得平衡,同时保持两种噪声模型的有序帕累托最优性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Automatic Control
IEEE Transactions on Automatic Control 工程技术-工程:电子与电气
CiteScore
11.30
自引率
5.90%
发文量
824
审稿时长
9 months
期刊介绍: In the IEEE Transactions on Automatic Control, the IEEE Control Systems Society publishes high-quality papers on the theory, design, and applications of control engineering. Two types of contributions are regularly considered: 1) Papers: Presentation of significant research, development, or application of control concepts. 2) Technical Notes and Correspondence: Brief technical notes, comments on published areas or established control topics, corrections to papers and notes published in the Transactions. In addition, special papers (tutorials, surveys, and perspectives on the theory and applications of control systems topics) are solicited.
期刊最新文献
IEEE Control Systems Society Publication Information Optimal Control of Markov Decision Processes for Efficiency with Linear Temporal Logic Tasks Observers for timed DESs with observation delays and losses Adaptive Tracking Control for Quantized Linear Stochastic Regression Systems: Asymptotic Property under Non-Periodic Signals Error-Aware Nonlinear Gain-Based PID Tracking Control: A Constructive and Self-Tuning Approach
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1