Adaptive Biased Stochastic Optimization

IF 18.6 IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-01-10 DOI:10.1109/TPAMI.2025.3528193

Zhuang Yang

{"title":"Adaptive Biased Stochastic Optimization","authors":"Zhuang Yang","doi":"10.1109/TPAMI.2025.3528193","DOIUrl":null,"url":null,"abstract":"This work develops and analyzes a class of adaptive biased stochastic optimization (ABSO) algorithms from the perspective of the GEneralized Adaptive gRadient (GEAR) method that contains Adam, AdaGrad, RMSProp, etc. Particularly, two preferred biased stochastic optimization (BSO) algorithms, the biased stochastic variance reduction gradient (BSVRG) algorithm and the stochastic recursive gradient algorithm (SARAH), equipped with GEAR, are first considered in this work, leading to two ABSO algorithms: BSVRG-GEAR and SARAH-GEAR. We present a uniform analysis of ABSO algorithms for minimizing strongly convex (SC) and Polyak-Łojasiewicz (PŁ) composite objective functions. Second, we also use our framework to develop another novel BSO algorithm, adaptive biased stochastic conjugate gradient (coined BSCG-GEAR), which achieves the well-known oracle complexity. Specifically, under mild conditions, we prove that the resulting ABSO algorithms attain a linear convergence rate on both PŁ and SC cases. Moreover, we show that the complexity of the resulting ABSO algorithms is comparable to that of advanced stochastic gradient-based algorithms. Finally, we demonstrate the empirical superiority and the numerical stability of the resulting ABSO algorithms by conducting numerical experiments on different applications of machine learning.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 4","pages":"3067-3078"},"PeriodicalIF":18.6000,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10836750/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This work develops and analyzes a class of adaptive biased stochastic optimization (ABSO) algorithms from the perspective of the GEneralized Adaptive gRadient (GEAR) method that contains Adam, AdaGrad, RMSProp, etc. Particularly, two preferred biased stochastic optimization (BSO) algorithms, the biased stochastic variance reduction gradient (BSVRG) algorithm and the stochastic recursive gradient algorithm (SARAH), equipped with GEAR, are first considered in this work, leading to two ABSO algorithms: BSVRG-GEAR and SARAH-GEAR. We present a uniform analysis of ABSO algorithms for minimizing strongly convex (SC) and Polyak-Łojasiewicz (PŁ) composite objective functions. Second, we also use our framework to develop another novel BSO algorithm, adaptive biased stochastic conjugate gradient (coined BSCG-GEAR), which achieves the well-known oracle complexity. Specifically, under mild conditions, we prove that the resulting ABSO algorithms attain a linear convergence rate on both PŁ and SC cases. Moreover, we show that the complexity of the resulting ABSO algorithms is comparable to that of advanced stochastic gradient-based algorithms. Finally, we demonstrate the empirical superiority and the numerical stability of the resulting ABSO algorithms by conducting numerical experiments on different applications of machine learning.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

自适应有偏随机优化

本文从广义自适应梯度（GEAR）方法的角度发展和分析了一类自适应偏置随机优化（ABSO）算法，该算法包含Adam、AdaGrad、RMSProp等。特别地，本文首先考虑了两种首选的有偏随机优化（BSO）算法，即带有GEAR的有偏随机方差减少梯度（BSVRG）算法和随机递归梯度算法（SARAH），得到了两种ABSO算法：BSVRG-GEAR和SARAH-GEAR。我们提出了最小化强凸（SC）和Polyak-Łojasiewicz （PŁ）复合目标函数的ABSO算法的统一分析。其次，我们还利用我们的框架开发了另一种新的BSO算法，自适应有偏差随机共轭梯度（BSCG-GEAR），该算法实现了众所周知的oracle复杂性。具体而言，在温和的条件下，我们证明了所得ABSO算法在PŁ和SC情况下都具有线性收敛率。此外，我们证明了所得到的ABSO算法的复杂性与先进的基于随机梯度的算法相当。最后，我们通过在不同的机器学习应用中进行数值实验，证明了所得ABSO算法的经验优势和数值稳定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量

期刊最新文献

GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation. Unsupervised Gaze Representation Learning by Switching Features. H₂OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers. MV2DFusion: Leveraging Modality-Specific Object Semantics for Multi-Modal 3D Detection. Parse Trees Guided LLM Prompt Compression.