Antoine Godichon-BaggioniLPSM, Wei LuLMI, Bruno PortierLMI
{"title":"A Full Adagrad algorithm with O(Nd) operations","authors":"Antoine Godichon-BaggioniLPSM, Wei LuLMI, Bruno PortierLMI","doi":"arxiv-2405.01908","DOIUrl":null,"url":null,"abstract":"A novel approach is given to overcome the computational challenges of the\nfull-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic\noptimization. By developing a recursive method that estimates the inverse of\nthe square root of the covariance of the gradient, alongside a streaming\nvariant for parameter updates, the study offers efficient and practical\nalgorithms for large-scale applications. This innovative strategy significantly\nreduces the complexity and resource demands typically associated with\nfull-matrix methods, enabling more effective optimization processes. Moreover,\nthe convergence rates of the proposed estimators and their asymptotic\nefficiency are given. Their effectiveness is demonstrated through numerical\nstudies.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"165 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.01908","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A novel approach is given to overcome the computational challenges of the
full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic
optimization. By developing a recursive method that estimates the inverse of
the square root of the covariance of the gradient, alongside a streaming
variant for parameter updates, the study offers efficient and practical
algorithms for large-scale applications. This innovative strategy significantly
reduces the complexity and resource demands typically associated with
full-matrix methods, enabling more effective optimization processes. Moreover,
the convergence rates of the proposed estimators and their asymptotic
efficiency are given. Their effectiveness is demonstrated through numerical
studies.