非精确Hessian信息下非凸Newton-MR的复杂度保证

IF 2.4 2区 数学 Q1 MATHEMATICS, APPLIED IMA Journal of Numerical Analysis Pub Date : 2025-03-05 DOI:10.1093/imanum/drae110
Alexander Lim, Fred Roosta
{"title":"非精确Hessian信息下非凸Newton-MR的复杂度保证","authors":"Alexander Lim, Fred Roosta","doi":"10.1093/imanum/drae110","DOIUrl":null,"url":null,"abstract":"We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.","PeriodicalId":56295,"journal":{"name":"IMA Journal of Numerical Analysis","volume":"101 1","pages":""},"PeriodicalIF":2.4000,"publicationDate":"2025-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Complexity guarantees for nonconvex Newton-MR under inexact Hessian information\",\"authors\":\"Alexander Lim, Fred Roosta\",\"doi\":\"10.1093/imanum/drae110\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.\",\"PeriodicalId\":56295,\"journal\":{\"name\":\"IMA Journal of Numerical Analysis\",\"volume\":\"101 1\",\"pages\":\"\"},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2025-03-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IMA Journal of Numerical Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/imanum/drae110\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IMA Journal of Numerical Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/imanum/drae110","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0

摘要

我们考虑了非凸无约束优化的Newton-MR算法的扩展到逼近Hessian信息的设置。在Hessian矩阵上的特定噪声模型下,我们研究了这种变体的迭代和操作复杂性,以在几种非凸设置下获得适当的次优性准则。我们首先考虑满足(广义)Polyak -Łojasiewicz条件的函数,它是非凸函数的一个特殊子类。结果表明,在一定条件下,算法达到全局线性收敛速度。然后我们考虑更一般的非凸设置,其中获得一阶次最优的速率被证明是次线性的。在所有这些设置中,我们证明了我们的算法收敛,而不管Hessian近似的程度以及子问题解的准确性。最后,我们在几个机器学习问题上比较了我们的算法与几种替代算法的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Complexity guarantees for nonconvex Newton-MR under inexact Hessian information
We consider an extension of the Newton-MR algorithm for nonconvex unconstrained optimization to the settings where Hessian information is approximated. Under a particular noise model on the Hessian matrix, we investigate the iteration and operation complexities of this variant to achieve appropriate sub-optimality criteria in several nonconvex settings. We do this by first considering functions that satisfy the (generalized) Polyak–Łojasiewicz condition, a special sub-class of nonconvex functions. We show that, under certain conditions, our algorithm achieves global linear convergence rate. We then consider more general nonconvex settings where the rate to obtain first-order sub-optimality is shown to be sub-linear. In all these settings we show that our algorithm converges regardless of the degree of approximation of the Hessian as well as the accuracy of the solution to the sub-problem. Finally, we compare the performance of our algorithm with several alternatives on a few machine learning problems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
IMA Journal of Numerical Analysis
IMA Journal of Numerical Analysis 数学-应用数学
CiteScore
5.30
自引率
4.80%
发文量
79
审稿时长
6-12 weeks
期刊介绍: The IMA Journal of Numerical Analysis (IMAJNA) publishes original contributions to all fields of numerical analysis; articles will be accepted which treat the theory, development or use of practical algorithms and interactions between these aspects. Occasional survey articles are also published.
期刊最新文献
Error analysis of an implicit–explicit time discretization scheme for semilinear wave equations with application to multiscale problems Employing nonresonant step sizes for time integration of highly oscillatory nonlinear Dirac equations Maximal regularity of evolving FEMs for parabolic equations on an evolving surface Variationally correct neural residual regression for parametric PDEs: on the viability of controlled accuracy Combined DG–CG finite element method for the Westervelt equation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1