{"title":"非孤立极小值的快速收敛:$${textrm{C}^{2}}$函数的四个等价条件","authors":"Quentin Rebjock, Nicolas Boumal","doi":"10.1007/s10107-024-02136-6","DOIUrl":null,"url":null,"abstract":"<p>Optimization algorithms can see their local convergence rates deteriorate when the Hessian at the optimum is singular. These singularities are inescapable when the optima are non-isolated. Yet, under the right circumstances, several algorithms preserve their favorable rates even when optima form a continuum (e.g., due to over-parameterization). This has been explained under various structural assumptions, including the Polyak–Łojasiewicz condition, Quadratic Growth and the Error Bound. We show that, for cost functions which are twice continuously differentiable (<span>\\(\\textrm{C}^2\\)</span>), those three (local) properties are equivalent. Moreover, we show they are equivalent to the Morse–Bott property, that is, local minima form differentiable submanifolds, and the Hessian of the cost function is positive definite along its normal directions. We leverage this insight to improve local convergence guarantees for safe-guarded Newton-type methods under any (hence all) of the above assumptions. First, for adaptive cubic regularization, we secure quadratic convergence even with inexact subproblem solvers. Second, for trust-region methods, we argue capture can fail with an exact subproblem solver, then proceed to show linear convergence with an inexact one (Cauchy steps).</p>","PeriodicalId":18297,"journal":{"name":"Mathematical Programming","volume":"29 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Fast convergence to non-isolated minima: four equivalent conditions for $${\\\\textrm{C}^{2}}$$ functions\",\"authors\":\"Quentin Rebjock, Nicolas Boumal\",\"doi\":\"10.1007/s10107-024-02136-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Optimization algorithms can see their local convergence rates deteriorate when the Hessian at the optimum is singular. These singularities are inescapable when the optima are non-isolated. Yet, under the right circumstances, several algorithms preserve their favorable rates even when optima form a continuum (e.g., due to over-parameterization). This has been explained under various structural assumptions, including the Polyak–Łojasiewicz condition, Quadratic Growth and the Error Bound. We show that, for cost functions which are twice continuously differentiable (<span>\\\\(\\\\textrm{C}^2\\\\)</span>), those three (local) properties are equivalent. Moreover, we show they are equivalent to the Morse–Bott property, that is, local minima form differentiable submanifolds, and the Hessian of the cost function is positive definite along its normal directions. We leverage this insight to improve local convergence guarantees for safe-guarded Newton-type methods under any (hence all) of the above assumptions. First, for adaptive cubic regularization, we secure quadratic convergence even with inexact subproblem solvers. Second, for trust-region methods, we argue capture can fail with an exact subproblem solver, then proceed to show linear convergence with an inexact one (Cauchy steps).</p>\",\"PeriodicalId\":18297,\"journal\":{\"name\":\"Mathematical Programming\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Programming\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1007/s10107-024-02136-6\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, SOFTWARE ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Programming","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1007/s10107-024-02136-6","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
Fast convergence to non-isolated minima: four equivalent conditions for $${\textrm{C}^{2}}$$ functions
Optimization algorithms can see their local convergence rates deteriorate when the Hessian at the optimum is singular. These singularities are inescapable when the optima are non-isolated. Yet, under the right circumstances, several algorithms preserve their favorable rates even when optima form a continuum (e.g., due to over-parameterization). This has been explained under various structural assumptions, including the Polyak–Łojasiewicz condition, Quadratic Growth and the Error Bound. We show that, for cost functions which are twice continuously differentiable (\(\textrm{C}^2\)), those three (local) properties are equivalent. Moreover, we show they are equivalent to the Morse–Bott property, that is, local minima form differentiable submanifolds, and the Hessian of the cost function is positive definite along its normal directions. We leverage this insight to improve local convergence guarantees for safe-guarded Newton-type methods under any (hence all) of the above assumptions. First, for adaptive cubic regularization, we secure quadratic convergence even with inexact subproblem solvers. Second, for trust-region methods, we argue capture can fail with an exact subproblem solver, then proceed to show linear convergence with an inexact one (Cauchy steps).
期刊介绍:
Mathematical Programming publishes original articles dealing with every aspect of mathematical optimization; that is, everything of direct or indirect use concerning the problem of optimizing a function of many variables, often subject to a set of constraints. This involves theoretical and computational issues as well as application studies. Included, along with the standard topics of linear, nonlinear, integer, conic, stochastic and combinatorial optimization, are techniques for formulating and applying mathematical programming models, convex, nonsmooth and variational analysis, the theory of polyhedra, variational inequalities, and control and game theory viewed from the perspective of mathematical programming.