Hoa T. Bui, Regina S. Burachik, Evgeni A. Nurminski, Matthew K. Tam
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1646-1678, June 2024. Abstract. We consider a class of convex optimization problems in a Hilbert space that can be solved by performing a single projection, i.e., by projecting an infeasible point onto the feasible set. Our results improve those established for the linear programming setting in Nurminski (2015) by considering problems that (i) may have multiple solutions, (ii) do not satisfy strict complementarity conditions, and (iii) possess nonlinear convex constraints. As a by-product of our analysis, we provide a quantitative estimate on the required distance between the infeasible point and the feasible set in order for its projection to be a solution of the problem. Our analysis relies on a “sharpness” property of the constraint set, a new property we introduce here.
{"title":"Single-Projection Procedure for Infinite Dimensional Convex Optimization Problems","authors":"Hoa T. Bui, Regina S. Burachik, Evgeni A. Nurminski, Matthew K. Tam","doi":"10.1137/22m1530173","DOIUrl":"https://doi.org/10.1137/22m1530173","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1646-1678, June 2024. <br/>Abstract. We consider a class of convex optimization problems in a Hilbert space that can be solved by performing a single projection, i.e., by projecting an infeasible point onto the feasible set. Our results improve those established for the linear programming setting in Nurminski (2015) by considering problems that (i) may have multiple solutions, (ii) do not satisfy strict complementarity conditions, and (iii) possess nonlinear convex constraints. As a by-product of our analysis, we provide a quantitative estimate on the required distance between the infeasible point and the feasible set in order for its projection to be a solution of the problem. Our analysis relies on a “sharpness” property of the constraint set, a new property we introduce here.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"49 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140830229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1622-1645, June 2024. Abstract. Augmented Lagrangian dual augments the classical Lagrangian dual with a nonnegative nonlinear penalty function of the violation of the relaxed/dualized constraints in order to reduce the duality gap. We investigate the cases in which mixed integer convex optimization problems have an exact penalty representation using sharp augmenting functions (norms as augmenting penalty functions). We present a generalizable constructive proof technique for proving existence of exact penalty representations for mixed integer convex programs under specific conditions using the associated value functions. This generalizes the recent results for mixed integer linear programming [M. J. Feizollahi, S. Ahmed, and A. Sun, Math. Program., 161 (2017), pp. 365–387] and mixed integer quadratic progamming [X. Gu, S. Ahmed, and S. S. Dey, SIAM J. Optim., 30 (2020), pp. 781–797] while also providing an alternative proof for the aforementioned along with quantification of the finite penalty parameter in these cases.
SIAM 优化期刊》第 34 卷第 2 期第 1622-1645 页,2024 年 6 月。摘要增量拉格朗日对偶用违反松弛/对偶约束的非负非线性惩罚函数来增量经典拉格朗日对偶,以减小对偶差距。我们研究了混合整数凸优化问题中使用尖锐增强函数(作为增强惩罚函数的规范)进行精确惩罚表示的情况。我们提出了一种可推广的构造证明技术,在特定条件下利用相关的值函数证明混合整数凸程序存在精确的惩罚表示。这概括了混合整数线性规划的最新成果 [M. J. Feizollahi, M. J. Feizollahi, M. J. M.J. Feizollahi, S. Ahmed, and A. Sun, Math.161 (2017), pp. 365-387] 和混合整数二次编程 [X. Gu, S. Ahmed, and S. S. Dey, SIAM J. Optim., 30 (2020), pp.
{"title":"Exact Augmented Lagrangian Duality for Mixed Integer Convex Optimization","authors":"Avinash Bhardwaj, Vishnu Narayanan, Abhishek Pathapati","doi":"10.1137/22m1526204","DOIUrl":"https://doi.org/10.1137/22m1526204","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1622-1645, June 2024. <br/>Abstract. Augmented Lagrangian dual augments the classical Lagrangian dual with a nonnegative nonlinear penalty function of the violation of the relaxed/dualized constraints in order to reduce the duality gap. We investigate the cases in which mixed integer convex optimization problems have an exact penalty representation using sharp augmenting functions (norms as augmenting penalty functions). We present a generalizable constructive proof technique for proving existence of exact penalty representations for mixed integer convex programs under specific conditions using the associated value functions. This generalizes the recent results for mixed integer linear programming [M. J. Feizollahi, S. Ahmed, and A. Sun, Math. Program., 161 (2017), pp. 365–387] and mixed integer quadratic progamming [X. Gu, S. Ahmed, and S. S. Dey, SIAM J. Optim., 30 (2020), pp. 781–797] while also providing an alternative proof for the aforementioned along with quantification of the finite penalty parameter in these cases.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"83 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140830776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1595-1621, June 2024. Abstract. We investigate frugal splitting operators for finite sum monotone inclusion problems. These operators utilize exactly one direct or resolvent evaluation of each operator of the sum, and the splitting operator’s output is dictated by linear combinations of these evaluations’ inputs and outputs. To facilitate analysis, we introduce a novel representation of frugal splitting operators via a generalized primal-dual resolvent. The representation is characterized by an index and four matrices, and we provide conditions on these that ensure equivalence between the classes of frugal splitting operators and generalized primal-dual resolvents. Our representation paves the way for new results regarding lifting numbers and the development of a unified convergence analysis for frugal splitting operator methods, contingent on the directly evaluated operators being cocoercive. The minimal lifting number is [math] where [math] is the number of monotone operators and [math] is the number of direct evaluations in the splitting. Notably, this lifting number is achievable only if the first and last operator evaluations are resolvent evaluations. These results generalize the minimal lifting results by Ryu and by Malitsky and Tam that consider frugal resolvent splittings. Building on our representation, we delineate a constructive method to design frugal splitting operators, exemplified in the design of a novel, convergent, and parallelizable frugal splitting operator with minimal lifting.
{"title":"Frugal Splitting Operators: Representation, Minimal Lifting, and Convergence","authors":"Martin Morin, Sebastian Banert, Pontus Giselsson","doi":"10.1137/22m1531105","DOIUrl":"https://doi.org/10.1137/22m1531105","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1595-1621, June 2024. <br/>Abstract. We investigate frugal splitting operators for finite sum monotone inclusion problems. These operators utilize exactly one direct or resolvent evaluation of each operator of the sum, and the splitting operator’s output is dictated by linear combinations of these evaluations’ inputs and outputs. To facilitate analysis, we introduce a novel representation of frugal splitting operators via a generalized primal-dual resolvent. The representation is characterized by an index and four matrices, and we provide conditions on these that ensure equivalence between the classes of frugal splitting operators and generalized primal-dual resolvents. Our representation paves the way for new results regarding lifting numbers and the development of a unified convergence analysis for frugal splitting operator methods, contingent on the directly evaluated operators being cocoercive. The minimal lifting number is [math] where [math] is the number of monotone operators and [math] is the number of direct evaluations in the splitting. Notably, this lifting number is achievable only if the first and last operator evaluations are resolvent evaluations. These results generalize the minimal lifting results by Ryu and by Malitsky and Tam that consider frugal resolvent splittings. Building on our representation, we delineate a constructive method to design frugal splitting operators, exemplified in the design of a novel, convergent, and parallelizable frugal splitting operator with minimal lifting.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"8 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140830756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1569-1594, June 2024. Abstract. In this paper, we propose several graph-based extensions of the Douglas–Rachford splitting (DRS) method to solve monotone inclusion problems involving the sum of [math] maximal monotone operators. Our construction is based on the choice of two nested graphs, to which we associate a generalization of the DRS algorithm that presents a prescribed structure. The resulting schemes can be understood as unconditionally stable frugal resolvent splitting methods with minimal lifting in the sense of Ryu [Math. Program., 182 (2020), pp. 233–273] as well as instances of the (degenerate) preconditioned proximal point method, which provides robust convergence guarantees. We further describe how the graph-based extensions of the DRS method can be leveraged to design new fully distributed protocols. Applications to a congested optimal transport problem and to distributed support vector machines show interesting connections with the underlying graph topology and highly competitive performances with state-of-the-art distributed optimization approaches.
{"title":"Graph and Distributed Extensions of the Douglas–Rachford Method","authors":"Kristian Bredies, Enis Chenchene, Emanuele Naldi","doi":"10.1137/22m1535097","DOIUrl":"https://doi.org/10.1137/22m1535097","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1569-1594, June 2024. <br/> Abstract. In this paper, we propose several graph-based extensions of the Douglas–Rachford splitting (DRS) method to solve monotone inclusion problems involving the sum of [math] maximal monotone operators. Our construction is based on the choice of two nested graphs, to which we associate a generalization of the DRS algorithm that presents a prescribed structure. The resulting schemes can be understood as unconditionally stable frugal resolvent splitting methods with minimal lifting in the sense of Ryu [Math. Program., 182 (2020), pp. 233–273] as well as instances of the (degenerate) preconditioned proximal point method, which provides robust convergence guarantees. We further describe how the graph-based extensions of the DRS method can be leveraged to design new fully distributed protocols. Applications to a congested optimal transport problem and to distributed support vector machines show interesting connections with the underlying graph topology and highly competitive performances with state-of-the-art distributed optimization approaches.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"64 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140800575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roberto Andreani, María L. Schuverdt, Leonardo D. Secchin
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1515-1539, June 2024. Abstract. The Fritz John (FJ) and Karush–Kuhn–Tucker (KKT) conditions are fundamental tools for characterizing minimizers and form the basis of almost all methods for constrained optimization. Since the seminal works of Fritz John, Karush, Kuhn, and Tucker, FJ/KKT conditions have been enhanced by adding extra necessary conditions. Such an extension was initially proposed by Hestenes in the 1970s and later extensively studied by Bertsekas and collaborators. In this work, we revisit enhanced KKT stationarity for standard (smooth) nonlinear programming. We argue that every KKT point satisfies the usual enhanced versions found in the literature. Therefore, enhanced KKT stationarity only concerns the Lagrange multipliers. We then analyze some properties of the corresponding multipliers under the quasi-normality constraint qualification (QNCQ), showing in particular that the set of so-called quasinormal multipliers is compact under QNCQ. Also, we report some consequences of introducing an extra abstract constraint to the problem. Given that enhanced FJ/KKT concepts are obtained by aggregating sequential conditions to FJ/KKT, we discuss the relevance of our findings with respect to the well-known sequential optimality conditions, which have been crucial in generalizing the global convergence of a well-established safeguarded augmented Lagrangian method. Finally, we apply our theory to mathematical programs with complementarity constraints and multiobjective problems, improving and elucidating previous results in the literature.
{"title":"On Enhanced KKT Optimality Conditions for Smooth Nonlinear Optimization","authors":"Roberto Andreani, María L. Schuverdt, Leonardo D. Secchin","doi":"10.1137/22m1539678","DOIUrl":"https://doi.org/10.1137/22m1539678","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1515-1539, June 2024. <br/> Abstract. The Fritz John (FJ) and Karush–Kuhn–Tucker (KKT) conditions are fundamental tools for characterizing minimizers and form the basis of almost all methods for constrained optimization. Since the seminal works of Fritz John, Karush, Kuhn, and Tucker, FJ/KKT conditions have been enhanced by adding extra necessary conditions. Such an extension was initially proposed by Hestenes in the 1970s and later extensively studied by Bertsekas and collaborators. In this work, we revisit enhanced KKT stationarity for standard (smooth) nonlinear programming. We argue that every KKT point satisfies the usual enhanced versions found in the literature. Therefore, enhanced KKT stationarity only concerns the Lagrange multipliers. We then analyze some properties of the corresponding multipliers under the quasi-normality constraint qualification (QNCQ), showing in particular that the set of so-called quasinormal multipliers is compact under QNCQ. Also, we report some consequences of introducing an extra abstract constraint to the problem. Given that enhanced FJ/KKT concepts are obtained by aggregating sequential conditions to FJ/KKT, we discuss the relevance of our findings with respect to the well-known sequential optimality conditions, which have been crucial in generalizing the global convergence of a well-established safeguarded augmented Lagrangian method. Finally, we apply our theory to mathematical programs with complementarity constraints and multiobjective problems, improving and elucidating previous results in the literature.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"49 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1490-1514, June 2024. Abstract. We study optimal simple second-order cone representations (a particular subclass of second-order cone representations) for weighted geometric means, which turns out to be closely related to minimum mediated sets. Several lower bounds and upper bounds on the size of optimal simple second-order cone representations are proved. In the case of bivariate weighted geometric means (equivalently, one-dimensional mediated sets), we are able to prove the exact size of an optimal simple second-order cone representation and give an algorithm to compute one. In the genenal case, fast heuristic algorithms and traversal algorithms are proposed to compute an approximately optimal simple second-order cone representation. Finally, applications to polynomial optimization, matrix optimization, and quantum information are provided.
{"title":"Weighted Geometric Mean, Minimum Mediated Set, and Optimal Simple Second-Order Cone Representation","authors":"Jie Wang","doi":"10.1137/22m1531257","DOIUrl":"https://doi.org/10.1137/22m1531257","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1490-1514, June 2024. <br/> Abstract. We study optimal simple second-order cone representations (a particular subclass of second-order cone representations) for weighted geometric means, which turns out to be closely related to minimum mediated sets. Several lower bounds and upper bounds on the size of optimal simple second-order cone representations are proved. In the case of bivariate weighted geometric means (equivalently, one-dimensional mediated sets), we are able to prove the exact size of an optimal simple second-order cone representation and give an algorithm to compute one. In the genenal case, fast heuristic algorithms and traversal algorithms are proposed to compute an approximately optimal simple second-order cone representation. Finally, applications to polynomial optimization, matrix optimization, and quantum information are provided.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"27 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140614539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robin Brown, David E. Bernal Neira, Davide Venturelli, Marco Pavone
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1455-1489, June 2024. Abstract. Recent years have seen significant advances in quantum/quantum-inspired technologies capable of approximately searching for the ground state of Ising spin Hamiltonians. The promise of leveraging such technologies to accelerate the solution of difficult optimization problems has spurred an increased interest in exploring methods to integrate Ising problems as part of their solution process, with existing approaches ranging from direct transcription to hybrid quantum-classical approaches rooted in existing optimization algorithms. While it is widely acknowledged that quantum computers should augment classical computers, rather than replace them entirely, comparatively little attention has been directed toward deriving analytical characterizations of their interactions. In this paper, we present a formal analysis of hybrid algorithms in the context of solving mixed-binary quadratic programs (MBQP) via Ising solvers. By leveraging an existing completely positive reformulation of MBQPs, as well as a new strong-duality result, we show the exactness of the dual problem over the cone of copositive matrices, thus allowing the resulting reformulation to inherit the straightforward analysis of convex optimization. We propose to solve this reformulation with a hybrid quantum-classical cutting-plane algorithm. Using existing complexity results for convex cutting-plane algorithms, we deduce that the classical portion of this hybrid framework is guaranteed to be polynomial time. This suggests that when applied to NP-hard problems, the complexity of the solution is shifted onto the subroutine handled by the Ising solver.
{"title":"A Copositive Framework for Analysis of Hybrid Ising-Classical Algorithms","authors":"Robin Brown, David E. Bernal Neira, Davide Venturelli, Marco Pavone","doi":"10.1137/22m1514581","DOIUrl":"https://doi.org/10.1137/22m1514581","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1455-1489, June 2024. <br/> Abstract. Recent years have seen significant advances in quantum/quantum-inspired technologies capable of approximately searching for the ground state of Ising spin Hamiltonians. The promise of leveraging such technologies to accelerate the solution of difficult optimization problems has spurred an increased interest in exploring methods to integrate Ising problems as part of their solution process, with existing approaches ranging from direct transcription to hybrid quantum-classical approaches rooted in existing optimization algorithms. While it is widely acknowledged that quantum computers should augment classical computers, rather than replace them entirely, comparatively little attention has been directed toward deriving analytical characterizations of their interactions. In this paper, we present a formal analysis of hybrid algorithms in the context of solving mixed-binary quadratic programs (MBQP) via Ising solvers. By leveraging an existing completely positive reformulation of MBQPs, as well as a new strong-duality result, we show the exactness of the dual problem over the cone of copositive matrices, thus allowing the resulting reformulation to inherit the straightforward analysis of convex optimization. We propose to solve this reformulation with a hybrid quantum-classical cutting-plane algorithm. Using existing complexity results for convex cutting-plane algorithms, we deduce that the classical portion of this hybrid framework is guaranteed to be polynomial time. This suggests that when applied to NP-hard problems, the complexity of the solution is shifted onto the subroutine handled by the Ising solver.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"54 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1427-1454, June 2024. Abstract. Orthogonal group synchronization is the problem of estimating [math] elements [math] from the [math] orthogonal group given some relative measurements [math]. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from [math] to [math]. Alternatively, Burer–Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension [math]. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that, for simultaneous localization and mapping problems, it seems sufficient to increase the dimension by a small constant multiple over the original. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators [math]. For [math], each [math] is relaxed to the Stiefel manifold [math] of [math] matrices with orthonormal rows. The available measurements implicitly define a (connected) graph [math] on [math] vertices. In the noiseless case, we show that, for all connected graphs [math], second-order critical points are globally optimal as soon as [math]. (This implies that Kuramoto oscillators on [math] synchronize for all [math].) This result is the best possible for general graphs; the previous best known result requires [math]. For [math], our result is robust to modest amounts of noise (depending on [math] and [math]). Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group); we show that there are no spurious local minima when [math].
{"title":"Benign Landscapes of Low-Dimensional Relaxations for Orthogonal Synchronization on General Graphs","authors":"Andrew D. McRae, Nicolas Boumal","doi":"10.1137/23m1584642","DOIUrl":"https://doi.org/10.1137/23m1584642","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1427-1454, June 2024. <br/>Abstract. Orthogonal group synchronization is the problem of estimating [math] elements [math] from the [math] orthogonal group given some relative measurements [math]. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from [math] to [math]. Alternatively, Burer–Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension [math]. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that, for simultaneous localization and mapping problems, it seems sufficient to increase the dimension by a small constant multiple over the original. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators [math]. For [math], each [math] is relaxed to the Stiefel manifold [math] of [math] matrices with orthonormal rows. The available measurements implicitly define a (connected) graph [math] on [math] vertices. In the noiseless case, we show that, for all connected graphs [math], second-order critical points are globally optimal as soon as [math]. (This implies that Kuramoto oscillators on [math] synchronize for all [math].) This result is the best possible for general graphs; the previous best known result requires [math]. For [math], our result is robust to modest amounts of noise (depending on [math] and [math]). Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group); we show that there are no spurious local minima when [math].","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"49 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1374-1401, June 2024. Abstract. A popular approach to minimizing a finite sum of smooth convex functions is stochastic gradient descent (SGD) and its variants. Fundamental research questions associated with SGD include (i) how to find a lower bound on the number of times that the gradient oracle of each individual function must be assessed in order to find an [math]-minimizer of the overall objective; (ii) how to design algorithms which guarantee finding an [math]-minimizer of the overall objective in expectation no more than a certain number of times (in terms of [math]) that the gradient oracle of each function needs to be assessed (i.e., upper bound). If these two bounds are at the same order of magnitude, then the algorithms may be called optimal. Most existing results along this line of research typically assume that the functions in the objective share the same condition number. In this paper, the first model we study is the problem of minimizing the sum of finitely many strongly convex functions whose condition numbers are all different. We propose an SGD-based method for this model and show that it is optimal in gradient computations, up to a logarithmic factor. We then consider a constrained separate block optimization model and present lower and upper bounds for its gradient computation complexity. Next, we propose solving the Fenchel dual of the constrained block optimization model via generalized SSNM, which we introduce earlier, and show that it yields a lower iteration complexity than solving the original model by the ADMM-type approach. Finally, we extend the analysis to the general composite convex optimization model and obtain gradient-computation complexity results under certain conditions.
{"title":"A Gradient Complexity Analysis for Minimizing the Sum of Strongly Convex Functions with Varying Condition Numbers","authors":"Nuozhou Wang, Shuzhong Zhang","doi":"10.1137/22m1503646","DOIUrl":"https://doi.org/10.1137/22m1503646","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1374-1401, June 2024. <br/> Abstract. A popular approach to minimizing a finite sum of smooth convex functions is stochastic gradient descent (SGD) and its variants. Fundamental research questions associated with SGD include (i) how to find a lower bound on the number of times that the gradient oracle of each individual function must be assessed in order to find an [math]-minimizer of the overall objective; (ii) how to design algorithms which guarantee finding an [math]-minimizer of the overall objective in expectation no more than a certain number of times (in terms of [math]) that the gradient oracle of each function needs to be assessed (i.e., upper bound). If these two bounds are at the same order of magnitude, then the algorithms may be called optimal. Most existing results along this line of research typically assume that the functions in the objective share the same condition number. In this paper, the first model we study is the problem of minimizing the sum of finitely many strongly convex functions whose condition numbers are all different. We propose an SGD-based method for this model and show that it is optimal in gradient computations, up to a logarithmic factor. We then consider a constrained separate block optimization model and present lower and upper bounds for its gradient computation complexity. Next, we propose solving the Fenchel dual of the constrained block optimization model via generalized SSNM, which we introduce earlier, and show that it yields a lower iteration complexity than solving the original model by the ADMM-type approach. Finally, we extend the analysis to the general composite convex optimization model and obtain gradient-computation complexity results under certain conditions.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"55 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1402-1426, June 2024. Abstract. In this article, a family of SDEs are derived as a tool to understand the behavior of numerical optimization methods under random evaluations of the gradient. Our objective is to transpose the introduction of continuous versions through ODEs to understand the asymptotic behavior of a discrete optimization scheme to the stochastic setting. We consider a continuous version of the stochastic gradient scheme and of a stochastic inertial system. This article first studies the quality of the approximation of the discrete scheme by an SDE when the step size tends to 0. Then, it presents new asymptotic bounds on the values [math], where [math] is a solution of the SDE and [math], when [math] is convex and under integrability conditions on the noise. Results are provided under two sets of hypotheses: first considering [math] and convex functions and then adding some geometrical properties of [math]. All of these results provide insight on the behavior of these inertial and perturbed algorithms in the setting of stochastic algorithms.
{"title":"Stochastic Differential Equations for Modeling First Order Optimization Methods","authors":"M. Dambrine, Ch. Dossal, B. Puig, A. Rondepierre","doi":"10.1137/21m1435665","DOIUrl":"https://doi.org/10.1137/21m1435665","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1402-1426, June 2024. <br/>Abstract. In this article, a family of SDEs are derived as a tool to understand the behavior of numerical optimization methods under random evaluations of the gradient. Our objective is to transpose the introduction of continuous versions through ODEs to understand the asymptotic behavior of a discrete optimization scheme to the stochastic setting. We consider a continuous version of the stochastic gradient scheme and of a stochastic inertial system. This article first studies the quality of the approximation of the discrete scheme by an SDE when the step size tends to 0. Then, it presents new asymptotic bounds on the values [math], where [math] is a solution of the SDE and [math], when [math] is convex and under integrability conditions on the noise. Results are provided under two sets of hypotheses: first considering [math] and convex functions and then adding some geometrical properties of [math]. All of these results provide insight on the behavior of these inertial and perturbed algorithms in the setting of stochastic algorithms.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"33 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}