Robin Brown, David E. Bernal Neira, Davide Venturelli, Marco Pavone
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1455-1489, June 2024. Abstract. Recent years have seen significant advances in quantum/quantum-inspired technologies capable of approximately searching for the ground state of Ising spin Hamiltonians. The promise of leveraging such technologies to accelerate the solution of difficult optimization problems has spurred an increased interest in exploring methods to integrate Ising problems as part of their solution process, with existing approaches ranging from direct transcription to hybrid quantum-classical approaches rooted in existing optimization algorithms. While it is widely acknowledged that quantum computers should augment classical computers, rather than replace them entirely, comparatively little attention has been directed toward deriving analytical characterizations of their interactions. In this paper, we present a formal analysis of hybrid algorithms in the context of solving mixed-binary quadratic programs (MBQP) via Ising solvers. By leveraging an existing completely positive reformulation of MBQPs, as well as a new strong-duality result, we show the exactness of the dual problem over the cone of copositive matrices, thus allowing the resulting reformulation to inherit the straightforward analysis of convex optimization. We propose to solve this reformulation with a hybrid quantum-classical cutting-plane algorithm. Using existing complexity results for convex cutting-plane algorithms, we deduce that the classical portion of this hybrid framework is guaranteed to be polynomial time. This suggests that when applied to NP-hard problems, the complexity of the solution is shifted onto the subroutine handled by the Ising solver.
{"title":"A Copositive Framework for Analysis of Hybrid Ising-Classical Algorithms","authors":"Robin Brown, David E. Bernal Neira, Davide Venturelli, Marco Pavone","doi":"10.1137/22m1514581","DOIUrl":"https://doi.org/10.1137/22m1514581","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1455-1489, June 2024. <br/> Abstract. Recent years have seen significant advances in quantum/quantum-inspired technologies capable of approximately searching for the ground state of Ising spin Hamiltonians. The promise of leveraging such technologies to accelerate the solution of difficult optimization problems has spurred an increased interest in exploring methods to integrate Ising problems as part of their solution process, with existing approaches ranging from direct transcription to hybrid quantum-classical approaches rooted in existing optimization algorithms. While it is widely acknowledged that quantum computers should augment classical computers, rather than replace them entirely, comparatively little attention has been directed toward deriving analytical characterizations of their interactions. In this paper, we present a formal analysis of hybrid algorithms in the context of solving mixed-binary quadratic programs (MBQP) via Ising solvers. By leveraging an existing completely positive reformulation of MBQPs, as well as a new strong-duality result, we show the exactness of the dual problem over the cone of copositive matrices, thus allowing the resulting reformulation to inherit the straightforward analysis of convex optimization. We propose to solve this reformulation with a hybrid quantum-classical cutting-plane algorithm. Using existing complexity results for convex cutting-plane algorithms, we deduce that the classical portion of this hybrid framework is guaranteed to be polynomial time. This suggests that when applied to NP-hard problems, the complexity of the solution is shifted onto the subroutine handled by the Ising solver.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1427-1454, June 2024. Abstract. Orthogonal group synchronization is the problem of estimating [math] elements [math] from the [math] orthogonal group given some relative measurements [math]. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from [math] to [math]. Alternatively, Burer–Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension [math]. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that, for simultaneous localization and mapping problems, it seems sufficient to increase the dimension by a small constant multiple over the original. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators [math]. For [math], each [math] is relaxed to the Stiefel manifold [math] of [math] matrices with orthonormal rows. The available measurements implicitly define a (connected) graph [math] on [math] vertices. In the noiseless case, we show that, for all connected graphs [math], second-order critical points are globally optimal as soon as [math]. (This implies that Kuramoto oscillators on [math] synchronize for all [math].) This result is the best possible for general graphs; the previous best known result requires [math]. For [math], our result is robust to modest amounts of noise (depending on [math] and [math]). Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group); we show that there are no spurious local minima when [math].
{"title":"Benign Landscapes of Low-Dimensional Relaxations for Orthogonal Synchronization on General Graphs","authors":"Andrew D. McRae, Nicolas Boumal","doi":"10.1137/23m1584642","DOIUrl":"https://doi.org/10.1137/23m1584642","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1427-1454, June 2024. <br/>Abstract. Orthogonal group synchronization is the problem of estimating [math] elements [math] from the [math] orthogonal group given some relative measurements [math]. The least-squares formulation is nonconvex. To avoid its local minima, a Shor-type convex relaxation squares the dimension of the optimization problem from [math] to [math]. Alternatively, Burer–Monteiro-type nonconvex relaxations have generic landscape guarantees at dimension [math]. For smaller relaxations, the problem structure matters. It has been observed in the robotics literature that, for simultaneous localization and mapping problems, it seems sufficient to increase the dimension by a small constant multiple over the original. We partially explain this. This also has implications for Kuramoto oscillators. Specifically, we minimize the least-squares cost function in terms of estimators [math]. For [math], each [math] is relaxed to the Stiefel manifold [math] of [math] matrices with orthonormal rows. The available measurements implicitly define a (connected) graph [math] on [math] vertices. In the noiseless case, we show that, for all connected graphs [math], second-order critical points are globally optimal as soon as [math]. (This implies that Kuramoto oscillators on [math] synchronize for all [math].) This result is the best possible for general graphs; the previous best known result requires [math]. For [math], our result is robust to modest amounts of noise (depending on [math] and [math]). Our proof uses a novel randomized choice of tangent direction to prove (near-)optimality of second-order critical points. Finally, we partially extend our noiseless landscape results to the complex case (unitary group); we show that there are no spurious local minima when [math].","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1374-1401, June 2024. Abstract. A popular approach to minimizing a finite sum of smooth convex functions is stochastic gradient descent (SGD) and its variants. Fundamental research questions associated with SGD include (i) how to find a lower bound on the number of times that the gradient oracle of each individual function must be assessed in order to find an [math]-minimizer of the overall objective; (ii) how to design algorithms which guarantee finding an [math]-minimizer of the overall objective in expectation no more than a certain number of times (in terms of [math]) that the gradient oracle of each function needs to be assessed (i.e., upper bound). If these two bounds are at the same order of magnitude, then the algorithms may be called optimal. Most existing results along this line of research typically assume that the functions in the objective share the same condition number. In this paper, the first model we study is the problem of minimizing the sum of finitely many strongly convex functions whose condition numbers are all different. We propose an SGD-based method for this model and show that it is optimal in gradient computations, up to a logarithmic factor. We then consider a constrained separate block optimization model and present lower and upper bounds for its gradient computation complexity. Next, we propose solving the Fenchel dual of the constrained block optimization model via generalized SSNM, which we introduce earlier, and show that it yields a lower iteration complexity than solving the original model by the ADMM-type approach. Finally, we extend the analysis to the general composite convex optimization model and obtain gradient-computation complexity results under certain conditions.
{"title":"A Gradient Complexity Analysis for Minimizing the Sum of Strongly Convex Functions with Varying Condition Numbers","authors":"Nuozhou Wang, Shuzhong Zhang","doi":"10.1137/22m1503646","DOIUrl":"https://doi.org/10.1137/22m1503646","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1374-1401, June 2024. <br/> Abstract. A popular approach to minimizing a finite sum of smooth convex functions is stochastic gradient descent (SGD) and its variants. Fundamental research questions associated with SGD include (i) how to find a lower bound on the number of times that the gradient oracle of each individual function must be assessed in order to find an [math]-minimizer of the overall objective; (ii) how to design algorithms which guarantee finding an [math]-minimizer of the overall objective in expectation no more than a certain number of times (in terms of [math]) that the gradient oracle of each function needs to be assessed (i.e., upper bound). If these two bounds are at the same order of magnitude, then the algorithms may be called optimal. Most existing results along this line of research typically assume that the functions in the objective share the same condition number. In this paper, the first model we study is the problem of minimizing the sum of finitely many strongly convex functions whose condition numbers are all different. We propose an SGD-based method for this model and show that it is optimal in gradient computations, up to a logarithmic factor. We then consider a constrained separate block optimization model and present lower and upper bounds for its gradient computation complexity. Next, we propose solving the Fenchel dual of the constrained block optimization model via generalized SSNM, which we introduce earlier, and show that it yields a lower iteration complexity than solving the original model by the ADMM-type approach. Finally, we extend the analysis to the general composite convex optimization model and obtain gradient-computation complexity results under certain conditions.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1402-1426, June 2024. Abstract. In this article, a family of SDEs are derived as a tool to understand the behavior of numerical optimization methods under random evaluations of the gradient. Our objective is to transpose the introduction of continuous versions through ODEs to understand the asymptotic behavior of a discrete optimization scheme to the stochastic setting. We consider a continuous version of the stochastic gradient scheme and of a stochastic inertial system. This article first studies the quality of the approximation of the discrete scheme by an SDE when the step size tends to 0. Then, it presents new asymptotic bounds on the values [math], where [math] is a solution of the SDE and [math], when [math] is convex and under integrability conditions on the noise. Results are provided under two sets of hypotheses: first considering [math] and convex functions and then adding some geometrical properties of [math]. All of these results provide insight on the behavior of these inertial and perturbed algorithms in the setting of stochastic algorithms.
{"title":"Stochastic Differential Equations for Modeling First Order Optimization Methods","authors":"M. Dambrine, Ch. Dossal, B. Puig, A. Rondepierre","doi":"10.1137/21m1435665","DOIUrl":"https://doi.org/10.1137/21m1435665","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1402-1426, June 2024. <br/>Abstract. In this article, a family of SDEs are derived as a tool to understand the behavior of numerical optimization methods under random evaluations of the gradient. Our objective is to transpose the introduction of continuous versions through ODEs to understand the asymptotic behavior of a discrete optimization scheme to the stochastic setting. We consider a continuous version of the stochastic gradient scheme and of a stochastic inertial system. This article first studies the quality of the approximation of the discrete scheme by an SDE when the step size tends to 0. Then, it presents new asymptotic bounds on the values [math], where [math] is a solution of the SDE and [math], when [math] is convex and under integrability conditions on the noise. Results are provided under two sets of hypotheses: first considering [math] and convex functions and then adding some geometrical properties of [math]. All of these results provide insight on the behavior of these inertial and perturbed algorithms in the setting of stochastic algorithms.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1341-1373, June 2024. Abstract. Bell inequalities are pillars of quantum physics in that their violations imply that certain properties of quantum physics (e.g., entanglement) cannot be represented by any classical picture of physics. In this article Bell inequalities and their violations are considered through the lens of noncommutative polynomial optimization. Optimality of these violations is certified for a large majority of a set of standard Bell inequalities, denoted A2–A89 in the literature. The main techniques used in the paper include the NPA hierarchy, i.e., the noncommutative version of the Lasserre semidefinite programming (SDP) hierarchies based on the Helton–McCullough Positivstellensatz, the Gelfand–Naimark–Segal (GNS) construction with a novel use of the Artin–Wedderburn theory for rounding and projecting, and nonlinear programming (NLP). A new “Newton chip”-like technique for reducing sizes of SDPs arising in the constructed polynomial optimization problems is presented. This technique is based on conditional expectations. Finally, noncommutative Gröbner bases are exploited to certify when an optimizer (a solution yielding optimum violation) cannot be extracted from a dual SDP solution.
{"title":"Certifying Optimality of Bell Inequality Violations: Noncommutative Polynomial Optimization through Semidefinite Programming and Local Optimization","authors":"Timotej Hrga, Igor Klep, Janez Povh","doi":"10.1137/22m1473340","DOIUrl":"https://doi.org/10.1137/22m1473340","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1341-1373, June 2024. <br/> Abstract. Bell inequalities are pillars of quantum physics in that their violations imply that certain properties of quantum physics (e.g., entanglement) cannot be represented by any classical picture of physics. In this article Bell inequalities and their violations are considered through the lens of noncommutative polynomial optimization. Optimality of these violations is certified for a large majority of a set of standard Bell inequalities, denoted A2–A89 in the literature. The main techniques used in the paper include the NPA hierarchy, i.e., the noncommutative version of the Lasserre semidefinite programming (SDP) hierarchies based on the Helton–McCullough Positivstellensatz, the Gelfand–Naimark–Segal (GNS) construction with a novel use of the Artin–Wedderburn theory for rounding and projecting, and nonlinear programming (NLP). A new “Newton chip”-like technique for reducing sizes of SDPs arising in the constructed polynomial optimization problems is presented. This technique is based on conditional expectations. Finally, noncommutative Gröbner bases are exploited to certify when an optimizer (a solution yielding optimum violation) cannot be extracted from a dual SDP solution.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Lin, Scott B. Lindstrom, Bruno F. Lourenço, Ting Kei Pong
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1316-1340, June 2024. Abstract. Error bounds are a requisite for trusting or distrusting solutions in an informed way. Until recently, provable error bounds in the absence of constraint qualifications were unattainable for many classes of cones that do not admit projections with known succinct expressions. We build such error bounds for the generalized power cones, using the recently developed framework of one-step facial residual functions. We also show that our error bounds are tight in the sense of that framework. Besides their utility for understanding solution reliability, the error bounds we discover have additional applications to the algebraic structure of the underlying cone, which we describe. In particular we use the error bounds to compute the automorphisms of the generalized power cones, and to identify a set of generalized power cones that are self-dual, irreducible, nonhomogeneous, and perfect.
{"title":"Generalized Power Cones: Optimal Error Bounds and Automorphisms","authors":"Ying Lin, Scott B. Lindstrom, Bruno F. Lourenço, Ting Kei Pong","doi":"10.1137/22m1542921","DOIUrl":"https://doi.org/10.1137/22m1542921","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1316-1340, June 2024. <br/> Abstract. Error bounds are a requisite for trusting or distrusting solutions in an informed way. Until recently, provable error bounds in the absence of constraint qualifications were unattainable for many classes of cones that do not admit projections with known succinct expressions. We build such error bounds for the generalized power cones, using the recently developed framework of one-step facial residual functions. We also show that our error bounds are tight in the sense of that framework. Besides their utility for understanding solution reliability, the error bounds we discover have additional applications to the algebraic structure of the underlying cone, which we describe. In particular we use the error bounds to compute the automorphisms of the generalized power cones, and to identify a set of generalized power cones that are self-dual, irreducible, nonhomogeneous, and perfect.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Buchheim, Alexandra Grütering, Christian Meyer
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1295-1315, June 2024. Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. In a companion paper [C. Buchheim, A. Grütering, and C. Meyer, SIAM J. Optim., arXiv:2203.07121, 2024], we describe the [math]-closure of the convex hull of feasible switching patterns as the intersection of convex sets derived from finite-dimensional projections. In this paper, the resulting outer description is used for the construction of an outer approximation algorithm in function space, whose iterates are proven to converge strongly in [math] to the global minimizer of the convexified optimal control problem. The linear-quadratic subproblems arising in each iteration of the outer approximation algorithm are solved by means of a semismooth Newton method. A numerical example in two spatial dimensions illustrates the efficiency of the overall algorithm.
SIAM 优化期刊》第 34 卷第 2 期第 1295-1315 页,2024 年 6 月。 摘要。我们考虑的是偏微分方程的最优控制问题,其中控制取值为二进制,但随时间跨度而变化;因此可以将其视为动态开关。切换模式可能受到组合约束,例如切换总数的上限或两次切换之间时间的下限。在另一篇论文 [C. Buchheim, A. GrüglerBuchheim、A. Grütering 和 C. Meyer,SIAM J. Optim.,arXiv:2203.07121,2024]中,我们将可行切换模式凸壳的[数学]封闭描述为由有限维投影得出的凸集的交集。在本文中,所得到的外部描述被用于构造函数空间中的外部逼近算法,其迭代在[math]中被证明强烈收敛于凸化最优控制问题的全局最小值。外近似算法的每次迭代中出现的线性二次子问题都是通过半滑牛顿法求解的。在两个空间维度上的一个数值示例说明了整个算法的效率。
{"title":"Parabolic Optimal Control Problems with Combinatorial Switching Constraints, Part II: Outer Approximation Algorithm","authors":"Christoph Buchheim, Alexandra Grütering, Christian Meyer","doi":"10.1137/22m1490284","DOIUrl":"https://doi.org/10.1137/22m1490284","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1295-1315, June 2024. <br/> Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. In a companion paper [C. Buchheim, A. Grütering, and C. Meyer, SIAM J. Optim., arXiv:2203.07121, 2024], we describe the [math]-closure of the convex hull of feasible switching patterns as the intersection of convex sets derived from finite-dimensional projections. In this paper, the resulting outer description is used for the construction of an outer approximation algorithm in function space, whose iterates are proven to converge strongly in [math] to the global minimizer of the convexified optimal control problem. The linear-quadratic subproblems arising in each iteration of the outer approximation algorithm are solved by means of a semismooth Newton method. A numerical example in two spatial dimensions illustrates the efficiency of the overall algorithm.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1206-1235, June 2024. Abstract. This paper introduces two decomposition-based methods for two-block mixed-integer linear programs (MILPs), which aim to take advantage of separable structures of the original problem by solving a sequence of lower-dimensional MILPs. The first method is based on the [math]-augmented Lagrangian method, and the second one is based on a modified alternating direction method of multipliers. In the presence of certain block-angular structures, both methods create parallel subproblems in one block of variables and add nonconvex cuts to update the other block; they converge to globally optimal solutions of the original MILP under proper conditions. Numerical experiments on three classes of MILPs demonstrate the advantages of the proposed methods on structured problems over the state-of-the-art MILP solvers.
{"title":"Decomposition Methods for Global Solution of Mixed-Integer Linear Programs","authors":"Kaizhao Sun, Mou Sun, Wotao Yin","doi":"10.1137/22m1487321","DOIUrl":"https://doi.org/10.1137/22m1487321","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1206-1235, June 2024. <br/> Abstract. This paper introduces two decomposition-based methods for two-block mixed-integer linear programs (MILPs), which aim to take advantage of separable structures of the original problem by solving a sequence of lower-dimensional MILPs. The first method is based on the [math]-augmented Lagrangian method, and the second one is based on a modified alternating direction method of multipliers. In the presence of certain block-angular structures, both methods create parallel subproblems in one block of variables and add nonconvex cuts to update the other block; they converge to globally optimal solutions of the original MILP under proper conditions. Numerical experiments on three classes of MILPs demonstrate the advantages of the proposed methods on structured problems over the state-of-the-art MILP solvers.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1264-1294, June 2024. Abstract. We study statistical properties of the optimal value of the sample average approximation (SAA). The focus is on the tail function of the absolute error induced by the SAA, deriving upper estimates of its outcomes dependent on the sample size. The estimates allow to conclude immediately convergence rates for the optimal value of the SAA. As a crucial point, the investigations are based on new types of conditions from the theory of empirical processes which do not rely on pathwise analytical properties of the goal functions. In particular, continuity in the parameter is not imposed in advance as often in the literature on the SAA method. It is also shown that the new condition is satisfied if the paths of the goal functions are Hölder continuous so that the main results carry over in this case. Moreover, the main results are applied to goal functions whose paths are piecewise Hölder continuous as, e.g., in two-stage mixed-integer programs. The main results are shown for classical risk-neutral stochastic programs, but we also demonstrate how to apply them to the sample average approximation of risk-averse stochastic programs. In this respect, we consider stochastic programs expressed in terms of mean upper semideviations and divergence risk measures.
SIAM 优化期刊》,第 34 卷第 2 期,第 1264-1294 页,2024 年 6 月。 摘要我们研究了样本平均近似(SAA)最优值的统计特性。重点是 SAA 引起的绝对误差的尾函数,推导出其结果取决于样本大小的上限估计值。通过这些估计值,可以立即得出 SAA 最佳值的收敛率。关键的一点是,研究基于经验过程理论中的新型条件,而这些条件并不依赖于目标函数的路径分析特性。特别是,没有像有关 SAA 方法的文献中经常提到的那样,事先强加参数的连续性。研究还表明,如果目标函数的路径是荷尔德连续的,那么新条件就会得到满足,因此主要结果在这种情况下也是如此。此外,主要结果还适用于路径为片断荷尔德连续的目标函数,例如两阶段混合整数程序。主要结果针对经典的风险中性随机程序,但我们也演示了如何将它们应用于风险规避随机程序的样本平均近似。在这方面,我们考虑了用均值上半偏差和发散风险度量表示的随机程序。
{"title":"Nonasymptotic Upper Estimates for Errors of the Sample Average Approximation Method to Solve Risk-Averse Stochastic Programs","authors":"Volker Krätschmer","doi":"10.1137/22m1535425","DOIUrl":"https://doi.org/10.1137/22m1535425","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1264-1294, June 2024. <br/> Abstract. We study statistical properties of the optimal value of the sample average approximation (SAA). The focus is on the tail function of the absolute error induced by the SAA, deriving upper estimates of its outcomes dependent on the sample size. The estimates allow to conclude immediately convergence rates for the optimal value of the SAA. As a crucial point, the investigations are based on new types of conditions from the theory of empirical processes which do not rely on pathwise analytical properties of the goal functions. In particular, continuity in the parameter is not imposed in advance as often in the literature on the SAA method. It is also shown that the new condition is satisfied if the paths of the goal functions are Hölder continuous so that the main results carry over in this case. Moreover, the main results are applied to goal functions whose paths are piecewise Hölder continuous as, e.g., in two-stage mixed-integer programs. The main results are shown for classical risk-neutral stochastic programs, but we also demonstrate how to apply them to the sample average approximation of risk-averse stochastic programs. In this respect, we consider stochastic programs expressed in terms of mean upper semideviations and divergence risk measures.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastian Banert, Jevgenija Rudzusika, Ozan Öktem, Jonas Adler
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1236-1263, June 2024. Abstract. We propose several deep-learning accelerated optimization solvers with convergence guarantees. We use ideas from the analysis of accelerated forward-backward schemes like FISTA, but instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set using some predefined method, we train a deep neural network to pick the best update within a given space. Finally, we show that the method is applicable to several cases of smooth and nonsmooth optimization and show superior results to established accelerated solvers.
{"title":"Accelerated Forward-Backward Optimization Using Deep Learning","authors":"Sebastian Banert, Jevgenija Rudzusika, Ozan Öktem, Jonas Adler","doi":"10.1137/22m1532548","DOIUrl":"https://doi.org/10.1137/22m1532548","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1236-1263, June 2024. <br/> Abstract. We propose several deep-learning accelerated optimization solvers with convergence guarantees. We use ideas from the analysis of accelerated forward-backward schemes like FISTA, but instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set using some predefined method, we train a deep neural network to pick the best update within a given space. Finally, we show that the method is applicable to several cases of smooth and nonsmooth optimization and show superior results to established accelerated solvers.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}