SIAM Journal on Optimization, Volume 34, Issue 2, Page 1341-1373, June 2024. Abstract. Bell inequalities are pillars of quantum physics in that their violations imply that certain properties of quantum physics (e.g., entanglement) cannot be represented by any classical picture of physics. In this article Bell inequalities and their violations are considered through the lens of noncommutative polynomial optimization. Optimality of these violations is certified for a large majority of a set of standard Bell inequalities, denoted A2–A89 in the literature. The main techniques used in the paper include the NPA hierarchy, i.e., the noncommutative version of the Lasserre semidefinite programming (SDP) hierarchies based on the Helton–McCullough Positivstellensatz, the Gelfand–Naimark–Segal (GNS) construction with a novel use of the Artin–Wedderburn theory for rounding and projecting, and nonlinear programming (NLP). A new “Newton chip”-like technique for reducing sizes of SDPs arising in the constructed polynomial optimization problems is presented. This technique is based on conditional expectations. Finally, noncommutative Gröbner bases are exploited to certify when an optimizer (a solution yielding optimum violation) cannot be extracted from a dual SDP solution.
{"title":"Certifying Optimality of Bell Inequality Violations: Noncommutative Polynomial Optimization through Semidefinite Programming and Local Optimization","authors":"Timotej Hrga, Igor Klep, Janez Povh","doi":"10.1137/22m1473340","DOIUrl":"https://doi.org/10.1137/22m1473340","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1341-1373, June 2024. <br/> Abstract. Bell inequalities are pillars of quantum physics in that their violations imply that certain properties of quantum physics (e.g., entanglement) cannot be represented by any classical picture of physics. In this article Bell inequalities and their violations are considered through the lens of noncommutative polynomial optimization. Optimality of these violations is certified for a large majority of a set of standard Bell inequalities, denoted A2–A89 in the literature. The main techniques used in the paper include the NPA hierarchy, i.e., the noncommutative version of the Lasserre semidefinite programming (SDP) hierarchies based on the Helton–McCullough Positivstellensatz, the Gelfand–Naimark–Segal (GNS) construction with a novel use of the Artin–Wedderburn theory for rounding and projecting, and nonlinear programming (NLP). A new “Newton chip”-like technique for reducing sizes of SDPs arising in the constructed polynomial optimization problems is presented. This technique is based on conditional expectations. Finally, noncommutative Gröbner bases are exploited to certify when an optimizer (a solution yielding optimum violation) cannot be extracted from a dual SDP solution.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"254 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ying Lin, Scott B. Lindstrom, Bruno F. Lourenço, Ting Kei Pong
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1316-1340, June 2024. Abstract. Error bounds are a requisite for trusting or distrusting solutions in an informed way. Until recently, provable error bounds in the absence of constraint qualifications were unattainable for many classes of cones that do not admit projections with known succinct expressions. We build such error bounds for the generalized power cones, using the recently developed framework of one-step facial residual functions. We also show that our error bounds are tight in the sense of that framework. Besides their utility for understanding solution reliability, the error bounds we discover have additional applications to the algebraic structure of the underlying cone, which we describe. In particular we use the error bounds to compute the automorphisms of the generalized power cones, and to identify a set of generalized power cones that are self-dual, irreducible, nonhomogeneous, and perfect.
{"title":"Generalized Power Cones: Optimal Error Bounds and Automorphisms","authors":"Ying Lin, Scott B. Lindstrom, Bruno F. Lourenço, Ting Kei Pong","doi":"10.1137/22m1542921","DOIUrl":"https://doi.org/10.1137/22m1542921","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1316-1340, June 2024. <br/> Abstract. Error bounds are a requisite for trusting or distrusting solutions in an informed way. Until recently, provable error bounds in the absence of constraint qualifications were unattainable for many classes of cones that do not admit projections with known succinct expressions. We build such error bounds for the generalized power cones, using the recently developed framework of one-step facial residual functions. We also show that our error bounds are tight in the sense of that framework. Besides their utility for understanding solution reliability, the error bounds we discover have additional applications to the algebraic structure of the underlying cone, which we describe. In particular we use the error bounds to compute the automorphisms of the generalized power cones, and to identify a set of generalized power cones that are self-dual, irreducible, nonhomogeneous, and perfect.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"46 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Buchheim, Alexandra Grütering, Christian Meyer
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1295-1315, June 2024. Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. In a companion paper [C. Buchheim, A. Grütering, and C. Meyer, SIAM J. Optim., arXiv:2203.07121, 2024], we describe the [math]-closure of the convex hull of feasible switching patterns as the intersection of convex sets derived from finite-dimensional projections. In this paper, the resulting outer description is used for the construction of an outer approximation algorithm in function space, whose iterates are proven to converge strongly in [math] to the global minimizer of the convexified optimal control problem. The linear-quadratic subproblems arising in each iteration of the outer approximation algorithm are solved by means of a semismooth Newton method. A numerical example in two spatial dimensions illustrates the efficiency of the overall algorithm.
SIAM 优化期刊》第 34 卷第 2 期第 1295-1315 页,2024 年 6 月。 摘要。我们考虑的是偏微分方程的最优控制问题,其中控制取值为二进制,但随时间跨度而变化;因此可以将其视为动态开关。切换模式可能受到组合约束,例如切换总数的上限或两次切换之间时间的下限。在另一篇论文 [C. Buchheim, A. GrüglerBuchheim、A. Grütering 和 C. Meyer,SIAM J. Optim.,arXiv:2203.07121,2024]中,我们将可行切换模式凸壳的[数学]封闭描述为由有限维投影得出的凸集的交集。在本文中,所得到的外部描述被用于构造函数空间中的外部逼近算法,其迭代在[math]中被证明强烈收敛于凸化最优控制问题的全局最小值。外近似算法的每次迭代中出现的线性二次子问题都是通过半滑牛顿法求解的。在两个空间维度上的一个数值示例说明了整个算法的效率。
{"title":"Parabolic Optimal Control Problems with Combinatorial Switching Constraints, Part II: Outer Approximation Algorithm","authors":"Christoph Buchheim, Alexandra Grütering, Christian Meyer","doi":"10.1137/22m1490284","DOIUrl":"https://doi.org/10.1137/22m1490284","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1295-1315, June 2024. <br/> Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. In a companion paper [C. Buchheim, A. Grütering, and C. Meyer, SIAM J. Optim., arXiv:2203.07121, 2024], we describe the [math]-closure of the convex hull of feasible switching patterns as the intersection of convex sets derived from finite-dimensional projections. In this paper, the resulting outer description is used for the construction of an outer approximation algorithm in function space, whose iterates are proven to converge strongly in [math] to the global minimizer of the convexified optimal control problem. The linear-quadratic subproblems arising in each iteration of the outer approximation algorithm are solved by means of a semismooth Newton method. A numerical example in two spatial dimensions illustrates the efficiency of the overall algorithm.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"49 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140568279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1206-1235, June 2024. Abstract. This paper introduces two decomposition-based methods for two-block mixed-integer linear programs (MILPs), which aim to take advantage of separable structures of the original problem by solving a sequence of lower-dimensional MILPs. The first method is based on the [math]-augmented Lagrangian method, and the second one is based on a modified alternating direction method of multipliers. In the presence of certain block-angular structures, both methods create parallel subproblems in one block of variables and add nonconvex cuts to update the other block; they converge to globally optimal solutions of the original MILP under proper conditions. Numerical experiments on three classes of MILPs demonstrate the advantages of the proposed methods on structured problems over the state-of-the-art MILP solvers.
{"title":"Decomposition Methods for Global Solution of Mixed-Integer Linear Programs","authors":"Kaizhao Sun, Mou Sun, Wotao Yin","doi":"10.1137/22m1487321","DOIUrl":"https://doi.org/10.1137/22m1487321","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1206-1235, June 2024. <br/> Abstract. This paper introduces two decomposition-based methods for two-block mixed-integer linear programs (MILPs), which aim to take advantage of separable structures of the original problem by solving a sequence of lower-dimensional MILPs. The first method is based on the [math]-augmented Lagrangian method, and the second one is based on a modified alternating direction method of multipliers. In the presence of certain block-angular structures, both methods create parallel subproblems in one block of variables and add nonconvex cuts to update the other block; they converge to globally optimal solutions of the original MILP under proper conditions. Numerical experiments on three classes of MILPs demonstrate the advantages of the proposed methods on structured problems over the state-of-the-art MILP solvers.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"3 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1264-1294, June 2024. Abstract. We study statistical properties of the optimal value of the sample average approximation (SAA). The focus is on the tail function of the absolute error induced by the SAA, deriving upper estimates of its outcomes dependent on the sample size. The estimates allow to conclude immediately convergence rates for the optimal value of the SAA. As a crucial point, the investigations are based on new types of conditions from the theory of empirical processes which do not rely on pathwise analytical properties of the goal functions. In particular, continuity in the parameter is not imposed in advance as often in the literature on the SAA method. It is also shown that the new condition is satisfied if the paths of the goal functions are Hölder continuous so that the main results carry over in this case. Moreover, the main results are applied to goal functions whose paths are piecewise Hölder continuous as, e.g., in two-stage mixed-integer programs. The main results are shown for classical risk-neutral stochastic programs, but we also demonstrate how to apply them to the sample average approximation of risk-averse stochastic programs. In this respect, we consider stochastic programs expressed in terms of mean upper semideviations and divergence risk measures.
SIAM 优化期刊》,第 34 卷第 2 期,第 1264-1294 页,2024 年 6 月。 摘要我们研究了样本平均近似(SAA)最优值的统计特性。重点是 SAA 引起的绝对误差的尾函数,推导出其结果取决于样本大小的上限估计值。通过这些估计值,可以立即得出 SAA 最佳值的收敛率。关键的一点是,研究基于经验过程理论中的新型条件,而这些条件并不依赖于目标函数的路径分析特性。特别是,没有像有关 SAA 方法的文献中经常提到的那样,事先强加参数的连续性。研究还表明,如果目标函数的路径是荷尔德连续的,那么新条件就会得到满足,因此主要结果在这种情况下也是如此。此外,主要结果还适用于路径为片断荷尔德连续的目标函数,例如两阶段混合整数程序。主要结果针对经典的风险中性随机程序,但我们也演示了如何将它们应用于风险规避随机程序的样本平均近似。在这方面,我们考虑了用均值上半偏差和发散风险度量表示的随机程序。
{"title":"Nonasymptotic Upper Estimates for Errors of the Sample Average Approximation Method to Solve Risk-Averse Stochastic Programs","authors":"Volker Krätschmer","doi":"10.1137/22m1535425","DOIUrl":"https://doi.org/10.1137/22m1535425","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1264-1294, June 2024. <br/> Abstract. We study statistical properties of the optimal value of the sample average approximation (SAA). The focus is on the tail function of the absolute error induced by the SAA, deriving upper estimates of its outcomes dependent on the sample size. The estimates allow to conclude immediately convergence rates for the optimal value of the SAA. As a crucial point, the investigations are based on new types of conditions from the theory of empirical processes which do not rely on pathwise analytical properties of the goal functions. In particular, continuity in the parameter is not imposed in advance as often in the literature on the SAA method. It is also shown that the new condition is satisfied if the paths of the goal functions are Hölder continuous so that the main results carry over in this case. Moreover, the main results are applied to goal functions whose paths are piecewise Hölder continuous as, e.g., in two-stage mixed-integer programs. The main results are shown for classical risk-neutral stochastic programs, but we also demonstrate how to apply them to the sample average approximation of risk-averse stochastic programs. In this respect, we consider stochastic programs expressed in terms of mean upper semideviations and divergence risk measures.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"49 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sebastian Banert, Jevgenija Rudzusika, Ozan Öktem, Jonas Adler
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1236-1263, June 2024. Abstract. We propose several deep-learning accelerated optimization solvers with convergence guarantees. We use ideas from the analysis of accelerated forward-backward schemes like FISTA, but instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set using some predefined method, we train a deep neural network to pick the best update within a given space. Finally, we show that the method is applicable to several cases of smooth and nonsmooth optimization and show superior results to established accelerated solvers.
{"title":"Accelerated Forward-Backward Optimization Using Deep Learning","authors":"Sebastian Banert, Jevgenija Rudzusika, Ozan Öktem, Jonas Adler","doi":"10.1137/22m1532548","DOIUrl":"https://doi.org/10.1137/22m1532548","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1236-1263, June 2024. <br/> Abstract. We propose several deep-learning accelerated optimization solvers with convergence guarantees. We use ideas from the analysis of accelerated forward-backward schemes like FISTA, but instead of the classical approach of proving convergence for a choice of parameters, such as a step-size, we show convergence whenever the update is chosen in a specific set. Rather than picking a point in this set using some predefined method, we train a deep neural network to pick the best update within a given space. Finally, we show that the method is applicable to several cases of smooth and nonsmooth optimization and show superior results to established accelerated solvers.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"31 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christoph Buchheim, Alexandra Grütering, Christian Meyer
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1187-1205, June 2024. Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. While such combinatorial constraints are often seen as an additional complication that is treated in a heuristic postprocessing, the core of our approach is to investigate the convex hull of all feasible switching patterns in order to define a tight convex relaxation of the control problem. The convex relaxation is built by cutting planes derived from finite-dimensional projections, which can be studied by means of polyhedral combinatorics. A numerical example for the case of a bounded number of switchings shows that our approach can significantly improve the dual bounds given by the straightforward continuous relaxation, which is obtained by relaxing binarity constraints.
{"title":"Parabolic Optimal Control Problems with Combinatorial Switching Constraints, Part I: Convex Relaxations","authors":"Christoph Buchheim, Alexandra Grütering, Christian Meyer","doi":"10.1137/22m1490260","DOIUrl":"https://doi.org/10.1137/22m1490260","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1187-1205, June 2024. <br/> Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. While such combinatorial constraints are often seen as an additional complication that is treated in a heuristic postprocessing, the core of our approach is to investigate the convex hull of all feasible switching patterns in order to define a tight convex relaxation of the control problem. The convex relaxation is built by cutting planes derived from finite-dimensional projections, which can be studied by means of polyhedral combinatorics. A numerical example for the case of a bounded number of switchings shows that our approach can significantly improve the dual bounds given by the straightforward continuous relaxation, which is obtained by relaxing binarity constraints.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"54 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1157-1185, March 2024. Abstract. We develop an implementable stochastic proximal point (SPP) method for a class of weakly convex, composite optimization problems. The proposed stochastic proximal point algorithm incorporates a variance reduction mechanism and the resulting SPP updates are solved using an inexact semismooth Newton framework. We establish detailed convergence results that take the inexactness of the SPP steps into account and that are in accordance with existing convergence guarantees of (proximal) stochastic variance-reduced gradient methods. Numerical experiments show that the proposed algorithm competes favorably with other state-of-the-art methods and achieves higher robustness with respect to the step size selection.
{"title":"A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction","authors":"Andre Milzarek, Fabian Schaipp, Michael Ulbrich","doi":"10.1137/22m1488181","DOIUrl":"https://doi.org/10.1137/22m1488181","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1157-1185, March 2024. <br/> Abstract. We develop an implementable stochastic proximal point (SPP) method for a class of weakly convex, composite optimization problems. The proposed stochastic proximal point algorithm incorporates a variance reduction mechanism and the resulting SPP updates are solved using an inexact semismooth Newton framework. We establish detailed convergence results that take the inexactness of the SPP steps into account and that are in accordance with existing convergence guarantees of (proximal) stochastic variance-reduced gradient methods. Numerical experiments show that the proposed algorithm competes favorably with other state-of-the-art methods and achieves higher robustness with respect to the step size selection.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"45 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140302825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1131-1156, March 2024. Abstract. We consider the decentralized optimization problem, where a network of [math] agents aims to collaboratively minimize the average of their individual smooth and convex objective functions through peer-to-peer communication in a directed graph. To tackle this problem, we propose two accelerated gradient tracking methods, namely Accelerated Push-DIGing (APD) and APD-SC, for non-strongly convex and strongly convex objective functions, respectively. We show that APD and APD-SC converge at the rates [math] and [math], respectively, up to constant factors depending only on the mixing matrix. APD and APD-SC are the first decentralized methods over unbalanced directed graphs that achieve the same provable acceleration as centralized methods. Numerical experiments demonstrate the effectiveness of both methods.
{"title":"Provably Accelerated Decentralized Gradient Methods Over Unbalanced Directed Graphs","authors":"Zhuoqing Song, Lei Shi, Shi Pu, Ming Yan","doi":"10.1137/22m148570x","DOIUrl":"https://doi.org/10.1137/22m148570x","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1131-1156, March 2024. <br/> Abstract. We consider the decentralized optimization problem, where a network of [math] agents aims to collaboratively minimize the average of their individual smooth and convex objective functions through peer-to-peer communication in a directed graph. To tackle this problem, we propose two accelerated gradient tracking methods, namely Accelerated Push-DIGing (APD) and APD-SC, for non-strongly convex and strongly convex objective functions, respectively. We show that APD and APD-SC converge at the rates [math] and [math], respectively, up to constant factors depending only on the mixing matrix. APD and APD-SC are the first decentralized methods over unbalanced directed graphs that achieve the same provable acceleration as centralized methods. Numerical experiments demonstrate the effectiveness of both methods.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"47 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140199146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Zhang, Necdet Serhat Aybat, Mert Gürbüzbalaban
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1097-1130, March 2024. Abstract. We consider strongly-convex-strongly-concave saddle point problems assuming we have access to unbiased stochastic estimates of the gradients. We propose a stochastic accelerated primal-dual (SAPD) algorithm and show that the SAPD sequence, generated using constant primal-dual step sizes, linearly converges to a neighborhood of the unique saddle point. Interpreting the size of the neighborhood as a measure of robustness to gradient noise, we obtain explicit characterizations of robustness in terms of SAPD parameters and problem constants. Based on these characterizations, we develop computationally tractable techniques for optimizing the SAPD parameters, i.e., the primal and dual step sizes, and the momentum parameter, to achieve a desired trade-off between the convergence rate and robustness on the Pareto curve. This allows SAPD to enjoy fast convergence properties while being robust to noise as an accelerated method. SAPD admits convergence guarantees for the distance metric with a variance term optimal up to a logarithmic factor, which can be removed by employing a restarting strategy. We also discuss how convergence and robustness results extend to the merely-convex-merely-concave setting. Finally, we illustrate our framework on a distributionally robust logistic regression problem.
{"title":"Robust Accelerated Primal-Dual Methods for Computing Saddle Points","authors":"Xuan Zhang, Necdet Serhat Aybat, Mert Gürbüzbalaban","doi":"10.1137/21m1462775","DOIUrl":"https://doi.org/10.1137/21m1462775","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1097-1130, March 2024. <br/> Abstract. We consider strongly-convex-strongly-concave saddle point problems assuming we have access to unbiased stochastic estimates of the gradients. We propose a stochastic accelerated primal-dual (SAPD) algorithm and show that the SAPD sequence, generated using constant primal-dual step sizes, linearly converges to a neighborhood of the unique saddle point. Interpreting the size of the neighborhood as a measure of robustness to gradient noise, we obtain explicit characterizations of robustness in terms of SAPD parameters and problem constants. Based on these characterizations, we develop computationally tractable techniques for optimizing the SAPD parameters, i.e., the primal and dual step sizes, and the momentum parameter, to achieve a desired trade-off between the convergence rate and robustness on the Pareto curve. This allows SAPD to enjoy fast convergence properties while being robust to noise as an accelerated method. SAPD admits convergence guarantees for the distance metric with a variance term optimal up to a logarithmic factor, which can be removed by employing a restarting strategy. We also discuss how convergence and robustness results extend to the merely-convex-merely-concave setting. Finally, we illustrate our framework on a distributionally robust logistic regression problem.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"103 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140172253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}