SIAM Journal on Optimization, Volume 34, Issue 1, Page 1071-1096, March 2024. Abstract. It is well known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show similar results for a wide variety of discrete optimization problems for which SDP relaxations have been derived. Based on a comprehensive study on discrete positive semidefinite matrices, we introduce a generic approach to derive mixed-integer SDP (MISDP) formulations of binary quadratically constrained quadratic programs and binary quadratic matrix programs. Applying a problem-specific approach, we derive more compact MISDP formulations of several problems, such as the quadratic assignment problem, the graph partition problem, and the integer matrix completion problem. We also show that several structured problems allow for novel compact MISDP formulations through the notion of association schemes. Complementary to the recent advances on algorithmic aspects related to MISDP, this work opens new perspectives on solution approaches for the here considered problems.
{"title":"On Integrality in Semidefinite Programming for Discrete Optimization","authors":"Frank de Meijer, Renata Sotirov","doi":"10.1137/23m1580905","DOIUrl":"https://doi.org/10.1137/23m1580905","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1071-1096, March 2024. <br/> Abstract. It is well known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show similar results for a wide variety of discrete optimization problems for which SDP relaxations have been derived. Based on a comprehensive study on discrete positive semidefinite matrices, we introduce a generic approach to derive mixed-integer SDP (MISDP) formulations of binary quadratically constrained quadratic programs and binary quadratic matrix programs. Applying a problem-specific approach, we derive more compact MISDP formulations of several problems, such as the quadratic assignment problem, the graph partition problem, and the integer matrix completion problem. We also show that several structured problems allow for novel compact MISDP formulations through the notion of association schemes. Complementary to the recent advances on algorithmic aspects related to MISDP, this work opens new perspectives on solution approaches for the here considered problems.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"47 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140153146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1045-1070, March 2024. Abstract. The Douglas–Rachford (DR) method is a widely used method for finding a point in the intersection of two closed convex sets (feasibility problem). However, the method converges weakly, and the associated rate of convergence is hard to analyze in general. In addition, the direct extension of the DR method for solving more-than-two-sets feasibility problems, called the [math]-sets-DR method, is not necessarily convergent. To improve the efficiency of the optimization algorithms, the introduction of randomization and the momentum technique has attracted increasing attention. In this paper, we propose the randomized [math]-sets-DR (RrDR) method for solving the feasibility problem derived from linear systems, showing the benefit of the randomization as it brings linear convergence in expectation to the otherwise divergent [math]-sets-DR method. Furthermore, the convergence rate does not depend on the dimension of the coefficient matrix. We also study RrDR with heavy ball momentum and establish its accelerated rate. Numerical experiments are provided to confirm our results and demonstrate the notable improvements in accuracy and efficiency of the DR method brought by the randomization and the momentum technique.
{"title":"Randomized Douglas–Rachford Methods for Linear Systems: Improved Accuracy and Efficiency","authors":"Deren Han, Yansheng Su, Jiaxin Xie","doi":"10.1137/23m1567503","DOIUrl":"https://doi.org/10.1137/23m1567503","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1045-1070, March 2024. <br/> Abstract. The Douglas–Rachford (DR) method is a widely used method for finding a point in the intersection of two closed convex sets (feasibility problem). However, the method converges weakly, and the associated rate of convergence is hard to analyze in general. In addition, the direct extension of the DR method for solving more-than-two-sets feasibility problems, called the [math]-sets-DR method, is not necessarily convergent. To improve the efficiency of the optimization algorithms, the introduction of randomization and the momentum technique has attracted increasing attention. In this paper, we propose the randomized [math]-sets-DR (RrDR) method for solving the feasibility problem derived from linear systems, showing the benefit of the randomization as it brings linear convergence in expectation to the otherwise divergent [math]-sets-DR method. Furthermore, the convergence rate does not depend on the dimension of the coefficient matrix. We also study RrDR with heavy ball momentum and establish its accelerated rate. Numerical experiments are provided to confirm our results and demonstrate the notable improvements in accuracy and efficiency of the DR method brought by the randomization and the momentum technique.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"69 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1006-1044, March 2024. Abstract. Minimax problems have recently attracted a lot of research interests. A few efforts have been made to solve decentralized nonconvex strongly-concave (NCSC) minimax-structured optimization; however, all of them focus on smooth problems with at most a constraint on the maximization variable. In this paper, we make the first attempt on solving composite NCSC minimax problems that can have convex nonsmooth terms on both minimization and maximization variables. Our algorithm is designed based on a novel reformulation of the decentralized minimax problem that introduces a multiplier to absorb the dual consensus constraint. The removal of dual consensus constraint enables the most aggressive (i.e., local maximization instead of a gradient ascent step) dual update that leads to the benefit of taking a larger primal stepsize and better complexity results. In addition, the decoupling of the nonsmoothness and consensus on the dual variable eases the analysis of a decentralized algorithm; thus our reformulation creates a new way for interested researchers to design new (and possibly more efficient) decentralized methods on solving NCSC minimax problems. We show a global convergence result of the proposed algorithm and an iteration complexity result to produce a (near) stationary point of the reformulation. Moreover, a relation is established between the (near) stationarities of the reformulation and the original formulation. With this relation, we show that when the dual regularizer is smooth, our algorithm can have lower complexity results (with reduced dependence on a condition number) than existing ones to produce a near-stationary point of the original formulation. Numerical experiments are conducted on a distributionally robust logistic regression to demonstrate the performance of the proposed algorithm.
{"title":"Decentralized Gradient Descent Maximization Method for Composite Nonconvex Strongly-Concave Minimax Problems","authors":"Yangyang Xu","doi":"10.1137/23m1558677","DOIUrl":"https://doi.org/10.1137/23m1558677","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1006-1044, March 2024. <br/> Abstract. Minimax problems have recently attracted a lot of research interests. A few efforts have been made to solve decentralized nonconvex strongly-concave (NCSC) minimax-structured optimization; however, all of them focus on smooth problems with at most a constraint on the maximization variable. In this paper, we make the first attempt on solving composite NCSC minimax problems that can have convex nonsmooth terms on both minimization and maximization variables. Our algorithm is designed based on a novel reformulation of the decentralized minimax problem that introduces a multiplier to absorb the dual consensus constraint. The removal of dual consensus constraint enables the most aggressive (i.e., local maximization instead of a gradient ascent step) dual update that leads to the benefit of taking a larger primal stepsize and better complexity results. In addition, the decoupling of the nonsmoothness and consensus on the dual variable eases the analysis of a decentralized algorithm; thus our reformulation creates a new way for interested researchers to design new (and possibly more efficient) decentralized methods on solving NCSC minimax problems. We show a global convergence result of the proposed algorithm and an iteration complexity result to produce a (near) stationary point of the reformulation. Moreover, a relation is established between the (near) stationarities of the reformulation and the original formulation. With this relation, we show that when the dual regularizer is smooth, our algorithm can have lower complexity results (with reduced dependence on a condition number) than existing ones to produce a near-stationary point of the original formulation. Numerical experiments are conducted on a distributionally robust logistic regression to demonstrate the performance of the proposed algorithm.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"4 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140117269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 977-1005, March 2024. Abstract. A striking pathology of semidefinite programs (SDPs) is illustrated by a classical example of Khachiyan: feasible solutions in SDPs may need exponential space even to write down. Such exponential size solutions are the main obstacle to solving a long standing, fundamental open problem: can we decide feasibility of SDPs in polynomial time? The consensus seems that SDPs with large size solutions are rare. However, here we prove that they are actually quite common: a linear change of variables transforms every strictly feasible SDP into a Khachiyan type SDP, in which the leading variables are large. As to “how large,” that depends on the singularity degree of a dual problem. Further, we present some SDPs coming from sum-of-squares proofs, in which large solutions appear naturally, without any change of variables. We also partially answer the question how do we represent such large solutions in polynomial space?
{"title":"How Do Exponential Size Solutions Arise in Semidefinite Programming?","authors":"Gábor Pataki, Aleksandr Touzov","doi":"10.1137/21m1434945","DOIUrl":"https://doi.org/10.1137/21m1434945","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 977-1005, March 2024. <br/> Abstract. A striking pathology of semidefinite programs (SDPs) is illustrated by a classical example of Khachiyan: feasible solutions in SDPs may need exponential space even to write down. Such exponential size solutions are the main obstacle to solving a long standing, fundamental open problem: can we decide feasibility of SDPs in polynomial time? The consensus seems that SDPs with large size solutions are rare. However, here we prove that they are actually quite common: a linear change of variables transforms every strictly feasible SDP into a Khachiyan type SDP, in which the leading variables are large. As to “how large,” that depends on the singularity degree of a dual problem. Further, we present some SDPs coming from sum-of-squares proofs, in which large solutions appear naturally, without any change of variables. We also partially answer the question how do we represent such large solutions in polynomial space?","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"65 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140072290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 946-976, March 2024. Abstract. We study a new two-time-scale stochastic gradient method for solving optimization problems, where the gradients are computed with the aid of an auxiliary variable under samples generated by time-varying Markov random processes controlled by the underlying optimization variable. These time-varying samples make gradient directions in our update biased and dependent, which can potentially lead to the divergence of the iterates. In our two-time-scale approach, one scale is to estimate the true gradient from these samples, which is then used to update the estimate of the optimal solution. While these two iterates are implemented simultaneously, the former is updated “faster” (using bigger step sizes) than the latter (using smaller step sizes). Our first contribution is to characterize the finite-time complexity of the proposed two-time-scale stochastic gradient method. In particular, we provide explicit formulas for the convergence rates of this method under different structural assumptions, namely, strong convexity, the Polyak–Łojasiewicz condition, and general nonconvexity. We apply our framework to policy optimization problems in control and reinforcement learning. First, we look at the infinite-horizon average-reward Markov decision process with finite state and action spaces and derive a convergence rate of [math] for the online actor-critic algorithm under function approximation, which recovers the best known rate derived specifically for this problem. Second, we study the linear-quadratic regulator and show that an online actor-critic method converges with rate [math]. Third, we use the actor-critic algorithm to solve the policy optimization problem in an entropy regularized Markov decision process, where we also establish a convergence of [math]. The results we derive for both the second and third problems are novel and previously unknown in the literature. Finally, we briefly present the application of our framework to gradient-based policy evaluation algorithms in reinforcement learning.
{"title":"A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning","authors":"Sihan Zeng, Thinh T. Doan, Justin Romberg","doi":"10.1137/22m150277x","DOIUrl":"https://doi.org/10.1137/22m150277x","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 946-976, March 2024. <br/> Abstract. We study a new two-time-scale stochastic gradient method for solving optimization problems, where the gradients are computed with the aid of an auxiliary variable under samples generated by time-varying Markov random processes controlled by the underlying optimization variable. These time-varying samples make gradient directions in our update biased and dependent, which can potentially lead to the divergence of the iterates. In our two-time-scale approach, one scale is to estimate the true gradient from these samples, which is then used to update the estimate of the optimal solution. While these two iterates are implemented simultaneously, the former is updated “faster” (using bigger step sizes) than the latter (using smaller step sizes). Our first contribution is to characterize the finite-time complexity of the proposed two-time-scale stochastic gradient method. In particular, we provide explicit formulas for the convergence rates of this method under different structural assumptions, namely, strong convexity, the Polyak–Łojasiewicz condition, and general nonconvexity. We apply our framework to policy optimization problems in control and reinforcement learning. First, we look at the infinite-horizon average-reward Markov decision process with finite state and action spaces and derive a convergence rate of [math] for the online actor-critic algorithm under function approximation, which recovers the best known rate derived specifically for this problem. Second, we study the linear-quadratic regulator and show that an online actor-critic method converges with rate [math]. Third, we use the actor-critic algorithm to solve the policy optimization problem in an entropy regularized Markov decision process, where we also establish a convergence of [math]. The results we derive for both the second and third problems are novel and previously unknown in the literature. Finally, we briefly present the application of our framework to gradient-based policy evaluation algorithms in reinforcement learning.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"64 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140072202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 918-945, March 2024. Abstract. The presence of second-order smoothness for objective functions of optimization problems can provide valuable information about their stability properties and help us design efficient numerical algorithms for solving these problems. Such second-order information, however, cannot be expected in various constrained and composite optimization problems since we often have to express their objective functions in terms of extended-real-valued functions for which the classical second derivative may not exist. One powerful geometrical tool to use for dealing with such functions is the concept of twice epi-differentiability. In this paper, we study a stronger version of this concept, called strict twice epi-differentiability. We characterize this concept for certain composite functions and use it to establish the equivalence of metric regularity and strong metric regularity for a class of generalized equations at their nondegenerate solutions. Finally, we present a characterization of continuous differentiability of the proximal mapping of our composite functions.
{"title":"A Chain Rule for Strict Twice Epi-Differentiability and Its Applications","authors":"Nguyen T. V. Hang, M. Ebrahim Sarabi","doi":"10.1137/22m1520025","DOIUrl":"https://doi.org/10.1137/22m1520025","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 918-945, March 2024. <br/> Abstract. The presence of second-order smoothness for objective functions of optimization problems can provide valuable information about their stability properties and help us design efficient numerical algorithms for solving these problems. Such second-order information, however, cannot be expected in various constrained and composite optimization problems since we often have to express their objective functions in terms of extended-real-valued functions for which the classical second derivative may not exist. One powerful geometrical tool to use for dealing with such functions is the concept of twice epi-differentiability. In this paper, we study a stronger version of this concept, called strict twice epi-differentiability. We characterize this concept for certain composite functions and use it to establish the equivalence of metric regularity and strong metric regularity for a class of generalized equations at their nondegenerate solutions. Finally, we present a characterization of continuous differentiability of the proximal mapping of our composite functions.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"69 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 893-917, March 2024. Abstract. Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating, for example, third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton updates are least-change updates satisfying the secant equation, with different methods using different norms to measure the size of the change. We present a full characterization for least-change updates in weighted Frobenius norms (satisfying an analogue of the secant equation) for derivatives of arbitrary order. Moreover, we establish convergence of the approximations to the true derivative under standard assumptions and explore the quality of the generated approximations in numerical experiments.
{"title":"Approximating Higher-Order Derivative Tensors Using Secant Updates","authors":"Karl Welzel, Raphael A. Hauser","doi":"10.1137/23m1549687","DOIUrl":"https://doi.org/10.1137/23m1549687","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 893-917, March 2024. <br/> Abstract. Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating, for example, third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton updates are least-change updates satisfying the secant equation, with different methods using different norms to measure the size of the change. We present a full characterization for least-change updates in weighted Frobenius norms (satisfying an analogue of the secant equation) for derivatives of arbitrary order. Moreover, we establish convergence of the approximations to the true derivative under standard assumptions and explore the quality of the generated approximations in numerical experiments.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"77 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 870-892, March 2024. Abstract. This paper studies the existence of a (Lipschitz) continuous (single-valued) solution function of parametric variational inequalities under functional and constraint perturbations. At the most elementary level, this issue can be explained from classical parametric linear programming and its resolution by the parametric simplex method, which computes a solution trajectory of the problem when the objective coefficients and the right-hand sides of the constraints are parameterized by a single scalar parameter. The computed optimal solution vector (and not the optimal objective value) is a continuous piecewise affine function in the parameter when the objective coefficients are kept constant, whereas the computed solution vector can be discontinuous when the right-hand constraint coefficients are kept fixed and there is a basis change at a critical value of the parameter in the objective. We investigate this issue more broadly first in the context of an affine variational inequality (AVI) and obtain results that go beyond those pertaining to the lower semicontinuity of the solution map with joint vector perturbations; the latter property is closely tied to a stability theory of a parametric AVI and in particular to Robinson’s seminal concept of strong regularity. Extensions to nonlinear variational inequalities is also investigated without requiring solution uniqueness (and therefore applicable to nonstrongly regular problems). The role of solution uniqueness in this issue of continuous single-valued solution selection is further clarified.
{"title":"Continuous Selections of Solutions to Parametric Variational Inequalities","authors":"Shaoning Han, Jong-Shi Pang","doi":"10.1137/22m1514982","DOIUrl":"https://doi.org/10.1137/22m1514982","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 870-892, March 2024. <br/> Abstract. This paper studies the existence of a (Lipschitz) continuous (single-valued) solution function of parametric variational inequalities under functional and constraint perturbations. At the most elementary level, this issue can be explained from classical parametric linear programming and its resolution by the parametric simplex method, which computes a solution trajectory of the problem when the objective coefficients and the right-hand sides of the constraints are parameterized by a single scalar parameter. The computed optimal solution vector (and not the optimal objective value) is a continuous piecewise affine function in the parameter when the objective coefficients are kept constant, whereas the computed solution vector can be discontinuous when the right-hand constraint coefficients are kept fixed and there is a basis change at a critical value of the parameter in the objective. We investigate this issue more broadly first in the context of an affine variational inequality (AVI) and obtain results that go beyond those pertaining to the lower semicontinuity of the solution map with joint vector perturbations; the latter property is closely tied to a stability theory of a parametric AVI and in particular to Robinson’s seminal concept of strong regularity. Extensions to nonlinear variational inequalities is also investigated without requiring solution uniqueness (and therefore applicable to nonstrongly regular problems). The role of solution uniqueness in this issue of continuous single-valued solution selection is further clarified.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"12 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 844-869, March 2024. Abstract. The sample average approximation (SAA) approach is applied to risk-neutral optimization problems governed by semilinear elliptic partial differential equations with random inputs. After constructing a compact set that contains the SAA critical points, we derive nonasymptotic sample size estimates for SAA critical points using the covering number approach. Thereby, we derive upper bounds on the number of samples needed to obtain accurate critical points of the risk-neutral PDE-constrained optimization problem through SAA critical points. We quantify accuracy using expectation and exponential tail bounds. Numerical illustrations are presented.
SIAM 优化期刊》第 34 卷第 1 期第 844-869 页,2024 年 3 月。 摘要。样本平均近似(SAA)方法适用于由随机输入的半线性椭圆偏微分方程控制的风险中性优化问题。在构建了包含 SAA 临界点的紧凑集之后,我们利用覆盖数方法推导出了 SAA 临界点的非渐近样本大小估计值。因此,我们推导出了通过 SAA 临界点获得风险中性 PDE 受限优化问题准确临界点所需的样本数量上限。我们使用期望值和指数尾边界来量化精确度。并给出了数值说明。
{"title":"Sample Size Estimates for Risk-Neutral Semilinear PDE-Constrained Optimization","authors":"Johannes Milz, Michael Ulbrich","doi":"10.1137/22m1512636","DOIUrl":"https://doi.org/10.1137/22m1512636","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 844-869, March 2024. <br/> Abstract. The sample average approximation (SAA) approach is applied to risk-neutral optimization problems governed by semilinear elliptic partial differential equations with random inputs. After constructing a compact set that contains the SAA critical points, we derive nonasymptotic sample size estimates for SAA critical points using the covering number approach. Thereby, we derive upper bounds on the number of samples needed to obtain accurate critical points of the risk-neutral PDE-constrained optimization problem through SAA critical points. We quantify accuracy using expectation and exponential tail bounds. Numerical illustrations are presented.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"255 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139951949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 817-843, March 2024. Abstract. We study the cone of factor-width-[math] matrices, where the factor width of a positive semidefinite matrix is defined as the smallest number [math] allowing it to be expressed as a sum of positive semidefinite matrices that are nonzero only on a single [math] principal submatrix. Two hierarchies of approximations are proposed for this cone. Some theoretical bounds to assess the quality of the new approximations are derived. We also use these approximations to build convex conic relaxations for the subset selection problem where one has to minimize [math] under the constraint that [math] has at most [math] nonzero components. Several numerical experiments are performed showing that some of these relaxations provide a good compromise between tightness and computational complexity and rank well compared to perspective-type relaxations.
{"title":"Subset Selection and the Cone of Factor-Width-k Matrices","authors":"Walid Ben-Ameur","doi":"10.1137/23m1549444","DOIUrl":"https://doi.org/10.1137/23m1549444","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 817-843, March 2024. <br/> Abstract. We study the cone of factor-width-[math] matrices, where the factor width of a positive semidefinite matrix is defined as the smallest number [math] allowing it to be expressed as a sum of positive semidefinite matrices that are nonzero only on a single [math] principal submatrix. Two hierarchies of approximations are proposed for this cone. Some theoretical bounds to assess the quality of the new approximations are derived. We also use these approximations to build convex conic relaxations for the subset selection problem where one has to minimize [math] under the constraint that [math] has at most [math] nonzero components. Several numerical experiments are performed showing that some of these relaxations provide a good compromise between tightness and computational complexity and rank well compared to perspective-type relaxations.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":"48 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139952160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}