SIAM Journal on Optimization, Volume 34, Issue 1, Page 893-917, March 2024. Abstract. Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating, for example, third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton updates are least-change updates satisfying the secant equation, with different methods using different norms to measure the size of the change. We present a full characterization for least-change updates in weighted Frobenius norms (satisfying an analogue of the secant equation) for derivatives of arbitrary order. Moreover, we establish convergence of the approximations to the true derivative under standard assumptions and explore the quality of the generated approximations in numerical experiments.
{"title":"Approximating Higher-Order Derivative Tensors Using Secant Updates","authors":"Karl Welzel, Raphael A. Hauser","doi":"10.1137/23m1549687","DOIUrl":"https://doi.org/10.1137/23m1549687","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 893-917, March 2024. <br/> Abstract. Quasi-Newton methods employ an update rule that gradually improves the Hessian approximation using the already available gradient evaluations. We propose higher-order secant updates which generalize this idea to higher-order derivatives, approximating, for example, third derivatives (which are tensors) from given Hessian evaluations. Our generalization is based on the observation that quasi-Newton updates are least-change updates satisfying the secant equation, with different methods using different norms to measure the size of the change. We present a full characterization for least-change updates in weighted Frobenius norms (satisfying an analogue of the secant equation) for derivatives of arbitrary order. Moreover, we establish convergence of the approximations to the true derivative under standard assumptions and explore the quality of the generated approximations in numerical experiments.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 870-892, March 2024. Abstract. This paper studies the existence of a (Lipschitz) continuous (single-valued) solution function of parametric variational inequalities under functional and constraint perturbations. At the most elementary level, this issue can be explained from classical parametric linear programming and its resolution by the parametric simplex method, which computes a solution trajectory of the problem when the objective coefficients and the right-hand sides of the constraints are parameterized by a single scalar parameter. The computed optimal solution vector (and not the optimal objective value) is a continuous piecewise affine function in the parameter when the objective coefficients are kept constant, whereas the computed solution vector can be discontinuous when the right-hand constraint coefficients are kept fixed and there is a basis change at a critical value of the parameter in the objective. We investigate this issue more broadly first in the context of an affine variational inequality (AVI) and obtain results that go beyond those pertaining to the lower semicontinuity of the solution map with joint vector perturbations; the latter property is closely tied to a stability theory of a parametric AVI and in particular to Robinson’s seminal concept of strong regularity. Extensions to nonlinear variational inequalities is also investigated without requiring solution uniqueness (and therefore applicable to nonstrongly regular problems). The role of solution uniqueness in this issue of continuous single-valued solution selection is further clarified.
{"title":"Continuous Selections of Solutions to Parametric Variational Inequalities","authors":"Shaoning Han, Jong-Shi Pang","doi":"10.1137/22m1514982","DOIUrl":"https://doi.org/10.1137/22m1514982","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 870-892, March 2024. <br/> Abstract. This paper studies the existence of a (Lipschitz) continuous (single-valued) solution function of parametric variational inequalities under functional and constraint perturbations. At the most elementary level, this issue can be explained from classical parametric linear programming and its resolution by the parametric simplex method, which computes a solution trajectory of the problem when the objective coefficients and the right-hand sides of the constraints are parameterized by a single scalar parameter. The computed optimal solution vector (and not the optimal objective value) is a continuous piecewise affine function in the parameter when the objective coefficients are kept constant, whereas the computed solution vector can be discontinuous when the right-hand constraint coefficients are kept fixed and there is a basis change at a critical value of the parameter in the objective. We investigate this issue more broadly first in the context of an affine variational inequality (AVI) and obtain results that go beyond those pertaining to the lower semicontinuity of the solution map with joint vector perturbations; the latter property is closely tied to a stability theory of a parametric AVI and in particular to Robinson’s seminal concept of strong regularity. Extensions to nonlinear variational inequalities is also investigated without requiring solution uniqueness (and therefore applicable to nonstrongly regular problems). The role of solution uniqueness in this issue of continuous single-valued solution selection is further clarified.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 844-869, March 2024. Abstract. The sample average approximation (SAA) approach is applied to risk-neutral optimization problems governed by semilinear elliptic partial differential equations with random inputs. After constructing a compact set that contains the SAA critical points, we derive nonasymptotic sample size estimates for SAA critical points using the covering number approach. Thereby, we derive upper bounds on the number of samples needed to obtain accurate critical points of the risk-neutral PDE-constrained optimization problem through SAA critical points. We quantify accuracy using expectation and exponential tail bounds. Numerical illustrations are presented.
SIAM 优化期刊》第 34 卷第 1 期第 844-869 页,2024 年 3 月。 摘要。样本平均近似(SAA)方法适用于由随机输入的半线性椭圆偏微分方程控制的风险中性优化问题。在构建了包含 SAA 临界点的紧凑集之后,我们利用覆盖数方法推导出了 SAA 临界点的非渐近样本大小估计值。因此,我们推导出了通过 SAA 临界点获得风险中性 PDE 受限优化问题准确临界点所需的样本数量上限。我们使用期望值和指数尾边界来量化精确度。并给出了数值说明。
{"title":"Sample Size Estimates for Risk-Neutral Semilinear PDE-Constrained Optimization","authors":"Johannes Milz, Michael Ulbrich","doi":"10.1137/22m1512636","DOIUrl":"https://doi.org/10.1137/22m1512636","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 844-869, March 2024. <br/> Abstract. The sample average approximation (SAA) approach is applied to risk-neutral optimization problems governed by semilinear elliptic partial differential equations with random inputs. After constructing a compact set that contains the SAA critical points, we derive nonasymptotic sample size estimates for SAA critical points using the covering number approach. Thereby, we derive upper bounds on the number of samples needed to obtain accurate critical points of the risk-neutral PDE-constrained optimization problem through SAA critical points. We quantify accuracy using expectation and exponential tail bounds. Numerical illustrations are presented.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139951949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 817-843, March 2024. Abstract. We study the cone of factor-width-[math] matrices, where the factor width of a positive semidefinite matrix is defined as the smallest number [math] allowing it to be expressed as a sum of positive semidefinite matrices that are nonzero only on a single [math] principal submatrix. Two hierarchies of approximations are proposed for this cone. Some theoretical bounds to assess the quality of the new approximations are derived. We also use these approximations to build convex conic relaxations for the subset selection problem where one has to minimize [math] under the constraint that [math] has at most [math] nonzero components. Several numerical experiments are performed showing that some of these relaxations provide a good compromise between tightness and computational complexity and rank well compared to perspective-type relaxations.
{"title":"Subset Selection and the Cone of Factor-Width-k Matrices","authors":"Walid Ben-Ameur","doi":"10.1137/23m1549444","DOIUrl":"https://doi.org/10.1137/23m1549444","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 817-843, March 2024. <br/> Abstract. We study the cone of factor-width-[math] matrices, where the factor width of a positive semidefinite matrix is defined as the smallest number [math] allowing it to be expressed as a sum of positive semidefinite matrices that are nonzero only on a single [math] principal submatrix. Two hierarchies of approximations are proposed for this cone. Some theoretical bounds to assess the quality of the new approximations are derived. We also use these approximations to build convex conic relaxations for the subset selection problem where one has to minimize [math] under the constraint that [math] has at most [math] nonzero components. Several numerical experiments are performed showing that some of these relaxations provide a good compromise between tightness and computational complexity and rank well compared to perspective-type relaxations.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139952160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 790-816, March 2024. Abstract. This paper proposes a path-based approach for the minimization of a continuously differentiable function over sparse symmetric sets, which is a hard problem that exhibits a restrictiveness-hierarchy of necessary optimality conditions. To achieve the more restrictive conditions in the hierarchy, state-of-the-art algorithms require a support optimization oracle that must exactly solve the problem in smaller dimensions. The path-based approach developed in this study produces a path-based optimality condition, which is placed well in the restrictiveness-hierarchy, and a method to achieve it that does not require a support optimization oracle and, moreover, is projection-free. In the development process, new results are derived for the regularized linear minimization problem over sparse symmetric sets, which give additional means to identify optimal solutions for convex and concave objective functions. We complement our results with numerical examples.
{"title":"A Path-Based Approach to Constrained Sparse Optimization","authors":"Nadav Hallak","doi":"10.1137/22m1535498","DOIUrl":"https://doi.org/10.1137/22m1535498","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 790-816, March 2024. <br/> Abstract. This paper proposes a path-based approach for the minimization of a continuously differentiable function over sparse symmetric sets, which is a hard problem that exhibits a restrictiveness-hierarchy of necessary optimality conditions. To achieve the more restrictive conditions in the hierarchy, state-of-the-art algorithms require a support optimization oracle that must exactly solve the problem in smaller dimensions. The path-based approach developed in this study produces a path-based optimality condition, which is placed well in the restrictiveness-hierarchy, and a method to achieve it that does not require a support optimization oracle and, moreover, is projection-free. In the development process, new results are derived for the regularized linear minimization problem over sparse symmetric sets, which give additional means to identify optimal solutions for convex and concave objective functions. We complement our results with numerical examples.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139919219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoya Li, Hsiang-Fu Yu, Lexing Ying, Inderjit S. Dhillon
SIAM Journal on Optimization, Volume 34, Issue 1, Page 764-789, March 2024. Abstract. Entropy regularized Markov decision processes have been widely used in reinforcement learning. This paper is concerned with the primal-dual formulation of the entropy regularized problems. Standard first-order methods suffer from slow convergence due to the lack of strict convexity and concavity. To address this issue, we first introduce a new quadratically convexified primal-dual formulation. The natural gradient ascent descent of the new formulation enjoys global convergence guarantee and exponential convergence rate. We also propose a new interpolating metric that further accelerates the convergence significantly. Numerical results are provided to demonstrate the performance of the proposed methods under multiple settings.
{"title":"Accelerating Primal-Dual Methods for Regularized Markov Decision Processes","authors":"Haoya Li, Hsiang-Fu Yu, Lexing Ying, Inderjit S. Dhillon","doi":"10.1137/21m1468851","DOIUrl":"https://doi.org/10.1137/21m1468851","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 764-789, March 2024. <br/> Abstract. Entropy regularized Markov decision processes have been widely used in reinforcement learning. This paper is concerned with the primal-dual formulation of the entropy regularized problems. Standard first-order methods suffer from slow convergence due to the lack of strict convexity and concavity. To address this issue, we first introduce a new quadratically convexified primal-dual formulation. The natural gradient ascent descent of the new formulation enjoys global convergence guarantee and exponential convergence rate. We also propose a new interpolating metric that further accelerates the convergence significantly. Numerical results are provided to demonstrate the performance of the proposed methods under multiple settings.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139919220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 742-763, March 2024. Abstract. This paper is concerned with the exact solution of mixed-integer programs (MIPs) over the rational numbers, i.e., without any roundoff errors and error tolerances. Here, one computational bottleneck that should be avoided whenever possible is to employ large-scale symbolic computations. Instead it is often possible to use safe directed rounding methods, e.g., to generate provably correct dual bounds. In this work, we continue to leverage this paradigm and extend an exact branch-and-bound framework by separation routines for safe cutting planes, based on the approach first introduced by Cook, Dash, Fukasawa, and Goycoolea in 2009 [INFORMS J. Comput., 21 (2009), pp. 641–649]. Constraints are aggregated safely using approximate dual multipliers from an LP solve, followed by mixed-integer rounding to generate provably valid, although slightly weaker inequalities. We generalize this approach to problem data that is not representable in floating-point arithmetic, add routines for controlling the encoding length of the resulting cutting planes, and show how these cutting planes can be verified according to the VIPR certificate standard. Furthermore, we analyze the performance impact of these cutting planes in the context of an exact MIP framework, showing that we can solve 21.5% more instances to exact optimality and reduce solving times by 26.8% on the MIPLIB 2017 benchmark test set.
{"title":"Safe and Verified Gomory Mixed-Integer Cuts in a Rational Mixed-Integer Program Framework","authors":"Leon Eifler, Ambros Gleixner","doi":"10.1137/23m156046x","DOIUrl":"https://doi.org/10.1137/23m156046x","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 742-763, March 2024. <br/> Abstract. This paper is concerned with the exact solution of mixed-integer programs (MIPs) over the rational numbers, i.e., without any roundoff errors and error tolerances. Here, one computational bottleneck that should be avoided whenever possible is to employ large-scale symbolic computations. Instead it is often possible to use safe directed rounding methods, e.g., to generate provably correct dual bounds. In this work, we continue to leverage this paradigm and extend an exact branch-and-bound framework by separation routines for safe cutting planes, based on the approach first introduced by Cook, Dash, Fukasawa, and Goycoolea in 2009 [INFORMS J. Comput., 21 (2009), pp. 641–649]. Constraints are aggregated safely using approximate dual multipliers from an LP solve, followed by mixed-integer rounding to generate provably valid, although slightly weaker inequalities. We generalize this approach to problem data that is not representable in floating-point arithmetic, add routines for controlling the encoding length of the resulting cutting planes, and show how these cutting planes can be verified according to the VIPR certificate standard. Furthermore, we analyze the performance impact of these cutting planes in the context of an exact MIP framework, showing that we can solve 21.5% more instances to exact optimality and reduce solving times by 26.8% on the MIPLIB 2017 benchmark test set.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139765298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 718-741, March 2024. Abstract. Linear programming on the Stiefel manifold (LPS) is studied for the first time. It aims at minimizing a linear objective function over the set of all [math]-tuples of orthonormal vectors in [math] satisfying [math] additional linear constraints. Despite the classical polynomial-time solvable case [math], general (LPS) is NP-hard. According to the Shapiro–Barvinok–Pataki theorem, (LPS) admits an exact semidefinite programming relaxation when [math], which is tight when [math]. Surprisingly, we can greatly strengthen this sufficient exactness condition to [math], which covers the classical case [math] and [math]. Regarding (LPS) as a smooth nonlinear programming problem, we reveal a nice property that under the linear independence constraint qualification, the standard first- and second-order local necessary optimality conditions are sufficient for global optimality when [math].
{"title":"Linear Programming on the Stiefel Manifold","authors":"Mengmeng Song, Yong Xia","doi":"10.1137/23m1552243","DOIUrl":"https://doi.org/10.1137/23m1552243","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 718-741, March 2024. <br/> Abstract. Linear programming on the Stiefel manifold (LPS) is studied for the first time. It aims at minimizing a linear objective function over the set of all [math]-tuples of orthonormal vectors in [math] satisfying [math] additional linear constraints. Despite the classical polynomial-time solvable case [math], general (LPS) is NP-hard. According to the Shapiro–Barvinok–Pataki theorem, (LPS) admits an exact semidefinite programming relaxation when [math], which is tight when [math]. Surprisingly, we can greatly strengthen this sufficient exactness condition to [math], which covers the classical case [math] and [math]. Regarding (LPS) as a smooth nonlinear programming problem, we reveal a nice property that under the linear independence constraint qualification, the standard first- and second-order local necessary optimality conditions are sufficient for global optimality when [math].","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139765313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Güzin Bayraksan, Francesca Maggioni, Daniel Faccini, Ming Yang
SIAM Journal on Optimization, Volume 34, Issue 1, Page 682-717, March 2024. Abstract. Multistage mixed-integer distributionally robust optimization (DRO) forms a class of extremely challenging problems since their size grows exponentially with the number of stages. One way to model the uncertainty in multistage DRO is by creating sets of conditional distributions (the so-called conditional ambiguity sets) on a finite scenario tree and requiring that such distributions remain close to nominal conditional distributions according to some measure of similarity/distance (e.g., [math]-divergences or Wasserstein distance). In this paper, new bounding criteria for this class of difficult decision problems are provided through scenario grouping using the ambiguity sets associated with various commonly used [math]-divergences and the Wasserstein distance. Our approach does not require any special problem structure such as linearity, convexity, stagewise independence, and so forth. Therefore, while we focus on multistage mixed-integer DRO, our bounds can be applied to a wide range of DRO problems including two-stage and multistage, with or without integer variables, convex or nonconvex, and nested or nonnested formulations. Numerical results on a multistage mixed-integer production problem show the efficiency of the proposed approach through different choices of partition strategies, ambiguity sets, and levels of robustness.
{"title":"Bounds for Multistage Mixed-Integer Distributionally Robust Optimization","authors":"Güzin Bayraksan, Francesca Maggioni, Daniel Faccini, Ming Yang","doi":"10.1137/22m147178x","DOIUrl":"https://doi.org/10.1137/22m147178x","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 682-717, March 2024. <br/> Abstract. Multistage mixed-integer distributionally robust optimization (DRO) forms a class of extremely challenging problems since their size grows exponentially with the number of stages. One way to model the uncertainty in multistage DRO is by creating sets of conditional distributions (the so-called conditional ambiguity sets) on a finite scenario tree and requiring that such distributions remain close to nominal conditional distributions according to some measure of similarity/distance (e.g., [math]-divergences or Wasserstein distance). In this paper, new bounding criteria for this class of difficult decision problems are provided through scenario grouping using the ambiguity sets associated with various commonly used [math]-divergences and the Wasserstein distance. Our approach does not require any special problem structure such as linearity, convexity, stagewise independence, and so forth. Therefore, while we focus on multistage mixed-integer DRO, our bounds can be applied to a wide range of DRO problems including two-stage and multistage, with or without integer variables, convex or nonconvex, and nested or nonnested formulations. Numerical results on a multistage mixed-integer production problem show the efficiency of the proposed approach through different choices of partition strategies, ambiguity sets, and levels of robustness.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139765300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wutao Si, P.-A. Absil, Wen Huang, Rujun Jiang, Simon Vary
SIAM Journal on Optimization, Volume 34, Issue 1, Page 654-681, March 2024. Abstract. In recent years, the proximal gradient method and its variants have been generalized to Riemannian manifolds for solving optimization problems with an additively separable structure, i.e., [math], where [math] is continuously differentiable, and [math] may be nonsmooth but convex with computationally reasonable proximal mapping. In this paper, we generalize the proximal Newton method to embedded submanifolds for solving the type of problem with [math]. The generalization relies on the Weingarten and semismooth analysis. It is shown that the Riemannian proximal Newton method has a local superlinear convergence rate under certain reasonable assumptions. Moreover, a hybrid version is given by concatenating a Riemannian proximal gradient method and the Riemannian proximal Newton method. It is shown that if the switch parameter is chosen appropriately, then the hybrid method converges globally and also has a local superlinear convergence rate. Numerical experiments on random and synthetic data are used to demonstrate the performance of the proposed methods.
{"title":"A Riemannian Proximal Newton Method","authors":"Wutao Si, P.-A. Absil, Wen Huang, Rujun Jiang, Simon Vary","doi":"10.1137/23m1565097","DOIUrl":"https://doi.org/10.1137/23m1565097","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 654-681, March 2024. <br/> Abstract. In recent years, the proximal gradient method and its variants have been generalized to Riemannian manifolds for solving optimization problems with an additively separable structure, i.e., [math], where [math] is continuously differentiable, and [math] may be nonsmooth but convex with computationally reasonable proximal mapping. In this paper, we generalize the proximal Newton method to embedded submanifolds for solving the type of problem with [math]. The generalization relies on the Weingarten and semismooth analysis. It is shown that the Riemannian proximal Newton method has a local superlinear convergence rate under certain reasonable assumptions. Moreover, a hybrid version is given by concatenating a Riemannian proximal gradient method and the Riemannian proximal Newton method. It is shown that if the switch parameter is chosen appropriately, then the hybrid method converges globally and also has a local superlinear convergence rate. Numerical experiments on random and synthetic data are used to demonstrate the performance of the proposed methods.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139765335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}