Christoph Buchheim, Alexandra Grütering, Christian Meyer
SIAM Journal on Optimization, Volume 34, Issue 2, Page 1187-1205, June 2024. Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. While such combinatorial constraints are often seen as an additional complication that is treated in a heuristic postprocessing, the core of our approach is to investigate the convex hull of all feasible switching patterns in order to define a tight convex relaxation of the control problem. The convex relaxation is built by cutting planes derived from finite-dimensional projections, which can be studied by means of polyhedral combinatorics. A numerical example for the case of a bounded number of switchings shows that our approach can significantly improve the dual bounds given by the straightforward continuous relaxation, which is obtained by relaxing binarity constraints.
{"title":"Parabolic Optimal Control Problems with Combinatorial Switching Constraints, Part I: Convex Relaxations","authors":"Christoph Buchheim, Alexandra Grütering, Christian Meyer","doi":"10.1137/22m1490260","DOIUrl":"https://doi.org/10.1137/22m1490260","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 2, Page 1187-1205, June 2024. <br/> Abstract. We consider optimal control problems for partial differential equations where the controls take binary values but vary over the time horizon; they can thus be seen as dynamic switches. The switching patterns may be subject to combinatorial constraints such as, e.g., an upper bound on the total number of switchings or a lower bound on the time between two switchings. While such combinatorial constraints are often seen as an additional complication that is treated in a heuristic postprocessing, the core of our approach is to investigate the convex hull of all feasible switching patterns in order to define a tight convex relaxation of the control problem. The convex relaxation is built by cutting planes derived from finite-dimensional projections, which can be studied by means of polyhedral combinatorics. A numerical example for the case of a bounded number of switchings shows that our approach can significantly improve the dual bounds given by the straightforward continuous relaxation, which is obtained by relaxing binarity constraints.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140567882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1157-1185, March 2024. Abstract. We develop an implementable stochastic proximal point (SPP) method for a class of weakly convex, composite optimization problems. The proposed stochastic proximal point algorithm incorporates a variance reduction mechanism and the resulting SPP updates are solved using an inexact semismooth Newton framework. We establish detailed convergence results that take the inexactness of the SPP steps into account and that are in accordance with existing convergence guarantees of (proximal) stochastic variance-reduced gradient methods. Numerical experiments show that the proposed algorithm competes favorably with other state-of-the-art methods and achieves higher robustness with respect to the step size selection.
{"title":"A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction","authors":"Andre Milzarek, Fabian Schaipp, Michael Ulbrich","doi":"10.1137/22m1488181","DOIUrl":"https://doi.org/10.1137/22m1488181","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1157-1185, March 2024. <br/> Abstract. We develop an implementable stochastic proximal point (SPP) method for a class of weakly convex, composite optimization problems. The proposed stochastic proximal point algorithm incorporates a variance reduction mechanism and the resulting SPP updates are solved using an inexact semismooth Newton framework. We establish detailed convergence results that take the inexactness of the SPP steps into account and that are in accordance with existing convergence guarantees of (proximal) stochastic variance-reduced gradient methods. Numerical experiments show that the proposed algorithm competes favorably with other state-of-the-art methods and achieves higher robustness with respect to the step size selection.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140302825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1131-1156, March 2024. Abstract. We consider the decentralized optimization problem, where a network of [math] agents aims to collaboratively minimize the average of their individual smooth and convex objective functions through peer-to-peer communication in a directed graph. To tackle this problem, we propose two accelerated gradient tracking methods, namely Accelerated Push-DIGing (APD) and APD-SC, for non-strongly convex and strongly convex objective functions, respectively. We show that APD and APD-SC converge at the rates [math] and [math], respectively, up to constant factors depending only on the mixing matrix. APD and APD-SC are the first decentralized methods over unbalanced directed graphs that achieve the same provable acceleration as centralized methods. Numerical experiments demonstrate the effectiveness of both methods.
{"title":"Provably Accelerated Decentralized Gradient Methods Over Unbalanced Directed Graphs","authors":"Zhuoqing Song, Lei Shi, Shi Pu, Ming Yan","doi":"10.1137/22m148570x","DOIUrl":"https://doi.org/10.1137/22m148570x","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1131-1156, March 2024. <br/> Abstract. We consider the decentralized optimization problem, where a network of [math] agents aims to collaboratively minimize the average of their individual smooth and convex objective functions through peer-to-peer communication in a directed graph. To tackle this problem, we propose two accelerated gradient tracking methods, namely Accelerated Push-DIGing (APD) and APD-SC, for non-strongly convex and strongly convex objective functions, respectively. We show that APD and APD-SC converge at the rates [math] and [math], respectively, up to constant factors depending only on the mixing matrix. APD and APD-SC are the first decentralized methods over unbalanced directed graphs that achieve the same provable acceleration as centralized methods. Numerical experiments demonstrate the effectiveness of both methods.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140199146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xuan Zhang, Necdet Serhat Aybat, Mert Gürbüzbalaban
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1097-1130, March 2024. Abstract. We consider strongly-convex-strongly-concave saddle point problems assuming we have access to unbiased stochastic estimates of the gradients. We propose a stochastic accelerated primal-dual (SAPD) algorithm and show that the SAPD sequence, generated using constant primal-dual step sizes, linearly converges to a neighborhood of the unique saddle point. Interpreting the size of the neighborhood as a measure of robustness to gradient noise, we obtain explicit characterizations of robustness in terms of SAPD parameters and problem constants. Based on these characterizations, we develop computationally tractable techniques for optimizing the SAPD parameters, i.e., the primal and dual step sizes, and the momentum parameter, to achieve a desired trade-off between the convergence rate and robustness on the Pareto curve. This allows SAPD to enjoy fast convergence properties while being robust to noise as an accelerated method. SAPD admits convergence guarantees for the distance metric with a variance term optimal up to a logarithmic factor, which can be removed by employing a restarting strategy. We also discuss how convergence and robustness results extend to the merely-convex-merely-concave setting. Finally, we illustrate our framework on a distributionally robust logistic regression problem.
{"title":"Robust Accelerated Primal-Dual Methods for Computing Saddle Points","authors":"Xuan Zhang, Necdet Serhat Aybat, Mert Gürbüzbalaban","doi":"10.1137/21m1462775","DOIUrl":"https://doi.org/10.1137/21m1462775","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1097-1130, March 2024. <br/> Abstract. We consider strongly-convex-strongly-concave saddle point problems assuming we have access to unbiased stochastic estimates of the gradients. We propose a stochastic accelerated primal-dual (SAPD) algorithm and show that the SAPD sequence, generated using constant primal-dual step sizes, linearly converges to a neighborhood of the unique saddle point. Interpreting the size of the neighborhood as a measure of robustness to gradient noise, we obtain explicit characterizations of robustness in terms of SAPD parameters and problem constants. Based on these characterizations, we develop computationally tractable techniques for optimizing the SAPD parameters, i.e., the primal and dual step sizes, and the momentum parameter, to achieve a desired trade-off between the convergence rate and robustness on the Pareto curve. This allows SAPD to enjoy fast convergence properties while being robust to noise as an accelerated method. SAPD admits convergence guarantees for the distance metric with a variance term optimal up to a logarithmic factor, which can be removed by employing a restarting strategy. We also discuss how convergence and robustness results extend to the merely-convex-merely-concave setting. Finally, we illustrate our framework on a distributionally robust logistic regression problem.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140172253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1071-1096, March 2024. Abstract. It is well known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show similar results for a wide variety of discrete optimization problems for which SDP relaxations have been derived. Based on a comprehensive study on discrete positive semidefinite matrices, we introduce a generic approach to derive mixed-integer SDP (MISDP) formulations of binary quadratically constrained quadratic programs and binary quadratic matrix programs. Applying a problem-specific approach, we derive more compact MISDP formulations of several problems, such as the quadratic assignment problem, the graph partition problem, and the integer matrix completion problem. We also show that several structured problems allow for novel compact MISDP formulations through the notion of association schemes. Complementary to the recent advances on algorithmic aspects related to MISDP, this work opens new perspectives on solution approaches for the here considered problems.
{"title":"On Integrality in Semidefinite Programming for Discrete Optimization","authors":"Frank de Meijer, Renata Sotirov","doi":"10.1137/23m1580905","DOIUrl":"https://doi.org/10.1137/23m1580905","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1071-1096, March 2024. <br/> Abstract. It is well known that by adding integrality constraints to the semidefinite programming (SDP) relaxation of the max-cut problem, the resulting integer semidefinite program is an exact formulation of the problem. In this paper we show similar results for a wide variety of discrete optimization problems for which SDP relaxations have been derived. Based on a comprehensive study on discrete positive semidefinite matrices, we introduce a generic approach to derive mixed-integer SDP (MISDP) formulations of binary quadratically constrained quadratic programs and binary quadratic matrix programs. Applying a problem-specific approach, we derive more compact MISDP formulations of several problems, such as the quadratic assignment problem, the graph partition problem, and the integer matrix completion problem. We also show that several structured problems allow for novel compact MISDP formulations through the notion of association schemes. Complementary to the recent advances on algorithmic aspects related to MISDP, this work opens new perspectives on solution approaches for the here considered problems.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140153146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1045-1070, March 2024. Abstract. The Douglas–Rachford (DR) method is a widely used method for finding a point in the intersection of two closed convex sets (feasibility problem). However, the method converges weakly, and the associated rate of convergence is hard to analyze in general. In addition, the direct extension of the DR method for solving more-than-two-sets feasibility problems, called the [math]-sets-DR method, is not necessarily convergent. To improve the efficiency of the optimization algorithms, the introduction of randomization and the momentum technique has attracted increasing attention. In this paper, we propose the randomized [math]-sets-DR (RrDR) method for solving the feasibility problem derived from linear systems, showing the benefit of the randomization as it brings linear convergence in expectation to the otherwise divergent [math]-sets-DR method. Furthermore, the convergence rate does not depend on the dimension of the coefficient matrix. We also study RrDR with heavy ball momentum and establish its accelerated rate. Numerical experiments are provided to confirm our results and demonstrate the notable improvements in accuracy and efficiency of the DR method brought by the randomization and the momentum technique.
{"title":"Randomized Douglas–Rachford Methods for Linear Systems: Improved Accuracy and Efficiency","authors":"Deren Han, Yansheng Su, Jiaxin Xie","doi":"10.1137/23m1567503","DOIUrl":"https://doi.org/10.1137/23m1567503","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1045-1070, March 2024. <br/> Abstract. The Douglas–Rachford (DR) method is a widely used method for finding a point in the intersection of two closed convex sets (feasibility problem). However, the method converges weakly, and the associated rate of convergence is hard to analyze in general. In addition, the direct extension of the DR method for solving more-than-two-sets feasibility problems, called the [math]-sets-DR method, is not necessarily convergent. To improve the efficiency of the optimization algorithms, the introduction of randomization and the momentum technique has attracted increasing attention. In this paper, we propose the randomized [math]-sets-DR (RrDR) method for solving the feasibility problem derived from linear systems, showing the benefit of the randomization as it brings linear convergence in expectation to the otherwise divergent [math]-sets-DR method. Furthermore, the convergence rate does not depend on the dimension of the coefficient matrix. We also study RrDR with heavy ball momentum and establish its accelerated rate. Numerical experiments are provided to confirm our results and demonstrate the notable improvements in accuracy and efficiency of the DR method brought by the randomization and the momentum technique.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140152967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 1006-1044, March 2024. Abstract. Minimax problems have recently attracted a lot of research interests. A few efforts have been made to solve decentralized nonconvex strongly-concave (NCSC) minimax-structured optimization; however, all of them focus on smooth problems with at most a constraint on the maximization variable. In this paper, we make the first attempt on solving composite NCSC minimax problems that can have convex nonsmooth terms on both minimization and maximization variables. Our algorithm is designed based on a novel reformulation of the decentralized minimax problem that introduces a multiplier to absorb the dual consensus constraint. The removal of dual consensus constraint enables the most aggressive (i.e., local maximization instead of a gradient ascent step) dual update that leads to the benefit of taking a larger primal stepsize and better complexity results. In addition, the decoupling of the nonsmoothness and consensus on the dual variable eases the analysis of a decentralized algorithm; thus our reformulation creates a new way for interested researchers to design new (and possibly more efficient) decentralized methods on solving NCSC minimax problems. We show a global convergence result of the proposed algorithm and an iteration complexity result to produce a (near) stationary point of the reformulation. Moreover, a relation is established between the (near) stationarities of the reformulation and the original formulation. With this relation, we show that when the dual regularizer is smooth, our algorithm can have lower complexity results (with reduced dependence on a condition number) than existing ones to produce a near-stationary point of the original formulation. Numerical experiments are conducted on a distributionally robust logistic regression to demonstrate the performance of the proposed algorithm.
{"title":"Decentralized Gradient Descent Maximization Method for Composite Nonconvex Strongly-Concave Minimax Problems","authors":"Yangyang Xu","doi":"10.1137/23m1558677","DOIUrl":"https://doi.org/10.1137/23m1558677","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 1006-1044, March 2024. <br/> Abstract. Minimax problems have recently attracted a lot of research interests. A few efforts have been made to solve decentralized nonconvex strongly-concave (NCSC) minimax-structured optimization; however, all of them focus on smooth problems with at most a constraint on the maximization variable. In this paper, we make the first attempt on solving composite NCSC minimax problems that can have convex nonsmooth terms on both minimization and maximization variables. Our algorithm is designed based on a novel reformulation of the decentralized minimax problem that introduces a multiplier to absorb the dual consensus constraint. The removal of dual consensus constraint enables the most aggressive (i.e., local maximization instead of a gradient ascent step) dual update that leads to the benefit of taking a larger primal stepsize and better complexity results. In addition, the decoupling of the nonsmoothness and consensus on the dual variable eases the analysis of a decentralized algorithm; thus our reformulation creates a new way for interested researchers to design new (and possibly more efficient) decentralized methods on solving NCSC minimax problems. We show a global convergence result of the proposed algorithm and an iteration complexity result to produce a (near) stationary point of the reformulation. Moreover, a relation is established between the (near) stationarities of the reformulation and the original formulation. With this relation, we show that when the dual regularizer is smooth, our algorithm can have lower complexity results (with reduced dependence on a condition number) than existing ones to produce a near-stationary point of the original formulation. Numerical experiments are conducted on a distributionally robust logistic regression to demonstrate the performance of the proposed algorithm.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140117269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 977-1005, March 2024. Abstract. A striking pathology of semidefinite programs (SDPs) is illustrated by a classical example of Khachiyan: feasible solutions in SDPs may need exponential space even to write down. Such exponential size solutions are the main obstacle to solving a long standing, fundamental open problem: can we decide feasibility of SDPs in polynomial time? The consensus seems that SDPs with large size solutions are rare. However, here we prove that they are actually quite common: a linear change of variables transforms every strictly feasible SDP into a Khachiyan type SDP, in which the leading variables are large. As to “how large,” that depends on the singularity degree of a dual problem. Further, we present some SDPs coming from sum-of-squares proofs, in which large solutions appear naturally, without any change of variables. We also partially answer the question how do we represent such large solutions in polynomial space?
{"title":"How Do Exponential Size Solutions Arise in Semidefinite Programming?","authors":"Gábor Pataki, Aleksandr Touzov","doi":"10.1137/21m1434945","DOIUrl":"https://doi.org/10.1137/21m1434945","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 977-1005, March 2024. <br/> Abstract. A striking pathology of semidefinite programs (SDPs) is illustrated by a classical example of Khachiyan: feasible solutions in SDPs may need exponential space even to write down. Such exponential size solutions are the main obstacle to solving a long standing, fundamental open problem: can we decide feasibility of SDPs in polynomial time? The consensus seems that SDPs with large size solutions are rare. However, here we prove that they are actually quite common: a linear change of variables transforms every strictly feasible SDP into a Khachiyan type SDP, in which the leading variables are large. As to “how large,” that depends on the singularity degree of a dual problem. Further, we present some SDPs coming from sum-of-squares proofs, in which large solutions appear naturally, without any change of variables. We also partially answer the question how do we represent such large solutions in polynomial space?","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140072290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 946-976, March 2024. Abstract. We study a new two-time-scale stochastic gradient method for solving optimization problems, where the gradients are computed with the aid of an auxiliary variable under samples generated by time-varying Markov random processes controlled by the underlying optimization variable. These time-varying samples make gradient directions in our update biased and dependent, which can potentially lead to the divergence of the iterates. In our two-time-scale approach, one scale is to estimate the true gradient from these samples, which is then used to update the estimate of the optimal solution. While these two iterates are implemented simultaneously, the former is updated “faster” (using bigger step sizes) than the latter (using smaller step sizes). Our first contribution is to characterize the finite-time complexity of the proposed two-time-scale stochastic gradient method. In particular, we provide explicit formulas for the convergence rates of this method under different structural assumptions, namely, strong convexity, the Polyak–Łojasiewicz condition, and general nonconvexity. We apply our framework to policy optimization problems in control and reinforcement learning. First, we look at the infinite-horizon average-reward Markov decision process with finite state and action spaces and derive a convergence rate of [math] for the online actor-critic algorithm under function approximation, which recovers the best known rate derived specifically for this problem. Second, we study the linear-quadratic regulator and show that an online actor-critic method converges with rate [math]. Third, we use the actor-critic algorithm to solve the policy optimization problem in an entropy regularized Markov decision process, where we also establish a convergence of [math]. The results we derive for both the second and third problems are novel and previously unknown in the literature. Finally, we briefly present the application of our framework to gradient-based policy evaluation algorithms in reinforcement learning.
{"title":"A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning","authors":"Sihan Zeng, Thinh T. Doan, Justin Romberg","doi":"10.1137/22m150277x","DOIUrl":"https://doi.org/10.1137/22m150277x","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 946-976, March 2024. <br/> Abstract. We study a new two-time-scale stochastic gradient method for solving optimization problems, where the gradients are computed with the aid of an auxiliary variable under samples generated by time-varying Markov random processes controlled by the underlying optimization variable. These time-varying samples make gradient directions in our update biased and dependent, which can potentially lead to the divergence of the iterates. In our two-time-scale approach, one scale is to estimate the true gradient from these samples, which is then used to update the estimate of the optimal solution. While these two iterates are implemented simultaneously, the former is updated “faster” (using bigger step sizes) than the latter (using smaller step sizes). Our first contribution is to characterize the finite-time complexity of the proposed two-time-scale stochastic gradient method. In particular, we provide explicit formulas for the convergence rates of this method under different structural assumptions, namely, strong convexity, the Polyak–Łojasiewicz condition, and general nonconvexity. We apply our framework to policy optimization problems in control and reinforcement learning. First, we look at the infinite-horizon average-reward Markov decision process with finite state and action spaces and derive a convergence rate of [math] for the online actor-critic algorithm under function approximation, which recovers the best known rate derived specifically for this problem. Second, we study the linear-quadratic regulator and show that an online actor-critic method converges with rate [math]. Third, we use the actor-critic algorithm to solve the policy optimization problem in an entropy regularized Markov decision process, where we also establish a convergence of [math]. The results we derive for both the second and third problems are novel and previously unknown in the literature. Finally, we briefly present the application of our framework to gradient-based policy evaluation algorithms in reinforcement learning.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140072202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
SIAM Journal on Optimization, Volume 34, Issue 1, Page 918-945, March 2024. Abstract. The presence of second-order smoothness for objective functions of optimization problems can provide valuable information about their stability properties and help us design efficient numerical algorithms for solving these problems. Such second-order information, however, cannot be expected in various constrained and composite optimization problems since we often have to express their objective functions in terms of extended-real-valued functions for which the classical second derivative may not exist. One powerful geometrical tool to use for dealing with such functions is the concept of twice epi-differentiability. In this paper, we study a stronger version of this concept, called strict twice epi-differentiability. We characterize this concept for certain composite functions and use it to establish the equivalence of metric regularity and strong metric regularity for a class of generalized equations at their nondegenerate solutions. Finally, we present a characterization of continuous differentiability of the proximal mapping of our composite functions.
{"title":"A Chain Rule for Strict Twice Epi-Differentiability and Its Applications","authors":"Nguyen T. V. Hang, M. Ebrahim Sarabi","doi":"10.1137/22m1520025","DOIUrl":"https://doi.org/10.1137/22m1520025","url":null,"abstract":"SIAM Journal on Optimization, Volume 34, Issue 1, Page 918-945, March 2024. <br/> Abstract. The presence of second-order smoothness for objective functions of optimization problems can provide valuable information about their stability properties and help us design efficient numerical algorithms for solving these problems. Such second-order information, however, cannot be expected in various constrained and composite optimization problems since we often have to express their objective functions in terms of extended-real-valued functions for which the classical second derivative may not exist. One powerful geometrical tool to use for dealing with such functions is the concept of twice epi-differentiability. In this paper, we study a stronger version of this concept, called strict twice epi-differentiability. We characterize this concept for certain composite functions and use it to establish the equivalence of metric regularity and strong metric regularity for a class of generalized equations at their nondegenerate solutions. Finally, we present a characterization of continuous differentiability of the proximal mapping of our composite functions.","PeriodicalId":49529,"journal":{"name":"SIAM Journal on Optimization","volume":null,"pages":null},"PeriodicalIF":3.1,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140004161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}