Many machine learning and optimization algorithms can be cast as instances of stochastic approximation (SA). The convergence rate of these algorithms is known to be slow, with the optimal mean squared error (MSE) of order $O(n^{-1})$. In prior work it was shown that MSE bounds approaching $O(n^{-4})$ can be achieved through the framework of quasi-stochastic approximation (QSA); essentially SA with careful choice of deterministic exploration. These results are extended to two time-scale algorithms, as found in policy gradient methods of reinforcement learning and extremum seeking control. The extensions are made possible in part by a new approach to analysis, allowing for the interpretation of two timescale algorithms as instances of single timescale QSA, made possible by the theory of negative Lyapunov exponents for QSA. The general theory is illustrated with applications to extremum seeking control (ESC).
{"title":"Markovian Foundations for Quasi-Stochastic Approximation in Two Timescales: Extended Version","authors":"Caio Kalil Lauand, Sean Meyn","doi":"arxiv-2409.07842","DOIUrl":"https://doi.org/arxiv-2409.07842","url":null,"abstract":"Many machine learning and optimization algorithms can be cast as instances of\u0000stochastic approximation (SA). The convergence rate of these algorithms is\u0000known to be slow, with the optimal mean squared error (MSE) of order\u0000$O(n^{-1})$. In prior work it was shown that MSE bounds approaching $O(n^{-4})$\u0000can be achieved through the framework of quasi-stochastic approximation (QSA);\u0000essentially SA with careful choice of deterministic exploration. These results\u0000are extended to two time-scale algorithms, as found in policy gradient methods\u0000of reinforcement learning and extremum seeking control. The extensions are made\u0000possible in part by a new approach to analysis, allowing for the interpretation\u0000of two timescale algorithms as instances of single timescale QSA, made possible\u0000by the theory of negative Lyapunov exponents for QSA. The general theory is\u0000illustrated with applications to extremum seeking control (ESC).","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Paul Häusner, Aleix Nieto Juscafresa, Jens Sjölund
In this paper, we develop a data-driven approach to generate incomplete LU factorizations of large-scale sparse matrices. The learned approximate factorization is utilized as a preconditioner for the corresponding linear equation system in the GMRES method. Incomplete factorization methods are one of the most commonly applied algebraic preconditioners for sparse linear equation systems and are able to speed up the convergence of Krylov subspace methods. However, they are sensitive to hyper-parameters and might suffer from numerical breakdown or lead to slow convergence when not properly applied. We replace the typically hand-engineered algorithms with a graph neural network based approach that is trained against data to predict an approximate factorization. This allows us to learn preconditioners tailored for a specific problem distribution. We analyze and empirically evaluate different loss functions to train the learned preconditioners and show their effectiveness to decrease the number of GMRES iterations and improve the spectral properties on our synthetic dataset. The code is available at https://github.com/paulhausner/neural-incomplete-factorization.
在本文中,我们开发了一种数据驱动方法,用于生成大规模稀疏矩阵的不完整 LU 因子化。学习到的近似因式分解被用作 GMRES 方法中相应线性方程组系统的预处理。不完全因子化方法是稀疏线性方程组最常用的代数预处理方法之一,能够加快 Krylov 子空间方法的收敛速度。然而,它们对超参数很敏感,如果应用不当,可能会出现数值崩溃或导致收敛速度缓慢。我们用基于图神经网络的方法取代了传统的手工设计算法,这种方法通过数据训练来预测近似因式分解。这样,我们就能学习为特定问题分布量身定制的预处理器。我们分析和实证评估了不同的损失函数,以训练学习到的预处理器,并在我们的合成数据集上展示了它们在减少 GMRES 迭代次数和改善频谱特性方面的有效性。代码可在https://github.com/paulhausner/neural-incomplete-factorization。
{"title":"Learning incomplete factorization preconditioners for GMRES","authors":"Paul Häusner, Aleix Nieto Juscafresa, Jens Sjölund","doi":"arxiv-2409.08262","DOIUrl":"https://doi.org/arxiv-2409.08262","url":null,"abstract":"In this paper, we develop a data-driven approach to generate incomplete LU\u0000factorizations of large-scale sparse matrices. The learned approximate\u0000factorization is utilized as a preconditioner for the corresponding linear\u0000equation system in the GMRES method. Incomplete factorization methods are one\u0000of the most commonly applied algebraic preconditioners for sparse linear\u0000equation systems and are able to speed up the convergence of Krylov subspace\u0000methods. However, they are sensitive to hyper-parameters and might suffer from\u0000numerical breakdown or lead to slow convergence when not properly applied. We\u0000replace the typically hand-engineered algorithms with a graph neural network\u0000based approach that is trained against data to predict an approximate\u0000factorization. This allows us to learn preconditioners tailored for a specific\u0000problem distribution. We analyze and empirically evaluate different loss\u0000functions to train the learned preconditioners and show their effectiveness to\u0000decrease the number of GMRES iterations and improve the spectral properties on\u0000our synthetic dataset. The code is available at\u0000https://github.com/paulhausner/neural-incomplete-factorization.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Obtaining the solution of constrained optimization problems as a function of parameters is very important in a multitude of applications, such as control and planning. Solving such parametric optimization problems in real time can present significant challenges, particularly when it is necessary to obtain highly accurate solutions or batches of solutions. To solve these challenges, we propose a learning-based iterative solver for constrained optimization which can obtain very fast and accurate solutions by customizing the solver to a specific parametric optimization problem. For a given set of parameters of the constrained optimization problem, we propose a first step with a neural network predictor that outputs primal-dual solutions of a reasonable degree of accuracy. This primal-dual solution is then improved to a very high degree of accuracy in a second step by a learned iterative solver in the form of a neural network. A novel loss function based on the Karush-Kuhn-Tucker conditions of optimality is introduced, enabling fully self-supervised training of both neural networks without the necessity of prior sampling of optimizer solutions. The evaluation of a variety of quadratic and nonlinear parametric test problems demonstrates that the predictor alone is already competitive with recent self-supervised schemes for approximating optimal solutions. The second step of our proposed learning-based iterative constrained optimizer achieves solutions with orders of magnitude better accuracy than other learning-based approaches, while being faster to evaluate than state-of-the-art solvers and natively allowing for GPU parallelization.
{"title":"Self-Supervised Learning of Iterative Solvers for Constrained Optimization","authors":"Lukas Lüken, Sergio Lucia","doi":"arxiv-2409.08066","DOIUrl":"https://doi.org/arxiv-2409.08066","url":null,"abstract":"Obtaining the solution of constrained optimization problems as a function of\u0000parameters is very important in a multitude of applications, such as control\u0000and planning. Solving such parametric optimization problems in real time can\u0000present significant challenges, particularly when it is necessary to obtain\u0000highly accurate solutions or batches of solutions. To solve these challenges,\u0000we propose a learning-based iterative solver for constrained optimization which\u0000can obtain very fast and accurate solutions by customizing the solver to a\u0000specific parametric optimization problem. For a given set of parameters of the\u0000constrained optimization problem, we propose a first step with a neural network\u0000predictor that outputs primal-dual solutions of a reasonable degree of\u0000accuracy. This primal-dual solution is then improved to a very high degree of\u0000accuracy in a second step by a learned iterative solver in the form of a neural\u0000network. A novel loss function based on the Karush-Kuhn-Tucker conditions of\u0000optimality is introduced, enabling fully self-supervised training of both\u0000neural networks without the necessity of prior sampling of optimizer solutions.\u0000The evaluation of a variety of quadratic and nonlinear parametric test problems\u0000demonstrates that the predictor alone is already competitive with recent\u0000self-supervised schemes for approximating optimal solutions. The second step of\u0000our proposed learning-based iterative constrained optimizer achieves solutions\u0000with orders of magnitude better accuracy than other learning-based approaches,\u0000while being faster to evaluate than state-of-the-art solvers and natively\u0000allowing for GPU parallelization.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wanru Chen, Rolf N. van Lieshout, Dezhi Zhang, Tom Van Woensel
This paper addresses a multi-period line planning problem in an integrated passenger-freight railway system, aiming to maximize profit while serving passengers and freight using a combination of dedicated passenger trains, dedicated freight trains, and mixed trains. To accommodate demand with different time sensitivities, we develop a period-extended change&go-network