首页 > 最新文献

arXiv - MATH - Optimization and Control最新文献

英文 中文
Markovian Foundations for Quasi-Stochastic Approximation in Two Timescales: Extended Version 双时标准随机逼近的马尔可夫基础扩展版
Pub Date : 2024-09-12 DOI: arxiv-2409.07842
Caio Kalil Lauand, Sean Meyn
Many machine learning and optimization algorithms can be cast as instances ofstochastic approximation (SA). The convergence rate of these algorithms isknown to be slow, with the optimal mean squared error (MSE) of order$O(n^{-1})$. In prior work it was shown that MSE bounds approaching $O(n^{-4})$can be achieved through the framework of quasi-stochastic approximation (QSA);essentially SA with careful choice of deterministic exploration. These resultsare extended to two time-scale algorithms, as found in policy gradient methodsof reinforcement learning and extremum seeking control. The extensions are madepossible in part by a new approach to analysis, allowing for the interpretationof two timescale algorithms as instances of single timescale QSA, made possibleby the theory of negative Lyapunov exponents for QSA. The general theory isillustrated with applications to extremum seeking control (ESC).
许多机器学习和优化算法都可以看作是随机逼近(SA)的实例。众所周知,这些算法的收敛速度很慢,最佳均方误差(MSE)为 $O(n^{-1})。之前的研究表明,通过准随机逼近(QSA)框架,可以实现接近 $O(n^{-4})$的 MSE 值;QSA 本质上是在谨慎选择确定性探索的情况下实现的 SA。这些结果被扩展到两种时间尺度的算法,如强化学习和极值寻优控制的策略梯度法。这些扩展部分得益于一种新的分析方法,它允许将双时间尺度算法解释为单时间尺度 QSA 的实例,QSA 的负 Lyapunov 指数理论使之成为可能。该一般理论在极值寻优控制(ESC)中的应用也说明了这一点。
{"title":"Markovian Foundations for Quasi-Stochastic Approximation in Two Timescales: Extended Version","authors":"Caio Kalil Lauand, Sean Meyn","doi":"arxiv-2409.07842","DOIUrl":"https://doi.org/arxiv-2409.07842","url":null,"abstract":"Many machine learning and optimization algorithms can be cast as instances of\u0000stochastic approximation (SA). The convergence rate of these algorithms is\u0000known to be slow, with the optimal mean squared error (MSE) of order\u0000$O(n^{-1})$. In prior work it was shown that MSE bounds approaching $O(n^{-4})$\u0000can be achieved through the framework of quasi-stochastic approximation (QSA);\u0000essentially SA with careful choice of deterministic exploration. These results\u0000are extended to two time-scale algorithms, as found in policy gradient methods\u0000of reinforcement learning and extremum seeking control. The extensions are made\u0000possible in part by a new approach to analysis, allowing for the interpretation\u0000of two timescale algorithms as instances of single timescale QSA, made possible\u0000by the theory of negative Lyapunov exponents for QSA. The general theory is\u0000illustrated with applications to extremum seeking control (ESC).","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning incomplete factorization preconditioners for GMRES 为 GMRES 学习不完全因式分解预处理器
Pub Date : 2024-09-12 DOI: arxiv-2409.08262
Paul Häusner, Aleix Nieto Juscafresa, Jens Sjölund
In this paper, we develop a data-driven approach to generate incomplete LUfactorizations of large-scale sparse matrices. The learned approximatefactorization is utilized as a preconditioner for the corresponding linearequation system in the GMRES method. Incomplete factorization methods are oneof the most commonly applied algebraic preconditioners for sparse linearequation systems and are able to speed up the convergence of Krylov subspacemethods. However, they are sensitive to hyper-parameters and might suffer fromnumerical breakdown or lead to slow convergence when not properly applied. Wereplace the typically hand-engineered algorithms with a graph neural networkbased approach that is trained against data to predict an approximatefactorization. This allows us to learn preconditioners tailored for a specificproblem distribution. We analyze and empirically evaluate different lossfunctions to train the learned preconditioners and show their effectiveness todecrease the number of GMRES iterations and improve the spectral properties onour synthetic dataset. The code is available athttps://github.com/paulhausner/neural-incomplete-factorization.
在本文中,我们开发了一种数据驱动方法,用于生成大规模稀疏矩阵的不完整 LU 因子化。学习到的近似因式分解被用作 GMRES 方法中相应线性方程组系统的预处理。不完全因子化方法是稀疏线性方程组最常用的代数预处理方法之一,能够加快 Krylov 子空间方法的收敛速度。然而,它们对超参数很敏感,如果应用不当,可能会出现数值崩溃或导致收敛速度缓慢。我们用基于图神经网络的方法取代了传统的手工设计算法,这种方法通过数据训练来预测近似因式分解。这样,我们就能学习为特定问题分布量身定制的预处理器。我们分析和实证评估了不同的损失函数,以训练学习到的预处理器,并在我们的合成数据集上展示了它们在减少 GMRES 迭代次数和改善频谱特性方面的有效性。代码可在https://github.com/paulhausner/neural-incomplete-factorization。
{"title":"Learning incomplete factorization preconditioners for GMRES","authors":"Paul Häusner, Aleix Nieto Juscafresa, Jens Sjölund","doi":"arxiv-2409.08262","DOIUrl":"https://doi.org/arxiv-2409.08262","url":null,"abstract":"In this paper, we develop a data-driven approach to generate incomplete LU\u0000factorizations of large-scale sparse matrices. The learned approximate\u0000factorization is utilized as a preconditioner for the corresponding linear\u0000equation system in the GMRES method. Incomplete factorization methods are one\u0000of the most commonly applied algebraic preconditioners for sparse linear\u0000equation systems and are able to speed up the convergence of Krylov subspace\u0000methods. However, they are sensitive to hyper-parameters and might suffer from\u0000numerical breakdown or lead to slow convergence when not properly applied. We\u0000replace the typically hand-engineered algorithms with a graph neural network\u0000based approach that is trained against data to predict an approximate\u0000factorization. This allows us to learn preconditioners tailored for a specific\u0000problem distribution. We analyze and empirically evaluate different loss\u0000functions to train the learned preconditioners and show their effectiveness to\u0000decrease the number of GMRES iterations and improve the spectral properties on\u0000our synthetic dataset. The code is available at\u0000https://github.com/paulhausner/neural-incomplete-factorization.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-Supervised Learning of Iterative Solvers for Constrained Optimization 约束优化迭代求解器的自监督学习
Pub Date : 2024-09-12 DOI: arxiv-2409.08066
Lukas Lüken, Sergio Lucia
Obtaining the solution of constrained optimization problems as a function ofparameters is very important in a multitude of applications, such as controland planning. Solving such parametric optimization problems in real time canpresent significant challenges, particularly when it is necessary to obtainhighly accurate solutions or batches of solutions. To solve these challenges,we propose a learning-based iterative solver for constrained optimization whichcan obtain very fast and accurate solutions by customizing the solver to aspecific parametric optimization problem. For a given set of parameters of theconstrained optimization problem, we propose a first step with a neural networkpredictor that outputs primal-dual solutions of a reasonable degree ofaccuracy. This primal-dual solution is then improved to a very high degree ofaccuracy in a second step by a learned iterative solver in the form of a neuralnetwork. A novel loss function based on the Karush-Kuhn-Tucker conditions ofoptimality is introduced, enabling fully self-supervised training of bothneural networks without the necessity of prior sampling of optimizer solutions.The evaluation of a variety of quadratic and nonlinear parametric test problemsdemonstrates that the predictor alone is already competitive with recentself-supervised schemes for approximating optimal solutions. The second step ofour proposed learning-based iterative constrained optimizer achieves solutionswith orders of magnitude better accuracy than other learning-based approaches,while being faster to evaluate than state-of-the-art solvers and nativelyallowing for GPU parallelization.
获取受限优化问题的解作为参数的函数在许多应用中都非常重要,例如控制土地规划。实时求解这类参数优化问题会带来巨大的挑战,尤其是在需要获得高精度解或成批解的情况下。为了解决这些难题,我们提出了一种基于学习的约束优化迭代求解器,通过针对特定参数优化问题定制求解器,可以获得非常快速和精确的解决方案。对于给定的约束优化问题参数集,我们建议第一步使用神经网络预测器,该预测器可输出具有合理准确度的原始二元解。然后,在第二步中,通过神经网络形式的学习迭代求解器,将原始二元解改进为精确度非常高的解。我们引入了一种基于卡鲁什-库恩-塔克最优条件的新型损失函数,从而实现了两个神经网络的完全自我监督训练,而无需事先对优化解进行采样。对各种二次参数和非线性参数测试问题的评估表明,预测器本身在逼近最优解方面已经可以与最近的自我监督方案相媲美。我们提出的基于学习的迭代约束优化器的第二步实现了比其他基于学习的方法更高精度的解决方案,同时其评估速度比最先进的求解器更快,并允许 GPU 并行化。
{"title":"Self-Supervised Learning of Iterative Solvers for Constrained Optimization","authors":"Lukas Lüken, Sergio Lucia","doi":"arxiv-2409.08066","DOIUrl":"https://doi.org/arxiv-2409.08066","url":null,"abstract":"Obtaining the solution of constrained optimization problems as a function of\u0000parameters is very important in a multitude of applications, such as control\u0000and planning. Solving such parametric optimization problems in real time can\u0000present significant challenges, particularly when it is necessary to obtain\u0000highly accurate solutions or batches of solutions. To solve these challenges,\u0000we propose a learning-based iterative solver for constrained optimization which\u0000can obtain very fast and accurate solutions by customizing the solver to a\u0000specific parametric optimization problem. For a given set of parameters of the\u0000constrained optimization problem, we propose a first step with a neural network\u0000predictor that outputs primal-dual solutions of a reasonable degree of\u0000accuracy. This primal-dual solution is then improved to a very high degree of\u0000accuracy in a second step by a learned iterative solver in the form of a neural\u0000network. A novel loss function based on the Karush-Kuhn-Tucker conditions of\u0000optimality is introduced, enabling fully self-supervised training of both\u0000neural networks without the necessity of prior sampling of optimizer solutions.\u0000The evaluation of a variety of quadratic and nonlinear parametric test problems\u0000demonstrates that the predictor alone is already competitive with recent\u0000self-supervised schemes for approximating optimal solutions. The second step of\u0000our proposed learning-based iterative constrained optimizer achieves solutions\u0000with orders of magnitude better accuracy than other learning-based approaches,\u0000while being faster to evaluate than state-of-the-art solvers and natively\u0000allowing for GPU parallelization.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-period railway line planning for integrated passenger-freight transportation 客货运输一体化的多期铁路线路规划
Pub Date : 2024-09-12 DOI: arxiv-2409.08256
Wanru Chen, Rolf N. van Lieshout, Dezhi Zhang, Tom Van Woensel
This paper addresses a multi-period line planning problem in an integratedpassenger-freight railway system, aiming to maximize profit while servingpassengers and freight using a combination of dedicated passenger trains,dedicated freight trains, and mixed trains. To accommodate demand withdifferent time sensitivities, we develop a period-extended change&go-networkthat tracks the paths taken by passengers and freight. The problem isformulated as a path-based mixed integer programming model, with the linearrelaxation solved using column generation. Paths for passengers and freight aredynamically generated by solving pricing problems defined as elementaryshortest-path problems with duration constraints. We propose two heuristicapproaches: price-and-branch and a diving heuristic, with accelerationstrategies, to find integer feasible solutions efficiently. Computationalexperiments on the Chinese high-speed railway network demonstrate that thediving heuristic outperforms the price-and-branch heuristic in bothcomputational time and solution quality. Additionally, the experimentshighlight the benefits of integrating freight, the advantages of multi-periodline planning, and the impact of different demand patterns on line operations.
本文探讨了客货运一体化铁路系统中的多期线路规划问题,目的是在使用专用客运列车、专用货运列车和混合列车为客运和货运提供服务的同时实现利润最大化。为了满足不同时间敏感性的需求,我们开发了一个周期扩展的换乘和出发网络,跟踪客运和货运的路径。该问题被表述为一个基于路径的混合整数编程模型,并使用列生成法解决线性松弛问题。客运和货运的路径是通过求解定价问题动态生成的,定价问题定义为带有持续时间约束的基本最短路径问题。我们提出了两种启发式方法:价格-分支和潜水启发式,并采用加速策略,以高效地找到整数可行解。在中国高速铁路网上进行的计算实验表明,潜水启发式在计算时间和解的质量上都优于价格-分支启发式。此外,实验还强调了整合货运的好处、多期线路规划的优势以及不同需求模式对线路运营的影响。
{"title":"Multi-period railway line planning for integrated passenger-freight transportation","authors":"Wanru Chen, Rolf N. van Lieshout, Dezhi Zhang, Tom Van Woensel","doi":"arxiv-2409.08256","DOIUrl":"https://doi.org/arxiv-2409.08256","url":null,"abstract":"This paper addresses a multi-period line planning problem in an integrated\u0000passenger-freight railway system, aiming to maximize profit while serving\u0000passengers and freight using a combination of dedicated passenger trains,\u0000dedicated freight trains, and mixed trains. To accommodate demand with\u0000different time sensitivities, we develop a period-extended change&go-network\u0000that tracks the paths taken by passengers and freight. The problem is\u0000formulated as a path-based mixed integer programming model, with the linear\u0000relaxation solved using column generation. Paths for passengers and freight are\u0000dynamically generated by solving pricing problems defined as elementary\u0000shortest-path problems with duration constraints. We propose two heuristic\u0000approaches: price-and-branch and a diving heuristic, with acceleration\u0000strategies, to find integer feasible solutions efficiently. Computational\u0000experiments on the Chinese high-speed railway network demonstrate that the\u0000diving heuristic outperforms the price-and-branch heuristic in both\u0000computational time and solution quality. Additionally, the experiments\u0000highlight the benefits of integrating freight, the advantages of multi-period\u0000line planning, and the impact of different demand patterns on line operations.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Consumption for Recursive Preferences with Local Substitution under Risk 风险条件下具有局部替代性的递归偏好的最优消费
Pub Date : 2024-09-12 DOI: arxiv-2409.07799
Hanwu Li, Frank Riedel
We explore intertemporal preferences that are recursive and account for localintertemporal substitution. First, we establish a rigorous foundation for thesepreferences and analyze their properties. Next, we examine the associatedoptimal consumption problem, proving the existence and uniqueness of theoptimal consumption plan. We present an infinite-dimensional version of theKuhn-Tucker theorem, which provides the necessary and sufficient conditions foroptimality. Additionally, we investigate quantitative properties and theconstruction of the optimal consumption plan. Finally, we offer a detaileddescription of the structure of optimal consumption within a geometric Poissonframework.
我们探讨了具有递归性并考虑局部跨期替代的跨期偏好。首先,我们为这些偏好建立了严格的基础,并分析了它们的特性。接着,我们研究了相关的最优消费问题,证明了最优消费计划的存在性和唯一性。我们提出了库恩-塔克(Kuhn-Tucker)定理的无穷维版本,该定理提供了最优的必要条件和充分条件。此外,我们还研究了最优消费计划的定量属性和构造。最后,我们详细描述了几何泊松框架下的最优消费结构。
{"title":"Optimal Consumption for Recursive Preferences with Local Substitution under Risk","authors":"Hanwu Li, Frank Riedel","doi":"arxiv-2409.07799","DOIUrl":"https://doi.org/arxiv-2409.07799","url":null,"abstract":"We explore intertemporal preferences that are recursive and account for local\u0000intertemporal substitution. First, we establish a rigorous foundation for these\u0000preferences and analyze their properties. Next, we examine the associated\u0000optimal consumption problem, proving the existence and uniqueness of the\u0000optimal consumption plan. We present an infinite-dimensional version of the\u0000Kuhn-Tucker theorem, which provides the necessary and sufficient conditions for\u0000optimality. Additionally, we investigate quantitative properties and the\u0000construction of the optimal consumption plan. Finally, we offer a detailed\u0000description of the structure of optimal consumption within a geometric Poisson\u0000framework.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How can the tragedy of the commons be prevented?: Introducing Linear Quadratic Mixed Mean Field Games 如何防止公地悲剧?线性二次混合均值场博弈介绍
Pub Date : 2024-09-12 DOI: arxiv-2409.08235
Gokce Dayanikli, Mathieu Lauriere
In a regular mean field game (MFG), the agents are assumed to beinsignificant, they do not realize their effect on the population level andthis may result in a phenomenon coined as the Tragedy of the Commons by theeconomists. However, in real life this phenomenon is often avoided thanks tothe underlying altruistic behavior of (all or some of the) agents. Motivated bythis observation, we introduce and analyze two different mean field models toinclude altruism in the decision making of agents. In the first model, mixedindividual MFGs, there are infinitely many agents who are partially altruistic(i.e., they behave partially cooperatively) and partially non-cooperative. Inthe second model, mixed population MFGs, one part of the population behavescooperatively and the remaining agents behave non-cooperatively. Both modelsare introduced in a general linear quadratic framework for which wecharacterize the equilibrium via forward backward stochastic differentialequations. Furthermore, we give explicit solutions in terms of ordinarydifferential equations, and prove the existence and uniqueness results.
在常规均值场博弈(MFG)中,代理人被假定为微不足道,他们不会意识到自己对群体水平的影响,这可能会导致被经济学家称为 "公地悲剧 "的现象。然而,在现实生活中,由于(所有或部分)代理人的潜在利他主义行为,这种现象往往可以避免。受此启发,我们引入并分析了两种不同的均值场模型,以将利他主义纳入代理人的决策过程。在第一个模型,即混合个体均值场模型中,有无限多的代理人部分具有利他主义(即他们的行为部分具有合作性),部分不具有合作性。在第二种模型,即混合种群多角色政府模型中,一部分种群采取合作行为,其余代理人采取非合作行为。这两个模型都是在一般线性二次方程框架下引入的,我们通过前向后向随机微分方程来描述其均衡。此外,我们还给出了常微分方程的明确解,并证明了存在性和唯一性结果。
{"title":"How can the tragedy of the commons be prevented?: Introducing Linear Quadratic Mixed Mean Field Games","authors":"Gokce Dayanikli, Mathieu Lauriere","doi":"arxiv-2409.08235","DOIUrl":"https://doi.org/arxiv-2409.08235","url":null,"abstract":"In a regular mean field game (MFG), the agents are assumed to be\u0000insignificant, they do not realize their effect on the population level and\u0000this may result in a phenomenon coined as the Tragedy of the Commons by the\u0000economists. However, in real life this phenomenon is often avoided thanks to\u0000the underlying altruistic behavior of (all or some of the) agents. Motivated by\u0000this observation, we introduce and analyze two different mean field models to\u0000include altruism in the decision making of agents. In the first model, mixed\u0000individual MFGs, there are infinitely many agents who are partially altruistic\u0000(i.e., they behave partially cooperatively) and partially non-cooperative. In\u0000the second model, mixed population MFGs, one part of the population behaves\u0000cooperatively and the remaining agents behave non-cooperatively. Both models\u0000are introduced in a general linear quadratic framework for which we\u0000characterize the equilibrium via forward backward stochastic differential\u0000equations. Furthermore, we give explicit solutions in terms of ordinary\u0000differential equations, and prove the existence and uniqueness results.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Duality theory in linear optimization and its extensions -- formally verified 线性优化及其扩展中的对偶理论 -- 正式验证
Pub Date : 2024-09-12 DOI: arxiv-2409.08119
Martin Dvorak, Vladimir Kolmogorov
Farkas established that a system of linear inequalities has a solution if andonly if we cannot obtain a contradiction by taking a linear combination of theinequalities. We state and formally prove several Farkas-like theorems overlinearly ordered fields in Lean 4. Furthermore, we extend duality theory to thecase when some coefficients are allowed to take ``infinite values''.
法卡斯认为,线性不等式系统有一个解,前提是我们不能通过线性组合得到矛盾。我们在精益 4 中阐述并正式证明了几个类似法卡斯的线性有序域定理。此外,我们还将对偶理论扩展到允许某些系数取 "无限值 "的情况。
{"title":"Duality theory in linear optimization and its extensions -- formally verified","authors":"Martin Dvorak, Vladimir Kolmogorov","doi":"arxiv-2409.08119","DOIUrl":"https://doi.org/arxiv-2409.08119","url":null,"abstract":"Farkas established that a system of linear inequalities has a solution if and\u0000only if we cannot obtain a contradiction by taking a linear combination of the\u0000inequalities. We state and formally prove several Farkas-like theorems over\u0000linearly ordered fields in Lean 4. Furthermore, we extend duality theory to the\u0000case when some coefficients are allowed to take ``infinite values''.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerated Multi-Time-Scale Stochastic Approximation: Optimal Complexity and Applications in Reinforcement Learning and Multi-Agent Games 加速多时间尺度随机逼近:最优复杂性及其在强化学习和多代理游戏中的应用
Pub Date : 2024-09-12 DOI: arxiv-2409.07767
Sihan Zeng, Thinh T. Doan
Multi-time-scale stochastic approximation is an iterative algorithm forfinding the fixed point of a set of $N$ coupled operators given their noisysamples. It has been observed that due to the coupling between the decisionvariables and noisy samples of the operators, the performance of this methoddecays as $N$ increases. In this work, we develop a new accelerated variant ofmulti-time-scale stochastic approximation, which significantly improves theconvergence rates of its standard counterpart. Our key idea is to introduceauxiliary variables to dynamically estimate the operators from their samples,which are then used to update the decision variables. These auxiliary variableshelp not only to control the variance of the operator estimates but also todecouple the sampling noise and the decision variables. This allows us toselect more aggressive step sizes to achieve an optimal convergence rate.Specifically, under a strong monotonicity condition, we show that for any valueof $N$ the $t^{text{th}}$ iterate of the proposed algorithm converges to thedesired solution at a rate $widetilde{O}(1/t)$ when the operator samples aregenerated from a single from Markov process trajectory. A second contribution of this work is to demonstrate that the objective of arange of problems in reinforcement learning and multi-agent games can beexpressed as a system of fixed-point equations. As such, the proposed approachcan be used to design new learning algorithms for solving these problems. Weillustrate this observation with numerical simulations in a multi-agent gameand show the advantage of the proposed method over the standardmulti-time-scale stochastic approximation algorithm.
多时间尺度随机逼近法是一种迭代算法,用于在给定其噪声样本的情况下找到一组 $N$ 耦合算子的定点。据观察,由于决策变量与算子噪声样本之间的耦合,该方法的性能会随着 $N$ 的增加而下降。在这项工作中,我们开发了一种新的多时间尺度随机逼近加速变体,大大提高了其标准对应方法的收敛率。我们的主要想法是引入辅助变量,从样本中动态估计算子,然后用于更新决策变量。这些辅助变量不仅有助于控制算子估计值的方差,还能将采样噪声和决策变量分离开来。具体来说,在强单调性条件下,我们证明了对于任意 $N$ 值,当算子样本从马尔可夫过程轨迹中生成时,所提算法的 $t^{text{th}}$ 次迭代以 $/widetilde{O}(1/t)$ 的速率收敛到所需的解。这项工作的第二个贡献是证明了强化学习和多代理博弈中一系列问题的目标可以表达为一个定点方程组。因此,所提出的方法可用于设计解决这些问题的新学习算法。我们通过多代理博弈中的数值模拟证明了这一观点,并展示了所提方法相对于标准多时间尺度随机逼近算法的优势。
{"title":"Accelerated Multi-Time-Scale Stochastic Approximation: Optimal Complexity and Applications in Reinforcement Learning and Multi-Agent Games","authors":"Sihan Zeng, Thinh T. Doan","doi":"arxiv-2409.07767","DOIUrl":"https://doi.org/arxiv-2409.07767","url":null,"abstract":"Multi-time-scale stochastic approximation is an iterative algorithm for\u0000finding the fixed point of a set of $N$ coupled operators given their noisy\u0000samples. It has been observed that due to the coupling between the decision\u0000variables and noisy samples of the operators, the performance of this method\u0000decays as $N$ increases. In this work, we develop a new accelerated variant of\u0000multi-time-scale stochastic approximation, which significantly improves the\u0000convergence rates of its standard counterpart. Our key idea is to introduce\u0000auxiliary variables to dynamically estimate the operators from their samples,\u0000which are then used to update the decision variables. These auxiliary variables\u0000help not only to control the variance of the operator estimates but also to\u0000decouple the sampling noise and the decision variables. This allows us to\u0000select more aggressive step sizes to achieve an optimal convergence rate.\u0000Specifically, under a strong monotonicity condition, we show that for any value\u0000of $N$ the $t^{text{th}}$ iterate of the proposed algorithm converges to the\u0000desired solution at a rate $widetilde{O}(1/t)$ when the operator samples are\u0000generated from a single from Markov process trajectory. A second contribution of this work is to demonstrate that the objective of a\u0000range of problems in reinforcement learning and multi-agent games can be\u0000expressed as a system of fixed-point equations. As such, the proposed approach\u0000can be used to design new learning algorithms for solving these problems. We\u0000illustrate this observation with numerical simulations in a multi-agent game\u0000and show the advantage of the proposed method over the standard\u0000multi-time-scale stochastic approximation algorithm.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On an optimization model for firefighting helicopter planning 关于消防直升机规划的优化模型
Pub Date : 2024-09-12 DOI: arxiv-2409.07937
Marta Rodríguez Barreiro, María José Ginzo Villamayor, Fernando Pérez Porras, María Luisa Carpente Rodríguez, Silvia María Lorenzo Freire
During a wildfire, the work of the aerial coordinator is crucial for thecontrol of the wildfire and the minimization of the burned area and the damagecaused. Since it could be very useful for the coordinator to havedecision-making tools at his/her disposal, this framework deals with anoptimization model to obtain the optimal planning of firefighting helicopters,deciding the points where the aircraft should load water, the areas of thewildfire where they should work, and the rest bases to which each helicoptershould be assigned. It was developed a Mixed Integer Linear Programming modelwhich takes into account the configuration of helicopters, in closed circuits,as well as the flight aerial regulations in Spain. Due to the complexity of themodel, two algorithms are developed, based on the Simulated Annealing andIterated Local Search metaheuristic techniques. Both algorithms are tested withreal data instances, obtaining very promising results for future application inthe planning of aircraft throughout a wildfire evolution.
在野火期间,空中协调员的工作对于控制野火、最大限度地减少烧毁面积和造成的损失至关重要。由于协调人员掌握决策工具非常有用,因此本框架讨论了一个优化模型,以获得灭火直升机的最佳规划,决定飞机应装载水的地点、应工作的野火区域以及每架直升机应分配的休息基地。我们开发了一个混合整数线性规划模型,该模型考虑到了直升机在封闭回路中的配置以及西班牙的空中飞行规定。由于模型的复杂性,开发了基于模拟退火和迭代局部搜索元启发式技术的两种算法。这两种算法都通过真实数据实例进行了测试,取得了非常有前景的结果,可用于未来在野火演变过程中的飞机规划。
{"title":"On an optimization model for firefighting helicopter planning","authors":"Marta Rodríguez Barreiro, María José Ginzo Villamayor, Fernando Pérez Porras, María Luisa Carpente Rodríguez, Silvia María Lorenzo Freire","doi":"arxiv-2409.07937","DOIUrl":"https://doi.org/arxiv-2409.07937","url":null,"abstract":"During a wildfire, the work of the aerial coordinator is crucial for the\u0000control of the wildfire and the minimization of the burned area and the damage\u0000caused. Since it could be very useful for the coordinator to have\u0000decision-making tools at his/her disposal, this framework deals with an\u0000optimization model to obtain the optimal planning of firefighting helicopters,\u0000deciding the points where the aircraft should load water, the areas of the\u0000wildfire where they should work, and the rest bases to which each helicopter\u0000should be assigned. It was developed a Mixed Integer Linear Programming model\u0000which takes into account the configuration of helicopters, in closed circuits,\u0000as well as the flight aerial regulations in Spain. Due to the complexity of the\u0000model, two algorithms are developed, based on the Simulated Annealing and\u0000Iterated Local Search metaheuristic techniques. Both algorithms are tested with\u0000real data instances, obtaining very promising results for future application in\u0000the planning of aircraft throughout a wildfire evolution.","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal control for coupled sweeping processes under minimal assumptions 最小假设条件下耦合扫频过程的最优控制
Pub Date : 2024-09-12 DOI: arxiv-2409.07722
Samara Chamoun, Vera Zeidan
In this paper, the study of nonsmooth optimal control problems (P) involvinga controlled sweeping process with three main characteristics is launched.First, the sweeping sets C(t) are nonsmooth, unbounded, time-dependent,uniformly prox-regular, and satisfy minimal assumptions. Second, the sweepingprocess is coupled with a controlled differential equation. Third, joint-stateendpoints constraint set S, including periodic conditions, is present. Theexistence and uniqueness of a Lipschitz solution for our dynamic isestablished, the existence of an optimal solution for our general form ofoptimal control is obtained, and the full form of the nonsmooth Pontryaginmaximum principle for strong local minimizers in (P) is derived under minimalhypotheses. One of the novelties of this paper is the idea to work with awell-constructed problem corresponding to truncated sweeping sets and jointendpoint constraints that shares the same strong local minimizer as (P) and forwhich the exponential-penalty approximation technique can be developed usingonly the assumptions on (P).
首先,扫描集合 C(t) 是非光滑、无边界、随时间变化、均匀近似规则的,并且满足最小假设。第二,扫频过程与受控微分方程耦合。第三,存在联合状态端点约束集 S,包括周期条件。本文建立了我们的动态 Lipschitz 解的存在性和唯一性,得到了我们的最优控制一般形式的最优解的存在性,并在最小假设条件下推导出了(P)中强局部最小化的非光滑 Pontryagin 最大原理的完整形式。本文的新颖之处之一是提出了与截断扫频集和联合端点约束相对应的构造良好的问题,该问题与(P)具有相同的强局部最小值,只需使用对(P)的假设,就能开发出指数惩罚近似技术。
{"title":"Optimal control for coupled sweeping processes under minimal assumptions","authors":"Samara Chamoun, Vera Zeidan","doi":"arxiv-2409.07722","DOIUrl":"https://doi.org/arxiv-2409.07722","url":null,"abstract":"In this paper, the study of nonsmooth optimal control problems (P) involving\u0000a controlled sweeping process with three main characteristics is launched.\u0000First, the sweeping sets C(t) are nonsmooth, unbounded, time-dependent,\u0000uniformly prox-regular, and satisfy minimal assumptions. Second, the sweeping\u0000process is coupled with a controlled differential equation. Third, joint-state\u0000endpoints constraint set S, including periodic conditions, is present. The\u0000existence and uniqueness of a Lipschitz solution for our dynamic is\u0000established, the existence of an optimal solution for our general form of\u0000optimal control is obtained, and the full form of the nonsmooth Pontryagin\u0000maximum principle for strong local minimizers in (P) is derived under minimal\u0000hypotheses. One of the novelties of this paper is the idea to work with a\u0000well-constructed problem corresponding to truncated sweeping sets and joint\u0000endpoint constraints that shares the same strong local minimizer as (P) and for\u0000which the exponential-penalty approximation technique can be developed using\u0000only the assumptions on (P).","PeriodicalId":501286,"journal":{"name":"arXiv - MATH - Optimization and Control","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142212369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - MATH - Optimization and Control
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1