首页 > 最新文献

SIAM Journal on Control and Optimization最新文献

英文 中文
On Borkar and Young Relaxed Control Topologies and Continuous Dependence of Invariant Measures on Control Policy 论博尔卡和杨松弛控制拓扑及不变量对控制策略的连续依赖性
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-12 DOI: 10.1137/23m1571940
Serdar Yüksel
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2367-2386, August 2024.
Abstract. In deterministic and stochastic control theory, relaxed or randomized control policies allow for versatile mathematical analysis (on continuity, compactness, convexity, and approximations) to be applicable with no artificial restrictions on the classes of control policies considered, leading to very general existence results on optimal measurable policies under various setups and information structures. On relaxed controls, two studied topologies are the Young and Borkar (weak[math]) topologies on spaces of functions from a state/measurement space to the space of probability measures on control action spaces; the former via a weak convergence topology on probability measures on a product space with a fixed marginal on the input (state) space, and the latter via a weak[math] topology on randomized policies viewed as maps from states/measurements to the space of signed measures with bounded variation. We establish implication and equivalence conditions between the Young and Borkar topologies on control policies. We then show that, under some conditions, for a controlled Markov chain with standard Borel spaces the invariant measure is weakly continuous on the space of stationary control policies defined by either of these topologies. An implication is near-optimality of quantized stationary policies in state and actions or continuous stationary and deterministic policies for average cost control under two sets of continuity conditions (with either weak continuity in the state-action pair or strong continuity in the action for each state) on transition kernels.
SIAM 控制与优化期刊》第 62 卷第 4 期第 2367-2386 页,2024 年 8 月。 摘要。在确定性和随机控制理论中,松弛或随机控制策略允许应用通用数学分析(关于连续性、紧凑性、凸性和近似),而不人为限制所考虑的控制策略类别,从而导致在各种设置和信息结构下关于最优可测策略的非常通用的存在性结果。关于松弛控制,研究的两个拓扑是从状态/测量空间到控制行动空间概率度量的函数空间上的 Young 和 Borkar(weak[math])拓扑;前者是通过输入(状态)空间上具有固定边际的乘积空间上概率度量的弱收敛拓扑,后者是通过被视为从状态/测量到具有有界变化的符号度量空间的映射的随机化策略上的弱[math]拓扑。我们在控制策略的 Young 拓扑和 Borkar 拓扑之间建立了蕴涵和等价条件。然后我们证明,在某些条件下,对于具有标准 Borel 空间的受控马尔可夫链,不变度量在这两种拓扑定义的静态控制策略空间上是弱连续的。这意味着,在过渡核的两组连续性条件(状态-行动对的弱连续性或每个状态的行动的强连续性)下,状态和行动的量化静态策略或平均成本控制的连续静态和确定性策略接近最优。
{"title":"On Borkar and Young Relaxed Control Topologies and Continuous Dependence of Invariant Measures on Control Policy","authors":"Serdar Yüksel","doi":"10.1137/23m1571940","DOIUrl":"https://doi.org/10.1137/23m1571940","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2367-2386, August 2024. <br/> Abstract. In deterministic and stochastic control theory, relaxed or randomized control policies allow for versatile mathematical analysis (on continuity, compactness, convexity, and approximations) to be applicable with no artificial restrictions on the classes of control policies considered, leading to very general existence results on optimal measurable policies under various setups and information structures. On relaxed controls, two studied topologies are the Young and Borkar (weak[math]) topologies on spaces of functions from a state/measurement space to the space of probability measures on control action spaces; the former via a weak convergence topology on probability measures on a product space with a fixed marginal on the input (state) space, and the latter via a weak[math] topology on randomized policies viewed as maps from states/measurements to the space of signed measures with bounded variation. We establish implication and equivalence conditions between the Young and Borkar topologies on control policies. We then show that, under some conditions, for a controlled Markov chain with standard Borel spaces the invariant measure is weakly continuous on the space of stationary control policies defined by either of these topologies. An implication is near-optimality of quantized stationary policies in state and actions or continuous stationary and deterministic policies for average cost control under two sets of continuity conditions (with either weak continuity in the state-action pair or strong continuity in the action for each state) on transition kernels.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constituting an Extension of Lyapunov’s Direct Method 构成李雅普诺夫直接法的延伸
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-12 DOI: 10.1137/23m1595242
M. Akbarian, N. Pariz, A. Heydari
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2346-2366, August 2024.
Abstract. This paper investigates new sufficient conditions for the stability, asymptotic stability, and global asymptotic stability of nonlinear autonomous systems, specifically in cases where the first derivative of the Lyapunov function candidate may have both positive and negative values on its domain. The main contribution of this approach is the introduction of a new auxiliary function that relaxes the stability conditions, allowing the first derivative of the Lyapunov function candidate to be less than or equal to a nonnegative function. The suggested auxiliary function should be integrable within our first theorem. Meanwhile, our first corollary presents a technique that simplifies the task by establishing specific conditions related to differential inequalities. This weaker condition in the proposed results enables the establishment of stability properties in cases where the Lyapunov function candidate is not well chosen or finding a Lyapunov function is not straightforward. Additionally, it is proven that the original Lyapunov method for autonomous systems is a special case of our first theorem. Furthermore, it is demonstrated that assumptions in previous studies, such as Matrosov’s theorem or results on higher-order derivatives of the Lyapunov function, guarantee the existence of our auxiliary function. Finally, lemmas are provided to construct these auxiliary functions, and examples are presented to demonstrate the effectiveness of this approach. This work will contribute to the development of stability analysis techniques for nonlinear autonomous systems.
SIAM 控制与优化期刊》第 62 卷第 4 期第 2346-2366 页,2024 年 8 月。 摘要本文研究了非线性自治系统稳定性、渐近稳定性和全局渐近稳定性的新充分条件,特别是在候选 Lyapunov 函数的一阶导数在其域上可能有正值和负值的情况下。这种方法的主要贡献在于引入了一个新的辅助函数,放宽了稳定性条件,允许候选 Lyapunov 函数的一阶导数小于或等于一个非负函数。在我们的第一个定理中,建议的辅助函数应该是可积分的。同时,我们的第一个推论提出了一种技术,通过建立与微分不等式相关的特定条件来简化任务。在候选 Lyapunov 函数选择不佳或寻找 Lyapunov 函数并不简单的情况下,所提结果中的这种较弱条件能够建立稳定特性。此外,我们还证明了自治系统的原始 Lyapunov 方法是我们第一个定理的特例。此外,我们还证明了之前研究中的假设,如 Matrosov 定理或关于 Lyapunov 函数高阶导数的结果,都能保证我们的辅助函数的存在。最后,我们提供了构建这些辅助函数的定理,并举例说明了这种方法的有效性。这项工作将有助于非线性自主系统稳定性分析技术的发展。
{"title":"Constituting an Extension of Lyapunov’s Direct Method","authors":"M. Akbarian, N. Pariz, A. Heydari","doi":"10.1137/23m1595242","DOIUrl":"https://doi.org/10.1137/23m1595242","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2346-2366, August 2024. <br/> Abstract. This paper investigates new sufficient conditions for the stability, asymptotic stability, and global asymptotic stability of nonlinear autonomous systems, specifically in cases where the first derivative of the Lyapunov function candidate may have both positive and negative values on its domain. The main contribution of this approach is the introduction of a new auxiliary function that relaxes the stability conditions, allowing the first derivative of the Lyapunov function candidate to be less than or equal to a nonnegative function. The suggested auxiliary function should be integrable within our first theorem. Meanwhile, our first corollary presents a technique that simplifies the task by establishing specific conditions related to differential inequalities. This weaker condition in the proposed results enables the establishment of stability properties in cases where the Lyapunov function candidate is not well chosen or finding a Lyapunov function is not straightforward. Additionally, it is proven that the original Lyapunov method for autonomous systems is a special case of our first theorem. Furthermore, it is demonstrated that assumptions in previous studies, such as Matrosov’s theorem or results on higher-order derivatives of the Lyapunov function, guarantee the existence of our auxiliary function. Finally, lemmas are provided to construct these auxiliary functions, and examples are presented to demonstrate the effectiveness of this approach. This work will contribute to the development of stability analysis techniques for nonlinear autonomous systems.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142204103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The Unconditional Consensus Control through Leadership for the Delayed Hegselmann–Krause Model 通过领导力对延迟海格塞曼-克劳斯模型进行无条件共识控制
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-08 DOI: 10.1137/23m1588858
Linglong Du, Jianwen Zhu, Feng Xie
{"title":"The Unconditional Consensus Control through Leadership for the Delayed Hegselmann–Krause Model","authors":"Linglong Du, Jianwen Zhu, Feng Xie","doi":"10.1137/23m1588858","DOIUrl":"https://doi.org/10.1137/23m1588858","url":null,"abstract":"","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141928757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sharp Equilibria for Time-Inconsistent Mean-Field Stopping Games 时间不一致均值场停止博弈的尖锐均衡点
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-08 DOI: 10.1137/23m1625512
Ziyuan Wang, Zhou Zhou
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2319-2345, August 2024.
Abstract. We investigate time-inconsistent mean-field stopping games under nonexponential discounting in discrete time. At the intrapersonal level, each player plays against her future selves as a result of the time inconsistency caused by nonexponential discounting. At the interpersonal level, she plays against other players due to players’ interaction via the proportion of players that have stopped. We look for sharp mean-field equilibria (MFEs), such that given other players’ stopping policies, the representative player’s strategy not only is an intrapersonal equilibrium, but also an optimal one among all such intrapersonal equilibria. We analyze two classes of examples. The first one is on time-inconsistent bank-run models, and we construct an (optimal) sharp MFE by a monotone iteration scheme. The second one has a Markovian setup and no common noise, and we show the existence of a sharp MFE based on the Tikhonov fixed-point theorem.
SIAM 控制与优化期刊》第 62 卷第 4 期第 2319-2345 页,2024 年 8 月。 摘要。我们研究了离散时间非指数贴现下的时间不一致均值场停止博弈。在个人层面上,由于非指数贴现导致的时间不一致,每个博弈者都在与未来的自己博弈。在人际层面上,由于玩家之间的互动,玩家会通过已停止游戏的玩家比例与其他玩家进行博弈。我们寻找尖锐的均场均衡(MFEs),即给定其他玩家的停止策略,代表玩家的策略不仅是一个人际均衡,而且是所有此类人际均衡中的最优策略。我们分析了两类例子。第一类是时间不一致的银行运行模型,我们通过单调迭代方案构建了一个(最优)尖锐的 MFE。第二个例子是马尔可夫模型,没有普通噪声,我们根据提霍诺夫定点定理证明了尖锐 MFE 的存在。
{"title":"Sharp Equilibria for Time-Inconsistent Mean-Field Stopping Games","authors":"Ziyuan Wang, Zhou Zhou","doi":"10.1137/23m1625512","DOIUrl":"https://doi.org/10.1137/23m1625512","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2319-2345, August 2024. <br/> Abstract. We investigate time-inconsistent mean-field stopping games under nonexponential discounting in discrete time. At the intrapersonal level, each player plays against her future selves as a result of the time inconsistency caused by nonexponential discounting. At the interpersonal level, she plays against other players due to players’ interaction via the proportion of players that have stopped. We look for sharp mean-field equilibria (MFEs), such that given other players’ stopping policies, the representative player’s strategy not only is an intrapersonal equilibrium, but also an optimal one among all such intrapersonal equilibria. We analyze two classes of examples. The first one is on time-inconsistent bank-run models, and we construct an (optimal) sharp MFE by a monotone iteration scheme. The second one has a Markovian setup and no common noise, and we show the existence of a sharp MFE based on the Tikhonov fixed-point theorem.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141938545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Observer for Pipeline Flow with Hydrogen Blending in Gas Networks: Exponential Synchronization 天然气网络中混有氢气的管道流量观测器:指数同步
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-07 DOI: 10.1137/23m1563840
Martin Gugat, Jan Giesselmann
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2273-2296, August 2024.
Abstract. We consider a state estimation problem for gas flows in pipeline networks where hydrogen is blended into the natural gas. The flow is modeled by the quasi-linear isothermal Euler equations coupled to an advection equation on a graph. The flow through the vertices where the pipes are connected is governed by algebraic node conditions. The state is approximated by an observer system that uses nodal measurements. We prove that the state of the observer system converges to the original system state exponentially fast in the [math]-norm if the measurements are exact. If measurement errors are present we show that the observer state approximates the original system state up to an error that is proportional to the maximal measurement error. The proof of the synchronization result uses Lyapunov functions with exponential weights.
SIAM 控制与优化期刊》第 62 卷第 4 期第 2273-2296 页,2024 年 8 月。 摘要我们考虑了在天然气中掺入氢气的管网中天然气流的状态估计问题。气流模型是准线性等温欧拉方程和图上的平流方程。流经管道连接顶点的气流受代数节点条件支配。状态由使用节点测量的观测器系统近似表示。我们证明,如果测量精确,观察者系统的状态会以指数[math]-norm 的速度收敛到原始系统状态。如果存在测量误差,我们将证明观测器状态近似于原始系统状态,最大误差与最大测量误差成正比。同步结果的证明使用了具有指数权重的 Lyapunov 函数。
{"title":"An Observer for Pipeline Flow with Hydrogen Blending in Gas Networks: Exponential Synchronization","authors":"Martin Gugat, Jan Giesselmann","doi":"10.1137/23m1563840","DOIUrl":"https://doi.org/10.1137/23m1563840","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2273-2296, August 2024. <br/> Abstract. We consider a state estimation problem for gas flows in pipeline networks where hydrogen is blended into the natural gas. The flow is modeled by the quasi-linear isothermal Euler equations coupled to an advection equation on a graph. The flow through the vertices where the pipes are connected is governed by algebraic node conditions. The state is approximated by an observer system that uses nodal measurements. We prove that the state of the observer system converges to the original system state exponentially fast in the [math]-norm if the measurements are exact. If measurement errors are present we show that the observer state approximates the original system state up to an error that is proportional to the maximal measurement error. The proof of the synchronization result uses Lyapunov functions with exponential weights.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141938547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-Gain Function Based Prescribed-Time Output Feedback Control Nonlinear Time-Delay Systems 基于双增益函数的规定时间输出反馈控制非线性时延系统
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-08-05 DOI: 10.1137/23m1556496
Pengju Ning, Sergey N. Dashkovskiy, Changchun Hua, Kuo Li
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2254-2272, August 2024.
Abstract. This paper investigates the prescribed-time output feedback stabilization problem for a class of nonlinear time-delay systems. First, a novel dual-gain function is put forward by exploiting the dynamic gain and the time-varying gain function to design the reduced-order observer for reconstructing unavailable states. Then, by utilizing the Lyapunov–Krasovskii functional and state variables of the reduced-order observer, a new prescribed-time controller is presented based on the nonscaling design framework. Since no state scaling is required in controller design process under this framework, our control strategy is simpler and can greatly reduce the computational burden. Further, compared with the previous prescribed-time stabilization results, our designed controller acts on the entire time domain, not just a limited time interval. Based on our proposed stability criterion, it is proved that the controller can render that all system state variables converge to the origin within the prescribed time. Finally, a numerical example is provided to illustrate the effectiveness of the proposed control strategy.
SIAM 控制与优化期刊》第 62 卷第 4 期第 2254-2272 页,2024 年 8 月。 摘要本文研究了一类非线性时延系统的规定时间输出反馈稳定问题。首先,通过利用动态增益和时变增益函数,提出了一种新的双增益函数,以设计用于重构不可用状态的降阶观测器。然后,利用减阶观测器的 Lyapunov-Krasovskii 函数和状态变量,提出了一种基于非缩放设计框架的新型规定时间控制器。由于在此框架下控制器设计过程中无需进行状态缩放,因此我们的控制策略更加简单,并能大大减轻计算负担。此外,与之前的规定时间稳定结果相比,我们设计的控制器作用于整个时域,而不仅仅是有限的时间间隔。根据我们提出的稳定性准则,可以证明控制器能使所有系统状态变量在规定时间内收敛到原点。最后,我们提供了一个数值示例来说明所提出的控制策略的有效性。
{"title":"Dual-Gain Function Based Prescribed-Time Output Feedback Control Nonlinear Time-Delay Systems","authors":"Pengju Ning, Sergey N. Dashkovskiy, Changchun Hua, Kuo Li","doi":"10.1137/23m1556496","DOIUrl":"https://doi.org/10.1137/23m1556496","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2254-2272, August 2024. <br/> Abstract. This paper investigates the prescribed-time output feedback stabilization problem for a class of nonlinear time-delay systems. First, a novel dual-gain function is put forward by exploiting the dynamic gain and the time-varying gain function to design the reduced-order observer for reconstructing unavailable states. Then, by utilizing the Lyapunov–Krasovskii functional and state variables of the reduced-order observer, a new prescribed-time controller is presented based on the nonscaling design framework. Since no state scaling is required in controller design process under this framework, our control strategy is simpler and can greatly reduce the computational burden. Further, compared with the previous prescribed-time stabilization results, our designed controller acts on the entire time domain, not just a limited time interval. Based on our proposed stability criterion, it is proved that the controller can render that all system state variables converge to the origin within the prescribed time. Finally, a numerical example is provided to illustrate the effectiveness of the proposed control strategy.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141938548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Relaxation of Controlled Stochastic Gradient Descent via Singular Perturbations 通过奇异扰动实现受控随机梯度下降的深度放松
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-24 DOI: 10.1137/23m1544878
Martino Bardi, Hicham Kouhkouh
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2229-2253, August 2024.
Abstract. We consider a singularly perturbed system of stochastic differential equations proposed by Chaudhari et al. (Res. Math. Sci. 2018) to approximate the entropic gradient descent in the optimization of deep neural networks via homogenization. We embed it in a much larger class of two-scale stochastic control problems and rely on convergence results for Hamilton–Jacobi–Bellman equations with unbounded data proved recently by ourselves (ESAIM Control Optim. Calc. Var. 2023). We show that the limit of the value functions is itself the value function of an effective control problem with extended controls and that the trajectories of the perturbed system converge in a suitable sense to the trajectories of the limiting effective control system. These rigorous results improve the understanding of the convergence of the algorithms used by Chaudhari et al., as well as of their possible extensions where some tuning parameters are modeled as dynamic controls.
SIAM 控制与优化期刊》第 62 卷第 4 期第 2229-2253 页,2024 年 8 月。 摘要。我们考虑了 Chaudhari 等人(Res. Math. Sci. 2018)提出的奇异扰动随机微分方程系统,以通过同质化近似深度神经网络优化中的熵梯度下降。我们将其嵌入到一类更大的二尺度随机控制问题中,并依赖于我们自己最近证明的具有无约束数据的汉密尔顿-雅各比-贝尔曼方程的收敛结果(ESAIM Control Optim. Calc. Var. 2023)。我们证明了值函数的极限本身就是具有扩展控制的有效控制问题的值函数,并且扰动系统的轨迹在适当意义上收敛于极限有效控制系统的轨迹。这些严谨的结果加深了人们对 Chaudhari 等人所使用算法的收敛性的理解,也加深了人们对将某些调整参数建模为动态控制的可能扩展的理解。
{"title":"Deep Relaxation of Controlled Stochastic Gradient Descent via Singular Perturbations","authors":"Martino Bardi, Hicham Kouhkouh","doi":"10.1137/23m1544878","DOIUrl":"https://doi.org/10.1137/23m1544878","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2229-2253, August 2024. <br/> Abstract. We consider a singularly perturbed system of stochastic differential equations proposed by Chaudhari et al. (Res. Math. Sci. 2018) to approximate the entropic gradient descent in the optimization of deep neural networks via homogenization. We embed it in a much larger class of two-scale stochastic control problems and rely on convergence results for Hamilton–Jacobi–Bellman equations with unbounded data proved recently by ourselves (ESAIM Control Optim. Calc. Var. 2023). We show that the limit of the value functions is itself the value function of an effective control problem with extended controls and that the trajectories of the perturbed system converge in a suitable sense to the trajectories of the limiting effective control system. These rigorous results improve the understanding of the convergence of the algorithms used by Chaudhari et al., as well as of their possible extensions where some tuning parameters are modeled as dynamic controls.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141780273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zero-Sum Stopper Versus Singular-Controller Games with Constrained Control Directions 带有受限控制方向的零和停止器与奇异控制器博弈
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-19 DOI: 10.1137/23m1579558
Andrea Bovo, Tiziano De Angelis, Jan Palczewski
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2203-2228, August 2024.
Abstract. We consider a class of zero-sum stopper versus singular-controller games in which the controller can only act on a subset [math] of the [math] coordinates of a controlled diffusion. Due to the constraint on the control directions these games fall outside the framework of recently studied variational methods. In this paper we develop an approximation procedure, based on [math]-stability estimates for the controlled diffusion process and almost sure convergence of suitable stopping times. That allows us to prove existence of the game’s value and to obtain an optimal strategy for the stopper under continuity and growth conditions on the payoff functions. This class of games is a natural extension of (single-agent) singular control problems, studied in the literature, with similar constraints on the admissible controls.
SIAM 控制与优化期刊》,第 62 卷第 4 期,第 2203-2228 页,2024 年 8 月。 摘要。我们考虑了一类零和阻塞与奇异控制器博弈,其中控制器只能作用于受控扩散的[数学]坐标子集[数学]。由于对控制方向的限制,这些博弈超出了最近研究的变分法框架。在本文中,我们基于受控扩散过程的[math]稳定性估计和合适的停止时间的几乎确定的收敛性,开发了一种近似程序。这样,我们就能证明博弈值的存在性,并在报酬函数的连续性和增长条件下为停止者获得最佳策略。这类博弈是(单个代理)奇异控制问题的自然延伸,在文献中已有研究,对可接受的控制也有类似的限制。
{"title":"Zero-Sum Stopper Versus Singular-Controller Games with Constrained Control Directions","authors":"Andrea Bovo, Tiziano De Angelis, Jan Palczewski","doi":"10.1137/23m1579558","DOIUrl":"https://doi.org/10.1137/23m1579558","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2203-2228, August 2024. <br/> Abstract. We consider a class of zero-sum stopper versus singular-controller games in which the controller can only act on a subset [math] of the [math] coordinates of a controlled diffusion. Due to the constraint on the control directions these games fall outside the framework of recently studied variational methods. In this paper we develop an approximation procedure, based on [math]-stability estimates for the controlled diffusion process and almost sure convergence of suitable stopping times. That allows us to prove existence of the game’s value and to obtain an optimal strategy for the stopper under continuity and growth conditions on the payoff functions. This class of games is a natural extension of (single-agent) singular control problems, studied in the literature, with similar constraints on the admissible controls.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141740263","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Markov Perfect Equilibria in Discounted Stochastic ARAT Games 论贴现随机 ARAT 游戏中的马尔可夫完美均衡
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-16 DOI: 10.1137/23m1592365
Anna Jaśkiewicz, Andrzej S. Nowak
{"title":"On Markov Perfect Equilibria in Discounted Stochastic ARAT Games","authors":"Anna Jaśkiewicz, Andrzej S. Nowak","doi":"10.1137/23m1592365","DOIUrl":"https://doi.org/10.1137/23m1592365","url":null,"abstract":"","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141641863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph-Structured Tensor Optimization for Nonlinear Density Control and Mean Field Games 非线性密度控制和均值场博弈的图结构张量优化
IF 2.2 2区 数学 Q2 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-07-16 DOI: 10.1137/23m1571587
Axel Ringh, Isabel Haasler, Yongxin Chen, Johan Karlsson
SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2176-2202, August 2024.
Abstract. In this work we develop a numerical method for solving a type of convex graph-structured tensor optimization problem. This type of problem, which can be seen as a generalization of multimarginal optimal transport problems with graph-structured costs, appears in many applications. Examples are unbalanced optimal transport and multispecies potential mean field games, where the latter is a class of nonlinear density control problems. The method we develop is based on coordinate ascent in a Lagrangian dual, and under mild assumptions we prove that the algorithm converges globally. Moreover, under a set of stricter assumptions, the algorithm converges R-linearly. To perform the coordinate ascent steps one has to compute projections of the tensor, and doing so by brute force is in general not computationally feasible. Nevertheless, for certain graph structures it is possible to derive efficient methods for computing these projections, and here we specifically consider the graph structure that occurs in multispecies potential mean field games. We also illustrate the methodology on a numerical example from this problem class.
SIAM 控制与优化期刊》,第 62 卷第 4 期,第 2176-2202 页,2024 年 8 月。 摘要在这项工作中,我们开发了一种求解凸图结构张量优化问题的数值方法。这类问题可视为具有图结构成本的多边际最优运输问题的一般化,在许多应用中都会出现。例如,不平衡最优传输和多物种势均场博弈,后者是一类非线性密度控制问题。我们开发的方法基于拉格朗日对偶中的坐标上升,在温和的假设条件下,我们证明了算法的全局收敛性。此外,在一系列更严格的假设条件下,该算法可实现 R 线性收敛。要执行坐标上升步骤,必须计算张量的投影,而用蛮力计算一般是不可行的。不过,对于某些图结构,我们可以推导出计算这些投影的高效方法,这里我们特别考虑了多物种潜在均值场博弈中出现的图结构。我们还将用这一类问题中的一个数值例子来说明这种方法。
{"title":"Graph-Structured Tensor Optimization for Nonlinear Density Control and Mean Field Games","authors":"Axel Ringh, Isabel Haasler, Yongxin Chen, Johan Karlsson","doi":"10.1137/23m1571587","DOIUrl":"https://doi.org/10.1137/23m1571587","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 4, Page 2176-2202, August 2024. <br/> Abstract. In this work we develop a numerical method for solving a type of convex graph-structured tensor optimization problem. This type of problem, which can be seen as a generalization of multimarginal optimal transport problems with graph-structured costs, appears in many applications. Examples are unbalanced optimal transport and multispecies potential mean field games, where the latter is a class of nonlinear density control problems. The method we develop is based on coordinate ascent in a Lagrangian dual, and under mild assumptions we prove that the algorithm converges globally. Moreover, under a set of stricter assumptions, the algorithm converges R-linearly. To perform the coordinate ascent steps one has to compute projections of the tensor, and doing so by brute force is in general not computationally feasible. Nevertheless, for certain graph structures it is possible to derive efficient methods for computing these projections, and here we specifically consider the graph structure that occurs in multispecies potential mean field games. We also illustrate the methodology on a numerical example from this problem class.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141717954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
SIAM Journal on Control and Optimization
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1