首页 > 最新文献

SIAM Journal on Control and Optimization最新文献

英文 中文
Verification Methods for the Lyapunov–Krasovskii Functional Inequalities Lyapunov-Krasovskii 函数不等式的验证方法
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-03-04 DOI: 10.1137/22m1542167
Ali Diab, Giorgio Valmorbida, William Pasillas-Lépine
SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 877-902, April 2024.
Abstract. We study parameterizations of Lyapunov–Krasovskii functionals (LKFs) to analyze the stability of linear time-delay systems. We discuss the solution to the delay Lyapunov matrix, which constructs an LKF associated with a prescribed time derivative, and relate it to the approaches commonly used in the numerical computation of LKFs. We then compare two approaches for the stability analysis of time-delay systems based on semidefinite programming, namely the method based on integral inequalities and the method based on sum-of-squares programming, which have recently emerged as optimization-based methods to compute LKFs. We discuss their main assumptions and establish connections between both methods. Finally, we formulate a projection-based method allowing us to use general sets of functions to parameterize LKFs, thus encompassing the sets of polynomial functions in the literature. The solutions of the proposed stability conditions and the construction of the corresponding LKFs as stability certificates are illustrated with numerical examples.
SIAM 控制与优化期刊》第 62 卷第 2 期第 877-902 页,2024 年 4 月。 摘要。我们研究了 Lyapunov-Krasovskii 函数(LKF)的参数化,以分析线性时延系统的稳定性。我们讨论了延迟 Lyapunov 矩阵的解,它构建了一个与规定时间导数相关的 LKF,并将其与 LKF 数值计算中常用的方法联系起来。然后,我们比较了两种基于半定量编程的时延系统稳定性分析方法,即基于积分不等式的方法和基于平方和编程的方法,这两种方法是最近出现的基于优化的 LKF 计算方法。我们讨论了它们的主要假设,并建立了这两种方法之间的联系。最后,我们提出了一种基于投影的方法,允许我们使用一般函数集来参数化 LKF,从而涵盖了文献中的多项式函数集。我们用数值示例说明了所提出的稳定性条件的解决方案以及作为稳定性证明的相应 LKF 的构建。
{"title":"Verification Methods for the Lyapunov–Krasovskii Functional Inequalities","authors":"Ali Diab, Giorgio Valmorbida, William Pasillas-Lépine","doi":"10.1137/22m1542167","DOIUrl":"https://doi.org/10.1137/22m1542167","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 877-902, April 2024. <br/> Abstract. We study parameterizations of Lyapunov–Krasovskii functionals (LKFs) to analyze the stability of linear time-delay systems. We discuss the solution to the delay Lyapunov matrix, which constructs an LKF associated with a prescribed time derivative, and relate it to the approaches commonly used in the numerical computation of LKFs. We then compare two approaches for the stability analysis of time-delay systems based on semidefinite programming, namely the method based on integral inequalities and the method based on sum-of-squares programming, which have recently emerged as optimization-based methods to compute LKFs. We discuss their main assumptions and establish connections between both methods. Finally, we formulate a projection-based method allowing us to use general sets of functions to parameterize LKFs, thus encompassing the sets of polynomial functions in the literature. The solutions of the proposed stability conditions and the construction of the corresponding LKFs as stability certificates are illustrated with numerical examples.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140036886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Global Approximate Controllability of a Quantum Particle in a Box by Moving Walls 论移动壁对盒中量子粒子的全局近似可控性
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-03-04 DOI: 10.1137/22m1518980
Aitor Balmaseda, Davide Lonigro, Juan Manuel Pérez-Pardo
SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 826-852, April 2024.
Abstract. We study a system composed of a free quantum particle trapped in a box whose walls can change their position. We prove the global approximate controllability of the system: any initial pure state can be driven arbitrarily close to any target pure state in the Hilbert space of the free particle with a predetermined final position of the box. To this purpose we consider weak solutions of the Schrödinger equation and use a stability theorem for the time-dependent Schrödinger equation.
SIAM 控制与优化期刊》第 62 卷第 2 期第 826-852 页,2024 年 4 月。 摘要我们研究了一个由被困在箱体内的自由量子粒子组成的系统,箱壁可以改变其位置。我们证明了该系统的全局近似可控性:任何初始纯态都可以被任意驱动到接近自由粒子的希尔伯特空间中的任何目标纯态,而盒子的最终位置是预先确定的。为此,我们考虑了薛定谔方程的弱解,并使用了与时间相关的薛定谔方程的稳定性定理。
{"title":"On Global Approximate Controllability of a Quantum Particle in a Box by Moving Walls","authors":"Aitor Balmaseda, Davide Lonigro, Juan Manuel Pérez-Pardo","doi":"10.1137/22m1518980","DOIUrl":"https://doi.org/10.1137/22m1518980","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 826-852, April 2024. <br/> Abstract. We study a system composed of a free quantum particle trapped in a box whose walls can change their position. We prove the global approximate controllability of the system: any initial pure state can be driven arbitrarily close to any target pure state in the Hilbert space of the free particle with a predetermined final position of the box. To this purpose we consider weak solutions of the Schrödinger equation and use a stability theorem for the time-dependent Schrödinger equation.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140037374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long-Run Impulse Control with Generalized Discounting 广义贴现的长期冲动控制
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-03-04 DOI: 10.1137/23m1582539
Damian Jelito, Łukasz Stettner
SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 853-876, April 2024.
Abstract. In this paper, we investigate the effects of applying generalized (nonexponential) discounting on a long-run impulse control problem for a Feller–Markov process. We show that the optimal value of the discounted problem is the same as the optimal value of its undiscounted version. Next, we prove that an optimal strategy for the undiscounted discrete-time functional is also optimal for the discrete-time discounted criterion and nearly optimal for the continuous-time discounted one. This shows that the discounted problem, being time-inconsistent in nature, admits a time-consistent solution. Also, instead of a complex time-dependent Bellman equation, one may consider its simpler time-independent version.
SIAM 控制与优化期刊》第 62 卷第 2 期第 853-876 页,2024 年 4 月。 摘要本文研究了对费勒-马尔科夫过程的长期脉冲控制问题应用广义(非指数)贴现的影响。我们证明,贴现问题的最优值与未贴现问题的最优值相同。接下来,我们证明了未贴现离散时间函数的最优策略也是离散时间贴现准则的最优策略,并且接近连续时间贴现准则的最优策略。这表明,贴现问题在本质上是时间不一致的,但它有一个时间一致的解。此外,我们还可以考虑与时间无关的简单版本,而不是复杂的与时间有关的贝尔曼方程。
{"title":"Long-Run Impulse Control with Generalized Discounting","authors":"Damian Jelito, Łukasz Stettner","doi":"10.1137/23m1582539","DOIUrl":"https://doi.org/10.1137/23m1582539","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 853-876, April 2024. <br/> Abstract. In this paper, we investigate the effects of applying generalized (nonexponential) discounting on a long-run impulse control problem for a Feller–Markov process. We show that the optimal value of the discounted problem is the same as the optimal value of its undiscounted version. Next, we prove that an optimal strategy for the undiscounted discrete-time functional is also optimal for the discrete-time discounted criterion and nearly optimal for the continuous-time discounted one. This shows that the discounted problem, being time-inconsistent in nature, admits a time-consistent solution. Also, instead of a complex time-dependent Bellman equation, one may consider its simpler time-independent version.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140036900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Stationary Nash Equilibrium Policies in [math]-Player Stochastic Games with Independent Chains 在具有独立链的[数学]玩家随机博弈中学习静态纳什均衡政策
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-03-01 DOI: 10.1137/22m1512880
S. Rasoul Etesami
SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 799-825, April 2024.
Abstract. We consider a subclass of [math]-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players’ internal chains are driven by independent transition probabilities. Moreover, players can receive only realizations of their payoffs, not the actual functions, and cannot observe each others’ states/actions. For this class of games, we first show that finding a stationary Nash equilibrium (NE) policy without any assumption on the reward functions is intractable. However, for general reward functions, we develop polynomial-time learning algorithms based on dual averaging and dual mirror descent, which converge in terms of the averaged Nikaido–Isoda distance to the set of [math]-NE policies almost surely or in expectation. In particular, under extra assumptions on the reward functions such as social concavity, we derive polynomial upper bounds on the number of iterates to achieve an [math]-NE policy with high probability. Finally, we evaluate the effectiveness of the proposed algorithms in learning [math]-NE policies using numerical experiments for energy management in smart grids.
SIAM 控制与优化期刊》第 62 卷第 2 期第 799-825 页,2024 年 4 月。 摘要。我们考虑了[math]玩家随机博弈的一个子类,其中玩家有自己的内部状态/行动空间,同时他们通过报酬函数耦合在一起。假设博弈方的内部链由独立的过渡概率驱动。此外,博弈者只能获得其报酬的实现,而非实际函数,并且无法观察到对方的状态/行动。对于这类博弈,我们首先证明,在不假设报酬函数的情况下,寻找静态纳什均衡(NE)策略是难以实现的。然而,对于一般的奖励函数,我们开发了基于对偶平均和对偶镜像下降的多项式时间学习算法,这些算法在平均日海道-伊索达距离方面几乎肯定或期望收敛到[math]-NE 政策集。特别是,在奖励函数的额外假设(如社会凹性)下,我们推导出了高概率实现[math]-NE 策略的迭代次数的多项式上限。最后,我们利用智能电网能源管理的数值实验评估了所提算法在学习[math]-NE 政策方面的有效性。
{"title":"Learning Stationary Nash Equilibrium Policies in [math]-Player Stochastic Games with Independent Chains","authors":"S. Rasoul Etesami","doi":"10.1137/22m1512880","DOIUrl":"https://doi.org/10.1137/22m1512880","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 2, Page 799-825, April 2024. <br/> Abstract. We consider a subclass of [math]-player stochastic games, in which players have their own internal state/action spaces while they are coupled through their payoff functions. It is assumed that players’ internal chains are driven by independent transition probabilities. Moreover, players can receive only realizations of their payoffs, not the actual functions, and cannot observe each others’ states/actions. For this class of games, we first show that finding a stationary Nash equilibrium (NE) policy without any assumption on the reward functions is intractable. However, for general reward functions, we develop polynomial-time learning algorithms based on dual averaging and dual mirror descent, which converge in terms of the averaged Nikaido–Isoda distance to the set of [math]-NE policies almost surely or in expectation. In particular, under extra assumptions on the reward functions such as social concavity, we derive polynomial upper bounds on the number of iterates to achieve an [math]-NE policy with high probability. Finally, we evaluate the effectiveness of the proposed algorithms in learning [math]-NE policies using numerical experiments for energy management in smart grids.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140008876","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spectral Factorization of Rank-Deficient Rational Densities 缺阶有理密度的谱因式分解
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-02-20 DOI: 10.1137/23m1546622
Wenqi Cao, Anders Lindquist
SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 776-798, February 2024.
Abstract. Though there are hundreds of papers on rational spectral factorization, most of them are concerned with full-rank spectral densities. In this paper we propose a novel approach for spectral factorization of a rank-deficient spectral density, leading to a minimum-phase full-rank spectral factor, in both the discrete-time and continuous-time cases. Compared with several approaches to low-rank spectral factorization, our approach exploits a deterministic relation inside the factor, leading to high computational efficiency. In addition, we show that this method is easily used in identification of low-rank processes and in Wiener filtering.
SIAM 控制与优化期刊》第 62 卷第 1 期第 776-798 页,2024 年 2 月。 摘要尽管有数百篇关于有理谱因式分解的论文,但其中大多数涉及全秩谱密度。在本文中,我们提出了一种新的秩缺谱密度谱因式分解方法,从而在离散时间和连续时间两种情况下得到最小相全秩谱因式。与几种低秩谱因式分解方法相比,我们的方法利用了因式内部的确定性关系,从而提高了计算效率。此外,我们还证明了这种方法很容易用于低阶过程的识别和维纳滤波。
{"title":"Spectral Factorization of Rank-Deficient Rational Densities","authors":"Wenqi Cao, Anders Lindquist","doi":"10.1137/23m1546622","DOIUrl":"https://doi.org/10.1137/23m1546622","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 776-798, February 2024. <br/> Abstract. Though there are hundreds of papers on rational spectral factorization, most of them are concerned with full-rank spectral densities. In this paper we propose a novel approach for spectral factorization of a rank-deficient spectral density, leading to a minimum-phase full-rank spectral factor, in both the discrete-time and continuous-time cases. Compared with several approaches to low-rank spectral factorization, our approach exploits a deterministic relation inside the factor, leading to high computational efficiency. In addition, we show that this method is easily used in identification of low-rank processes and in Wiener filtering.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139917628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leader-Following Rendezvous Control for Generalized Cucker-Smale Model on Riemannian Manifolds 黎曼曲面上广义卡克-斯马尔模型的领跑者-追随者会合控制
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-02-16 DOI: 10.1137/23m1545811
Xiaoyu Li, Yuhu Wu, Lining Ru
SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 724-751, February 2024.
Abstract. This paper studies a leader-following rendezvous problem for the generalized Cucker–Smale model, a double-integrator multiagent system, on some Riemannian manifolds. By using intrinsic properties of the covariant derivative, logarithmic map, and parallel transport on the Riemannian manifolds, we design a feedback control law and prove that this feedback control law enables all followers to track the trajectory of the moving leader when the Riemannian manifold is compact or flat. As concrete examples, we consider the leader-following rendezvous problem on the unit sphere, in Euclidean space, on the unit circle, and infinite cylinder and present the corresponding feedback control laws. Meanwhile, numerical examples are given for the aforementioned Riemannian manifolds to illustrate and verify the theoretical results.
SIAM 控制与优化期刊》第 62 卷第 1 期第 724-751 页,2024 年 2 月。 摘要本文研究了广义 Cucker-Smale 模型(一种双积分多代理系统)在某些黎曼流形上的领导-跟随交会问题。利用黎曼流形上的协变导数、对数映射和平行传输的固有特性,我们设计了一种反馈控制律,并证明当黎曼流形紧凑或平坦时,这种反馈控制律能使所有跟随者跟踪移动的领导者的轨迹。作为具体例子,我们考虑了单位球面、欧几里得空间、单位圆和无限圆柱体上的领跑者-跟随者交会问题,并给出了相应的反馈控制律。同时,我们给出了上述黎曼流形的数值示例,以说明和验证理论结果。
{"title":"Leader-Following Rendezvous Control for Generalized Cucker-Smale Model on Riemannian Manifolds","authors":"Xiaoyu Li, Yuhu Wu, Lining Ru","doi":"10.1137/23m1545811","DOIUrl":"https://doi.org/10.1137/23m1545811","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 724-751, February 2024. <br/> Abstract. This paper studies a leader-following rendezvous problem for the generalized Cucker–Smale model, a double-integrator multiagent system, on some Riemannian manifolds. By using intrinsic properties of the covariant derivative, logarithmic map, and parallel transport on the Riemannian manifolds, we design a feedback control law and prove that this feedback control law enables all followers to track the trajectory of the moving leader when the Riemannian manifold is compact or flat. As concrete examples, we consider the leader-following rendezvous problem on the unit sphere, in Euclidean space, on the unit circle, and infinite cylinder and present the corresponding feedback control laws. Meanwhile, numerical examples are given for the aforementioned Riemannian manifolds to illustrate and verify the theoretical results.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Turnpike Properties for Mean-Field Linear-Quadratic Optimal Control Problems 平均场线性-二次方最优控制问题的岔道特性
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-02-16 DOI: 10.1137/22m1524187
Jingrui Sun, Jiongmin Yong
SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 752-775, February 2024.
Abstract. This paper is concerned with an optimal control problem for a mean-field linear stochastic differential equation with a quadratic functional in the infinite time horizon. Under suitable conditions, including the stabilizability, the (strong) exponential, integral, and mean-square turnpike properties for the optimal pair are established. The keys are to correctly formulate the corresponding static optimization problem and find the equations determining the correction processes. These have revealed the main feature of the stochastic problems which are significantly different from the deterministic version of the theory.
SIAM 控制与优化期刊》第 62 卷第 1 期第 752-775 页,2024 年 2 月。 摘要本文关注无限时间范围内具有二次函数的均场线性随机微分方程的最优控制问题。在适当的条件下,包括可稳定化条件,建立了最优对的(强)指数、积分和均方转弯特性。关键在于正确提出相应的静态优化问题,并找到决定修正过程的方程。这些都揭示了随机问题的主要特征,与确定性理论版本有显著不同。
{"title":"Turnpike Properties for Mean-Field Linear-Quadratic Optimal Control Problems","authors":"Jingrui Sun, Jiongmin Yong","doi":"10.1137/22m1524187","DOIUrl":"https://doi.org/10.1137/22m1524187","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 752-775, February 2024. <br/> Abstract. This paper is concerned with an optimal control problem for a mean-field linear stochastic differential equation with a quadratic functional in the infinite time horizon. Under suitable conditions, including the stabilizability, the (strong) exponential, integral, and mean-square turnpike properties for the optimal pair are established. The keys are to correctly formulate the corresponding static optimization problem and find the equations determining the correction processes. These have revealed the main feature of the stochastic problems which are significantly different from the deterministic version of the theory.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Modeling of Impulse Control with Random Effects for Continuous Markov Processes 论带随机效应的连续马尔可夫过程的冲动控制建模
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-02-15 DOI: 10.1137/19m1286967
Kurt L. Helmes, Richard H. Stockbridge, Chao Zhu
SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 699-723, February 2024.
Abstract. The use of coordinate processes for the modeling of impulse control for general Markov processes typically involves the construction of a probability measure on a countable product of copies of the path space. In addition, admissibility of an impulse control policy requires that the random times of the interventions be stopping times with respect to different filtrations arising from the different component coordinate processes. When the underlying Markov process has continuous paths, however, a simpler model can be developed which takes the single path space as its probability space and uses the natural filtration with respect to which the intervention times must be stopping times. Moreover, this model construction allows for impulse control with random effects whereby the decision maker selects a distribution of the new state. This paper gives the construction of the probability measure on the path space for an admissible intervention policy subject to a randomized impulse mechanism. In addition, a class of polices is defined for which the paths between interventions are independent and a further subclass for which the cycles following the initial cycle are identically distributed. A benefit of this smaller subclass of policies is that one is allowed to use classical renewal arguments to analyze long-term average control problems. Further, the paper defines a class of stationary impulse policies for which the family of models gives a Markov family. The decision to use an [math] ordering policy in inventory management provides an example of an impulse policy for which the process has independent and identically distributed cycles and the family of models forms a Markov family.
SIAM 控制与优化期刊》第 62 卷第 1 期第 699-723 页,2024 年 2 月。 摘要。使用坐标过程对一般马尔可夫过程进行脉冲控制建模,通常需要在路径空间副本的可数乘积上构建概率度量。此外,脉冲控制策略的可接受性还要求干预的随机时间是不同组件坐标过程产生的不同过滤的停止时间。然而,当基本马尔可夫过程具有连续路径时,可以建立一个更简单的模型,将单一路径空间作为其概率空间,并使用干预时间必须是停止时间的自然过滤。此外,这种模型结构还允许进行具有随机效应的脉冲控制,即决策者选择新状态的分布。本文给出了受随机脉冲机制影响的可接受干预政策在路径空间上的概率度量的构造。此外,本文还定义了一类干预之间的路径是独立的政策,以及一类初始周期之后的周期是同分布的政策。这种较小的政策子类的一个好处是,人们可以使用经典的更新论证来分析长期平均控制问题。此外,本文还定义了一类静态脉冲政策,其模型族给出了一个马尔可夫族。在库存管理中使用[数学]订货策略的决策提供了一个脉冲策略的例子,对于该策略,过程具有独立且同分布的循环,模型族形成了一个马尔可夫族。
{"title":"On the Modeling of Impulse Control with Random Effects for Continuous Markov Processes","authors":"Kurt L. Helmes, Richard H. Stockbridge, Chao Zhu","doi":"10.1137/19m1286967","DOIUrl":"https://doi.org/10.1137/19m1286967","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 699-723, February 2024. <br/> Abstract. The use of coordinate processes for the modeling of impulse control for general Markov processes typically involves the construction of a probability measure on a countable product of copies of the path space. In addition, admissibility of an impulse control policy requires that the random times of the interventions be stopping times with respect to different filtrations arising from the different component coordinate processes. When the underlying Markov process has continuous paths, however, a simpler model can be developed which takes the single path space as its probability space and uses the natural filtration with respect to which the intervention times must be stopping times. Moreover, this model construction allows for impulse control with random effects whereby the decision maker selects a distribution of the new state. This paper gives the construction of the probability measure on the path space for an admissible intervention policy subject to a randomized impulse mechanism. In addition, a class of polices is defined for which the paths between interventions are independent and a further subclass for which the cycles following the initial cycle are identically distributed. A benefit of this smaller subclass of policies is that one is allowed to use classical renewal arguments to analyze long-term average control problems. Further, the paper defines a class of stationary impulse policies for which the family of models gives a Markov family. The decision to use an [math] ordering policy in inventory management provides an example of an impulse policy for which the process has independent and identically distributed cycles and the family of models forms a Markov family.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Control Duality and the Douglas–Rachford Algorithm 最优控制对偶性和道格拉斯-拉赫福德算法
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-02-12 DOI: 10.1137/23m1558549
Regina S. Burachik, Bethany I. Caldwell, C. Yalçin Kaya, Walaa M. Moursi
SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 680-698, February 2024.
Abstract. We explore the relationship between the dual of a weighted minimum-energy control problem, a special case of linear-quadratic optimal control problems, and the Douglas–Rachford (DR) algorithm. We obtain an expression for the fixed point of the DR operator as applied to solving the optimal control problem, which in turn devises a certificate of optimality that can be employed for numerical verification. The fixed point and the optimality check are illustrated in two example optimal control problems.
SIAM 控制与优化期刊》第 62 卷第 1 期第 680-698 页,2024 年 2 月。 摘要我们探讨了加权最小能量控制问题(线性二次优化控制问题的一种特例)的对偶与道格拉斯-拉赫福德(DR)算法之间的关系。我们得到了 DR 算子用于求解最优控制问题的定点表达式,进而设计出了可用于数值检验的最优性证书。固定点和最优性检验在两个优化控制问题示例中进行了说明。
{"title":"Optimal Control Duality and the Douglas–Rachford Algorithm","authors":"Regina S. Burachik, Bethany I. Caldwell, C. Yalçin Kaya, Walaa M. Moursi","doi":"10.1137/23m1558549","DOIUrl":"https://doi.org/10.1137/23m1558549","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 680-698, February 2024. <br/> Abstract. We explore the relationship between the dual of a weighted minimum-energy control problem, a special case of linear-quadratic optimal control problems, and the Douglas–Rachford (DR) algorithm. We obtain an expression for the fixed point of the DR operator as applied to solving the optimal control problem, which in turn devises a certificate of optimality that can be employed for numerical verification. The fixed point and the optimality check are illustrated in two example optimal control problems.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Primal-Dual Regression Approach for Markov Decision Processes with General State and Action Spaces 具有一般状态和行动空间的马尔可夫决策过程的原始-双重回归方法
IF 2.2 2区 数学 Q1 Mathematics Pub Date : 2024-02-12 DOI: 10.1137/22m1526010
Denis Belomestny, John Schoenmakers
SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 650-679, February 2024.
Abstract. We develop a regression-based primal-dual martingale approach for solving discrete time, finite-horizon MDPs. The state and action spaces may be finite or infinite (but regular enough) subsets of Euclidean space. Consequently, our method allows for the construction of tight upper and lower-biased approximations of the value functions, providing precise estimates of the optimal policy. Importantly, we prove error bounds for the estimated duality gap featuring polynomial dependence on the time horizon. Additionally, we observe sublinear dependence of the stochastic part of the error on the cardinality/dimension of the state and action spaces. From a computational perspective, our proposed method is efficient. Unlike typical duality-based methods for optimal control problems in the literature, the Monte Carlo procedures involved here do not require nested simulations.
SIAM 控制与优化期刊》第 62 卷第 1 期第 650-679 页,2024 年 2 月。 摘要我们开发了一种基于回归的原始-双鞅方法,用于求解离散时间、有限视距 MDP。状态空间和行动空间可以是欧几里得空间的有限或无限(但足够规则)子集。因此,我们的方法可以构建严格的上偏和下偏值函数近似值,提供最优策略的精确估计。重要的是,我们证明了估计对偶差距的误差边界,其特点是对时间跨度的多项式依赖。此外,我们还观察到误差的随机部分与状态和行动空间的心率/维度存在亚线性关系。从计算角度来看,我们提出的方法是高效的。与文献中基于二元性的最优控制问题典型方法不同,这里涉及的蒙特卡罗程序不需要嵌套模拟。
{"title":"Primal-Dual Regression Approach for Markov Decision Processes with General State and Action Spaces","authors":"Denis Belomestny, John Schoenmakers","doi":"10.1137/22m1526010","DOIUrl":"https://doi.org/10.1137/22m1526010","url":null,"abstract":"SIAM Journal on Control and Optimization, Volume 62, Issue 1, Page 650-679, February 2024. <br/> Abstract. We develop a regression-based primal-dual martingale approach for solving discrete time, finite-horizon MDPs. The state and action spaces may be finite or infinite (but regular enough) subsets of Euclidean space. Consequently, our method allows for the construction of tight upper and lower-biased approximations of the value functions, providing precise estimates of the optimal policy. Importantly, we prove error bounds for the estimated duality gap featuring polynomial dependence on the time horizon. Additionally, we observe sublinear dependence of the stochastic part of the error on the cardinality/dimension of the state and action spaces. From a computational perspective, our proposed method is efficient. Unlike typical duality-based methods for optimal control problems in the literature, the Monte Carlo procedures involved here do not require nested simulations.","PeriodicalId":49531,"journal":{"name":"SIAM Journal on Control and Optimization","volume":null,"pages":null},"PeriodicalIF":2.2,"publicationDate":"2024-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139758454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
SIAM Journal on Control and Optimization
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1