首页 > 最新文献

Systems & Control Letters最新文献

英文 中文
Stability analysis of systems with time-varying delays for conservatism and complexity reduction 对具有时变延迟的系统进行稳定性分析,以减少保守性和复杂性
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-23 DOI: 10.1016/j.sysconle.2024.105948
Yu-Long Fan , Chuan-Ke Zhang , Yun-Fan Liu , Yong He , Qing-Guo Wang
This paper is concerned with the stability analysis of systems with time-varying delays via the Lyapunov–Krasovskii functional (LKF) method. Unlike the most existing works primarily on conservatism reduction, this paper aims to establish stability criteria with less conservatism as well as low complexity, based on a relatively simple LKF with improved derivative treatments. For this purpose, a fragmented-component-based integral inequality is developed through matrix-separation and mixed estimation of the augmented integral term, which tights the estimation gap and contributes to conservatism reduction; and a novel linearized transformation method is proposed by stripping-simplification and matrix-injection, which handles nonlinear delay-itself-related terms at a low complexity cost. Then, a novel stability criterion as well as several comparative criteria are obtained for linear time-delay systems. Finally, the superiority of the proposed methods is demonstrated via two benchmark examples and a load frequency control system.
本文关注通过 Lyapunov-Krasovskii 函数(LKF)方法对具有时变延迟的系统进行稳定性分析。与大多数以降低保守性为主要目的的现有著作不同,本文旨在基于相对简单的 LKF 和改进的导数处理方法,建立保守性较小且复杂性较低的稳定性标准。为此,本文通过矩阵分离和对增强积分项的混合估计,开发了一种基于片段成分的积分不等式,它弥补了估计差距,有助于降低保守性;并通过剥离简化和矩阵注入,提出了一种新型线性化变换方法,它能以较低的复杂度成本处理非线性延迟自相关项。然后,针对线性时延系统提出了一种新的稳定性准则和几种比较准则。最后,通过两个基准实例和一个负载频率控制系统证明了所提方法的优越性。
{"title":"Stability analysis of systems with time-varying delays for conservatism and complexity reduction","authors":"Yu-Long Fan ,&nbsp;Chuan-Ke Zhang ,&nbsp;Yun-Fan Liu ,&nbsp;Yong He ,&nbsp;Qing-Guo Wang","doi":"10.1016/j.sysconle.2024.105948","DOIUrl":"10.1016/j.sysconle.2024.105948","url":null,"abstract":"<div><div>This paper is concerned with the stability analysis of systems with time-varying delays via the Lyapunov–Krasovskii functional (LKF) method. Unlike the most existing works primarily on conservatism reduction, this paper aims to establish stability criteria with less conservatism as well as low complexity, based on a relatively simple LKF with improved derivative treatments. For this purpose, a fragmented-component-based integral inequality is developed through matrix-separation and mixed estimation of the augmented integral term, which tights the estimation gap and contributes to conservatism reduction; and a novel linearized transformation method is proposed by stripping-simplification and matrix-injection, which handles nonlinear delay-itself-related terms at a low complexity cost. Then, a novel stability criterion as well as several comparative criteria are obtained for linear time-delay systems. Finally, the superiority of the proposed methods is demonstrated via two benchmark examples and a load frequency control system.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105948"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Near optimality of Lipschitz and smooth policies in controlled diffusions 受控扩散中的利普切茨和平稳政策的近最优性
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-23 DOI: 10.1016/j.sysconle.2024.105943
Somnath Pradhan , Serdar Yüksel
For optimal control of diffusions under several criteria, due to computational or analytical reasons, many studies have a apriori assumed control policies to be Lipschitz or smooth, often with no rigorous analysis on whether this restriction entails loss. While optimality of Markov/stationary Markov policies for expected finite horizon/infinite horizon (discounted/ergodic) cost and cost-up-to-exit time optimal control problems can be established under certain technical conditions, an optimal solution is typically only measurable in the state (and time, if the horizon is finite) with no apriori additional structural properties. In this paper, building on our recent work (Pradhan and Yüksel, 2024) establishing the regularity of optimal cost on the space of control policies under the Borkar control topology for a general class of controlled diffusions in Rd, we establish near optimality of smooth or Lipschitz continuous policies for optimal control under expected finite horizon, infinite horizon discounted, infinite horizon average, and up-to-exit time cost criteria. Under mild assumptions, we first show that smooth/Lipschitz continuous policies are dense in the space of Markov/stationary Markov policies under the Borkar topology. Then utilizing the continuity of optimal costs as a function of policies on the space of Markov/stationary policies under the Borkar topology, we establish that optimal policies can be approximated by smooth/Lipschitz continuous policies with arbitrary precision. While our results are extensions of our recent work, the practical significance of an explicit statement and accessible presentation dedicated to Lipschitz and smooth policies, given their prominence in the literature, motivates our current paper.
对于若干标准下的扩散最优控制,由于计算或分析方面的原因,许多研究都先验地假定控制策略是立普齐兹或平滑的,但往往没有严格分析这种限制是否会带来损失。虽然在某些技术条件下,可以建立马尔可夫/稳态马尔可夫政策对预期有限视界/无限视界(贴现/迭代)成本和成本-退出时间最优控制问题的最优性,但最优解通常只在状态(和时间,如果视界是有限的)上可测量,而没有先验的附加结构特性。在本文中,我们在近期工作(Pradhan and Yüksel, 2024)的基础上,针对 Rd 中的一类受控扩散,建立了 Borkar 控制拓扑下控制策略空间上最优成本的正则性,并在预期有限视界、无限视界贴现、无限视界平均和直至退出时间成本准则下,为最优控制建立了平滑或 Lipschitz 连续策略的近似最优性。在温和的假设条件下,我们首先证明在博尔卡拓扑下,平滑/利普斯奇兹连续政策在马尔可夫/静态马尔可夫政策空间中是密集的。然后,利用博尔卡拓扑结构下马尔可夫/稳态政策空间中最优成本作为政策函数的连续性,我们确定最优政策可以用任意精度的平滑/边缘连续政策近似。虽然我们的结果是对我们近期工作的扩展,但鉴于利普斯基茨和平稳政策在文献中的突出地位,对它们进行明确的陈述和易懂的介绍对我们当前的论文具有实际意义。
{"title":"Near optimality of Lipschitz and smooth policies in controlled diffusions","authors":"Somnath Pradhan ,&nbsp;Serdar Yüksel","doi":"10.1016/j.sysconle.2024.105943","DOIUrl":"10.1016/j.sysconle.2024.105943","url":null,"abstract":"<div><div>For optimal control of diffusions under several criteria, due to computational or analytical reasons, many studies have a apriori assumed control policies to be Lipschitz or smooth, often with no rigorous analysis on whether this restriction entails loss. While optimality of Markov/stationary Markov policies for expected finite horizon/infinite horizon (discounted/ergodic) cost and cost-up-to-exit time optimal control problems can be established under certain technical conditions, an optimal solution is typically only measurable in the state (and time, if the horizon is finite) with no apriori additional structural properties. In this paper, building on our recent work (Pradhan and Yüksel, 2024) establishing the regularity of optimal cost on the space of control policies under the Borkar control topology for a general class of controlled diffusions in <span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>d</mi></mrow></msup></math></span>, we establish near optimality of smooth or Lipschitz continuous policies for optimal control under expected finite horizon, infinite horizon discounted, infinite horizon average, and up-to-exit time cost criteria. Under mild assumptions, we first show that smooth/Lipschitz continuous policies are dense in the space of Markov/stationary Markov policies under the Borkar topology. Then utilizing the continuity of optimal costs as a function of policies on the space of Markov/stationary policies under the Borkar topology, we establish that optimal policies can be approximated by smooth/Lipschitz continuous policies with arbitrary precision. While our results are extensions of our recent work, the practical significance of an explicit statement and accessible presentation dedicated to Lipschitz and smooth policies, given their prominence in the literature, motivates our current paper.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105943"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Periodic event-triggered data-driven control for networked control systems with time-varying delays 具有时变延迟的网络控制系统的周期性事件触发数据驱动控制
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-23 DOI: 10.1016/j.sysconle.2024.105951
Zi-Jie Wei , Kun-Zhi Liu , Yan-Wei Wang , Zhuo-Rui Pan , Si-Xin Wen , Xi-Ming Sun
This article focuses on data-driven analysis and controller design for networked control systems (NCSs) with network-induced delays. The study considers a linear time-invariant (LTI) system controlled through a periodic event-triggering mechanism. First, by leveraging data-based representations, we establish data-based stability conditions for NCSs with time-varying delays. Furthermore, we propose the data-based method for co-designing the controller and the periodic event-triggering scheme. In addition, we present novel data-based conditions for verifying dissipativity properties of NCSs. The effectiveness of our proposed methods is validated through a simulation and a turbofan engine hardware-in-the-loop (HIL) experiment.
本文的重点是对具有网络延迟的网络控制系统(NCS)进行数据驱动分析和控制器设计。研究考虑了通过周期性事件触发机制控制的线性时不变(LTI)系统。首先,通过利用基于数据的表示法,我们为具有时变延迟的 NCS 建立了基于数据的稳定性条件。此外,我们还提出了基于数据的方法,用于共同设计控制器和周期性事件触发方案。此外,我们还提出了基于数据的新条件,用于验证 NCS 的耗散特性。我们通过仿真和涡扇发动机硬件在环(HIL)实验验证了所提方法的有效性。
{"title":"Periodic event-triggered data-driven control for networked control systems with time-varying delays","authors":"Zi-Jie Wei ,&nbsp;Kun-Zhi Liu ,&nbsp;Yan-Wei Wang ,&nbsp;Zhuo-Rui Pan ,&nbsp;Si-Xin Wen ,&nbsp;Xi-Ming Sun","doi":"10.1016/j.sysconle.2024.105951","DOIUrl":"10.1016/j.sysconle.2024.105951","url":null,"abstract":"<div><div>This article focuses on data-driven analysis and controller design for networked control systems (NCSs) with network-induced delays. The study considers a linear time-invariant (LTI) system controlled through a periodic event-triggering mechanism. First, by leveraging data-based representations, we establish data-based stability conditions for NCSs with time-varying delays. Furthermore, we propose the data-based method for co-designing the controller and the periodic event-triggering scheme. In addition, we present novel data-based conditions for verifying dissipativity properties of NCSs. The effectiveness of our proposed methods is validated through a simulation and a turbofan engine hardware-in-the-loop (HIL) experiment.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105951"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint identification of system parameter and noise parameters in quantized systems 量化系统中系统参数和噪声参数的联合识别
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-21 DOI: 10.1016/j.sysconle.2024.105941
Jieming Ke, Yanlong Zhao, Ji-Feng Zhang
This paper investigates the joint identification problem of unknown system parameter and noise parameters in quantized systems when the noises involved are Gaussian with unknown variance and mean value. Under such noises, previous investigations show that the unknown system parameter and noise parameters are not jointly identifiable in the single-threshold quantizer case. The joint identifiability in the multi-threshold quantizer case still remains an open problem. This paper proves that the unknown system parameter, the noise variance and the mean value are jointly identifiable if and only if there are at least two thresholds. Then, a decomposition-recombination identification algorithm is proposed to jointly identify the unknown system parameter and noise parameters. Firstly, a technique is designed to convert the identification problem with unknown noise parameters into an extended parameter identification problem with standard Gaussian noises. Secondly, the extended parameter is identified by a stochastic approximation method for quantized systems. For the effectiveness, this paper obtains the strong consistency and the Lp convergence for the algorithm under non-persistently exciting inputs and without any a priori knowledge on the range of the unknown system parameter. The almost sure convergence rate is also obtained. Furthermore, when the mean value is known, the unknown system parameter and noise variance can be jointly identified under weaker conditions on the inputs and the quantizer. Finally, the effectiveness of the proposed algorithm is demonstrated by simulation.
本文研究了量化系统中未知系统参数和噪声参数的联合识别问题,当涉及的噪声是方差和均值未知的高斯噪声时。以往的研究表明,在这种情况下,未知系统参数和噪声参数在单门限量化器情况下是不可联合识别的。多门限量化器情况下的联合可识别性仍是一个悬而未决的问题。本文证明,当且仅当至少有两个阈值时,未知系统参数、噪声方差和均值是可联合识别的。然后,本文提出了一种分解-组合识别算法来联合识别未知系统参数和噪声参数。首先,设计了一种技术,将未知噪声参数的识别问题转换为标准高斯噪声的扩展参数识别问题。其次,通过量化系统的随机逼近方法来识别扩展参数。在有效性方面,本文获得了算法在非持续激励输入下的强一致性和 Lp 收敛性,并且不需要任何关于未知系统参数范围的先验知识。同时还获得了几乎确定的收敛速率。此外,当平均值已知时,在输入和量化器的较弱条件下,未知系统参数和噪声方差可以被联合识别。最后,通过仿真证明了所提算法的有效性。
{"title":"Joint identification of system parameter and noise parameters in quantized systems","authors":"Jieming Ke,&nbsp;Yanlong Zhao,&nbsp;Ji-Feng Zhang","doi":"10.1016/j.sysconle.2024.105941","DOIUrl":"10.1016/j.sysconle.2024.105941","url":null,"abstract":"<div><div>This paper investigates the joint identification problem of unknown system parameter and noise parameters in quantized systems when the noises involved are Gaussian with unknown variance and mean value. Under such noises, previous investigations show that the unknown system parameter and noise parameters are not jointly identifiable in the single-threshold quantizer case. The joint identifiability in the multi-threshold quantizer case still remains an open problem. This paper proves that the unknown system parameter, the noise variance and the mean value are jointly identifiable if and only if there are at least two thresholds. Then, a decomposition-recombination identification algorithm is proposed to jointly identify the unknown system parameter and noise parameters. Firstly, a technique is designed to convert the identification problem with unknown noise parameters into an extended parameter identification problem with standard Gaussian noises. Secondly, the extended parameter is identified by a stochastic approximation method for quantized systems. For the effectiveness, this paper obtains the strong consistency and the <span><math><msup><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msup></math></span> convergence for the algorithm under non-persistently exciting inputs and without any <em>a priori</em> knowledge on the range of the unknown system parameter. The almost sure convergence rate is also obtained. Furthermore, when the mean value is known, the unknown system parameter and noise variance can be jointly identified under weaker conditions on the inputs and the quantizer. Finally, the effectiveness of the proposed algorithm is demonstrated by simulation.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105941"},"PeriodicalIF":2.1,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust control of time-delayed stochastic switched systems with dwell 带停留的时延随机切换系统的鲁棒控制
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-14 DOI: 10.1016/j.sysconle.2024.105934
E. Gershon , L.I. Allerhand , U. Shaked
Linear, state-delayed, discrete-time, stochastic, switched systems are considered, where the problems of stochastic l2-gain and state-feedback control designs are treated and solved. We first develop a special version of a bounded real lemma for the said systems for the nominal case.
Based on the this lemma we derive state-feedback gains for nominal systems where in our solution method, to each subsystem of the switched system, a Lyapunov function is assigned that is non-increasing at the switching instants and where a dwell time constrain is imposed on the system. The assigned Lyapunov function is allowed to vary piecewise linearly in time, starting at the end of the previous switch instant, and it becomes time-invariant after the dwell. Based on the solution of the state-feedback control for nominal systems and exploiting the fact that this solution is affine in the system matrices, a state-feedback control is derived for the polytopic case. We bring a numerical example that demonstrates the solvability and tractability of our solution method.
我们考虑了线性、状态延迟、离散时间、随机、开关系统,处理并解决了随机 l2 增益和状态反馈控制设计问题。我们首先为上述名义系统开发了一个特殊版本的有界实数定理。基于该定理,我们推导出了名义系统的状态反馈增益,在我们的求解方法中,为开关系统的每个子系统分配了一个在开关时刻非递增的 Lyapunov 函数,并对系统施加了停留时间限制。分配的 Lyapunov 函数从上一个切换瞬间结束时开始,允许在时间上片断线性变化,停留时间结束后,Lyapunov 函数变得与时间无关。根据标称系统的状态反馈控制解法,并利用该解法在系统矩阵中的仿射关系,我们得出了多拓扑情况下的状态反馈控制。我们通过一个数值示例证明了我们的求解方法的可解性和可操作性。
{"title":"Robust control of time-delayed stochastic switched systems with dwell","authors":"E. Gershon ,&nbsp;L.I. Allerhand ,&nbsp;U. Shaked","doi":"10.1016/j.sysconle.2024.105934","DOIUrl":"10.1016/j.sysconle.2024.105934","url":null,"abstract":"<div><div>Linear, state-delayed, discrete-time, stochastic, switched systems are considered, where the problems of stochastic <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-gain and state-feedback control designs are treated and solved. We first develop a special version of a bounded real lemma for the said systems for the nominal case.</div><div>Based on the this lemma we derive state-feedback gains for nominal systems where in our solution method, to each subsystem of the switched system, a Lyapunov function is assigned that is non-increasing at the switching instants and where a dwell time constrain is imposed on the system. The assigned Lyapunov function is allowed to vary piecewise linearly in time, starting at the end of the previous switch instant, and it becomes time-invariant after the dwell. Based on the solution of the state-feedback control for nominal systems and exploiting the fact that this solution is affine in the system matrices, a state-feedback control is derived for the polytopic case. We bring a numerical example that demonstrates the solvability and tractability of our solution method.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105934"},"PeriodicalIF":2.1,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-driven control of nonlinear systems: An online sequential approach 非线性系统的数据驱动控制:在线顺序方法
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-10 DOI: 10.1016/j.sysconle.2024.105932
Minh Vu , Yunshen Huang , Shen Zeng
While data-driven control has shown its potential for solving complex tasks, current algorithms such as reinforcement learning are still data-intensive and often limited to simulated environments. Model-based learning is a promising approach to reducing the amount of data required in practical implementations, yet it suffers from a critical issue known as model exploitation. In this paper, we present a sequential approach to model-based learning that avoids model exploitation and achieves stable system behaviors during learning with minimal exploration. The advocated control design utilizes estimates of the system’s local dynamics to step-by-step improve the control. During the process, when additional data is required, the program pauses the control synthesis to collect data in the surrounding area and updates the model accordingly. The local and sequential nature of this approach is the key component to regulating the system’s exploration in the state–action space and, at the same time, avoiding the issue of model exploitation, which are the main challenges in model-based learning control. Through simulated examples and physical experiments, we demonstrate that the proposed approach can quickly learn a desirable control from scratch, with just a small number of trials.
虽然数据驱动控制已显示出其解决复杂任务的潜力,但目前的算法(如强化学习)仍然是数据密集型的,而且往往局限于模拟环境。基于模型的学习是在实际应用中减少所需数据量的一种有前途的方法,但它也存在一个关键问题,即模型利用。在本文中,我们提出了一种基于模型学习的顺序方法,它可以避免模型利用,并在学习过程中以最少的探索实现稳定的系统行为。所倡导的控制设计利用对系统局部动态的估计来逐步改进控制。在此过程中,当需要额外数据时,程序会暂停控制合成,以收集周边区域的数据,并相应地更新模型。这种方法的局部性和顺序性是调节系统在状态-动作空间中探索的关键要素,同时也避免了模型被利用的问题,而这正是基于模型的学习控制所面临的主要挑战。通过模拟示例和物理实验,我们证明了所提出的方法只需少量试验就能从零开始快速学习到理想的控制。
{"title":"Data-driven control of nonlinear systems: An online sequential approach","authors":"Minh Vu ,&nbsp;Yunshen Huang ,&nbsp;Shen Zeng","doi":"10.1016/j.sysconle.2024.105932","DOIUrl":"10.1016/j.sysconle.2024.105932","url":null,"abstract":"<div><div>While data-driven control has shown its potential for solving complex tasks, current algorithms such as reinforcement learning are still data-intensive and often limited to simulated environments. Model-based learning is a promising approach to reducing the amount of data required in practical implementations, yet it suffers from a critical issue known as model exploitation. In this paper, we present a sequential approach to model-based learning that avoids model exploitation and achieves stable system behaviors during learning with minimal exploration. The advocated control design utilizes estimates of the system’s local dynamics to step-by-step improve the control. During the process, when additional data is required, the program pauses the control synthesis to collect data in the surrounding area and updates the model accordingly. The local and sequential nature of this approach is the key component to <em>regulating the system’s exploration in the state–action space</em> and, at the same time, <em>avoiding the issue of model exploitation</em>, which are the main challenges in model-based learning control. Through simulated examples and physical experiments, we demonstrate that the proposed approach can quickly learn a desirable control from scratch, with just a small number of trials.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105932"},"PeriodicalIF":2.1,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal impulse control problems with time delays: An illustrative example 有时间延迟的最优脉冲控制问题:举例说明
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-04 DOI: 10.1016/j.sysconle.2024.105940
Giovanni Fusco , Monica Motta , Richard Vinter
For impulse control systems described by a measure driven differential equation, depending linearly on the measure, it is customary to interpret the state trajectory corresponding to an impulse control, specified by a measure, as the limit of state trajectories associated with some sequence of conventional controls approximating the measure. It is known that, when the measure is vector valued, it is possible that different choices of approximating sequences for the measure give rise to different limiting state trajectories. If the measure is scalar valued, however, there is a unique limiting trajectory. Now consider impulse control systems, in which the right side of the measure driven differential equation depends on both the current and delayed states. In recent work by the authors it has been shown that, for such impulse control systems with time delay, the state trajectory corresponding to a given measure may be non-unique, even when the measure is scalar valued. It was also shown that each limiting state trajectory can be identified with the unique state trajectory associated with some measure together with a family of ‘attached controls’. (The attached controls capture the nature of the measure approximation.) The authors also derived a maximum principle governing minimizers for a general class of impulse optimal control problems with time delay, in which the domain of the optimization problem comprises measures coupled with a family of ‘attached controls’. The purpose of this paper is both to illustrate, by means of an example, this newly discovered non-uniqueness phenomenon and to provide the first application of the new maximum principle, to investigate minimizers for scalar input impulse optimal control problems with time delay, in circumstances when limiting state trajectories associated with a given measure control are not unique. The example is an optimal control problem, for which the underlying control system is a forced harmonic oscillator, with scalar impulse control, in which the control gain is a nonlinear function of the current and delayed states.
对于由量纲驱动的微分方程描述的、与量纲线性相关的脉冲控制系统,通常将与量纲指定的脉冲控制相对应的状态轨迹解释为与某些近似量纲的常规控制序列相关的状态轨迹的极限。众所周知,当量度为矢量值时,不同的量度近似序列可能会产生不同的极限状态轨迹。然而,如果度量是标量值,则存在唯一的极限轨迹。现在考虑脉冲控制系统,其中度量驱动微分方程的右边取决于当前和延迟状态。作者最近的研究表明,对于这种具有时间延迟的脉冲控制系统,即使量值是标量值,与给定量值相对应的状态轨迹也可能是非唯一的。研究还表明,每种极限状态轨迹都可以与与某种度量相关的唯一状态轨迹以及一系列 "附加控制 "相识别。(附带控制捕捉了度量近似的性质)。作者还推导出一个最大值原则,该原则适用于具有时间延迟的一般脉冲最优控制问题的最小化,其中优化问题的领域包括与 "附带控制 "系列耦合的度量。本文的目的既是通过一个例子来说明这种新发现的非唯一性现象,也是首次应用新的最大值原理来研究有时间延迟的标量输入脉冲最优控制问题的最小值,在这种情况下,与给定措施控制相关的极限状态轨迹并不是唯一的。这个例子是一个最优控制问题,其基本控制系统是一个强制谐波振荡器,具有标量脉冲控制,其中控制增益是当前状态和延迟状态的非线性函数。
{"title":"Optimal impulse control problems with time delays: An illustrative example","authors":"Giovanni Fusco ,&nbsp;Monica Motta ,&nbsp;Richard Vinter","doi":"10.1016/j.sysconle.2024.105940","DOIUrl":"10.1016/j.sysconle.2024.105940","url":null,"abstract":"<div><div>For impulse control systems described by a measure driven differential equation, depending linearly on the measure, it is customary to interpret the state trajectory corresponding to an impulse control, specified by a measure, as the limit of state trajectories associated with some sequence of conventional controls approximating the measure. It is known that, when the measure is vector valued, it is possible that different choices of approximating sequences for the measure give rise to different limiting state trajectories. If the measure is scalar valued, however, there is a unique limiting trajectory. Now consider impulse control systems, in which the right side of the measure driven differential equation depends on both the current and delayed states. In recent work by the authors it has been shown that, for such impulse control systems with time delay, the state trajectory corresponding to a given measure may be non-unique, even when the measure is scalar valued. It was also shown that each limiting state trajectory can be identified with the unique state trajectory associated with some measure together with a family of ‘attached controls’. (The attached controls capture the nature of the measure approximation.) The authors also derived a maximum principle governing minimizers for a general class of impulse optimal control problems with time delay, in which the domain of the optimization problem comprises measures coupled with a family of ‘attached controls’. The purpose of this paper is both to illustrate, by means of an example, this newly discovered non-uniqueness phenomenon and to provide the first application of the new maximum principle, to investigate minimizers for scalar input impulse optimal control problems with time delay, in circumstances when limiting state trajectories associated with a given measure control are not unique. The example is an optimal control problem, for which the underlying control system is a forced harmonic oscillator, with scalar impulse control, in which the control gain is a nonlinear function of the current and delayed states.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105940"},"PeriodicalIF":2.1,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse reinforcement learning methods for linear differential games 线性微分博弈的逆强化学习方法
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-04 DOI: 10.1016/j.sysconle.2024.105936
Hamed Jabbari Asl, Eiji Uchibe
In this study, we considered the problem of inverse reinforcement learning or estimating the cost function of expert players in multi-player differential games. We proposed two online data-driven solutions for linear–quadratic games that are applicable to systems that fulfill a specific dimension criterion or whose unknown matrices in the cost function conform to a diagonal condition. The first method, which is partially model-free, utilizes the trajectories of expert agents to solve the problem. The second method is entirely model-free and employs the trajectories of both expert and learner agents. We determined the conditions under which the solutions are applicable and identified the necessary requirements for the collected data. We conducted numerical simulations to establish the effectiveness of the proposed methods.
在本研究中,我们考虑了多人差分博弈中的反强化学习或专家玩家成本函数估计问题。我们针对线性-二次方博弈提出了两种在线数据驱动解决方案,适用于满足特定维度标准或成本函数中的未知矩阵符合对角线条件的系统。第一种方法部分不需要模型,利用专家代理的轨迹来解决问题。第二种方法完全不需要模型,同时利用专家和学习者的轨迹。我们确定了解决方案的适用条件,并确定了对所收集数据的必要要求。我们进行了数值模拟,以确定所提方法的有效性。
{"title":"Inverse reinforcement learning methods for linear differential games","authors":"Hamed Jabbari Asl,&nbsp;Eiji Uchibe","doi":"10.1016/j.sysconle.2024.105936","DOIUrl":"10.1016/j.sysconle.2024.105936","url":null,"abstract":"<div><div>In this study, we considered the problem of inverse reinforcement learning or estimating the cost function of expert players in multi-player differential games. We proposed two online data-driven solutions for linear–quadratic games that are applicable to systems that fulfill a specific dimension criterion or whose unknown matrices in the cost function conform to a diagonal condition. The first method, which is partially model-free, utilizes the trajectories of expert agents to solve the problem. The second method is entirely model-free and employs the trajectories of both expert and learner agents. We determined the conditions under which the solutions are applicable and identified the necessary requirements for the collected data. We conducted numerical simulations to establish the effectiveness of the proposed methods.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105936"},"PeriodicalIF":2.1,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency domain identification of passive local modules in linear dynamic networks 线性动态网络中无源局部模块的频域识别
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-03 DOI: 10.1016/j.sysconle.2024.105937
Lucas F.M. Rodrigues , Gustavo H.C. Oliveira , Lucas P.R.K. Ihlenfeld , Ricardo Schumacher , Paul M.J. Van den Hof
We develop a novel frequency-domain approach to address the important open issue of estimating passive local modules within dynamic networks. The method applies an approach based on two stages, a non-parametric and a parametric one. The parametric stage is an extension of the vector fitting technique that incorporates energy consistency conditions as a fundamental component of the identification procedure, forming a path of the passive model in the Sanathanan–Koerner iterations. The approach includes a formulation via linear matrix inequalities to enforce energy-balance conditions resulting in a convex optimization problem. The approach is practical even under weak assumptions on noise, enabling real-world applications. Numerical simulations illustrate the potential of the developed method to effectively estimate local passive modules in dynamic networks.
我们开发了一种新颖的频域方法来解决估算动态网络中无源局部模块这一重要的未决问题。该方法基于两个阶段,即非参数阶段和参数阶段。参数阶段是矢量拟合技术的扩展,它将能量一致性条件作为识别程序的基本组成部分,在 Sanathanan-Koerner 迭代中形成无源模型的路径。该方法包括通过线性矩阵不等式来强制执行能量平衡条件,从而产生一个凸优化问题。即使在噪声较弱的假设条件下,该方法也是实用的,可在现实世界中应用。数值模拟说明了所开发方法在有效估计动态网络中的局部无源模块方面的潜力。
{"title":"Frequency domain identification of passive local modules in linear dynamic networks","authors":"Lucas F.M. Rodrigues ,&nbsp;Gustavo H.C. Oliveira ,&nbsp;Lucas P.R.K. Ihlenfeld ,&nbsp;Ricardo Schumacher ,&nbsp;Paul M.J. Van den Hof","doi":"10.1016/j.sysconle.2024.105937","DOIUrl":"10.1016/j.sysconle.2024.105937","url":null,"abstract":"<div><div>We develop a novel frequency-domain approach to address the important open issue of estimating passive local modules within dynamic networks. The method applies an approach based on two stages, a non-parametric and a parametric one. The parametric stage is an extension of the vector fitting technique that incorporates energy consistency conditions as a fundamental component of the identification procedure, forming a path of the passive model in the Sanathanan–Koerner iterations. The approach includes a formulation via linear matrix inequalities to enforce energy-balance conditions resulting in a convex optimization problem. The approach is practical even under weak assumptions on noise, enabling real-world applications. Numerical simulations illustrate the potential of the developed method to effectively estimate local passive modules in dynamic networks.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105937"},"PeriodicalIF":2.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Min–max group consensus of discrete-time multi-agent systems under directed random networks 有向随机网络下离散多代理系统的最小-最大群体共识
IF 2.1 3区 计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS Pub Date : 2024-10-03 DOI: 10.1016/j.sysconle.2024.105938
Jianing Yang , Liqi Zhou , Jian Liu , Jianxiang Xi , Yuanshi Zheng
This paper studies the min–max group consensus of discrete-time multi-agent systems under a directed random graph, where the presence of each directed edge is randomly determined by a probability and independent of the presence of other edges. Firstly, we propose a min–max consensus protocol without memory, and give the necessary and sufficient conditions to ensure that the multi-agent system can achieve the min–max group consensus in the sense of almost sure and mean square, respectively. Secondly, we design a novel consensus protocol with memory and a behavior mechanism. Using the stochastic analysis theory and the extremal algebra, some necessary and sufficient conditions are obtained for achieving the min–max group consensus in the sense of almost sure and mean square, respectively. It is shown that the protocol with memory can solve the loss problem of the maximum and minimum initial states. Finally, the effectiveness of the two group consensus protocols and the behavior mechanism is verified by four numerical simulations.
本文研究了有向随机图下离散-时间多代理系统的最小-最大群体共识,在有向随机图中,每条有向边的存在由概率随机决定,且与其他边的存在无关。首先,我们提出了一种无记忆的最小-最大共识协议,并分别给出了确保多代理系统在几乎确定和均方意义上实现最小-最大群体共识的必要条件和充分条件。其次,我们设计了一种带内存的新型共识协议和行为机制。利用随机分析理论和极值代数,分别得到了实现几乎确定和均方意义上的最小-最大群体共识的一些必要条件和充分条件。结果表明,带记忆的协议可以解决最大和最小初始状态的损失问题。最后,通过四次数值模拟验证了两种群体共识协议和行为机制的有效性。
{"title":"Min–max group consensus of discrete-time multi-agent systems under directed random networks","authors":"Jianing Yang ,&nbsp;Liqi Zhou ,&nbsp;Jian Liu ,&nbsp;Jianxiang Xi ,&nbsp;Yuanshi Zheng","doi":"10.1016/j.sysconle.2024.105938","DOIUrl":"10.1016/j.sysconle.2024.105938","url":null,"abstract":"<div><div>This paper studies the min–max group consensus of discrete-time multi-agent systems under a directed random graph, where the presence of each directed edge is randomly determined by a probability and independent of the presence of other edges. Firstly, we propose a min–max consensus protocol without memory, and give the necessary and sufficient conditions to ensure that the multi-agent system can achieve the min–max group consensus in the sense of almost sure and mean square, respectively. Secondly, we design a novel consensus protocol with memory and a behavior mechanism. Using the stochastic analysis theory and the extremal algebra, some necessary and sufficient conditions are obtained for achieving the min–max group consensus in the sense of almost sure and mean square, respectively. It is shown that the protocol with memory can solve the loss problem of the maximum and minimum initial states. Finally, the effectiveness of the two group consensus protocols and the behavior mechanism is verified by four numerical simulations.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105938"},"PeriodicalIF":2.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Systems & Control Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1