Pub Date : 2024-10-23DOI: 10.1016/j.sysconle.2024.105948
Yu-Long Fan , Chuan-Ke Zhang , Yun-Fan Liu , Yong He , Qing-Guo Wang
This paper is concerned with the stability analysis of systems with time-varying delays via the Lyapunov–Krasovskii functional (LKF) method. Unlike the most existing works primarily on conservatism reduction, this paper aims to establish stability criteria with less conservatism as well as low complexity, based on a relatively simple LKF with improved derivative treatments. For this purpose, a fragmented-component-based integral inequality is developed through matrix-separation and mixed estimation of the augmented integral term, which tights the estimation gap and contributes to conservatism reduction; and a novel linearized transformation method is proposed by stripping-simplification and matrix-injection, which handles nonlinear delay-itself-related terms at a low complexity cost. Then, a novel stability criterion as well as several comparative criteria are obtained for linear time-delay systems. Finally, the superiority of the proposed methods is demonstrated via two benchmark examples and a load frequency control system.
{"title":"Stability analysis of systems with time-varying delays for conservatism and complexity reduction","authors":"Yu-Long Fan , Chuan-Ke Zhang , Yun-Fan Liu , Yong He , Qing-Guo Wang","doi":"10.1016/j.sysconle.2024.105948","DOIUrl":"10.1016/j.sysconle.2024.105948","url":null,"abstract":"<div><div>This paper is concerned with the stability analysis of systems with time-varying delays via the Lyapunov–Krasovskii functional (LKF) method. Unlike the most existing works primarily on conservatism reduction, this paper aims to establish stability criteria with less conservatism as well as low complexity, based on a relatively simple LKF with improved derivative treatments. For this purpose, a fragmented-component-based integral inequality is developed through matrix-separation and mixed estimation of the augmented integral term, which tights the estimation gap and contributes to conservatism reduction; and a novel linearized transformation method is proposed by stripping-simplification and matrix-injection, which handles nonlinear delay-itself-related terms at a low complexity cost. Then, a novel stability criterion as well as several comparative criteria are obtained for linear time-delay systems. Finally, the superiority of the proposed methods is demonstrated via two benchmark examples and a load frequency control system.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105948"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/j.sysconle.2024.105943
Somnath Pradhan , Serdar Yüksel
For optimal control of diffusions under several criteria, due to computational or analytical reasons, many studies have a apriori assumed control policies to be Lipschitz or smooth, often with no rigorous analysis on whether this restriction entails loss. While optimality of Markov/stationary Markov policies for expected finite horizon/infinite horizon (discounted/ergodic) cost and cost-up-to-exit time optimal control problems can be established under certain technical conditions, an optimal solution is typically only measurable in the state (and time, if the horizon is finite) with no apriori additional structural properties. In this paper, building on our recent work (Pradhan and Yüksel, 2024) establishing the regularity of optimal cost on the space of control policies under the Borkar control topology for a general class of controlled diffusions in , we establish near optimality of smooth or Lipschitz continuous policies for optimal control under expected finite horizon, infinite horizon discounted, infinite horizon average, and up-to-exit time cost criteria. Under mild assumptions, we first show that smooth/Lipschitz continuous policies are dense in the space of Markov/stationary Markov policies under the Borkar topology. Then utilizing the continuity of optimal costs as a function of policies on the space of Markov/stationary policies under the Borkar topology, we establish that optimal policies can be approximated by smooth/Lipschitz continuous policies with arbitrary precision. While our results are extensions of our recent work, the practical significance of an explicit statement and accessible presentation dedicated to Lipschitz and smooth policies, given their prominence in the literature, motivates our current paper.
对于若干标准下的扩散最优控制,由于计算或分析方面的原因,许多研究都先验地假定控制策略是立普齐兹或平滑的,但往往没有严格分析这种限制是否会带来损失。虽然在某些技术条件下,可以建立马尔可夫/稳态马尔可夫政策对预期有限视界/无限视界(贴现/迭代)成本和成本-退出时间最优控制问题的最优性,但最优解通常只在状态(和时间,如果视界是有限的)上可测量,而没有先验的附加结构特性。在本文中,我们在近期工作(Pradhan and Yüksel, 2024)的基础上,针对 Rd 中的一类受控扩散,建立了 Borkar 控制拓扑下控制策略空间上最优成本的正则性,并在预期有限视界、无限视界贴现、无限视界平均和直至退出时间成本准则下,为最优控制建立了平滑或 Lipschitz 连续策略的近似最优性。在温和的假设条件下,我们首先证明在博尔卡拓扑下,平滑/利普斯奇兹连续政策在马尔可夫/静态马尔可夫政策空间中是密集的。然后,利用博尔卡拓扑结构下马尔可夫/稳态政策空间中最优成本作为政策函数的连续性,我们确定最优政策可以用任意精度的平滑/边缘连续政策近似。虽然我们的结果是对我们近期工作的扩展,但鉴于利普斯基茨和平稳政策在文献中的突出地位,对它们进行明确的陈述和易懂的介绍对我们当前的论文具有实际意义。
{"title":"Near optimality of Lipschitz and smooth policies in controlled diffusions","authors":"Somnath Pradhan , Serdar Yüksel","doi":"10.1016/j.sysconle.2024.105943","DOIUrl":"10.1016/j.sysconle.2024.105943","url":null,"abstract":"<div><div>For optimal control of diffusions under several criteria, due to computational or analytical reasons, many studies have a apriori assumed control policies to be Lipschitz or smooth, often with no rigorous analysis on whether this restriction entails loss. While optimality of Markov/stationary Markov policies for expected finite horizon/infinite horizon (discounted/ergodic) cost and cost-up-to-exit time optimal control problems can be established under certain technical conditions, an optimal solution is typically only measurable in the state (and time, if the horizon is finite) with no apriori additional structural properties. In this paper, building on our recent work (Pradhan and Yüksel, 2024) establishing the regularity of optimal cost on the space of control policies under the Borkar control topology for a general class of controlled diffusions in <span><math><msup><mrow><mi>R</mi></mrow><mrow><mi>d</mi></mrow></msup></math></span>, we establish near optimality of smooth or Lipschitz continuous policies for optimal control under expected finite horizon, infinite horizon discounted, infinite horizon average, and up-to-exit time cost criteria. Under mild assumptions, we first show that smooth/Lipschitz continuous policies are dense in the space of Markov/stationary Markov policies under the Borkar topology. Then utilizing the continuity of optimal costs as a function of policies on the space of Markov/stationary policies under the Borkar topology, we establish that optimal policies can be approximated by smooth/Lipschitz continuous policies with arbitrary precision. While our results are extensions of our recent work, the practical significance of an explicit statement and accessible presentation dedicated to Lipschitz and smooth policies, given their prominence in the literature, motivates our current paper.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105943"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-23DOI: 10.1016/j.sysconle.2024.105951
Zi-Jie Wei , Kun-Zhi Liu , Yan-Wei Wang , Zhuo-Rui Pan , Si-Xin Wen , Xi-Ming Sun
This article focuses on data-driven analysis and controller design for networked control systems (NCSs) with network-induced delays. The study considers a linear time-invariant (LTI) system controlled through a periodic event-triggering mechanism. First, by leveraging data-based representations, we establish data-based stability conditions for NCSs with time-varying delays. Furthermore, we propose the data-based method for co-designing the controller and the periodic event-triggering scheme. In addition, we present novel data-based conditions for verifying dissipativity properties of NCSs. The effectiveness of our proposed methods is validated through a simulation and a turbofan engine hardware-in-the-loop (HIL) experiment.
{"title":"Periodic event-triggered data-driven control for networked control systems with time-varying delays","authors":"Zi-Jie Wei , Kun-Zhi Liu , Yan-Wei Wang , Zhuo-Rui Pan , Si-Xin Wen , Xi-Ming Sun","doi":"10.1016/j.sysconle.2024.105951","DOIUrl":"10.1016/j.sysconle.2024.105951","url":null,"abstract":"<div><div>This article focuses on data-driven analysis and controller design for networked control systems (NCSs) with network-induced delays. The study considers a linear time-invariant (LTI) system controlled through a periodic event-triggering mechanism. First, by leveraging data-based representations, we establish data-based stability conditions for NCSs with time-varying delays. Furthermore, we propose the data-based method for co-designing the controller and the periodic event-triggering scheme. In addition, we present novel data-based conditions for verifying dissipativity properties of NCSs. The effectiveness of our proposed methods is validated through a simulation and a turbofan engine hardware-in-the-loop (HIL) experiment.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105951"},"PeriodicalIF":2.1,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-21DOI: 10.1016/j.sysconle.2024.105941
Jieming Ke, Yanlong Zhao, Ji-Feng Zhang
This paper investigates the joint identification problem of unknown system parameter and noise parameters in quantized systems when the noises involved are Gaussian with unknown variance and mean value. Under such noises, previous investigations show that the unknown system parameter and noise parameters are not jointly identifiable in the single-threshold quantizer case. The joint identifiability in the multi-threshold quantizer case still remains an open problem. This paper proves that the unknown system parameter, the noise variance and the mean value are jointly identifiable if and only if there are at least two thresholds. Then, a decomposition-recombination identification algorithm is proposed to jointly identify the unknown system parameter and noise parameters. Firstly, a technique is designed to convert the identification problem with unknown noise parameters into an extended parameter identification problem with standard Gaussian noises. Secondly, the extended parameter is identified by a stochastic approximation method for quantized systems. For the effectiveness, this paper obtains the strong consistency and the convergence for the algorithm under non-persistently exciting inputs and without any a priori knowledge on the range of the unknown system parameter. The almost sure convergence rate is also obtained. Furthermore, when the mean value is known, the unknown system parameter and noise variance can be jointly identified under weaker conditions on the inputs and the quantizer. Finally, the effectiveness of the proposed algorithm is demonstrated by simulation.
{"title":"Joint identification of system parameter and noise parameters in quantized systems","authors":"Jieming Ke, Yanlong Zhao, Ji-Feng Zhang","doi":"10.1016/j.sysconle.2024.105941","DOIUrl":"10.1016/j.sysconle.2024.105941","url":null,"abstract":"<div><div>This paper investigates the joint identification problem of unknown system parameter and noise parameters in quantized systems when the noises involved are Gaussian with unknown variance and mean value. Under such noises, previous investigations show that the unknown system parameter and noise parameters are not jointly identifiable in the single-threshold quantizer case. The joint identifiability in the multi-threshold quantizer case still remains an open problem. This paper proves that the unknown system parameter, the noise variance and the mean value are jointly identifiable if and only if there are at least two thresholds. Then, a decomposition-recombination identification algorithm is proposed to jointly identify the unknown system parameter and noise parameters. Firstly, a technique is designed to convert the identification problem with unknown noise parameters into an extended parameter identification problem with standard Gaussian noises. Secondly, the extended parameter is identified by a stochastic approximation method for quantized systems. For the effectiveness, this paper obtains the strong consistency and the <span><math><msup><mrow><mi>L</mi></mrow><mrow><mi>p</mi></mrow></msup></math></span> convergence for the algorithm under non-persistently exciting inputs and without any <em>a priori</em> knowledge on the range of the unknown system parameter. The almost sure convergence rate is also obtained. Furthermore, when the mean value is known, the unknown system parameter and noise variance can be jointly identified under weaker conditions on the inputs and the quantizer. Finally, the effectiveness of the proposed algorithm is demonstrated by simulation.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105941"},"PeriodicalIF":2.1,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142530579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-14DOI: 10.1016/j.sysconle.2024.105934
E. Gershon , L.I. Allerhand , U. Shaked
Linear, state-delayed, discrete-time, stochastic, switched systems are considered, where the problems of stochastic -gain and state-feedback control designs are treated and solved. We first develop a special version of a bounded real lemma for the said systems for the nominal case.
Based on the this lemma we derive state-feedback gains for nominal systems where in our solution method, to each subsystem of the switched system, a Lyapunov function is assigned that is non-increasing at the switching instants and where a dwell time constrain is imposed on the system. The assigned Lyapunov function is allowed to vary piecewise linearly in time, starting at the end of the previous switch instant, and it becomes time-invariant after the dwell. Based on the solution of the state-feedback control for nominal systems and exploiting the fact that this solution is affine in the system matrices, a state-feedback control is derived for the polytopic case. We bring a numerical example that demonstrates the solvability and tractability of our solution method.
{"title":"Robust control of time-delayed stochastic switched systems with dwell","authors":"E. Gershon , L.I. Allerhand , U. Shaked","doi":"10.1016/j.sysconle.2024.105934","DOIUrl":"10.1016/j.sysconle.2024.105934","url":null,"abstract":"<div><div>Linear, state-delayed, discrete-time, stochastic, switched systems are considered, where the problems of stochastic <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-gain and state-feedback control designs are treated and solved. We first develop a special version of a bounded real lemma for the said systems for the nominal case.</div><div>Based on the this lemma we derive state-feedback gains for nominal systems where in our solution method, to each subsystem of the switched system, a Lyapunov function is assigned that is non-increasing at the switching instants and where a dwell time constrain is imposed on the system. The assigned Lyapunov function is allowed to vary piecewise linearly in time, starting at the end of the previous switch instant, and it becomes time-invariant after the dwell. Based on the solution of the state-feedback control for nominal systems and exploiting the fact that this solution is affine in the system matrices, a state-feedback control is derived for the polytopic case. We bring a numerical example that demonstrates the solvability and tractability of our solution method.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105934"},"PeriodicalIF":2.1,"publicationDate":"2024-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142433853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-10DOI: 10.1016/j.sysconle.2024.105932
Minh Vu , Yunshen Huang , Shen Zeng
While data-driven control has shown its potential for solving complex tasks, current algorithms such as reinforcement learning are still data-intensive and often limited to simulated environments. Model-based learning is a promising approach to reducing the amount of data required in practical implementations, yet it suffers from a critical issue known as model exploitation. In this paper, we present a sequential approach to model-based learning that avoids model exploitation and achieves stable system behaviors during learning with minimal exploration. The advocated control design utilizes estimates of the system’s local dynamics to step-by-step improve the control. During the process, when additional data is required, the program pauses the control synthesis to collect data in the surrounding area and updates the model accordingly. The local and sequential nature of this approach is the key component to regulating the system’s exploration in the state–action space and, at the same time, avoiding the issue of model exploitation, which are the main challenges in model-based learning control. Through simulated examples and physical experiments, we demonstrate that the proposed approach can quickly learn a desirable control from scratch, with just a small number of trials.
{"title":"Data-driven control of nonlinear systems: An online sequential approach","authors":"Minh Vu , Yunshen Huang , Shen Zeng","doi":"10.1016/j.sysconle.2024.105932","DOIUrl":"10.1016/j.sysconle.2024.105932","url":null,"abstract":"<div><div>While data-driven control has shown its potential for solving complex tasks, current algorithms such as reinforcement learning are still data-intensive and often limited to simulated environments. Model-based learning is a promising approach to reducing the amount of data required in practical implementations, yet it suffers from a critical issue known as model exploitation. In this paper, we present a sequential approach to model-based learning that avoids model exploitation and achieves stable system behaviors during learning with minimal exploration. The advocated control design utilizes estimates of the system’s local dynamics to step-by-step improve the control. During the process, when additional data is required, the program pauses the control synthesis to collect data in the surrounding area and updates the model accordingly. The local and sequential nature of this approach is the key component to <em>regulating the system’s exploration in the state–action space</em> and, at the same time, <em>avoiding the issue of model exploitation</em>, which are the main challenges in model-based learning control. Through simulated examples and physical experiments, we demonstrate that the proposed approach can quickly learn a desirable control from scratch, with just a small number of trials.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105932"},"PeriodicalIF":2.1,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-04DOI: 10.1016/j.sysconle.2024.105940
Giovanni Fusco , Monica Motta , Richard Vinter
For impulse control systems described by a measure driven differential equation, depending linearly on the measure, it is customary to interpret the state trajectory corresponding to an impulse control, specified by a measure, as the limit of state trajectories associated with some sequence of conventional controls approximating the measure. It is known that, when the measure is vector valued, it is possible that different choices of approximating sequences for the measure give rise to different limiting state trajectories. If the measure is scalar valued, however, there is a unique limiting trajectory. Now consider impulse control systems, in which the right side of the measure driven differential equation depends on both the current and delayed states. In recent work by the authors it has been shown that, for such impulse control systems with time delay, the state trajectory corresponding to a given measure may be non-unique, even when the measure is scalar valued. It was also shown that each limiting state trajectory can be identified with the unique state trajectory associated with some measure together with a family of ‘attached controls’. (The attached controls capture the nature of the measure approximation.) The authors also derived a maximum principle governing minimizers for a general class of impulse optimal control problems with time delay, in which the domain of the optimization problem comprises measures coupled with a family of ‘attached controls’. The purpose of this paper is both to illustrate, by means of an example, this newly discovered non-uniqueness phenomenon and to provide the first application of the new maximum principle, to investigate minimizers for scalar input impulse optimal control problems with time delay, in circumstances when limiting state trajectories associated with a given measure control are not unique. The example is an optimal control problem, for which the underlying control system is a forced harmonic oscillator, with scalar impulse control, in which the control gain is a nonlinear function of the current and delayed states.
{"title":"Optimal impulse control problems with time delays: An illustrative example","authors":"Giovanni Fusco , Monica Motta , Richard Vinter","doi":"10.1016/j.sysconle.2024.105940","DOIUrl":"10.1016/j.sysconle.2024.105940","url":null,"abstract":"<div><div>For impulse control systems described by a measure driven differential equation, depending linearly on the measure, it is customary to interpret the state trajectory corresponding to an impulse control, specified by a measure, as the limit of state trajectories associated with some sequence of conventional controls approximating the measure. It is known that, when the measure is vector valued, it is possible that different choices of approximating sequences for the measure give rise to different limiting state trajectories. If the measure is scalar valued, however, there is a unique limiting trajectory. Now consider impulse control systems, in which the right side of the measure driven differential equation depends on both the current and delayed states. In recent work by the authors it has been shown that, for such impulse control systems with time delay, the state trajectory corresponding to a given measure may be non-unique, even when the measure is scalar valued. It was also shown that each limiting state trajectory can be identified with the unique state trajectory associated with some measure together with a family of ‘attached controls’. (The attached controls capture the nature of the measure approximation.) The authors also derived a maximum principle governing minimizers for a general class of impulse optimal control problems with time delay, in which the domain of the optimization problem comprises measures coupled with a family of ‘attached controls’. The purpose of this paper is both to illustrate, by means of an example, this newly discovered non-uniqueness phenomenon and to provide the first application of the new maximum principle, to investigate minimizers for scalar input impulse optimal control problems with time delay, in circumstances when limiting state trajectories associated with a given measure control are not unique. The example is an optimal control problem, for which the underlying control system is a forced harmonic oscillator, with scalar impulse control, in which the control gain is a nonlinear function of the current and delayed states.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105940"},"PeriodicalIF":2.1,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-04DOI: 10.1016/j.sysconle.2024.105936
Hamed Jabbari Asl, Eiji Uchibe
In this study, we considered the problem of inverse reinforcement learning or estimating the cost function of expert players in multi-player differential games. We proposed two online data-driven solutions for linear–quadratic games that are applicable to systems that fulfill a specific dimension criterion or whose unknown matrices in the cost function conform to a diagonal condition. The first method, which is partially model-free, utilizes the trajectories of expert agents to solve the problem. The second method is entirely model-free and employs the trajectories of both expert and learner agents. We determined the conditions under which the solutions are applicable and identified the necessary requirements for the collected data. We conducted numerical simulations to establish the effectiveness of the proposed methods.
{"title":"Inverse reinforcement learning methods for linear differential games","authors":"Hamed Jabbari Asl, Eiji Uchibe","doi":"10.1016/j.sysconle.2024.105936","DOIUrl":"10.1016/j.sysconle.2024.105936","url":null,"abstract":"<div><div>In this study, we considered the problem of inverse reinforcement learning or estimating the cost function of expert players in multi-player differential games. We proposed two online data-driven solutions for linear–quadratic games that are applicable to systems that fulfill a specific dimension criterion or whose unknown matrices in the cost function conform to a diagonal condition. The first method, which is partially model-free, utilizes the trajectories of expert agents to solve the problem. The second method is entirely model-free and employs the trajectories of both expert and learner agents. We determined the conditions under which the solutions are applicable and identified the necessary requirements for the collected data. We conducted numerical simulations to establish the effectiveness of the proposed methods.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105936"},"PeriodicalIF":2.1,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1016/j.sysconle.2024.105937
Lucas F.M. Rodrigues , Gustavo H.C. Oliveira , Lucas P.R.K. Ihlenfeld , Ricardo Schumacher , Paul M.J. Van den Hof
We develop a novel frequency-domain approach to address the important open issue of estimating passive local modules within dynamic networks. The method applies an approach based on two stages, a non-parametric and a parametric one. The parametric stage is an extension of the vector fitting technique that incorporates energy consistency conditions as a fundamental component of the identification procedure, forming a path of the passive model in the Sanathanan–Koerner iterations. The approach includes a formulation via linear matrix inequalities to enforce energy-balance conditions resulting in a convex optimization problem. The approach is practical even under weak assumptions on noise, enabling real-world applications. Numerical simulations illustrate the potential of the developed method to effectively estimate local passive modules in dynamic networks.
{"title":"Frequency domain identification of passive local modules in linear dynamic networks","authors":"Lucas F.M. Rodrigues , Gustavo H.C. Oliveira , Lucas P.R.K. Ihlenfeld , Ricardo Schumacher , Paul M.J. Van den Hof","doi":"10.1016/j.sysconle.2024.105937","DOIUrl":"10.1016/j.sysconle.2024.105937","url":null,"abstract":"<div><div>We develop a novel frequency-domain approach to address the important open issue of estimating passive local modules within dynamic networks. The method applies an approach based on two stages, a non-parametric and a parametric one. The parametric stage is an extension of the vector fitting technique that incorporates energy consistency conditions as a fundamental component of the identification procedure, forming a path of the passive model in the Sanathanan–Koerner iterations. The approach includes a formulation via linear matrix inequalities to enforce energy-balance conditions resulting in a convex optimization problem. The approach is practical even under weak assumptions on noise, enabling real-world applications. Numerical simulations illustrate the potential of the developed method to effectively estimate local passive modules in dynamic networks.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105937"},"PeriodicalIF":2.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-03DOI: 10.1016/j.sysconle.2024.105938
Jianing Yang , Liqi Zhou , Jian Liu , Jianxiang Xi , Yuanshi Zheng
This paper studies the min–max group consensus of discrete-time multi-agent systems under a directed random graph, where the presence of each directed edge is randomly determined by a probability and independent of the presence of other edges. Firstly, we propose a min–max consensus protocol without memory, and give the necessary and sufficient conditions to ensure that the multi-agent system can achieve the min–max group consensus in the sense of almost sure and mean square, respectively. Secondly, we design a novel consensus protocol with memory and a behavior mechanism. Using the stochastic analysis theory and the extremal algebra, some necessary and sufficient conditions are obtained for achieving the min–max group consensus in the sense of almost sure and mean square, respectively. It is shown that the protocol with memory can solve the loss problem of the maximum and minimum initial states. Finally, the effectiveness of the two group consensus protocols and the behavior mechanism is verified by four numerical simulations.
{"title":"Min–max group consensus of discrete-time multi-agent systems under directed random networks","authors":"Jianing Yang , Liqi Zhou , Jian Liu , Jianxiang Xi , Yuanshi Zheng","doi":"10.1016/j.sysconle.2024.105938","DOIUrl":"10.1016/j.sysconle.2024.105938","url":null,"abstract":"<div><div>This paper studies the min–max group consensus of discrete-time multi-agent systems under a directed random graph, where the presence of each directed edge is randomly determined by a probability and independent of the presence of other edges. Firstly, we propose a min–max consensus protocol without memory, and give the necessary and sufficient conditions to ensure that the multi-agent system can achieve the min–max group consensus in the sense of almost sure and mean square, respectively. Secondly, we design a novel consensus protocol with memory and a behavior mechanism. Using the stochastic analysis theory and the extremal algebra, some necessary and sufficient conditions are obtained for achieving the min–max group consensus in the sense of almost sure and mean square, respectively. It is shown that the protocol with memory can solve the loss problem of the maximum and minimum initial states. Finally, the effectiveness of the two group consensus protocols and the behavior mechanism is verified by four numerical simulations.</div></div>","PeriodicalId":49450,"journal":{"name":"Systems & Control Letters","volume":"193 ","pages":"Article 105938"},"PeriodicalIF":2.1,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142426247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}