首页 > 最新文献

IEEE open journal of control systems最新文献

英文 中文
Distributed Anytime-Feasible Resource Allocation Subject to Heterogeneous Time-Varying Delays 异构时变时滞下的分布式随时可行资源分配
Pub Date : 2022-09-28 DOI: 10.1109/OJCSYS.2022.3210453
Mohammadreza Doostmohammadian;Alireza Aghasi;Apostolos I. Rikos;Andreas Grammenos;Evangelia Kalyvianaki;Christoforos N. Hadjicostis;Karl H. Johansson;Themistoklis Charalambous
This paper considers distributed allocation strategies, formulated as a distributed sum-preserving (fixed-sum) allocation of resources over a multi-agent network in the presence of heterogeneous arbitrary time-varying delays. We propose a double time-scale scenario for unknown delays and a faster single time-scale scenario for known delays. Further, the links among the nodes are considered subject to certain nonlinearities (e.g, quantization and saturation/clipping). We discuss different models for nonlinearities and how they may affect the convergence, sum-preserving feasibility constraint, and solution optimality over general weight-balanced uniformly strongly connected networks and, further, time-delayed undirected networks. Our proposed scheme works in a variety of applications with general non-quadratic strongly-convex smooth objective functions. The non-quadratic part, for example, can be due to additive convex penalty or barrier functions to address the local box constraints. The network can change over time, is not necessarily connected at all times, but is only assumed to be uniformly-connected. The novelty of this work is to address all-time feasible Laplacian gradient solutions in presence of nonlinearities, switching digraph topology (not necessarily all-time connected), and heterogeneous time-varying delays.
本文考虑了分布式分配策略,该策略被表述为在存在异构任意时变延迟的情况下,在多智能体网络上对资源进行分布式保和(固定和)分配。我们提出了未知延迟的双时间尺度场景和已知延迟的更快单时间尺度场景。此外,节点之间的链路被认为受到某些非线性的影响(例如,量化和饱和/削波)。我们讨论了不同的非线性模型,以及它们如何影响一般权重平衡一致强连通网络以及时滞无向网络的收敛性、保和可行性约束和解的最优性。我们提出的方案适用于一般非二次强凸光滑目标函数的各种应用。例如,非二次部分可能是由于附加凸惩罚或障碍函数来解决局部盒约束。网络可以随着时间的推移而变化,不一定总是连接的,但只假设是一致连接的。这项工作的新颖之处在于,在存在非线性、切换有向图拓扑(不一定是全时连接的)和异构时变延迟的情况下,解决了始终可行的拉普拉斯梯度解。
{"title":"Distributed Anytime-Feasible Resource Allocation Subject to Heterogeneous Time-Varying Delays","authors":"Mohammadreza Doostmohammadian;Alireza Aghasi;Apostolos I. Rikos;Andreas Grammenos;Evangelia Kalyvianaki;Christoforos N. Hadjicostis;Karl H. Johansson;Themistoklis Charalambous","doi":"10.1109/OJCSYS.2022.3210453","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3210453","url":null,"abstract":"This paper considers distributed allocation strategies, formulated as a distributed sum-preserving (fixed-sum) allocation of resources over a multi-agent network in the presence of heterogeneous arbitrary time-varying delays. We propose a double time-scale scenario for unknown delays and a faster single time-scale scenario for known delays. Further, the links among the nodes are considered subject to certain nonlinearities (e.g, quantization and saturation/clipping). We discuss different models for nonlinearities and how they may affect the convergence, sum-preserving feasibility constraint, and solution optimality over general weight-balanced uniformly strongly connected networks and, further, time-delayed undirected networks. Our proposed scheme works in a variety of applications with general non-quadratic strongly-convex smooth objective functions. The non-quadratic part, for example, can be due to additive convex penalty or barrier functions to address the local box constraints. The network can change over time, is not necessarily connected at all times, but is only assumed to be uniformly-connected. The novelty of this work is to address all-time feasible Laplacian gradient solutions in presence of nonlinearities, switching digraph topology (not necessarily all-time connected), and heterogeneous time-varying delays.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"255-267"},"PeriodicalIF":0.0,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09904851.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50348955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Reinforcement Learning With Safety and Stability Guarantees During Exploration For Linear Systems 线性系统探索过程中具有安全性和稳定性保证的强化学习
Pub Date : 2022-09-28 DOI: 10.1109/OJCSYS.2022.3209945
Zahra Marvi;Bahare Kiumarsi
The satisfaction of the safety and stability properties of reinforcement learning (RL) algorithms has been a long-standing challenge. These properties must be satisfied even during learning, for which exploration is required to collect rich data. However, satisfying the safety of actions when little is known about the system dynamics is a daunting challenge. After all, predicting the consequence of RL actions requires knowing the system dynamics. This paper presents a novel RL scheme that ensures the safety and stability of the linear systems during the exploration and exploitation phases. To do so, a fast and data-efficient model-learning with the convergence guarantee is employed along and simultaneously with an off-policy RL scheme to find the optimal controller. The accurate bound of the model-learning error is derived and its characteristic is employed in the formation of a novel adaptive robustified control barrier function (ARCBF) which guarantees that states of the system remain in the safe set even when the learning is incomplete. Therefore, after satisfaction of a mild rank condition, the noisy input in the exploratory data collection phase and the optimal controller in the exploitation phase are minimally altered such that the ARCBF criterion is satisfied and, therefore, safety is guaranteed in both phases. It is shown that under the proposed RL framework, the model learning error is a vanishing perturbation to the original system. Therefore, a stability guarantee is also provided even in the exploration when noisy random inputs are applied to the system.
增强学习算法的安全性和稳定性一直是一个长期的挑战。即使在学习过程中,也必须满足这些特性,为此需要进行探索以收集丰富的数据。然而,在对系统动力学知之甚少的情况下,满足行动的安全性是一项艰巨的挑战。毕竟,预测RL行为的后果需要了解系统动力学。本文提出了一种新的RL方案,该方案确保了线性系统在勘探和开发阶段的安全性和稳定性。为此,在非策略RL方案的同时,采用了一种具有收敛保证的快速且数据有效的模型学习来寻找最优控制器。导出了模型学习误差的精确界,并将其特性用于形成一种新的自适应鲁棒控制屏障函数(ARCBF),该函数保证了即使在学习不完全的情况下,系统的状态也保持在安全集内。因此,在满足温和秩条件之后,探索数据收集阶段中的噪声输入和开发阶段中的最优控制器被最小程度地改变,从而满足ARCBF标准,并且因此在两个阶段中都保证了安全性。结果表明,在所提出的RL框架下,模型学习误差对原始系统是一个消失的扰动。因此,即使在将噪声随机输入应用于系统的探索中,也提供了稳定性保证。
{"title":"Reinforcement Learning With Safety and Stability Guarantees During Exploration For Linear Systems","authors":"Zahra Marvi;Bahare Kiumarsi","doi":"10.1109/OJCSYS.2022.3209945","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3209945","url":null,"abstract":"The satisfaction of the safety and stability properties of reinforcement learning (RL) algorithms has been a long-standing challenge. These properties must be satisfied even during learning, for which exploration is required to collect rich data. However, satisfying the safety of actions when little is known about the system dynamics is a daunting challenge. After all, predicting the consequence of RL actions requires knowing the system dynamics. This paper presents a novel RL scheme that ensures the safety and stability of the linear systems during the exploration and exploitation phases. To do so, a fast and data-efficient model-learning with the convergence guarantee is employed along and simultaneously with an off-policy RL scheme to find the optimal controller. The accurate bound of the model-learning error is derived and its characteristic is employed in the formation of a novel adaptive robustified control barrier function (ARCBF) which guarantees that states of the system remain in the safe set even when the learning is incomplete. Therefore, after satisfaction of a mild rank condition, the noisy input in the exploratory data collection phase and the optimal controller in the exploitation phase are minimally altered such that the ARCBF criterion is satisfied and, therefore, safety is guaranteed in both phases. It is shown that under the proposed RL framework, the model learning error is a vanishing perturbation to the original system. Therefore, a stability guarantee is also provided even in the exploration when noisy random inputs are applied to the system.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"322-334"},"PeriodicalIF":0.0,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09904857.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50237539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Real Time Li-Ion Battery Bank Parameters Estimation via Universal Adaptive Stabilization 基于通用自适应稳定的实时锂电池组参数估计
Pub Date : 2022-09-15 DOI: 10.1109/OJCSYS.2022.3206710
Shayok Mukhopadhyay;Hafiz M. Usman;Habibur Rehman
This paper proposes an accurate and efficient Universal Adaptive Stabilizer (UAS) based online parameters estimation technique for a 400 V Li-ion battery bank. The battery open circuit voltage, parameters modeling the transient response, and series resistance are all estimated in a single real-time test. In contrast to earlier UAS based work on individual battery packs, this work does not require prior offline experimentation or any post-processing. Real time fast convergence of parameters' estimates with minimal experimental effort enables update of battery parameters during run-time. The proposed strategy is mathematically validated and its performance is demonstrated on a 400 V, 6.6 Ah Li-ion battery bank powering an induction motor driven prototype electric vehicle (EV) traction system.
针对400V锂离子电池组,提出了一种准确高效的基于通用自适应稳定器(UAS)的在线参数估计技术。电池开路电压、瞬态响应建模参数和串联电阻都是在单个实时测试中估计的。与早期基于无人机的单个电池组工作相比,这项工作不需要事先离线实验或任何后处理。参数估计的实时快速收敛只需最少的实验工作量,就可以在运行时更新电池参数。所提出的策略经过了数学验证,并在为感应电机驱动的原型电动汽车(EV)牵引系统供电的400V、6.6Ah锂离子电池组上验证了其性能。
{"title":"Real Time Li-Ion Battery Bank Parameters Estimation via Universal Adaptive Stabilization","authors":"Shayok Mukhopadhyay;Hafiz M. Usman;Habibur Rehman","doi":"10.1109/OJCSYS.2022.3206710","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3206710","url":null,"abstract":"This paper proposes an accurate and efficient Universal Adaptive Stabilizer (UAS) based online parameters estimation technique for a 400 V Li-ion battery bank. The battery open circuit voltage, parameters modeling the transient response, and series resistance are all estimated in a single real-time test. In contrast to earlier UAS based work on individual battery packs, this work does not require prior offline experimentation or any post-processing. Real time fast convergence of parameters' estimates with minimal experimental effort enables update of battery parameters during run-time. The proposed strategy is mathematically validated and its performance is demonstrated on a 400 V, 6.6 Ah Li-ion battery bank powering an induction motor driven prototype electric vehicle (EV) traction system.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"268-293"},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09893763.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50348786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Differentially Private Algorithms for Statistical Verification of Cyber-Physical Systems 用于网络物理系统统计验证的差分私有算法
Pub Date : 2022-09-15 DOI: 10.1109/OJCSYS.2022.3207108
Yu Wang;Hussein Sibai;Mark Yen;Sayan Mitra;Geir E. Dullerud
Statistical model checking is a class of sequential algorithms that can verify specifications of interest on an ensemble of cyber-physical systems (e.g., whether 99% of cars from a batch meet a requirement on their functionality). These algorithms infer the probability that given specifications are satisfied by the systems with provable statistical guarantees by drawing sufficient numbers of independent and identically distributed samples. During the process of statistical model checking, the values of the samples (e.g., a user's car trajectory) may be inferred by intruders, causing privacy concerns in consumer-level applications (e.g., automobiles and medical devices). This paper addresses the privacy of statistical model checking algorithms from the point of view of differential privacy. These algorithms are sequential, drawing samples until a condition on their values is met. We show that revealing the number of samples drawn can violate privacy. We also show that the standard exponential mechanism that randomizes the output of an algorithm to achieve differential privacy fails to do so in the context of sequential algorithms. Instead, we relax the conservative requirement in differential privacy that the sensitivity of the output of the algorithm should be bounded to any perturbation for any data set. We propose a new notion of differential privacy which we call expected differential privacy (EDP). Then, we propose a novel expected sensitivity analysis for the sequential algorithm and propose a corresponding exponential mechanism that randomizes the termination time to achieve the EDP. We apply the proposed exponential mechanism to statistical model checking algorithms to preserve the privacy of the samples they draw. The utility of the proposed algorithm is demonstrated in a case study.
统计模型检查是一类序列算法,可以验证网络物理系统集成中感兴趣的规范(例如,一批中99%的汽车是否满足其功能要求)。这些算法通过绘制足够数量的独立且相同分布的样本来推断具有可证明统计保证的系统满足给定规范的概率。在统计模型检查过程中,入侵者可能会推断出样本的值(例如,用户的汽车轨迹),从而在消费者级应用程序(例如,汽车和医疗设备)中引起隐私问题。本文从差分隐私的角度讨论了统计模型检查算法的隐私问题。这些算法是连续的,绘制样本,直到满足其值的条件。我们表明,透露抽取的样本数量可能会侵犯隐私。我们还表明,对算法输出进行随机化以实现差分隐私的标准指数机制在序列算法的情况下无法做到这一点。相反,我们放宽了微分隐私中的保守要求,即算法输出的灵敏度应限制为任何数据集的任何扰动。我们提出了一个新的差分隐私概念,我们称之为期望差分隐私(EDP)。然后,我们提出了一种新的序列算法的预期灵敏度分析,并提出了相应的指数机制,该机制随机化终止时间以实现EDP。我们将所提出的指数机制应用于统计模型检查算法,以保护他们绘制的样本的隐私。通过一个实例验证了该算法的实用性。
{"title":"Differentially Private Algorithms for Statistical Verification of Cyber-Physical Systems","authors":"Yu Wang;Hussein Sibai;Mark Yen;Sayan Mitra;Geir E. Dullerud","doi":"10.1109/OJCSYS.2022.3207108","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3207108","url":null,"abstract":"Statistical model checking is a class of sequential algorithms that can verify specifications of interest on an ensemble of cyber-physical systems (e.g., whether 99% of cars from a batch meet a requirement on their functionality). These algorithms infer the probability that given specifications are satisfied by the systems with provable statistical guarantees by drawing sufficient numbers of independent and identically distributed samples. During the process of statistical model checking, the values of the samples (e.g., a user's car trajectory) may be inferred by intruders, causing privacy concerns in consumer-level applications (e.g., automobiles and medical devices). This paper addresses the privacy of statistical model checking algorithms from the point of view of differential privacy. These algorithms are sequential, drawing samples until a condition on their values is met. We show that revealing the number of samples drawn can violate privacy. We also show that the standard exponential mechanism that randomizes the output of an algorithm to achieve differential privacy fails to do so in the context of sequential algorithms. Instead, we relax the conservative requirement in differential privacy that the sensitivity of the output of the algorithm should be bounded to any perturbation for any data set. We propose a new notion of differential privacy which we call \u0000<italic>expected differential privacy</i>\u0000 (EDP). Then, we propose a novel expected sensitivity analysis for the sequential algorithm and propose a corresponding exponential mechanism that randomizes the termination time to achieve the EDP. We apply the proposed exponential mechanism to statistical model checking algorithms to preserve the privacy of the samples they draw. The utility of the proposed algorithm is demonstrated in a case study.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"294-305"},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09893303.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50381126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Online Optimization of Dynamical Systems With Deep Learning Perception 具有深度学习感知的动态系统在线优化
Pub Date : 2022-09-14 DOI: 10.1109/OJCSYS.2022.3205871
Liliaokeawawa Cothren;Gianluca Bianchin;Emiliano Dall'Anese
This paper considers the problem of controlling a dynamical system when the state cannot be directly measured and the control performance metrics are unknown or only partially known. In particular, we focus on the design of data-driven controllers to regulate a dynamical system to the solution of a constrained convex optimization problem where: i) the state must be estimated from nonlinear and possibly high-dimensional data; and ii) the cost of the optimization problem – which models control objectives associated with inputs and states of the system – is not available and must be learned from data. We propose a data-driven feedback controller that is based on adaptations of a projected gradient-flow method; the controller includes neural networks as integral components for the estimation of the unknown functions. Leveraging stability theory for perturbed systems, we derive sufficient conditions to guarantee exponential input-to-state stability (ISS) of the control loop. In particular, we show that the interconnected system is ISS with respect to the approximation errors of the neural network and unknown disturbances affecting the system. The transient bounds combine the universal approximation property of deep neural networks with the ISS characterization. Illustrative numerical results are presented in the context of robotics and control of epidemics.
本文考虑了当状态不能直接测量并且控制性能指标未知或仅部分已知时控制动态系统的问题。特别地,我们专注于数据驱动控制器的设计,以将动力系统调节到约束凸优化问题的解,其中:i)状态必须从非线性和可能的高维数据中估计;以及ii)优化问题的成本——对与系统输入和状态相关的控制目标进行建模——是不可用的,必须从数据中学习。我们提出了一种基于投影梯度流方法自适应的数据驱动反馈控制器;控制器包括神经网络作为用于估计未知函数的积分部件。利用扰动系统的稳定性理论,我们导出了保证控制回路状态稳定性(ISS)的指数输入的充分条件。特别地,我们证明了关于神经网络的近似误差和影响系统的未知扰动,互连系统是ISS。瞬态边界将深度神经网络的普遍逼近特性与ISS特征相结合。在机器人和流行病控制的背景下给出了说明性的数值结果。
{"title":"Online Optimization of Dynamical Systems With Deep Learning Perception","authors":"Liliaokeawawa Cothren;Gianluca Bianchin;Emiliano Dall'Anese","doi":"10.1109/OJCSYS.2022.3205871","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3205871","url":null,"abstract":"This paper considers the problem of controlling a dynamical system when the state cannot be directly measured and the control performance metrics are unknown or only partially known. In particular, we focus on the design of data-driven controllers to regulate a dynamical system to the solution of a constrained convex optimization problem where: i) the state must be estimated from nonlinear and possibly high-dimensional data; and ii) the cost of the optimization problem – which models control objectives associated with inputs and states of the system – is not available and must be learned from data. We propose a data-driven feedback controller that is based on adaptations of a projected gradient-flow method; the controller includes neural networks as integral components for the estimation of the unknown functions. Leveraging stability theory for perturbed systems, we derive sufficient conditions to guarantee exponential input-to-state stability (ISS) of the control loop. In particular, we show that the interconnected system is ISS with respect to the approximation errors of the neural network and unknown disturbances affecting the system. The transient bounds combine the universal approximation property of deep neural networks with the ISS characterization. Illustrative numerical results are presented in the context of robotics and control of epidemics.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"306-321"},"PeriodicalIF":0.0,"publicationDate":"2022-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09891838.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50348787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Neural Network Optimal Feedback Control With Guaranteed Local Stability 具有保证局部稳定性的神经网络最优反馈控制
Pub Date : 2022-09-12 DOI: 10.1109/OJCSYS.2022.3205863
Tenavi Nakamura-Zimmerer;Qi Gong;Wei Kang
Recent research shows that supervised learning can be an effective tool for designing near-optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of neural network controllers is still not well understood. In particular, some neural networks with high test accuracy can fail to even locally stabilize the dynamic system. To address this challenge we propose several novel neural network architectures, which we show guarantee local asymptotic stability while retaining the approximation capacity to learn the optimal feedback policy semi-globally. The proposed architectures are compared against standard neural network feedback controllers through numerical simulations of two high-dimensional nonlinear optimal control problems: stabilization of an unstable Burgers-type partial differential equation, and altitude and course tracking for an unmanned aerial vehicle. The simulations demonstrate that standard neural networks can fail to stabilize the dynamics even when trained well, while the proposed architectures are always at least locally stabilizing and can achieve near-optimal performance.
最近的研究表明,监督学习可以成为设计高维非线性动态系统的近最优反馈控制器的有效工具。但是神经网络控制器的行为仍然没有得到很好的理解。特别是,一些测试精度高的神经网络甚至无法局部稳定动态系统。为了应对这一挑战,我们提出了几种新的神经网络架构,我们证明了它们保证了局部渐近稳定性,同时保持了半全局学习最优反馈策略的近似能力。通过对两个高维非线性最优控制问题的数值模拟,将所提出的结构与标准神经网络反馈控制器进行了比较:一个不稳定Burgers型偏微分方程的稳定性,以及一个无人机的高度和航向跟踪。仿真表明,即使训练良好,标准神经网络也可能无法稳定动力学,而所提出的体系结构总是至少局部稳定的,并且可以实现接近最优的性能。
{"title":"Neural Network Optimal Feedback Control With Guaranteed Local Stability","authors":"Tenavi Nakamura-Zimmerer;Qi Gong;Wei Kang","doi":"10.1109/OJCSYS.2022.3205863","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3205863","url":null,"abstract":"Recent research shows that supervised learning can be an effective tool for designing near-optimal feedback controllers for high-dimensional nonlinear dynamic systems. But the behavior of neural network controllers is still not well understood. In particular, some neural networks with high test accuracy can fail to even locally stabilize the dynamic system. To address this challenge we propose several novel neural network architectures, which we show guarantee local asymptotic stability while retaining the approximation capacity to learn the optimal feedback policy semi-globally. The proposed architectures are compared against standard neural network feedback controllers through numerical simulations of two high-dimensional nonlinear optimal control problems: stabilization of an unstable Burgers-type partial differential equation, and altitude and course tracking for an unmanned aerial vehicle. The simulations demonstrate that standard neural networks can fail to stabilize the dynamics even when trained well, while the proposed architectures are always at least locally stabilizing and can achieve near-optimal performance.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"210-222"},"PeriodicalIF":0.0,"publicationDate":"2022-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09887885.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50348954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Efficient Learning of Hyperrectangular Invariant Sets Using Gaussian Processes 利用高斯过程有效学习超矩形不变集
Pub Date : 2022-09-12 DOI: 10.1109/OJCSYS.2022.3206083
Michael Enqi Cao;Matthieu Bloch;Samuel Coogan
We present a method for efficiently computing reachable sets and forward invariant sets for continuous-time systems with dynamics that include unknown components. Our main assumption is that, given any hyperrectangle of states, lower and upper bounds for the unknown components are available. With this assumption, the theory of mixed monotone systems allows us to formulate an efficient method for computing a hyperrectangular set that over-approximates the reachable set of the system. We then show a related approach that leads to sufficient conditions for identifying hyperrectangular sets that are forward invariant for the dynamics. We additionally show that set estimates tighten as the bounds on the unknown behavior tighten. Finally, we derive a method for satisfying our main assumption by modeling the unknown components as state-dependent Gaussian processes, providing bounds that are correct with high probability. A key benefit of our approach is to enable tractable computations for systems up to moderately high dimension that are subject to low dimensional uncertainty modeled as Gaussian processes, a class of systems that often appears in practice. We demonstrate our results on several examples, including a case study of a planar multirotor aerial vehicle.
我们提出了一种有效计算连续时间系统的可达集和前向不变集的方法,该系统具有包含未知分量的动力学。我们的主要假设是,给定任何超矩形状态,未知分量的下界和上界都是可用的。有了这个假设,混合单调系统理论允许我们制定一种有效的方法来计算超矩形集,该超矩形集过度逼近系统的可达集。然后,我们展示了一种相关的方法,该方法导致识别对动力学具有前向不变的超矩形集的充分条件。我们还表明,集合估计随着未知行为的边界收紧而收紧。最后,我们推导了一种方法,通过将未知分量建模为状态相关的高斯过程来满足我们的主要假设,提供高概率正确的边界。我们的方法的一个关键好处是,能够对高达适度维度的系统进行易于处理的计算,这些系统受到建模为高斯过程的低维不确定性的影响,高斯过程是一类在实践中经常出现的系统。我们在几个例子中展示了我们的结果,包括一个平面多旋翼飞行器的案例研究。
{"title":"Efficient Learning of Hyperrectangular Invariant Sets Using Gaussian Processes","authors":"Michael Enqi Cao;Matthieu Bloch;Samuel Coogan","doi":"10.1109/OJCSYS.2022.3206083","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3206083","url":null,"abstract":"We present a method for efficiently computing reachable sets and forward invariant sets for continuous-time systems with dynamics that include unknown components. Our main assumption is that, given any hyperrectangle of states, lower and upper bounds for the unknown components are available. With this assumption, the theory of mixed monotone systems allows us to formulate an efficient method for computing a hyperrectangular set that over-approximates the reachable set of the system. We then show a related approach that leads to sufficient conditions for identifying hyperrectangular sets that are forward invariant for the dynamics. We additionally show that set estimates tighten as the bounds on the unknown behavior tighten. Finally, we derive a method for satisfying our main assumption by modeling the unknown components as state-dependent Gaussian processes, providing bounds that are correct with high probability. A key benefit of our approach is to enable tractable computations for systems up to moderately high dimension that are subject to low dimensional uncertainty modeled as Gaussian processes, a class of systems that often appears in practice. We demonstrate our results on several examples, including a case study of a planar multirotor aerial vehicle.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"223-236"},"PeriodicalIF":0.0,"publicationDate":"2022-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09888053.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50381124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Finite Sample Identification of Low-Order LTI Systems via Nuclear Norm Regularization 基于核范数正则化的低阶LTI系统的有限样本辨识
Pub Date : 2022-08-31 DOI: 10.1109/OJCSYS.2022.3200015
Yue Sun;Samet Oymak;Maryam Fazel
This paper studies the problem of identifying low-order linear time-invariant systems via Hankel nuclear norm (HNN) regularization. This regularization encourages the Hankel matrix to be low-rank, which corresponds to the dynamical system being of low order. We provide novel statistical analysis for this regularization, and contrast it with the unregularized ordinary least-squares (OLS) estimator. Our analysis leads to new finite-sample error bounds on estimating the impulse response and the Hankel matrix associated with the linear system using HNN regularization. We design a suitable input excitation, and show that we can recover the system using a number of observations that scales optimally with the true system order and achieves strong statistical estimation rates. Complementing these, we also demonstrate that the input design indeed matters by proving that intuitive choices, such as i.i.d. Gaussian input, lead to sub-optimal sample complexity. To better understand the benefits of regularization, we also revisit the OLS estimator. Besides refining existing bounds, we experimentally identify when HNN regularization improves over OLS: (1) For low-order systems with slow impulse-response decay, OLS method performs poorly in terms of sample complexity, (2) the Hankel matrix returned by regularization has a more clear singular value gap that makes determining the system order easier, (3) HNN regularization is less sensitive to hyperparameter choice. To choose the regularization parameter, we also outline a simple joint train-validation procedure.
本文研究了利用Hankel核范数(HNN)正则化识别低阶线性时不变系统的问题。这种正则化鼓励Hankel矩阵是低阶的,这对应于低阶的动力系统。我们为这种正则化提供了新的统计分析,并将其与非正则化的普通最小二乘(OLS)估计器进行了比较。我们的分析得出了使用HNN正则化估计脉冲响应和与线性系统相关的Hankel矩阵的新的有限样本误差界。我们设计了一个合适的输入激励,并表明我们可以使用许多观测值来恢复系统,这些观测值与真实系统阶数成最佳比例,并实现了强大的统计估计率。作为补充,我们还证明了输入设计确实很重要,证明了直觉选择,如i.i.d.高斯输入,会导致次优样本复杂性。为了更好地理解正则化的好处,我们还重新审视了OLS估计器。除了细化现有边界外,我们还通过实验确定了HNN正则化何时优于OLS:(1)对于具有慢脉冲响应衰减的低阶系统,OLS方法在样本复杂度方面表现不佳,(2)正则化返回的Hankel矩阵具有更清晰的奇异值间隙,这使得确定系统阶数变得更容易,(3)HNN正则化对超参数选择不太敏感。为了选择正则化参数,我们还概述了一个简单的联合训练验证程序。
{"title":"Finite Sample Identification of Low-Order LTI Systems via Nuclear Norm Regularization","authors":"Yue Sun;Samet Oymak;Maryam Fazel","doi":"10.1109/OJCSYS.2022.3200015","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3200015","url":null,"abstract":"This paper studies the problem of identifying low-order linear time-invariant systems via Hankel nuclear norm (HNN) regularization. This regularization encourages the Hankel matrix to be low-rank, which corresponds to the dynamical system being of low order. We provide novel statistical analysis for this regularization, and contrast it with the unregularized ordinary least-squares (OLS) estimator. Our analysis leads to new finite-sample error bounds on estimating the impulse response and the Hankel matrix associated with the linear system using HNN regularization. We design a suitable input excitation, and show that we can recover the system using a number of observations that scales optimally with the true system order and achieves strong statistical estimation rates. Complementing these, we also demonstrate that the input design indeed matters by proving that intuitive choices, such as i.i.d. Gaussian input, lead to sub-optimal sample complexity. To better understand the benefits of regularization, we also revisit the OLS estimator. Besides refining existing bounds, we experimentally identify when HNN regularization improves over OLS: (1) For low-order systems with slow impulse-response decay, OLS method performs poorly in terms of sample complexity, (2) the Hankel matrix returned by regularization has a more clear singular value gap that makes determining the system order easier, (3) HNN regularization is less sensitive to hyperparameter choice. To choose the regularization parameter, we also outline a simple joint train-validation procedure.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"237-254"},"PeriodicalIF":0.0,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09870857.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50381125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Characterization of Multilayer Piezoelectric Stacks Down to 100K 100K以下多层压电堆的特性研究
Pub Date : 2022-08-29 DOI: 10.1117/12.2634962
S. Sherrit, M. Badescu, John B. Steeves, William E. Krieger, Clifford A. Klein, Otto R. Polanco, C. Weisberg, D. Van Buren, J. Sauvageau, K. Coste
A variety of applications require precision control at cryogenic temperatures. The next-generation of telescopes are looking to increase apertures in space telescopes and observations in the mid through far infrared regions enabling new science ranging from exoplanet characterization to precision astronomical observations to further refine astrophysics models. Concepts include segmented telescopes which are capable of observations in UV through IR bands, thus driving the need for UV surface performance at cryogenic temperatures. These telescope’s segments will require actuators for controlled surface displacements capable of operation at cryogenic temperatures ( $le 150text{K}$ ). The work reported in this paper is directed at understanding piezoelectric stack actuator operation down to cryogenic temperatures (100 K) which will provide actuator designers the needed information to model and predict performance. The data reported down to 100 K includes; resonance data, displacement voltage (S vs E) and capacitor voltage (D vs E) curves, stiffness, hysteresis, blocking force, DC resistance measurements, thermal strains and the coefficients of thermal expansion as a function of the electrical boundary conditions. Open-loop control drive strategies and errors are also reported. We apply this data to a surface parallel actuator mirror design.
各种应用需要在低温下进行精确控制。下一代望远镜正在寻求增加太空望远镜的孔径,并在中红外到远红外区域进行观测,从而实现从系外行星表征到精确天文观测的新科学,以进一步完善天体物理学模型。概念包括能够在紫外线到红外波段进行观测的分段望远镜,从而推动了对低温下紫外线表面性能的需求。这些望远镜的部分将需要能够在低温($le 150text{K}$)下运行的可控表面位移的致动器。本文报道的工作旨在了解压电堆致动器在低温(100K)下的操作,这将为致动器设计者提供建模和预测性能所需的信息。报告的低至100K的数据包括:;谐振数据、位移电压(S vs E)和电容器电压(D vs E)曲线、刚度、磁滞、阻塞力、直流电阻测量、热应变和热膨胀系数作为电边界条件的函数。还报告了开环控制驱动策略和错误。我们将这些数据应用于表面平行致动器反射镜的设计。
{"title":"Characterization of Multilayer Piezoelectric Stacks Down to 100K","authors":"S. Sherrit, M. Badescu, John B. Steeves, William E. Krieger, Clifford A. Klein, Otto R. Polanco, C. Weisberg, D. Van Buren, J. Sauvageau, K. Coste","doi":"10.1117/12.2634962","DOIUrl":"https://doi.org/10.1117/12.2634962","url":null,"abstract":"A variety of applications require precision control at cryogenic temperatures. The next-generation of telescopes are looking to increase apertures in space telescopes and observations in the mid through far infrared regions enabling new science ranging from exoplanet characterization to precision astronomical observations to further refine astrophysics models. Concepts include segmented telescopes which are capable of observations in UV through IR bands, thus driving the need for UV surface performance at cryogenic temperatures. These telescope’s segments will require actuators for controlled surface displacements capable of operation at cryogenic temperatures ( $le 150text{K}$ ). The work reported in this paper is directed at understanding piezoelectric stack actuator operation down to cryogenic temperatures (100 K) which will provide actuator designers the needed information to model and predict performance. The data reported down to 100 K includes; resonance data, displacement voltage (S vs E) and capacitor voltage (D vs E) curves, stiffness, hysteresis, blocking force, DC resistance measurements, thermal strains and the coefficients of thermal expansion as a function of the electrical boundary conditions. Open-loop control drive strategies and errors are also reported. We apply this data to a surface parallel actuator mirror design.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"2 1","pages":"65-82"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47036631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Stable Reinforcement Learning for Optimal Frequency Control: A Distributed Averaging-Based Integral Approach 最优频率控制的稳定强化学习:一种基于分布平均的积分方法
Pub Date : 2022-08-29 DOI: 10.1109/OJCSYS.2022.3202202
Yan Jiang;Wenqi Cui;Baosen Zhang;Jorge Cortés
Frequency control plays a pivotal role in reliable power system operations. It is conventionally performed in a hierarchical way that first rapidly stabilizes the frequency deviations and then slowly recovers the nominal frequency. However, as the generation mix shifts from synchronous generators to renewable resources, power systems experience larger and faster frequency fluctuations due to the loss of inertia, which adversely impacts the frequency stability. This has motivated active research in algorithms that jointly address frequency degradation and economic efficiency in a fast timescale, among which the distributed averaging-based integral (DAI) control is a notable one that sets controllable power injections directly proportional to the integrals of frequency deviation and economic inefficiency signals. Nevertheless, DAI does not typically consider the transient performance of the system following power disturbances and has been restricted to quadratic operational cost functions. This paper aims to leverage nonlinear optimal controllers to simultaneously achieve optimal transient frequency control and find the most economic power dispatch for frequency restoration. To this end, we integrate reinforcement learning (RL) to the classic DAI, which results in RL-DAI control. Specifically, we use RL to learn a neural network-based control policy mapping from the integral variables of DAI to the controllable power injections which provides optimal transient frequency control, while DAI inherently ensures the frequency restoration and optimal economic dispatch. Compared to existing methods, we provide provable guarantees on the stability of the learned controllers and extend the set of allowable cost functions to a much larger class. Simulations on the 39-bus New England system illustrate our results.
频率控制在电力系统可靠运行中起着关键作用。它通常以分级方式执行,首先快速稳定频率偏差,然后缓慢恢复标称频率。然而,随着发电组合从同步发电机转向可再生资源,由于惯性损失,电力系统会经历更大、更快的频率波动,这会对频率稳定性产生不利影响。这促使人们积极研究在快速时间尺度上联合解决频率退化和经济效率问题的算法,其中基于分布式平均的积分(DAI)控制是一种值得注意的控制,它将可控功率注入设置为与频率偏差和经济效率信号的积分成正比。然而,DAI通常不考虑电力扰动后系统的瞬态性能,并且被限制为二次运行成本函数。本文旨在利用非线性最优控制器同时实现最优瞬态频率控制,并找到最经济的频率恢复电力调度。为此,我们将强化学习(RL)与经典的DAI相结合,从而实现RL-DAI控制。具体而言,我们使用RL来学习从DAI的积分变量到可控功率注入的基于神经网络的控制策略映射,该映射提供了最优的瞬态频率控制,而DAI本质上确保了频率恢复和最优经济调度。与现有方法相比,我们对学习控制器的稳定性提供了可证明的保证,并将允许代价函数集扩展到一个更大的类。在新英格兰39路公交车系统上的模拟说明了我们的结果。
{"title":"Stable Reinforcement Learning for Optimal Frequency Control: A Distributed Averaging-Based Integral Approach","authors":"Yan Jiang;Wenqi Cui;Baosen Zhang;Jorge Cortés","doi":"10.1109/OJCSYS.2022.3202202","DOIUrl":"https://doi.org/10.1109/OJCSYS.2022.3202202","url":null,"abstract":"Frequency control plays a pivotal role in reliable power system operations. It is conventionally performed in a hierarchical way that first rapidly stabilizes the frequency deviations and then slowly recovers the nominal frequency. However, as the generation mix shifts from synchronous generators to renewable resources, power systems experience larger and faster frequency fluctuations due to the loss of inertia, which adversely impacts the frequency stability. This has motivated active research in algorithms that jointly address frequency degradation and economic efficiency in a fast timescale, among which the distributed averaging-based integral (DAI) control is a notable one that sets controllable power injections directly proportional to the integrals of frequency deviation and economic inefficiency signals. Nevertheless, DAI does not typically consider the transient performance of the system following power disturbances and has been restricted to quadratic operational cost functions. This paper aims to leverage nonlinear optimal controllers to simultaneously achieve optimal transient frequency control and find the most economic power dispatch for frequency restoration. To this end, we integrate reinforcement learning (RL) to the classic DAI, which results in RL-DAI control. Specifically, we use RL to learn a neural network-based control policy mapping from the integral variables of DAI to the controllable power injections which provides optimal transient frequency control, while DAI inherently ensures the frequency restoration and optimal economic dispatch. Compared to existing methods, we provide provable guarantees on the stability of the learned controllers and extend the set of allowable cost functions to a much larger class. Simulations on the 39-bus New England system illustrate our results.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"1 ","pages":"194-209"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/9552933/9683993/09869334.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50255798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
IEEE open journal of control systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1