首页 > 最新文献

IEEE open journal of control systems最新文献

英文 中文
Distributionally Robust Policy and Lyapunov-Certificate Learning 分布稳健政策与 Lyapunov 证书学习
Pub Date : 2024-08-07 DOI: 10.1109/OJCSYS.2024.3440051
Kehan Long;Jorge Cortés;Nikolay Atanasov
This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate. To avoid the computational complexity involved in dealing with the space of probability measures, we identify a sufficient condition in the form of deterministic convex constraints that ensures the Lyapunov derivative constraint is satisfied. We integrate this condition into a loss function for training a neural network-based controller and show that, for the resulting closed-loop system, the global asymptotic stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution (OoD) model uncertainties. To demonstrate the efficacy and efficiency of the proposed methodology, we compare it with an uncertainty-agnostic baseline approach and several reinforcement learning approaches in two control problems in simulation. Open-source implementations of the examples are available at https://github.com/KehanLong/DR_Stabilizing_Policy.
本文介绍了为模型不确定性下的控制系统合成分布式鲁棒稳定神经控制器和凭手机验证码领取彩金的新方法。在为不确定系统设计具有稳定性保证的控制器时,一个关键挑战是在线部署期间如何准确确定和适应模型参数不确定性的变化。我们采用新颖的分布稳健型 Lyapunov 导数机会约束来解决这一问题,确保 Lyapunov 证书单调递减。为了避免处理概率度量空间所涉及的计算复杂性,我们以确定性凸约束的形式确定了一个充分条件,确保满足 Lyapunov 导数约束。我们将这一条件整合到损失函数中,用于训练基于神经网络的控制器,结果表明,对于由此产生的闭环系统,即使在分布外(OoD)模型不确定的情况下,其平衡的全局渐近稳定性也能以很高的置信度得到验证。为了证明所提方法的功效和效率,我们在两个模拟控制问题中将其与不确定基线方法和几种强化学习方法进行了比较。示例的开源实现可在 https://github.com/KehanLong/DR_Stabilizing_Policy 上获取。
{"title":"Distributionally Robust Policy and Lyapunov-Certificate Learning","authors":"Kehan Long;Jorge Cortés;Nikolay Atanasov","doi":"10.1109/OJCSYS.2024.3440051","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3440051","url":null,"abstract":"This article presents novel methods for synthesizing distributionally robust stabilizing neural controllers and certificates for control systems under model uncertainty. A key challenge in designing controllers with stability guarantees for uncertain systems is the accurate determination of and adaptation to shifts in model parametric uncertainty during online deployment. We tackle this with a novel distributionally robust formulation of the Lyapunov derivative chance constraint ensuring a monotonic decrease of the Lyapunov certificate. To avoid the computational complexity involved in dealing with the space of probability measures, we identify a sufficient condition in the form of deterministic convex constraints that ensures the Lyapunov derivative constraint is satisfied. We integrate this condition into a loss function for training a neural network-based controller and show that, for the resulting closed-loop system, the global asymptotic stability of its equilibrium can be certified with high confidence, even with Out-of-Distribution (OoD) model uncertainties. To demonstrate the efficacy and efficiency of the proposed methodology, we compare it with an uncertainty-agnostic baseline approach and several reinforcement learning approaches in two control problems in simulation. Open-source implementations of the examples are available at \u0000<uri>https://github.com/KehanLong/DR_Stabilizing_Policy</uri>\u0000.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"375-388"},"PeriodicalIF":0.0,"publicationDate":"2024-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10629071","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global Multi-Phase Path Planning Through High-Level Reinforcement Learning 通过高级强化学习进行全球多阶段路径规划
Pub Date : 2024-07-29 DOI: 10.1109/OJCSYS.2024.3435080
Babak Salamat;Sebastian-Sven Olzem;Gerhard Elsbacher;Andrea M. Tonello
In this paper, we introduce the Global Multi-Phase Path Planning ($GMP^{3}$) algorithm in planner problems, which computes fast and feasible trajectories in environments with obstacles, considering physical and kinematic constraints. Our approach utilizes a Markov Decision Process (MDP) framework and high-level reinforcement learning techniques to ensure trajectory smoothness, continuity, and compliance with constraints. Through extensive simulations, we demonstrate the algorithm's effectiveness and efficiency across various scenarios. We highlight existing path planning challenges, particularly in integrating dynamic adaptability and computational efficiency. The results validate our method's convergence guarantees using Lyapunov’s stability theorem and underscore its computational advantages.
在本文中,我们介绍了规划器问题中的全局多阶段路径规划($GMP^{3}$)算法,它可以在有障碍物的环境中计算快速可行的轨迹,同时考虑物理和运动学约束。我们的方法利用马尔可夫决策过程(MDP)框架和高级强化学习技术来确保轨迹的平滑性、连续性并符合约束条件。通过大量模拟,我们展示了该算法在各种场景下的有效性和效率。我们强调了现有路径规划所面临的挑战,尤其是在动态适应性和计算效率的整合方面。结果利用 Lyapunov 稳定性定理验证了我们方法的收敛性保证,并强调了其计算优势。
{"title":"Global Multi-Phase Path Planning Through High-Level Reinforcement Learning","authors":"Babak Salamat;Sebastian-Sven Olzem;Gerhard Elsbacher;Andrea M. Tonello","doi":"10.1109/OJCSYS.2024.3435080","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3435080","url":null,"abstract":"In this paper, we introduce the \u0000<italic>Global Multi-Phase Path Planning</i>\u0000 (\u0000<monospace><inline-formula><tex-math>$GMP^{3}$</tex-math></inline-formula></monospace>\u0000) algorithm in planner problems, which computes fast and feasible trajectories in environments with obstacles, considering physical and kinematic constraints. Our approach utilizes a Markov Decision Process (MDP) framework and high-level reinforcement learning techniques to ensure trajectory smoothness, continuity, and compliance with constraints. Through extensive simulations, we demonstrate the algorithm's effectiveness and efficiency across various scenarios. We highlight existing path planning challenges, particularly in integrating dynamic adaptability and computational efficiency. The results validate our method's convergence guarantees using Lyapunov’s stability theorem and underscore its computational advantages.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"405-415"},"PeriodicalIF":0.0,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10613437","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142430772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Risk-Aware Stochastic MPC for Chance-Constrained Linear Systems 针对机会受限线性系统的风险意识随机 MPC
Pub Date : 2024-07-01 DOI: 10.1109/OJCSYS.2024.3421372
Pouria Tooranjipour;Bahare Kiumarsi;Hamidreza Modares
This paper presents a fully risk-aware model predictive control (MPC) framework for chance-constrained discrete-time linear control systems with process noise. Conditional value-at-risk (CVaR) as a popular coherent risk measure is incorporated in both the constraints and the cost function of the MPC framework. This allows the system to navigate the entire spectrum of risk assessments, from worst-case to risk-neutral scenarios, ensuring both constraint satisfaction and performance optimization in stochastic environments. The recursive feasibility and risk-aware exponential stability of the resulting risk-aware MPC are demonstrated through rigorous theoretical analysis by considering the disturbance feedback policy parameterization. In the end, two numerical examples are given to elucidate the efficacy of the proposed method.
本文针对具有过程噪声的机会约束离散时间线性控制系统,提出了一种完全风险感知的模型预测控制(MPC)框架。条件风险值(CVaR)作为一种流行的连贯风险度量,被纳入了 MPC 框架的约束条件和成本函数中。这样,系统就能驾驭从最坏情况到风险中性情况的整个风险评估范围,确保在随机环境中既能满足约束条件,又能优化性能。通过对扰动反馈策略参数化的严格理论分析,证明了由此产生的风险感知 MPC 的递归可行性和风险感知指数稳定性。最后,给出了两个数值示例,以阐明所提方法的有效性。
{"title":"Risk-Aware Stochastic MPC for Chance-Constrained Linear Systems","authors":"Pouria Tooranjipour;Bahare Kiumarsi;Hamidreza Modares","doi":"10.1109/OJCSYS.2024.3421372","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3421372","url":null,"abstract":"This paper presents a fully risk-aware model predictive control (MPC) framework for chance-constrained discrete-time linear control systems with process noise. Conditional value-at-risk (CVaR) as a popular coherent risk measure is incorporated in both the constraints and the cost function of the MPC framework. This allows the system to navigate the entire spectrum of risk assessments, from worst-case to risk-neutral scenarios, ensuring both constraint satisfaction and performance optimization in stochastic environments. The recursive feasibility and risk-aware exponential stability of the resulting risk-aware MPC are demonstrated through rigorous theoretical analysis by considering the disturbance feedback policy parameterization. In the end, two numerical examples are given to elucidate the efficacy of the proposed method.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"282-294"},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10578318","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141631005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging the Turnpike Effect for Mean Field Games Numerics 利用匝道效应进行均值场游戏数值计算
Pub Date : 2024-06-26 DOI: 10.1109/OJCSYS.2024.3419642
René A. Carmona;Claire Zeng
Recently, a deep-learning algorithm referred to as Deep Galerkin Method (DGM), has gained a lot of attention among those trying to solve numerically Mean Field Games with finite horizon, even if the performance seems to be decreasing significantly with increasing horizon. On the other hand, it has been proven that some specific classes of Mean Field Games enjoy some form of the turnpike property identified over seven decades ago by economists. The gist of this phenomenon is a proof that the solution of an optimal control problem over a long time interval spends most of its time near the stationary solution of the ergodic version of the corresponding infinite horizon optimization problem. After reviewing the implementation of DGM for finite horizon Mean Field Games, we introduce a “turnpike-accelerated” version that incorporates the turnpike estimates in the loss function to be optimized, and we perform a comparative numerical analysis to show the advantages of this accelerated version over the baseline DGM algorithm. We demonstrate on some of the Mean Field Game models with local-couplings known to have the turnpike property, as well as a new class of linear-quadratic models for which we derive explicit turnpike estimates.
最近,一种被称为 "深度伽勒金方法(DGM)"的深度学习算法在试图数值求解有限视界均值场博弈的人群中获得了广泛关注,尽管其性能似乎随着视界的增加而显著下降。另一方面,有研究证明,某些特定类别的均值场博弈具有经济学家在七十多年前发现的某种形式的岔道特性。这一现象的要旨是证明了在一个较长的时间间隔内,最优控制问题的解大部分时间都在相应的无限视界优化问题的遍历版本的静态解附近。在回顾了有限视界均值场博弈的 DGM 实现之后,我们引入了 "岔道加速 "版本,该版本将岔道估计纳入了待优化的损失函数中,我们还进行了数值对比分析,以显示该加速版本相对于基准 DGM 算法的优势。我们在一些已知具有岔道特性的局部耦合平均场博弈模型以及一类新的线性二次模型上进行了演示,并得出了明确的岔道估计值。
{"title":"Leveraging the Turnpike Effect for Mean Field Games Numerics","authors":"René A. Carmona;Claire Zeng","doi":"10.1109/OJCSYS.2024.3419642","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3419642","url":null,"abstract":"Recently, a deep-learning algorithm referred to as Deep Galerkin Method (DGM), has gained a lot of attention among those trying to solve numerically Mean Field Games with finite horizon, even if the performance seems to be decreasing significantly with increasing horizon. On the other hand, it has been proven that some specific classes of Mean Field Games enjoy some form of the turnpike property identified over seven decades ago by economists. The gist of this phenomenon is a proof that the solution of an optimal control problem over a long time interval spends most of its time near the stationary solution of the ergodic version of the corresponding infinite horizon optimization problem. After reviewing the implementation of DGM for finite horizon Mean Field Games, we introduce a “turnpike-accelerated” version that incorporates the turnpike estimates in the loss function to be optimized, and we perform a comparative numerical analysis to show the advantages of this accelerated version over the baseline DGM algorithm. We demonstrate on some of the Mean Field Game models with local-couplings known to have the turnpike property, as well as a new class of linear-quadratic models for which we derive explicit turnpike estimates.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"389-404"},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10572276","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142376852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Concurrent Learning of Control Policy and Unknown Safety Specifications in Reinforcement Learning 强化学习中同时学习控制策略和未知安全规范
Pub Date : 2024-06-24 DOI: 10.1109/OJCSYS.2024.3418306
Lunet Yifru;Ali Baheri
Reinforcement learning (RL) has revolutionized decision-making across a wide range of domains over the past few decades. Yet, deploying RL policies in real-world scenarios presents the crucial challenge of ensuring safety. Traditional safe RL approaches have predominantly focused on incorporating predefined safety constraints into the policy learning process. However, this reliance on predefined safety constraints poses limitations in dynamic and unpredictable real-world settings where such constraints may not be available or sufficiently adaptable. Bridging this gap, we propose a novel approach that concurrently learns a safe RL control policy and identifies the unknown safety constraint parameters of a given environment. Initializing with a parametric signal temporal logic (pSTL) safety specification and a small initial labeled dataset, we frame the problem as a bilevel optimization task, intricately integrating constrained policy optimization, using a Lagrangian-variant of the twin delayed deep deterministic policy gradient (TD3) algorithm, with Bayesian optimization for optimizing parameters for the given pSTL safety specification. Through experimentation in comprehensive case studies, we validate the efficacy of this approach across varying forms of environmental constraints, consistently yielding safe RL policies with high returns. Furthermore, our findings indicate successful learning of STL safety constraint parameters, exhibiting a high degree of conformity with true environmental safety constraints. The performance of our model closely mirrors that of an ideal scenario that possesses complete prior knowledge of safety constraints, demonstrating its proficiency in accurately identifying environmental safety constraints and learning safe policies that adhere to those constraints. A Python implementation of the algorithm can be found at https://github.com/SAILRIT/Concurrent-Learning-of-Control-Policy-and-Unknown-Constraints-in-Reinforcement-Learning.git.
在过去的几十年里,强化学习(RL)已经在广泛的领域为决策带来了革命性的变化。然而,在现实世界场景中部署 RL 政策却面临着确保安全的严峻挑战。传统的安全 RL 方法主要侧重于将预定义的安全约束纳入策略学习过程。然而,这种对预定义安全约束的依赖在动态和不可预测的真实世界环境中造成了限制,因为在这种环境中,此类约束可能无法获得或无法充分适应。为了弥补这一缺陷,我们提出了一种新方法,它能同时学习安全的 RL 控制策略,并识别给定环境中的未知安全约束参数。以参数信号时序逻辑(pSTL)安全规范和一个小型初始标注数据集为初始,我们将该问题视为一个双层优化任务,利用孪生延迟深度确定性策略梯度(TD3)算法的拉格朗日变体将约束策略优化与贝叶斯优化巧妙地结合在一起,以优化给定 pSTL 安全规范的参数。通过综合案例研究实验,我们验证了这种方法在不同形式的环境约束下的有效性,并持续产生了具有高回报的安全 RL 政策。此外,我们的研究结果表明,我们成功地学习了 STL 安全约束参数,与真实的环境安全约束高度一致。我们模型的性能与拥有完整安全约束先验知识的理想场景非常接近,这表明它能够准确识别环境安全约束并学习符合这些约束的安全策略。该算法的 Python 实现可在 https://github.com/SAILRIT/Concurrent-Learning-of-Control-Policy-and-Unknown-Constraints-in-Reinforcement-Learning.git 上找到。
{"title":"Concurrent Learning of Control Policy and Unknown Safety Specifications in Reinforcement Learning","authors":"Lunet Yifru;Ali Baheri","doi":"10.1109/OJCSYS.2024.3418306","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3418306","url":null,"abstract":"Reinforcement learning (RL) has revolutionized decision-making across a wide range of domains over the past few decades. Yet, deploying RL policies in real-world scenarios presents the crucial challenge of ensuring safety. Traditional safe RL approaches have predominantly focused on incorporating predefined safety constraints into the policy learning process. However, this reliance on predefined safety constraints poses limitations in dynamic and unpredictable real-world settings where such constraints may not be available or sufficiently adaptable. Bridging this gap, we propose a novel approach that concurrently learns a safe RL control policy and identifies the unknown safety constraint parameters of a given environment. Initializing with a parametric signal temporal logic (pSTL) safety specification and a small initial labeled dataset, we frame the problem as a bilevel optimization task, intricately integrating constrained policy optimization, using a Lagrangian-variant of the twin delayed deep deterministic policy gradient (TD3) algorithm, with Bayesian optimization for optimizing parameters for the given pSTL safety specification. Through experimentation in comprehensive case studies, we validate the efficacy of this approach across varying forms of environmental constraints, consistently yielding safe RL policies with high returns. Furthermore, our findings indicate successful learning of STL safety constraint parameters, exhibiting a high degree of conformity with true environmental safety constraints. The performance of our model closely mirrors that of an ideal scenario that possesses complete prior knowledge of safety constraints, demonstrating its proficiency in accurately identifying environmental safety constraints and learning safe policies that adhere to those constraints. A Python implementation of the algorithm can be found at \u0000<uri>https://github.com/SAILRIT/Concurrent-Learning-of-Control-Policy-and-Unknown-Constraints-in-Reinforcement-Learning.git</uri>\u0000.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"266-281"},"PeriodicalIF":0.0,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10569078","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Solving Decision-Dependent Games by Learning From Feedback 通过从反馈中学习来解决依赖决策的游戏
Pub Date : 2024-06-19 DOI: 10.1109/OJCSYS.2024.3416768
Killian Wood;Ahmed S. Zamzam;Emiliano Dall'Anese
This paper tackles the problem of solving stochastic optimization problems with a decision-dependent distribution in the setting of stochastic strongly-monotone games and when the distributional dependence is unknown. A two-stage approach is proposed, which initially involves estimating the distributional dependence on decision variables, and subsequently optimizing over the estimated distributional map. The paper presents guarantees for the approximation of the cost of each agent. Furthermore, a stochastic gradient-based algorithm is developed and analyzed for finding the Nash equilibrium in a distributed fashion. Numerical simulations are provided for a novel electric vehicle charging market formulation using real-world data.
本文探讨了在强单调随机博弈背景下,当分布依赖性未知时,如何解决决策依赖分布的随机优化问题。本文提出了一种两阶段方法,即首先估计决策变量的分布依赖性,然后在估计的分布图上进行优化。论文提出了每个代理成本近似值的保证。此外,还开发并分析了一种基于随机梯度的算法,用于以分布式方式寻找纳什均衡。本文还利用真实世界的数据,对新型电动汽车充电市场模型进行了数值模拟。
{"title":"Solving Decision-Dependent Games by Learning From Feedback","authors":"Killian Wood;Ahmed S. Zamzam;Emiliano Dall'Anese","doi":"10.1109/OJCSYS.2024.3416768","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3416768","url":null,"abstract":"This paper tackles the problem of solving stochastic optimization problems with a decision-dependent distribution in the setting of stochastic strongly-monotone games and when the distributional dependence is unknown. A two-stage approach is proposed, which initially involves estimating the distributional dependence on decision variables, and subsequently optimizing over the estimated distributional map. The paper presents guarantees for the approximation of the cost of each agent. Furthermore, a stochastic gradient-based algorithm is developed and analyzed for finding the Nash equilibrium in a distributed fashion. Numerical simulations are provided for a novel electric vehicle charging market formulation using real-world data.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"295-309"},"PeriodicalIF":0.0,"publicationDate":"2024-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10564130","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141964790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sorta Solving the OPF by Not Solving the OPF: DAE Control Theory and the Price of Realtime Regulation 通过不解决 OPF 算是解决了 OPF:DAE 控制理论与实时监管的代价
Pub Date : 2024-06-13 DOI: 10.1109/OJCSYS.2024.3414221
Muhammad Nadeem;Ahmad F. Taha
This paper presents a new approach to approximate the AC optimal power flow (ACOPF). By eliminating the need to solve the ACOPF every few minutes, the paper showcases how a realtime feedback controller can be utilized in lieu of ACOPF and its variants. By i) forming the grid dynamics as a system of differential-algebraic equations (DAE) that naturally encode the non-convex OPF power flow constraints, ii) utilizing DAE-Lyapunov theory, and iii) designing a feedback controller that captures realtime uncertainty while being uncertainty-unaware, the presented approach demonstrates promises of obtaining solutions that are close to the OPF ones without needing to solve the OPF. The proposed controller responds in realtime to deviations in renewables generation and loads, guaranteeing improvements in system transient stability, while always yielding approximate solutions of the ACOPF with no constraint violations. As the studied approach herein yields slightly more expensive realtime generator controls, the corresponding price of realtime control and regulation is examined. Cost comparisons with the traditional ACOPF are also showcased—all via case studies on standard power networks.
本文介绍了一种近似交流最佳功率流(ACOPF)的新方法。通过消除每几分钟求解一次 ACOPF 的需要,本文展示了如何利用实时反馈控制器来替代 ACOPF 及其变体。通过 i) 将电网动态形成一个自然编码非凸 OPF 功率流约束的微分代数方程 (DAE) 系统,ii) 利用 DAE-Lyapunov 理论,iii) 设计一个能捕捉实时不确定性同时又不感知不确定性的反馈控制器,本文提出的方法有望在无需求解 OPF 的情况下获得接近 OPF 的解决方案。所提出的控制器能实时响应可再生能源发电和负载的偏差,保证系统瞬态稳定性的改善,同时始终能得到不违反约束条件的 ACOPF 近似解。由于本文研究的方法产生的实时发电机控制成本略高,因此对实时控制和调节的相应价格进行了研究。此外,还通过标准电力网络的案例研究,展示了与传统 ACOPF 的成本比较。
{"title":"Sorta Solving the OPF by Not Solving the OPF: DAE Control Theory and the Price of Realtime Regulation","authors":"Muhammad Nadeem;Ahmad F. Taha","doi":"10.1109/OJCSYS.2024.3414221","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3414221","url":null,"abstract":"This paper presents a new approach to approximate the AC optimal power flow (ACOPF). By eliminating the need to solve the ACOPF every few minutes, the paper showcases how a realtime feedback controller can be utilized in lieu of ACOPF and its variants. By \u0000<italic>i)</i>\u0000 forming the grid dynamics as a system of differential-algebraic equations (DAE) that naturally encode the non-convex OPF power flow constraints, \u0000<italic>ii)</i>\u0000 utilizing DAE-Lyapunov theory, and \u0000<italic>iii)</i>\u0000 designing a feedback controller that captures realtime uncertainty while being uncertainty-unaware, the presented approach demonstrates promises of obtaining solutions that are close to the OPF ones without needing to solve the OPF. The proposed controller responds in realtime to deviations in renewables generation and loads, guaranteeing improvements in system transient stability, while always yielding approximate solutions of the ACOPF with no constraint violations. As the studied approach herein yields slightly more expensive realtime generator controls, the corresponding price of realtime control and regulation is examined. Cost comparisons with the traditional ACOPF are also showcased—all via case studies on standard power networks.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"253-265"},"PeriodicalIF":0.0,"publicationDate":"2024-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10556752","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141474874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regional PID Control of Switched Positive Systems With Multiple Equilibrium Points 具有多个平衡点的开关正系统的区域 PID 控制
Pub Date : 2024-04-18 DOI: 10.1109/OJCSYS.2024.3391001
Pei Zhang;Junfeng Zhang;Xuan Jia
This paper investigates the regional control problem of switched positive systems with multiple equilibrium points. A proportional-integral-derivative controller is designed by combining the output, the error between the state and the equilibrium point, and the difference of output. A cone is introduced to design the final stable region. Two classes of copositive Lyapunov functions are constructed to achieve the stability and regional stability of subsystems and the whole systems, respectively. Then, a novel class of observers with multiple equilibrium points is proposed using a matrix decomposition approach. The observer-based proportional-integral-derivative control problem is thus solved and all states are driven to the designed cone region under the designed controller. All conditions are formulated in the form of linear programming. The novelties of this paper lie in that: (i) A proportional-integral-derivative control framework is introduced for the considered systems, (ii) Luenberger observer is developed for the observer with multiple equilibrium points, and (iii) Copositive Lyapunov functions and linear programming are employed for the analysis and design of controller and observer. Finally, the effectiveness of the proposed design is verified via two examples.
本文研究了具有多个平衡点的开关正系统的区域控制问题。结合输出、状态与平衡点之间的误差以及输出差值,设计了一个比例-积分-派生控制器。引入了一个锥体来设计最终稳定区域。构建了两类共正 Lyapunov 函数,以分别实现子系统和整个系统的稳定性和区域稳定性。然后,利用矩阵分解方法提出了一类具有多个平衡点的新型观测器。从而解决了基于观测器的比例-积分-衍生控制问题,在设计的控制器下,所有状态都被驱动到设计的锥形区域。所有条件均以线性规划的形式提出。本文的新颖之处在于(i) 为所考虑的系统引入了比例-积分-衍生控制框架,(ii) 为具有多个平衡点的观测器开发了卢恩伯格观测器,(iii) 在控制器和观测器的分析和设计中采用了共正 Lyapunov 函数和线性规划。最后,通过两个实例验证了所提设计方案的有效性。
{"title":"Regional PID Control of Switched Positive Systems With Multiple Equilibrium Points","authors":"Pei Zhang;Junfeng Zhang;Xuan Jia","doi":"10.1109/OJCSYS.2024.3391001","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3391001","url":null,"abstract":"This paper investigates the regional control problem of switched positive systems with multiple equilibrium points. A proportional-integral-derivative controller is designed by combining the output, the error between the state and the equilibrium point, and the difference of output. A cone is introduced to design the final stable region. Two classes of copositive Lyapunov functions are constructed to achieve the stability and regional stability of subsystems and the whole systems, respectively. Then, a novel class of observers with multiple equilibrium points is proposed using a matrix decomposition approach. The observer-based proportional-integral-derivative control problem is thus solved and all states are driven to the designed cone region under the designed controller. All conditions are formulated in the form of linear programming. The novelties of this paper lie in that: (i) A proportional-integral-derivative control framework is introduced for the considered systems, (ii) Luenberger observer is developed for the observer with multiple equilibrium points, and (iii) Copositive Lyapunov functions and linear programming are employed for the analysis and design of controller and observer. Finally, the effectiveness of the proposed design is verified via two examples.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"190-201"},"PeriodicalIF":0.0,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10504945","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Novel Bounds for Incremental Hessian Estimation With Application to Zeroth-Order Federated Learning 应用于零阶联合学习的增量赫赛斯估计新界限
Pub Date : 2024-04-15 DOI: 10.1109/OJCSYS.2024.3388374
Alessio Maritan;Luca Schenato;Subhrakanti Dey
The Hessian matrix conveys important information about the curvature, spectrum and partial derivatives of a function, and is required in a variety of tasks. However, computing the exact Hessian is prohibitively expensive for high-dimensional input spaces, and is just impossible in zeroth-order optimization, where the objective function is a black-box of which only input-output pairs are known. In this work we address this relevant problem by providing a rigorous analysis of an Hessian estimator available in the literature, allowing it to be used as a provably accurate replacement of the true Hessian matrix. The Hessian estimator is randomized and incremental, and its computation requires only point function evaluations. We provide non-asymptotic convergence bounds on the estimation error and derive the minimum number of function queries needed to achieve a desired accuracy with arbitrarily high probability. In the second part of the paper we show a practical application of our results, introducing a novel optimization algorithm suitable for non-convex and black-box federated learning. The algorithm only requires clients to evaluate their local functions at certain input points, and builds a sufficiently accurate estimate of the global Hessian matrix in a distributed way. The algorithm exploits inexact cubic regularization to escape saddle points and guarantees convergence with optimal iteration complexity and high probability. Numerical results show that the proposed algorithm outperforms the existing zeroth-order federated algorithms in both convex and non-convex problems. Furthermore, we achieve similar performance to state-of-the-art algorithms for federated convex optimization that use exact gradients and Hessian matrices.
黑森矩阵传达了函数的曲率、频谱和偏导数等重要信息,在各种任务中都需要使用。然而,计算精确的黑森矩阵对于高维输入空间来说过于昂贵,而且在零阶优化中也是不可能的,因为在零阶优化中,目标函数是一个黑箱,只有输入输出对是已知的。在这项工作中,我们通过对文献中的一个黑森估计器进行严格分析,解决了这一相关问题,使其可以用作真正黑森矩阵的可证明精确替代物。黑森估计器是随机和增量的,其计算只需要点函数求值。我们提供了估计误差的非渐近收敛边界,并推导出了以任意高的概率达到预期精度所需的最小函数查询次数。在论文的第二部分,我们展示了我们成果的实际应用,介绍了一种适用于非凸和黑箱联合学习的新型优化算法。该算法只要求客户在特定输入点评估其局部函数,并以分布式方式建立足够精确的全局赫塞斯矩阵估计值。该算法利用非精确立方正则化来摆脱鞍点,并保证以最佳迭代复杂度和高概率收敛。数值结果表明,在凸问题和非凸问题上,所提出的算法都优于现有的零阶联合算法。此外,我们还取得了与使用精确梯度和黑森矩阵的最先进联合凸优化算法类似的性能。
{"title":"Novel Bounds for Incremental Hessian Estimation With Application to Zeroth-Order Federated Learning","authors":"Alessio Maritan;Luca Schenato;Subhrakanti Dey","doi":"10.1109/OJCSYS.2024.3388374","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3388374","url":null,"abstract":"The Hessian matrix conveys important information about the curvature, spectrum and partial derivatives of a function, and is required in a variety of tasks. However, computing the exact Hessian is prohibitively expensive for high-dimensional input spaces, and is just impossible in zeroth-order optimization, where the objective function is a black-box of which only input-output pairs are known. In this work we address this relevant problem by providing a rigorous analysis of an Hessian estimator available in the literature, allowing it to be used as a provably accurate replacement of the true Hessian matrix. The Hessian estimator is randomized and incremental, and its computation requires only point function evaluations. We provide non-asymptotic convergence bounds on the estimation error and derive the minimum number of function queries needed to achieve a desired accuracy with arbitrarily high probability. In the second part of the paper we show a practical application of our results, introducing a novel optimization algorithm suitable for non-convex and black-box federated learning. The algorithm only requires clients to evaluate their local functions at certain input points, and builds a sufficiently accurate estimate of the global Hessian matrix in a distributed way. The algorithm exploits inexact cubic regularization to escape saddle points and guarantees convergence with optimal iteration complexity and high probability. Numerical results show that the proposed algorithm outperforms the existing zeroth-order federated algorithms in both convex and non-convex problems. Furthermore, we achieve similar performance to state-of-the-art algorithms for federated convex optimization that use exact gradients and Hessian matrices.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"173-189"},"PeriodicalIF":0.0,"publicationDate":"2024-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10499850","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Robust Output Control Barrier Functions From Safe Expert Demonstrations 从安全专家演示中学习鲁棒输出控制障碍函数
Pub Date : 2024-04-04 DOI: 10.1109/OJCSYS.2024.3385348
Lars Lindemann;Alexander Robey;Lejun Jiang;Satyajeet Das;Stephen Tu;Nikolai Matni
This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations. We assume that a model of the system dynamics and a state estimator are available along with corresponding error bounds, e.g., estimated from data in practice. We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety, as defined through controlled forward invariance of a safe set. We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior, e.g., data collected from a human operator or an expert controller. When the parametrization of the ROCBF is linear, then we show that, under mild assumptions, the optimization problem is convex. Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF. Towards obtaining a practical control algorithm, we propose an algorithmic implementation of our theoretical framework that accounts for assumptions made in our framework in practice. We validate our algorithm in the autonomous driving simulator CARLA and demonstrate how to learn safe control laws from simulated RGB camera images.
本文探讨了从专家示范的部分观测结果中学习安全输出反馈控制法的问题。我们假定系统动力学模型和状态估计器以及相应的误差边界(例如,从实际数据中估计的误差边界)是可用的。我们首先提出了鲁棒输出控制障碍函数(ROCBFs),作为保证安全性的一种手段,通过安全集的受控前向不变性来定义。然后,我们提出了一个优化问题,即从显示安全系统行为的专家演示(例如,从人类操作员或专家控制器收集的数据)中学习 ROCBFs。当 ROCBF 的参数化为线性时,我们将证明,在温和的假设条件下,优化问题是凸性的。除了优化问题,我们还提供了数据密度、系统模型和状态估计的平滑性以及误差边界大小等方面的可验证条件,以保证所获得的 ROCBF 的有效性。为了获得实用的控制算法,我们提出了理论框架的算法实现方法,在实践中考虑到了框架中的假设。我们在自动驾驶模拟器 CARLA 中验证了我们的算法,并演示了如何从模拟的 RGB 摄像头图像中学习安全控制法则。
{"title":"Learning Robust Output Control Barrier Functions From Safe Expert Demonstrations","authors":"Lars Lindemann;Alexander Robey;Lejun Jiang;Satyajeet Das;Stephen Tu;Nikolai Matni","doi":"10.1109/OJCSYS.2024.3385348","DOIUrl":"https://doi.org/10.1109/OJCSYS.2024.3385348","url":null,"abstract":"This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations. We assume that a model of the system dynamics and a state estimator are available along with corresponding error bounds, e.g., estimated from data in practice. We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety, as defined through controlled forward invariance of a safe set. We then formulate an optimization problem to learn ROCBFs from expert demonstrations that exhibit safe system behavior, e.g., data collected from a human operator or an expert controller. When the parametrization of the ROCBF is linear, then we show that, under mild assumptions, the optimization problem is convex. Along with the optimization problem, we provide verifiable conditions in terms of the density of the data, smoothness of the system model and state estimator, and the size of the error bounds that guarantee validity of the obtained ROCBF. Towards obtaining a practical control algorithm, we propose an algorithmic implementation of our theoretical framework that accounts for assumptions made in our framework in practice. We validate our algorithm in the autonomous driving simulator CARLA and demonstrate how to learn safe control laws from simulated RGB camera images.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"3 ","pages":"158-172"},"PeriodicalIF":0.0,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10491341","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140818787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE open journal of control systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1