首页 > 最新文献

IEEE open journal of control systems最新文献

英文 中文
VisioPath: Vision-Language Enhanced Model Predictive Control for Safe Autonomous Navigation in Mixed Traffic 混合交通中安全自主导航的视觉语言增强模型预测控制
Pub Date : 2025-10-10 DOI: 10.1109/OJCSYS.2025.3620149
Shanting Wang;Panagiotis Typaldos;Chenjun Li;Andreas A. Malikopoulos
In this paper, we introduce VisioPath, a novel framework combining vision-language models (VLMs) with model predictive control (MPC) to enable safe autonomous driving in dynamic traffic environments. The proposed approach leverages a bird's-eye view video processing pipeline and zero-shot VLM capabilities to obtain structured information about surrounding vehicles, including their positions, dimensions, and velocitie, while providing semantically-informed initial trajectory guesses that warm-start the optimizer and enable contextually-aware navigation decisions (e.g., yielding to emergency vehicles). Using this rich perception output, we shape elliptical collision-avoidance potential fields around other traffic participants, which are seamlessly integrated into a finite-horizon optimal control problem for trajectory planning. The resulting trajectory optimization is solved via differential dynamic programming and is embedded in an event-triggered MPC loop. To ensure collision-free motion, a safety verification layer is incorporated in the framework that provides an assessment of potential unsafe trajectories. Extensive simulations in SUMO and CARLA simulators demonstrate that VisioPath outperforms other baseline approaches, such as conventional MPC, A*, RRT and CBF methods, across multiple metrics. By combining modern AI-driven perception with the rigorous foundation of optimal control, VisioPath represents a significant step forward in safe trajectory planning for complex traffic systems.
本文介绍了一种将视觉语言模型(VLMs)与模型预测控制(MPC)相结合的新型框架VisioPath,以实现动态交通环境下的安全自动驾驶。所提出的方法利用鸟瞰视频处理管道和零射击VLM功能来获取有关周围车辆的结构化信息,包括其位置、尺寸和速度,同时提供语义知情的初始轨迹猜测,从而预热启动优化器并实现上下文感知导航决策(例如,向紧急车辆提供支持)。利用这种丰富的感知输出,我们在其他交通参与者周围塑造椭圆避碰势场,并将其无缝集成到轨迹规划的有限视界最优控制问题中。所得到的轨迹优化通过微分动态规划求解,并嵌入到事件触发的MPC回路中。为了确保无碰撞运动,在框架中加入了一个安全验证层,提供对潜在不安全轨迹的评估。在SUMO和CARLA模拟器中进行的大量模拟表明,VisioPath在多个指标上优于其他基准方法,如传统的MPC、A*、RRT和CBF方法。通过将现代人工智能驱动的感知与严格的最优控制基础相结合,VisioPath在复杂交通系统的安全轨迹规划方面迈出了重要一步。
{"title":"VisioPath: Vision-Language Enhanced Model Predictive Control for Safe Autonomous Navigation in Mixed Traffic","authors":"Shanting Wang;Panagiotis Typaldos;Chenjun Li;Andreas A. Malikopoulos","doi":"10.1109/OJCSYS.2025.3620149","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3620149","url":null,"abstract":"In this paper, we introduce <italic>VisioPath</i>, a novel framework combining vision-language models (VLMs) with model predictive control (MPC) to enable safe autonomous driving in dynamic traffic environments. The proposed approach leverages a bird's-eye view video processing pipeline and zero-shot VLM capabilities to obtain structured information about surrounding vehicles, including their positions, dimensions, and velocitie, while providing semantically-informed initial trajectory guesses that warm-start the optimizer and enable contextually-aware navigation decisions (e.g., yielding to emergency vehicles). Using this rich perception output, we shape elliptical collision-avoidance potential fields around other traffic participants, which are seamlessly integrated into a finite-horizon optimal control problem for trajectory planning. The resulting trajectory optimization is solved via differential dynamic programming and is embedded in an event-triggered MPC loop. To ensure collision-free motion, a safety verification layer is incorporated in the framework that provides an assessment of potential unsafe trajectories. Extensive simulations in SUMO and CARLA simulators demonstrate that <italic>VisioPath</i> outperforms other baseline approaches, such as conventional MPC, A*, RRT and CBF methods, across multiple metrics. By combining modern AI-driven perception with the rigorous foundation of optimal control, <italic>VisioPath</i> represents a significant step forward in safe trajectory planning for complex traffic systems.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"562-580"},"PeriodicalIF":0.0,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11199901","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145510199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Norm-Bounded Model Predictive Control Allocation Strategy for an Over-Actuated Aircraft 超驱动飞机的范数有界模型预测控制分配策略
Pub Date : 2025-10-09 DOI: 10.1109/OJCSYS.2025.3619810
V. Scordamaglia;M. Mattei;G. Franzé
In this paper, a novel solution for addressing the control allocation problem for over-actuated autonomous aircraft is presented. Inparticular, a detailed High Altitude Performance Demonstrator (HAPD) is used to show the effectiveness of the control/allocation architecture. The novelty of the proposed solution consists of designing a model predictive controller compliant with input saturations, geometric constraints, model uncertainties and enjoying tracking capabilities to be used during the online operations to adapt the nominal allocation unit to the time-varying conditions arising from the nonlinear aircraft dynamics. To make this approach viable, the state trajectories of the nonlinear envelope are formally embedded into those pertaining to a norm-bounded linear description. Then the allocation task is addressed by defining an online reference generator in charge of providing a feasible reference trajectory compatible with time-varying flight conditions. Finally, the nominal allocation is adapted online by exploiting state prediction features of the model predictive controller. A simulation campaign, involving comparisons with a well-known competitor, is performed by enlightening the effectiveness of the proposed approach in fulfilling constraints, ensuring accurate trajectory tracking and optimally allocating the control effort.
本文提出了一种解决过度驱动自主飞行器控制分配问题的新方法。特别地,一个详细的高空性能演示器(HAPD)被用来展示控制/分配架构的有效性。该方案的新颖之处在于设计了一个模型预测控制器,该控制器符合输入饱和、几何约束、模型不确定性,并具有在线操作过程中的跟踪能力,使标称分配单元适应非线性飞机动力学引起的时变条件。为了使这种方法可行,非线性包络的状态轨迹被正式嵌入到与范数有界线性描述相关的状态轨迹中。然后,通过定义在线参考发生器,提供与时变飞行条件兼容的可行参考轨迹,解决分配任务。最后,利用模型预测控制器的状态预测特性对标称分配进行在线调整。通过与知名竞争对手进行比较的仿真活动,验证了所提方法在满足约束条件、确保精确轨迹跟踪和优化分配控制努力方面的有效性。
{"title":"Norm-Bounded Model Predictive Control Allocation Strategy for an Over-Actuated Aircraft","authors":"V. Scordamaglia;M. Mattei;G. Franzé","doi":"10.1109/OJCSYS.2025.3619810","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3619810","url":null,"abstract":"In this paper, a novel solution for addressing the control allocation problem for over-actuated autonomous aircraft is presented. Inparticular, a detailed High Altitude Performance Demonstrator (HAPD) is used to show the effectiveness of the control/allocation architecture. The novelty of the proposed solution consists of designing a model predictive controller compliant with input saturations, geometric constraints, model uncertainties and enjoying tracking capabilities to be used during the online operations to adapt the nominal allocation unit to the time-varying conditions arising from the nonlinear aircraft dynamics. To make this approach viable, the state trajectories of the nonlinear envelope are formally embedded into those pertaining to a norm-bounded linear description. Then the allocation task is addressed by defining an online reference generator in charge of providing a feasible reference trajectory compatible with time-varying flight conditions. Finally, the nominal allocation is adapted online by exploiting state prediction features of the model predictive controller. A simulation campaign, involving comparisons with a well-known competitor, is performed by enlightening the effectiveness of the proposed approach in fulfilling constraints, ensuring accurate trajectory tracking and optimally allocating the control effort.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"518-530"},"PeriodicalIF":0.0,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11197644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145405353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constrained Path Planning for Soft Continuum Robots With Bernstein Surfaces 具有Bernstein曲面的软连续体机器人约束路径规划
Pub Date : 2025-10-02 DOI: 10.1109/OJCSYS.2025.3617288
Maxwell Hammond;Ean Lovett;Vincenzo Pugliese;Venanzio Cichella;Caterina Lamuta
This manuscript presents a framework for trajectory generation for soft continuum robots using principles from optimal control. The problem is constrained over the partial differential kinematic equations of the Cosserat rod model, capturing all modes of deformation which soft continuum systems can achieve. The derived optimal control problem is transformed to a nonlinear programming problem which can be solved using the Bernstein polynomial basis. Non-unit quaternions are used to discretely apply rotational transformations to values approximated over Bernstein polynomials, allowing individual components of strain to be constrained separately. Included within this manuscript are numerical results as well as validation through experimental results.
本文提出了一个基于最优控制原理的软连续体机器人轨迹生成框架。该问题约束于Cosserat杆模型的偏微分运动方程,捕获了软连续体系统可以实现的所有变形模式。将导出的最优控制问题转化为非线性规划问题,利用Bernstein多项式基求解。非单位四元数用于离散地将旋转变换应用于伯恩斯坦多项式上的近似值,从而允许单独约束应变的各个分量。本文包括数值结果以及通过实验结果的验证。
{"title":"Constrained Path Planning for Soft Continuum Robots With Bernstein Surfaces","authors":"Maxwell Hammond;Ean Lovett;Vincenzo Pugliese;Venanzio Cichella;Caterina Lamuta","doi":"10.1109/OJCSYS.2025.3617288","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3617288","url":null,"abstract":"This manuscript presents a framework for trajectory generation for soft continuum robots using principles from optimal control. The problem is constrained over the partial differential kinematic equations of the Cosserat rod model, capturing all modes of deformation which soft continuum systems can achieve. The derived optimal control problem is transformed to a nonlinear programming problem which can be solved using the Bernstein polynomial basis. Non-unit quaternions are used to discretely apply rotational transformations to values approximated over Bernstein polynomials, allowing individual components of strain to be constrained separately. Included within this manuscript are numerical results as well as validation through experimental results.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"618-628"},"PeriodicalIF":0.0,"publicationDate":"2025-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11190071","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Operator Neural Network Model Predictive Control 深度算子神经网络模型预测控制
Pub Date : 2025-09-26 DOI: 10.1109/OJCSYS.2025.3614875
Thomas O. de Jong;Khemraj Shukla;Mircea Lazar
In this paper, we consider the design of model predictive control (MPC) algorithms based on deep operator neural networks (DeepONets) (Lu et al. 2021). These neural networks are capable of accurately approximating real- and complex-valued solutions (Jiang et al. 2024) of continuous-time nonlinear systems without relying on recurrent architectures. The DeepONet architecture is made up of two feedforward neural networks: the branch network, which encodes the input function space, and the trunk network, which represents dependencies on temporal variables or initial conditions. Utilizing the original DeepONet architecture (Lu et al. 2021) as a predictor within MPC for Multi-Input Multi-Output (MIMO) systems requires multiple branch networks, to generate multi-output predictions, one for each input. Moreover, to predict multiple time steps into the future, the network has to be evaluated multiple times. Motivated by this, we introduce a multi-step DeepONet (MS-DeepONet) architecture that computes in one-shot multi-step predictions of system outputs from multi-step input sequences, which is better suited for MPC. We prove that the MS-DeepONet is a universal approximator in terms of multi-step sequence prediction. Additionally, we develop automated hyperparameter selection strategies and implement MPC frameworks using both the standard DeepONet and the proposed MS-DeepONet architectures in PyTorch. We compare MS-DeepONet, standard DeepONet, and LSTM-based controllers on learning and predictive control tasks for the Van der Pol oscillator and the quadruple tank process. The MS-DeepONet is also evaluated on a challenging cart–pendulum system, where it successfully learns swing-up and stabilization policies. Across the examples, MS-DeepONet outperforms standard DeepONet in prediction accuracy and control performance, and achieves significantly lower computation times than Long Short-Term Memory (LSTM) based MPC.
在本文中,我们考虑了基于深度算子神经网络(DeepONets)的模型预测控制(MPC)算法的设计(Lu et al. 2021)。这些神经网络能够准确地逼近连续时间非线性系统的实值和复值解(Jiang et al. 2024),而不依赖于循环架构。DeepONet架构由两个前馈神经网络组成:分支网络编码输入函数空间,主干网络表示对时间变量或初始条件的依赖关系。利用原始DeepONet架构(Lu et al. 2021)作为多输入多输出(MIMO)系统的MPC预测器,需要多个分支网络来生成多输出预测,每个输入一个。此外,为了预测未来的多个时间步长,必须对网络进行多次评估。基于此,我们引入了一种多步DeepONet (MS-DeepONet)架构,该架构可以对多步输入序列的系统输出进行一次多步预测,更适合MPC。我们证明了MS-DeepONet在多步序列预测方面是一个通用逼近器。此外,我们开发了自动超参数选择策略,并在PyTorch中使用标准DeepONet和提议的MS-DeepONet架构实现MPC框架。我们比较了MS-DeepONet、标准DeepONet和基于lstm的控制器对Van der Pol振荡器和四缸过程的学习和预测控制任务。MS-DeepONet还在一个具有挑战性的小车摆系统中进行了评估,成功地学习了摆动和稳定策略。在所有示例中,MS-DeepONet在预测精度和控制性能方面优于标准DeepONet,并且比基于长短期记忆(LSTM)的MPC实现了显着降低的计算时间。
{"title":"Deep Operator Neural Network Model Predictive Control","authors":"Thomas O. de Jong;Khemraj Shukla;Mircea Lazar","doi":"10.1109/OJCSYS.2025.3614875","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3614875","url":null,"abstract":"In this paper, we consider the design of model predictive control (MPC) algorithms based on deep operator neural networks (DeepONets) (Lu et al. 2021). These neural networks are capable of accurately approximating real- and complex-valued solutions (Jiang et al. 2024) of continuous-time nonlinear systems without relying on recurrent architectures. The DeepONet architecture is made up of two feedforward neural networks: the branch network, which encodes the input function space, and the trunk network, which represents dependencies on temporal variables or initial conditions. Utilizing the original DeepONet architecture (Lu et al. 2021) as a predictor within MPC for Multi-Input Multi-Output (MIMO) systems requires multiple branch networks, to generate multi-output predictions, one for each input. Moreover, to predict multiple time steps into the future, the network has to be evaluated multiple times. Motivated by this, we introduce a multi-step DeepONet (MS-DeepONet) architecture that computes in one-shot multi-step predictions of system outputs from multi-step input sequences, which is better suited for MPC. We prove that the MS-DeepONet is a universal approximator in terms of multi-step sequence prediction. Additionally, we develop automated hyperparameter selection strategies and implement MPC frameworks using both the standard DeepONet and the proposed MS-DeepONet architectures in PyTorch. We compare MS-DeepONet, standard DeepONet, and LSTM-based controllers on learning and predictive control tasks for the Van der Pol oscillator and the quadruple tank process. The MS-DeepONet is also evaluated on a challenging cart–pendulum system, where it successfully learns swing-up and stabilization policies. Across the examples, MS-DeepONet outperforms standard DeepONet in prediction accuracy and control performance, and achieves significantly lower computation times than Long Short-Term Memory (LSTM) based MPC.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"501-517"},"PeriodicalIF":0.0,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11181185","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145405352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collaborative Safety-Critical Control in Coupled Networked Systems 耦合网络系统中的协同安全关键控制
Pub Date : 2025-09-24 DOI: 10.1109/OJCSYS.2025.3614070
Brooks A. Butler;Philip E. Paré
As modern systems become increasingly connected with complex dynamic coupling relationships, developing safe control methods for such interconnected systems becomes paramount. In this paper, we explore the relationship of node-level safety definitions for individual agents to local neighborhood dynamics. We define a collaborative control barrier function and provide conditions under which sets defined by these functions will be forward invariant. We use collaborative control barrier functions to construct a novel decentralized algorithm for the safe control of collaborating network agents and provide conditions under which the algorithm is guaranteed to return a viable set of safe control actions for all agents. We then illustrate these results on a networked susceptible-infected-susceptible (SIS) model.
随着现代系统越来越多地与复杂的动态耦合关系联系在一起,为这种相互联系的系统开发安全的控制方法变得至关重要。本文探讨了个体智能体的节点级安全定义与局部邻域动态的关系。我们定义了一个协同控制障碍函数,并给出了由这些函数定义的集合是前向不变的条件。我们利用协同控制屏障函数构造了一种新的分散的协作网络代理安全控制算法,并提供了保证该算法为所有代理返回一组可行的安全控制动作的条件。然后,我们在网络易感-感染-易感(SIS)模型上说明了这些结果。
{"title":"Collaborative Safety-Critical Control in Coupled Networked Systems","authors":"Brooks A. Butler;Philip E. Paré","doi":"10.1109/OJCSYS.2025.3614070","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3614070","url":null,"abstract":"As modern systems become increasingly connected with complex dynamic coupling relationships, developing safe control methods for such interconnected systems becomes paramount. In this paper, we explore the relationship of node-level safety definitions for individual agents to local neighborhood dynamics. We define a collaborative control barrier function and provide conditions under which sets defined by these functions will be forward invariant. We use collaborative control barrier functions to construct a novel decentralized algorithm for the safe control of collaborating network agents and provide conditions under which the algorithm is guaranteed to return a viable set of safe control actions for all agents. We then illustrate these results on a networked susceptible-infected-susceptible (SIS) model.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"433-446"},"PeriodicalIF":0.0,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11176994","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cautious Optimization via Data Informativity 基于数据信息性的谨慎优化
Pub Date : 2025-09-22 DOI: 10.1109/OJCSYS.2025.3612784
Jaap Eising;Jorge Cortés
This paper deals with the problem of accurately determining guaranteed suboptimal values of an unknown cost function on the basis of noisy measurements. We consider a set-valued variant to regression where, instead of finding a best estimate of the cost function, we reason over all functions compatible with the measurements and apply robust methods explicitly in terms of the data. Our treatment provides data-based conditions under which closed-form expressions of upper bounds of the unknown function can be obtained, and regularity properties like convexity and Lipschitzness can be established. These results allow us to perform point- and set-wise verification of suboptimality, and tackle the cautious optimization of the unknown function in both one-shot and online scenarios. We showcase the versatility of the proposed methods in two control-relevant problems: data-driven contraction analysis of unknown nonlinear systems and suboptimal regulation with unknown dynamics and cost. Simulations illustrate our results.
本文研究了基于噪声测量的未知代价函数的保证次优值的精确确定问题。我们考虑回归的集值变体,而不是寻找成本函数的最佳估计,我们对与测量相兼容的所有函数进行推理,并根据数据明确地应用鲁棒方法。我们的处理提供了基于数据的条件,在此条件下可以得到未知函数上界的封闭形式表达式,并可以建立凸性和lipschitz性等正则性。这些结果使我们能够执行次最优性的点和集合验证,并解决一次性和在线场景中未知函数的谨慎优化问题。我们在两个与控制相关的问题中展示了所提出方法的多功能性:未知非线性系统的数据驱动收缩分析和未知动态和成本的次优调节。仿真验证了我们的结果。
{"title":"Cautious Optimization via Data Informativity","authors":"Jaap Eising;Jorge Cortés","doi":"10.1109/OJCSYS.2025.3612784","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3612784","url":null,"abstract":"This paper deals with the problem of accurately determining guaranteed suboptimal values of an unknown cost function on the basis of noisy measurements. We consider a set-valued variant to regression where, instead of finding a best estimate of the cost function, we reason over all functions compatible with the measurements and apply robust methods explicitly in terms of the data. Our treatment provides data-based conditions under which closed-form expressions of upper bounds of the unknown function can be obtained, and regularity properties like convexity and Lipschitzness can be established. These results allow us to perform point- and set-wise verification of suboptimality, and tackle the cautious optimization of the unknown function in both one-shot and online scenarios. We showcase the versatility of the proposed methods in two control-relevant problems: data-driven contraction analysis of unknown nonlinear systems and suboptimal regulation with unknown dynamics and cost. Simulations illustrate our results.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"400-417"},"PeriodicalIF":0.0,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11175185","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145351997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mutual Support by Sensor-Attacker Team for a Passive Target 传感器攻击小组对被动目标的相互支持
Pub Date : 2025-09-19 DOI: 10.1109/OJCSYS.2025.3612246
Prajakta Surve;Shaunak D. Bopardikar;Alexander Von Moll;Isaac Weintraub;David W. Casbeer
We introduce a pursuit game played between a team of a sensor and an attacker and a mobile target in the unbounded Euclidean plane. The target is faster than the sensor, but slower than the attacker. The sensor’s objective is to keep the target within a sensing radius so that the attacker can capture the target, whereas the target seeks to escape by reaching beyond the sensing radius from the sensor without getting captured by the attacker. We assume that as long as the target is within the sensing radius from the sensor, the sensor-attacker team is able to measure the target’s instantaneous position and velocity. We pose and solve this problem as a Game of Kind in which the target uses an open-loop strategy (passive target). Aside from the novel formulation, our contributions are four-fold. First, we present optimal strategies for both the sensor and the attacker, according to their respective objectives. Specifically, we design a sensor strategy that maximizes the duration for which the target remains within its sensing range, while the attacker uses proportional navigation to capture the target. Second, we characterize the sensable region – the region in the plane in which the target remains within the sensing radius of the sensor during the game – and show that capture is guaranteed if and only if the Apollonius circle between the attacker and the target is fully contained within this region. Third, we derive a lower bound on the target’s speed below which capture is guaranteed, and an upper bound on the target speed above which there exists an escape strategy for the target, from an arbitrary initial orientation between the agents. Fourth, for a given initial orientation between the agents, we present a sharper upper bound on the target speed above which there exists an escape strategy for the target.
在无界欧几里得平面上,引入了一种由传感器和攻击者组成的团队与移动目标之间的追逐博弈。目标比传感器快,但比攻击者慢。传感器的目的是使目标保持在传感半径内,以便攻击者能够捕获目标,而目标则寻求通过超出传感器的传感半径而逃脱,而不被攻击者捕获。我们假设只要目标在传感器的感应半径内,传感器攻击者团队就能够测量目标的瞬时位置和速度。我们提出并解决这一问题作为同类游戏,其中目标使用开环策略(被动目标)。除了新配方,我们的贡献有四个方面。首先,我们根据传感器和攻击者各自的目标提出了最优策略。具体来说,我们设计了一种传感器策略,使目标保持在其感知范围内的时间最大化,而攻击者则使用比例导航来捕获目标。其次,我们描述了可感知区域——在游戏过程中目标保持在传感器感知半径内的平面区域——并表明,当且仅当攻击者和目标之间的阿波罗尼乌斯圈完全包含在该区域内时,捕获是保证的。第三,我们从智能体之间的任意初始方向推导出目标速度的下界,在此下界下目标可以保证被捕获,而在此上界上目标有逃脱策略。第四,对于给定的智能体之间的初始方向,我们给出了一个更尖锐的目标速度上界,在此上界上存在目标的逃逸策略。
{"title":"Mutual Support by Sensor-Attacker Team for a Passive Target","authors":"Prajakta Surve;Shaunak D. Bopardikar;Alexander Von Moll;Isaac Weintraub;David W. Casbeer","doi":"10.1109/OJCSYS.2025.3612246","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3612246","url":null,"abstract":"We introduce a pursuit game played between a team of a sensor and an attacker and a mobile target in the unbounded Euclidean plane. The target is faster than the sensor, but slower than the attacker. The sensor’s objective is to keep the target within a sensing radius so that the attacker can capture the target, whereas the target seeks to escape by reaching beyond the sensing radius from the sensor without getting captured by the attacker. We assume that as long as the target is within the sensing radius from the sensor, the sensor-attacker team is able to measure the target’s instantaneous position and velocity. We pose and solve this problem as a <italic>Game of Kind</i> in which the target uses an open-loop strategy (passive target). Aside from the novel formulation, our contributions are four-fold. First, we present optimal strategies for both the sensor and the attacker, according to their respective objectives. Specifically, we design a sensor strategy that maximizes the duration for which the target remains within its sensing range, while the attacker uses proportional navigation to capture the target. Second, we characterize the <italic>sensable region</i> – the region in the plane in which the target remains within the sensing radius of the sensor during the game – and show that capture is guaranteed if and only if the Apollonius circle between the attacker and the target is fully contained within this region. Third, we derive a lower bound on the target’s speed below which capture is guaranteed, and an upper bound on the target speed above which there exists an escape strategy for the target, from an arbitrary initial orientation between the agents. Fourth, for a given initial orientation between the agents, we present a sharper upper bound on the target speed above which there exists an escape strategy for the target.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"418-432"},"PeriodicalIF":0.0,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11173712","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning-Enabled Iterative Convex Optimization for Safety-Critical Model Predictive Control 安全关键模型预测控制的可学习迭代凸优化
Pub Date : 2025-09-19 DOI: 10.1109/OJCSYS.2025.3612245
Shuo Liu;Zhe Huang;Jun Zeng;Koushil Sreenath;Calin A. Belta
Safety remains a central challenge in control of dynamical systems, particularly when the boundaries of unsafe sets are complex (e.g., nonconvex, nonsmooth) or unknown. This paper proposes a learning-enabled framework for safety-critical Model Predictive Control (MPC) that integrates Discrete-Time High-Order Control Barrier Functions (DHOCBFs) with iterative convex optimization. Unlike existing methods that primarily address CBFs of relative degree one with fully known unsafe set boundaries, our approach generalizes to arbitrary relative degrees and addresses scenarios where only samples are available for the unsafe set boundaries. We extract pixels from unsafe set boundaries and train a neural network to approximate local linearizations. The learned models are incorporated into the linearized DHOCBF constraints at each time step within the MPC framework. An iterative convex optimization procedure is developed to accelerate computation while maintaining formal safety guarantees. The benefits of computational performance and safe avoidance of obstacles with diverse shapes are examined and confirmed through numerical results. By bridging model-based control with learning-based environment modeling, this framework advances safe autonomy for discrete-time systems operating in complex and partially known settings.
安全性仍然是动力系统控制的核心挑战,特别是当不安全集的边界是复杂的(例如,非凸的,非光滑的)或未知的时候。本文提出了一种基于学习的安全关键模型预测控制(MPC)框架,该框架将离散时间高阶控制屏障函数(dhocbf)与迭代凸优化相结合。与现有方法主要解决具有完全已知不安全集边界的相对程度为1的cbf不同,我们的方法可以推广到任意相对程度,并解决只有样本可用于不安全集边界的情况。我们从不安全的边界中提取像素,并训练神经网络来近似局部线性化。在MPC框架内的每个时间步,将学习到的模型合并到线性化的DHOCBF约束中。为了在保证形式安全的前提下加快计算速度,提出了一种迭代凸优化算法。通过数值结果验证了计算性能和安全避开不同形状障碍物的好处。通过桥接基于模型的控制和基于学习的环境建模,该框架提高了在复杂和部分已知环境下运行的离散时间系统的安全自主性。
{"title":"Learning-Enabled Iterative Convex Optimization for Safety-Critical Model Predictive Control","authors":"Shuo Liu;Zhe Huang;Jun Zeng;Koushil Sreenath;Calin A. Belta","doi":"10.1109/OJCSYS.2025.3612245","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3612245","url":null,"abstract":"Safety remains a central challenge in control of dynamical systems, particularly when the boundaries of unsafe sets are complex (e.g., nonconvex, nonsmooth) or unknown. This paper proposes a learning-enabled framework for safety-critical Model Predictive Control (MPC) that integrates Discrete-Time High-Order Control Barrier Functions (DHOCBFs) with iterative convex optimization. Unlike existing methods that primarily address CBFs of relative degree one with fully known unsafe set boundaries, our approach generalizes to arbitrary relative degrees and addresses scenarios where only samples are available for the unsafe set boundaries. We extract pixels from unsafe set boundaries and train a neural network to approximate local linearizations. The learned models are incorporated into the linearized DHOCBF constraints at each time step within the MPC framework. An iterative convex optimization procedure is developed to accelerate computation while maintaining formal safety guarantees. The benefits of computational performance and safe avoidance of obstacles with diverse shapes are examined and confirmed through numerical results. By bridging model-based control with learning-based environment modeling, this framework advances safe autonomy for discrete-time systems operating in complex and partially known settings.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"482-500"},"PeriodicalIF":0.0,"publicationDate":"2025-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11174009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compositional Shield Synthesis for Safe Reinforcement Learning in Partial Observability 面向部分可观察性安全强化学习的组合屏蔽综合
Pub Date : 2025-09-18 DOI: 10.1109/OJCSYS.2025.3611725
Steven Carr;Georgios Bakirtzis;Ufuk Topcu
Agents controlled by the output of reinforcement learning (RL) algorithms often transition to unsafe states, particularly in uncertain and partially observable environments. Partially observable Markov decision processes (POMDPs) provide a natural setting for studying such scenarios with limited sensing. Shields filter undesirable actions to ensure safe RL by preserving safety requirements in the agents’ policy. However, synthesizing holistic shields is computationally expensive in complex deployment scenarios. We propose the compositional synthesis of shields by modeling safety requirements by parts, thereby improving scalability. In particular, problem formulations in the form of POMDPs using RL algorithms illustrate that an RL agent equipped with the resulting compositional shielding, beyond being safe, converges to higher values of expected reward. By using subproblem formulations, we preserve and improve the ability of shielded agents to require fewer training episodes than unshielded agents, especially in sparse-reward settings. Concretely, we find that compositional shield synthesis allows an RL agent to remain safe in environments two orders of magnitude larger than other state-of-the-art model-based approaches.
由强化学习(RL)算法输出控制的智能体经常过渡到不安全状态,特别是在不确定和部分可观察的环境中。部分可观察马尔可夫决策过程(pomdp)为研究这些具有有限感知的场景提供了一个自然的环境。屏蔽过滤不良行为,通过保留代理策略中的安全要求来确保安全的RL。然而,在复杂的部署场景中,综合整体防护在计算上是昂贵的。我们提出了基于零件的安全需求建模方法,从而提高了可扩展性。特别是,使用RL算法的pomdp形式的问题公式表明,配备了生成的组合屏蔽的RL代理,除了安全之外,还收敛到更高的预期奖励值。通过使用子问题公式,我们保留并提高了屏蔽代理比非屏蔽代理需要更少训练集的能力,特别是在稀疏奖励设置中。具体来说,我们发现复合屏蔽合成允许RL代理在比其他最先进的基于模型的方法大两个数量级的环境中保持安全。
{"title":"Compositional Shield Synthesis for Safe Reinforcement Learning in Partial Observability","authors":"Steven Carr;Georgios Bakirtzis;Ufuk Topcu","doi":"10.1109/OJCSYS.2025.3611725","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3611725","url":null,"abstract":"Agents controlled by the output of reinforcement learning (RL) algorithms often transition to unsafe states, particularly in uncertain and partially observable environments. Partially observable Markov decision processes (POMDPs) provide a natural setting for studying such scenarios with limited sensing. <italic>Shields</i> filter undesirable actions to ensure safe RL by preserving safety requirements in the agents’ policy. However, synthesizing holistic shields is computationally expensive in complex deployment scenarios. We propose the <italic>compositional</i> synthesis of shields by modeling safety requirements by parts, thereby improving scalability. In particular, problem formulations in the form of POMDPs using RL algorithms illustrate that an RL agent equipped with the resulting compositional shielding, beyond being safe, converges to higher values of expected reward. By using subproblem formulations, we preserve and improve the ability of shielded agents to require fewer training episodes than unshielded agents, especially in sparse-reward settings. Concretely, we find that compositional shield synthesis allows an RL agent to remain safe in environments two orders of magnitude larger than other state-of-the-art model-based approaches.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"373-384"},"PeriodicalIF":0.0,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11172329","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deception in Turret Defense Game: Information Limiting Strategy to Induce Dilemma 炮塔防御博弈中的欺骗:诱导困境的信息限制策略
Pub Date : 2025-09-17 DOI: 10.1109/OJCSYS.2025.3611726
Daigo Shishika;Alexander Von Moll;Dipankar Maity;Michael Dorothy
Can deception exist in differential games? We provide a case study for a Turret-Attacker differential game, where two Attackers seek to score points by reaching a target region while a Turret tries to minimize the score by aligning itself with the Attackers before they reach the target. In contrast to the original problem solved with complete information, we assume that the Turret only has partial information about the maximum speed of the Attackers. We investigate whether there is any incentive for the Attackers to move slower than their maximum speed in order to “deceive” the Turret into taking suboptimal actions. We first describe the existence of a dilemma that the Turret may face. Then we derive a set of initial conditions from which the Attackers can force the Turret into a situation where it must take a guess.
欺骗可以存在于微分博弈中吗?我们提供了一个炮塔-攻击者差异游戏的案例研究,其中两个攻击者试图通过到达目标区域来得分,而一个炮塔试图通过在攻击者到达目标区域之前与他们保持一致来最小化得分。与原来的完全信息解决问题不同,我们假设炮塔只有部分关于攻击者最大速度的信息。我们调查是否有动机让攻击者以低于其最大速度的速度移动,以“欺骗”炮塔采取次优行动。我们首先描述了炮塔可能面临的一个困境的存在。然后我们推导出一组初始条件,攻击者可以从这些初始条件中迫使炮塔进入必须进行猜测的情况。
{"title":"Deception in Turret Defense Game: Information Limiting Strategy to Induce Dilemma","authors":"Daigo Shishika;Alexander Von Moll;Dipankar Maity;Michael Dorothy","doi":"10.1109/OJCSYS.2025.3611726","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3611726","url":null,"abstract":"Can deception exist in differential games? We provide a case study for a Turret-Attacker differential game, where two Attackers seek to score points by reaching a target region while a Turret tries to minimize the score by aligning itself with the Attackers before they reach the target. In contrast to the original problem solved with complete information, we assume that the Turret only has partial information about the maximum speed of the Attackers. We investigate whether there is any incentive for the Attackers to move slower than their maximum speed in order to “deceive” the Turret into taking suboptimal actions. We first describe the existence of a dilemma that the Turret may face. Then we derive a set of initial conditions from which the Attackers can force the Turret into a situation where it must take a guess.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"385-399"},"PeriodicalIF":0.0,"publicationDate":"2025-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11169496","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145352064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE open journal of control systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1