Pub Date : 2025-08-21DOI: 10.1109/OJCSYS.2025.3601435
Caio Fabio Oliveira da Silva;Azita Dabiri;Bart De Schutter
This work proposes an approach that integrates reinforcement learning (RL) and model predictive control (MPC) to solve finite-horizon optimal control problems in mixed-logical dynamical systems efficiently. Optimization-based control of such systems with discrete and continuous decision variables entails the online solution of mixed-integer linear programs, which suffer from the curse of dimensionality. In the proposed approach, by repeated interaction with a simulator of the system, a reinforcement learning agent is trained to provide a policy for the discrete decision variables. During online operation, the RL policy simplifies the online optimization problem of the MPC controller from a mixed-integer linear program to a linear program, significantly reducing the computation time. A fundamental contribution of this work is the definition of the decoupled Q-function, which plays a crucial role in making the learning problem tractable in a combinatorial action space. We motivate the use of recurrent neural networks to approximate the decoupled Q-function and show how they can be employed in a reinforcement learning setting. A microgrid system is used as an illustrative example where real-world data is used to demonstrate that the proposed method substantially reduces the maximum online computation time of MPC (up to $20times$) while maintaining high feasibility and average optimality gap lower than 1.1% .
{"title":"Integrating Reinforcement Learning and Model Predictive Control for Mixed- Logical Dynamical Systems","authors":"Caio Fabio Oliveira da Silva;Azita Dabiri;Bart De Schutter","doi":"10.1109/OJCSYS.2025.3601435","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3601435","url":null,"abstract":"This work proposes an approach that integrates reinforcement learning (RL) and model predictive control (MPC) to solve finite-horizon optimal control problems in mixed-logical dynamical systems efficiently. Optimization-based control of such systems with discrete and continuous decision variables entails the online solution of mixed-integer linear programs, which suffer from the curse of dimensionality. In the proposed approach, by repeated interaction with a simulator of the system, a reinforcement learning agent is trained to provide a policy for the discrete decision variables. During online operation, the RL policy simplifies the online optimization problem of the MPC controller from a mixed-integer linear program to a linear program, significantly reducing the computation time. A fundamental contribution of this work is the definition of the decoupled Q-function, which plays a crucial role in making the learning problem tractable in a combinatorial action space. We motivate the use of recurrent neural networks to approximate the decoupled Q-function and show how they can be employed in a reinforcement learning setting. A microgrid system is used as an illustrative example where real-world data is used to demonstrate that the proposed method substantially reduces the maximum online computation time of MPC (up to <inline-formula><tex-math>$20times$</tex-math></inline-formula>) while maintaining high feasibility and average optimality gap lower than 1.1% .","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"316-331"},"PeriodicalIF":0.0,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11134093","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145027930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-20DOI: 10.1109/OJCSYS.2025.3600925
Xue-Fang Wang;Jingjing Jiang;Wen-Hua Chen
This paper presents a novel solution for optimal high-level decision-making in autonomous overtaking on two-lane roads, considering both opposite-direction and same-direction traffic. The proposed solutionaccounts for key factors such as safety and optimality, while also ensuring recursive feasibility and stability.To safely complete overtaking maneuvers, the solution is built on a constrained Markov decision process (MDP) that generates optimal decisions for path planners. By combining MDP with model predictive control (MPC), the approach guarantees recursive feasibility and stability through a baseline control policy that calculates the terminal cost and is incorporated into a constructed Lyapunov function. The proposed solution is validated through five simulated driving scenarios, demonstrating its robustness in handling diverse interactions within dynamic and complex traffic conditions.
{"title":"MDP-Based High-Level Decision-Making for Combining Safety and Optimality: Autonomous Overtaking","authors":"Xue-Fang Wang;Jingjing Jiang;Wen-Hua Chen","doi":"10.1109/OJCSYS.2025.3600925","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3600925","url":null,"abstract":"This paper presents a novel solution for optimal high-level decision-making in autonomous overtaking on two-lane roads, considering both opposite-direction and same-direction traffic. The proposed solutionaccounts for key factors such as safety and optimality, while also ensuring recursive feasibility and stability.To safely complete overtaking maneuvers, the solution is built on a constrained Markov decision process (MDP) that generates optimal decisions for path planners. By combining MDP with model predictive control (MPC), the approach guarantees recursive feasibility and stability through a baseline control policy that calculates the terminal cost and is incorporated into a constructed Lyapunov function. The proposed solution is validated through five simulated driving scenarios, demonstrating its robustness in handling diverse interactions within dynamic and complex traffic conditions.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"299-315"},"PeriodicalIF":0.0,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11130904","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, we utilize the concept of average controllability in graphs, along with a novel rank encoding method, to enhance the performance of Graph Neural Networks (GNNs) in social network classification tasks. GNNs have proven highly effective in various network-based learning applications and require some form of node features to function. However, their performance is heavily influenced by the expressiveness of these features. In social networks, node features are often unavailable due to privacy constraints or the absence of inherent attributes, making it challenging for GNNs to achieve optimal performance. To address this limitation, we propose two strategies for constructing expressive node features. First, we introduce average controllability along with other centrality metrics (denoted as NCT-EFA) as node-level metrics that capture critical aspects of network topology. Building on this, we develop a rank encoding method that transforms average controllability—or any other graph-theoretic metric—into a fixed-dimensional feature space, thereby improving feature representation. We conduct extensive numerical evaluations using six benchmark GNN models across four social network datasets to compare different node feature construction methods. Our results demonstrate that incorporating average controllability into the feature space significantly improves GNN performance. Moreover, the proposed rank encoding method outperforms traditional one-hot degree encoding, improving the ROC AUC from 68.7% to 73.9% using GraphSAGE on the GitHub Stargazers dataset, underscoring its effectiveness in generating expressive and efficient node representations.
{"title":"Feature Construction Using Network Control Theory and Rank Encoding for Graph Machine Learning","authors":"Anwar Said;Yifan Wei;Obaid Ullah Ahmad;Mudassir Shabbir;Waseem Abbas;Xenofon Koutsoukos","doi":"10.1109/OJCSYS.2025.3599371","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3599371","url":null,"abstract":"In this article, we utilize the concept of average controllability in graphs, along with a novel rank encoding method, to enhance the performance of Graph Neural Networks (GNNs) in social network classification tasks. GNNs have proven highly effective in various network-based learning applications and require some form of node features to function. However, their performance is heavily influenced by the expressiveness of these features. In social networks, node features are often unavailable due to privacy constraints or the absence of inherent attributes, making it challenging for GNNs to achieve optimal performance. To address this limitation, we propose two strategies for constructing expressive node features. First, we introduce average controllability along with other centrality metrics (denoted as NCT-EFA) as node-level metrics that capture critical aspects of network topology. Building on this, we develop a rank encoding method that transforms average controllability—or any other graph-theoretic metric—into a fixed-dimensional feature space, thereby improving feature representation. We conduct extensive numerical evaluations using six benchmark GNN models across four social network datasets to compare different node feature construction methods. Our results demonstrate that incorporating average controllability into the feature space significantly improves GNN performance. Moreover, the proposed rank encoding method outperforms traditional one-hot degree encoding, improving the ROC AUC from 68.7% to 73.9% using GraphSAGE on the GitHub Stargazers dataset, underscoring its effectiveness in generating expressive and efficient node representations.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"288-298"},"PeriodicalIF":0.0,"publicationDate":"2025-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11126872","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145021271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-14DOI: 10.1109/OJCSYS.2025.3599473
Eloy Garcia;David W. Casbeer
This paper analyzes the classic game of capture-the-flag, modeled as a conflict between an Attacker and a Defender. The game unfolds in distinct phases with changing objectives: first, the Attacker tries to capture a flag while the Defender attempts to intercept; second, if successful, the Attacker tries to reach a safe zone while the Defender again seeks interception. We mathematically derive the optimal state-feedback strategies for both players and the associated Value function for each phase, rigorously proving their correctness. A key contribution is introducing the transition phase, where we analyze the Defender’s optimal repositioning strategy when flag capture becomes inevitable, preparing it for the game’s second phase. This novel transition connects the game’s stages, critically enabling us to solve the overall Game of Kind – determining the winner from any starting condition – and define the precise circumstances under which the Attacker can both capture the flag and successfully escape to the safe zone.
本文分析了经典的夺旗游戏,将其建模为攻击者和防御者之间的冲突。游戏以不同的阶段展开,目标不断变化:首先,攻击者试图夺取一面旗帜,而防守者试图拦截;其次,如果成功,攻击者尝试到达安全区域,而防御者再次寻求拦截。我们从数学上推导出每个阶段参与者的最佳状态反馈策略和相关的价值函数,严格证明了它们的正确性。一个关键的贡献是引入过渡阶段,在这个阶段,我们分析了当夺旗不可避免时防守者的最佳重新定位策略,为游戏的第二阶段做准备。这种新颖的过渡连接了游戏的各个阶段,使我们能够解决整个“同类游戏”(game of Kind)——在任何起始条件下决定获胜者——并定义攻击者既能夺取旗帜又能成功逃到安全区的精确环境。
{"title":"The Capture-the-Flag Differential Game: Attack, Transition and Retreat","authors":"Eloy Garcia;David W. Casbeer","doi":"10.1109/OJCSYS.2025.3599473","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3599473","url":null,"abstract":"This paper analyzes the classic game of capture-the-flag, modeled as a conflict between an Attacker and a Defender. The game unfolds in distinct phases with changing objectives: first, the Attacker tries to capture a flag while the Defender attempts to intercept; second, if successful, the Attacker tries to reach a safe zone while the Defender again seeks interception. We mathematically derive the optimal state-feedback strategies for both players and the associated Value function for each phase, rigorously proving their correctness. A key contribution is introducing the transition phase, where we analyze the Defender’s optimal repositioning strategy when flag capture becomes inevitable, preparing it for the game’s second phase. This novel transition connects the game’s stages, critically enabling us to solve the overall Game of Kind – determining the winner from any starting condition – and define the precise circumstances under which the Attacker can both capture the flag and successfully escape to the safe zone.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"271-287"},"PeriodicalIF":0.0,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11125922","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13DOI: 10.1109/OJCSYS.2025.3598673
Mia Scoblic;Camilla Tabasso;Venanzio Cichella;Isaac Kaminer
Collision avoidance is a fundamental aspect of many applications involving autonomous vehicles. Solving this problem becomes especially challenging when the agents involved cannot communicate. In these scenarios, onboard sensors are essential for detecting and avoiding other vehicles or obstacles. However, in many practical applications, sensors have limited range and measurements may be intermittent due to external factors. With this in mind, in this work, we present a novel decentralized vision-based collision avoidance algorithm which does not require communication among the agents and has mild assumptions on the sensing capabilities of the vehicles. Once a collision is detected, the agents replan their trajectories to follow a circular path centered at the point of collision. A feedback control law is designed so that the vehicles can maintain a predefined phase shift along this circle and therefore are able to avoid collisions. A Lyapunov analysis is performed to provide performance bounds and the efficacy of the proposed method is demonstrated through both simulated and experimental results.
{"title":"Vision-Based Collision Avoidance for Multi-Agent Systems With Intermittent Measurements","authors":"Mia Scoblic;Camilla Tabasso;Venanzio Cichella;Isaac Kaminer","doi":"10.1109/OJCSYS.2025.3598673","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3598673","url":null,"abstract":"Collision avoidance is a fundamental aspect of many applications involving autonomous vehicles. Solving this problem becomes especially challenging when the agents involved cannot communicate. In these scenarios, onboard sensors are essential for detecting and avoiding other vehicles or obstacles. However, in many practical applications, sensors have limited range and measurements may be intermittent due to external factors. With this in mind, in this work, we present a novel decentralized vision-based collision avoidance algorithm which does not require communication among the agents and has mild assumptions on the sensing capabilities of the vehicles. Once a collision is detected, the agents replan their trajectories to follow a circular path centered at the point of collision. A feedback control law is designed so that the vehicles can maintain a predefined phase shift along this circle and therefore are able to avoid collisions. A Lyapunov analysis is performed to provide performance bounds and the efficacy of the proposed method is demonstrated through both simulated and experimental results.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"349-359"},"PeriodicalIF":0.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11123838","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-08-13DOI: 10.1109/OJCSYS.2025.3598626
Christopher I. Calle;Shaunak D. Bopardikar
In this work, we apply concentration-based results to the problem of sensor selection for state estimation to provide us with meaningful guarantees on the properties of our selection. We consider a selection of sensors that is randomly chosen with replacement for a stochastic linear dynamical system, and we utilize the Kalman filter to perform state estimation. Our main contributions are four-fold. First, we derive novel matrix concentration inequalities (CIs) for a sum of positive semi-definite random matrices. Second, we provide two algorithms for specifying the parameters required to apply our matrix CIs, a novel statistical tool. Third, we propose two avenues for improving the sample complexity of this statistical tool. Fourth, we provide a procedure for optimizing the semi-definite bounds of our matrix CIs. When our matrix CIs are applied to the problem of sensor selection for state estimation, our final contribution is a procedure for optimizing the filtered state estimation error covariance matrix of the Kalman filter. Finally, we show through simulations that our bounds significantly outperform those of an existing matrix CI and are applicable for a larger parameter regime. Also, we demonstrate the applicability of our matrix CIs for the state estimation of nonlinear dynamical systems.
{"title":"Generalized Concentration-Based Performance Guarantees on Sensor Selection for State Estimation","authors":"Christopher I. Calle;Shaunak D. Bopardikar","doi":"10.1109/OJCSYS.2025.3598626","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3598626","url":null,"abstract":"In this work, we apply concentration-based results to the problem of sensor selection for state estimation to provide us with meaningful guarantees on the properties of our selection. We consider a selection of sensors that is randomly chosen with replacement for a stochastic linear dynamical system, and we utilize the Kalman filter to perform state estimation. Our main contributions are four-fold. First, we derive novel matrix concentration inequalities (CIs) for a sum of positive semi-definite random matrices. Second, we provide two algorithms for specifying the parameters required to apply our matrix CIs, a novel statistical tool. Third, we propose two avenues for improving the sample complexity of this statistical tool. Fourth, we provide a procedure for optimizing the semi-definite bounds of our matrix CIs. When our matrix CIs are applied to the problem of sensor selection for state estimation, our final contribution is a procedure for optimizing the filtered state estimation error covariance matrix of the Kalman filter. Finally, we show through simulations that our bounds significantly outperform those of an existing matrix CI and are applicable for a larger parameter regime. Also, we demonstrate the applicability of our matrix CIs for the state estimation of nonlinear dynamical systems.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"250-270"},"PeriodicalIF":0.0,"publicationDate":"2025-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11123730","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144990018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-10DOI: 10.1109/OJCSYS.2025.3587537
Aleena Thomas;Abhijith Ajayakumar;Raju K. George
In this paper, controllability and observability of a heterogeneous networked system with Linear Time Invariant (LTI) nodal systems having Multiple-Inputs and Multiple-Outputs (MIMO) aligned in a weighted and directed network topology are studied. Apart from the heterogenity in nodal dynamics, the inner-coupling matrices that quantify the interactions among nodes are also different. In contrast to the existing literature, the system under consideration has distinct node dimensions, which adds to the generality. Necessary and sufficient conditions for controllability and observability as well as certain necessary conditions for controllability of a class of networked systems are established. These conditions show the dependence of network controllability and observability on various node and network-specific factors. As a practical application, a three-sector economy is modelled as a heterogeneous networked system with distinct node dimensions and its controllability is analysed. Computational time in floating point operations (flops) of the proposed methods are estimated, which indicates their efficiency on comparison with the classical conditions. This is illustrated by computational comparison of the existing and proposed schemes, applied to a randomly generated networked system. Also, robustness of the proposed schemes are analysed with the example of randomly generated networked systems. All the results are supported with illustrative numerical examples.
{"title":"Controllability and Observability of Heterogeneous Networked Systems With Non-Uniform Node Dimensions and Distinct Inner-Coupling Matrices","authors":"Aleena Thomas;Abhijith Ajayakumar;Raju K. George","doi":"10.1109/OJCSYS.2025.3587537","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3587537","url":null,"abstract":"In this paper, controllability and observability of a heterogeneous networked system with Linear Time Invariant (LTI) nodal systems having Multiple-Inputs and Multiple-Outputs (MIMO) aligned in a weighted and directed network topology are studied. Apart from the heterogenity in nodal dynamics, the inner-coupling matrices that quantify the interactions among nodes are also different. In contrast to the existing literature, the system under consideration has distinct node dimensions, which adds to the generality. Necessary and sufficient conditions for controllability and observability as well as certain necessary conditions for controllability of a class of networked systems are established. These conditions show the dependence of network controllability and observability on various node and network-specific factors. As a practical application, a three-sector economy is modelled as a heterogeneous networked system with distinct node dimensions and its controllability is analysed. Computational time in floating point operations (flops) of the proposed methods are estimated, which indicates their efficiency on comparison with the classical conditions. This is illustrated by computational comparison of the existing and proposed schemes, applied to a randomly generated networked system. Also, robustness of the proposed schemes are analysed with the example of randomly generated networked systems. All the results are supported with illustrative numerical examples.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"219-235"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11075535","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144831949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-07-02DOI: 10.1109/OJCSYS.2025.3585427
L. van de Kamp;B. Hunnekens;T. Oomen;N. van de Wouw
Safe deployment of neural networks to classify time series in safety-critical applications relies on the ability of the classifier to detect data that does not originate from the same distribution as the training data. The aim of this paper is to propose a framework for detecting whether time-series data is sampled from a different distribution than the training data, known as the problem of out-of-distribution (OOD) detection. We propose a novel distance-based OOD method for time-series data using a hierarchical clustering method together with dynamic time-warping to measure the difference between a new data instance and the training set. The method is evaluated in the context of mechanical ventilation, a safety critical application, using both simulated and clinical datasets. Results of the mechanical ventilation use case demonstrate that the proposed approach effectively detects out-of-distribution data and improves classification performance in diverse settings.
在安全关键应用中,安全部署神经网络对时间序列进行分类依赖于分类器检测与训练数据不同分布的数据的能力。本文的目的是提出一个框架,用于检测时间序列数据是否从不同于训练数据的分布中采样,称为out- distribution (OOD)检测问题。我们提出了一种新的基于距离的时间序列数据OOD方法,使用层次聚类方法和动态时间规整来度量新数据实例与训练集之间的差异。该方法在机械通气这一安全关键应用的背景下进行评估,使用模拟和临床数据集。机械通气用例的结果表明,该方法可以有效地检测出分布外数据,并提高了不同设置下的分类性能。
{"title":"Time-Series Out-of-Distribution Data Detection in Mechanical Ventilation","authors":"L. van de Kamp;B. Hunnekens;T. Oomen;N. van de Wouw","doi":"10.1109/OJCSYS.2025.3585427","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3585427","url":null,"abstract":"Safe deployment of neural networks to classify time series in safety-critical applications relies on the ability of the classifier to detect data that does not originate from the same distribution as the training data. The aim of this paper is to propose a framework for detecting whether time-series data is sampled from a different distribution than the training data, known as the problem of <italic>out-of-distribution</i> (OOD) detection. We propose a novel distance-based OOD method for time-series data using a hierarchical clustering method together with dynamic time-warping to measure the difference between a new data instance and the training set. The method is evaluated in the context of mechanical ventilation, a safety critical application, using both simulated and clinical datasets. Results of the mechanical ventilation use case demonstrate that the proposed approach effectively detects out-of-distribution data and improves classification performance in diverse settings.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"236-249"},"PeriodicalIF":0.0,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11066264","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The stochastic linear bandit problem has emerged as a fundamental building-block in machine learning and control, and a realistic model for many applications. By equipping this classical problem with safety constraints, the safe linear bandit problem further broadens its relevance to safety-critical applications. However, most existing algorithms for safe linear bandits only consider linear constraints, making them inadequate for many real-world applications, which often have non-linear constraints. To alleviate this limitation, we study the problem of safe linear bandits under general (non-linear) constraints. Under a novel constraint regularity condition that is weaker than convexity, we give two algorithms with $tilde{mathcal {O}}(d sqrt{T})$ regret. We then give efficient implementations of these algorithms for several specific settings. Lastly, we give simulation results demonstrating the effectiveness of our algorithms in choosing dynamic pricing signals for a demand response problem under distribution power flow constraints.
{"title":"Optimistic Algorithms for Safe Linear Bandits Under General Constraints","authors":"Spencer Hutchinson;Arghavan Zibaie;Ramtin Pedarsani;Mahnoosh Alizadeh","doi":"10.1109/OJCSYS.2025.3558118","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3558118","url":null,"abstract":"The stochastic linear bandit problem has emerged as a fundamental building-block in machine learning and control, and a realistic model for many applications. By equipping this classical problem with safety constraints, the <italic>safe linear bandit problem</i> further broadens its relevance to safety-critical applications. However, most existing algorithms for safe linear bandits only consider <italic>linear constraints</i>, making them inadequate for many real-world applications, which often have non-linear constraints. To alleviate this limitation, we study the problem of safe linear bandits under general (non-linear) constraints. Under a novel constraint regularity condition that is weaker than convexity, we give two algorithms with <inline-formula><tex-math>$tilde{mathcal {O}}(d sqrt{T})$</tex-math></inline-formula> regret. We then give efficient implementations of these algorithms for several specific settings. Lastly, we give simulation results demonstrating the effectiveness of our algorithms in choosing dynamic pricing signals for a demand response problem under distribution power flow constraints.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"103-116"},"PeriodicalIF":0.0,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10950393","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143896165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-30DOI: 10.1109/OJCSYS.2025.3575305
Bo Chen;Baike She;Calvin Hawkins;Philip E. Paré;Matthew T. Hale
Reproduction numbers are widely used to analyze epidemic spreading processes over networks. However,conventional reproduction numbers of an overall network, which require spreading information from the entire network, do not indicate where an epidemic is spreading. To address this limitation, we first propose a novel class of local distributed reproduction numbers that capture spreading behaviors at the level of individual nodes. We demonstrate how to compute these values in a distributed way and use them to derive new threshold conditions for network spreading analysis. Due to the fact that epidemic data are often collected at multiple geographic or administrative scales, we then define a class of cluster distributed reproduction numbers to describe the spread between groups of nodes such as communities, cities, or states. We further show that the local distributed reproduction numbers can be aggregated to form the cluster distributed reproduction numbers. Unlike conventional network-level reproduction numbers, these distributed measures reveal fine-grained interaction patterns that may raise privacy concerns by exposing the frequency or intensity of interactions across regions. To address this issue, we propose a privacy-enhanced distributed reproduction number framework that implements differential privacy. This framework enables scalable and privacy-preserving analysis of epidemic spreading processes in networked populations through the calculation of privacy-preserving distributed reproduction numbers. Numerical experiments show that while maintaining differential privacy, the private distributed reproduction numbers yield accurate estimates of epidemic spread while also offering more insights than conventional reproduction numbers.
{"title":"Scalable Distributed Reproduction Numbers of Network Epidemics With Differential Privacy","authors":"Bo Chen;Baike She;Calvin Hawkins;Philip E. Paré;Matthew T. Hale","doi":"10.1109/OJCSYS.2025.3575305","DOIUrl":"https://doi.org/10.1109/OJCSYS.2025.3575305","url":null,"abstract":"Reproduction numbers are widely used to analyze epidemic spreading processes over networks. However,conventional reproduction numbers of an overall network, which require spreading information from the entire network, do not indicate where an epidemic is spreading. To address this limitation, we first propose a novel class of local distributed reproduction numbers that capture spreading behaviors at the level of individual nodes. We demonstrate how to compute these values in a distributed way and use them to derive new threshold conditions for network spreading analysis. Due to the fact that epidemic data are often collected at multiple geographic or administrative scales, we then define a class of cluster distributed reproduction numbers to describe the spread between groups of nodes such as communities, cities, or states. We further show that the local distributed reproduction numbers can be aggregated to form the cluster distributed reproduction numbers. Unlike conventional network-level reproduction numbers, these distributed measures reveal fine-grained interaction patterns that may raise privacy concerns by exposing the frequency or intensity of interactions across regions. To address this issue, we propose a privacy-enhanced distributed reproduction number framework that implements differential privacy. This framework enables scalable and privacy-preserving analysis of epidemic spreading processes in networked populations through the calculation of privacy-preserving distributed reproduction numbers. Numerical experiments show that while maintaining differential privacy, the private distributed reproduction numbers yield accurate estimates of epidemic spread while also offering more insights than conventional reproduction numbers.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"199-218"},"PeriodicalIF":0.0,"publicationDate":"2025-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11018355","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144524373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}