首页 > 最新文献

IEEE Transactions on Cybernetics最新文献

英文 中文
Secure Q-Learning of Fuzzy Markov Jump Systems Under Malicious Attacks: A Homotopic Scheme. 恶意攻击下模糊马尔可夫跳跃系统的安全q -学习:一个同伦方案。
IF 11.8 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1109/tcyb.2025.3634171
Hao Shen,Zheng Huang,Jiacheng Wu,Jing Wang,Michael V Basin
This article proposes a novel reinforcement learning (RL)-based secure control policy for nonlinear Markov jump systems (MJSs) subject to false data injection attacks (FDIAs). First, the Takagi-Sugeno (T-S) fuzzy model is applied to describe the nonlinear MJS. A min-max strategy and an off-policy homotopic Q-learning (HQ) scheme are then introduced to design a secure control policy without requiring knowledge of the system dynamics. The proposed approach offers two main advantages: it does not require an initial stabilizing control gain, and it guarantees unbiased learning under persistently excited conditions. Furthermore, a rigorous stability analysis of the overall closed-loop system under FDIAs is presented. Finally, the effectiveness of the proposed approach is demonstrated using a tunnel diode circuit.
本文提出了一种新的基于强化学习(RL)的非线性马尔可夫跳变系统(MJSs)安全控制策略。首先,采用Takagi-Sugeno (T-S)模糊模型对非线性MJS进行描述。然后引入最小-最大策略和离策略同伦q -学习(HQ)方案来设计安全的控制策略,而不需要了解系统动力学。提出的方法有两个主要优点:它不需要初始稳定控制增益,并保证在持续激励条件下的无偏学习。此外,本文还对整个闭环系统在fdi作用下的稳定性进行了严格的分析。最后,通过隧道二极管电路验证了该方法的有效性。
{"title":"Secure Q-Learning of Fuzzy Markov Jump Systems Under Malicious Attacks: A Homotopic Scheme.","authors":"Hao Shen,Zheng Huang,Jiacheng Wu,Jing Wang,Michael V Basin","doi":"10.1109/tcyb.2025.3634171","DOIUrl":"https://doi.org/10.1109/tcyb.2025.3634171","url":null,"abstract":"This article proposes a novel reinforcement learning (RL)-based secure control policy for nonlinear Markov jump systems (MJSs) subject to false data injection attacks (FDIAs). First, the Takagi-Sugeno (T-S) fuzzy model is applied to describe the nonlinear MJS. A min-max strategy and an off-policy homotopic Q-learning (HQ) scheme are then introduced to design a secure control policy without requiring knowledge of the system dynamics. The proposed approach offers two main advantages: it does not require an initial stabilizing control gain, and it guarantees unbiased learning under persistently excited conditions. Furthermore, a rigorous stability analysis of the overall closed-loop system under FDIAs is presented. Finally, the effectiveness of the proposed approach is demonstrated using a tunnel diode circuit.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"32 1","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Complexity Double-Layered Iterative Learning Control for Nonlinear MIMO System Under Cyberattacks. 网络攻击下非线性MIMO系统的低复杂度双层迭代学习控制。
IF 11.8 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1109/tcyb.2025.3634747
Dong Liu,Yu-Kun Wang,Xin Wang,Wei-Wei Che,Zheng-Guang Wu
In this article, the double-layered iterative learning control (DLILC) approach is adopted to investigate the tracking control problem of repetitive nonlinear multiple-input-multiple-output (MIMO) systems under false data injection (FDI) attacks. Based on historical data, two control loops in the scheme are devised to improve tracking accuracy. More specifically, an outer loop adaptive set-point tuning mechanism is developed, which is independent of the inner-loop controller. Such a mechanism dynamically optimizes learning gains by leveraging historical data and significantly reduces reliance on preset system parameters. In the inner loop, a proportional-derivative controller is employed to form the feedback circuit. Furthermore, the double dynamic linearization technique is adopted to transform complex nonlinearities, coupling effects, and unknown uncertainties into a set of linearly estimable parameters. To address FDI attacks, an output observer-based real-time compensator is constructed, which is capable of promptly mitigating the impact of such attacks on system outputs. Simulation results demonstrate that the proposed scheme ensures high-precision tracking, substantially reduces computational burden, and exhibits superior resilience against attacks. The approach thus provides a new pathway toward secure and efficient iterative learning control of nonlinear systems.
本文采用双层迭代学习控制(DLILC)方法研究了重复非线性多输入多输出(MIMO)系统在虚假数据注入(FDI)攻击下的跟踪控制问题。该方案根据历史数据设计了两个控制回路,提高了跟踪精度。更具体地说,开发了一种独立于内环控制器的外环自适应设定点调谐机构。这种机制通过利用历史数据动态优化学习收益,并显著减少对预设系统参数的依赖。在内环中,采用比例导数控制器构成反馈电路。采用双动态线性化技术,将复杂非线性、耦合效应和未知不确定性转化为一组线性可估计的参数。为了解决FDI攻击,构造了一个基于输出观测器的实时补偿器,能够迅速减轻这种攻击对系统输出的影响。仿真结果表明,该方案保证了高精度的跟踪,大大减少了计算量,并具有良好的抗攻击能力。该方法为非线性系统的安全高效迭代学习控制提供了一条新的途径。
{"title":"Low-Complexity Double-Layered Iterative Learning Control for Nonlinear MIMO System Under Cyberattacks.","authors":"Dong Liu,Yu-Kun Wang,Xin Wang,Wei-Wei Che,Zheng-Guang Wu","doi":"10.1109/tcyb.2025.3634747","DOIUrl":"https://doi.org/10.1109/tcyb.2025.3634747","url":null,"abstract":"In this article, the double-layered iterative learning control (DLILC) approach is adopted to investigate the tracking control problem of repetitive nonlinear multiple-input-multiple-output (MIMO) systems under false data injection (FDI) attacks. Based on historical data, two control loops in the scheme are devised to improve tracking accuracy. More specifically, an outer loop adaptive set-point tuning mechanism is developed, which is independent of the inner-loop controller. Such a mechanism dynamically optimizes learning gains by leveraging historical data and significantly reduces reliance on preset system parameters. In the inner loop, a proportional-derivative controller is employed to form the feedback circuit. Furthermore, the double dynamic linearization technique is adopted to transform complex nonlinearities, coupling effects, and unknown uncertainties into a set of linearly estimable parameters. To address FDI attacks, an output observer-based real-time compensator is constructed, which is capable of promptly mitigating the impact of such attacks on system outputs. Simulation results demonstrate that the proposed scheme ensures high-precision tracking, substantially reduces computational burden, and exhibits superior resilience against attacks. The approach thus provides a new pathway toward secure and efficient iterative learning control of nonlinear systems.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"101 1","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-Driven Distributed Kalman Filter-Based Sensor Fault Isolation and Estimation for Large-Scale Interconnected Systems. 基于数据驱动分布式卡尔曼滤波的大型互联系统传感器故障隔离与估计。
IF 11.8 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1109/tcyb.2025.3635766
Shuyu Ding,Haoran Ma,Zhengen Zhao,Steven X Ding,Ying Yang
This article proposes a data-driven distributed Kalman filter (DKF)-based sensor fault isolation and estimation scheme for large-scale interconnected dynamic systems, composed of heterogeneous subsystems coupled through a directed topological graph. A local diagnosis unit (LDU) is established for each subsystem, where the data-driven DKF-based residual generator is constructed using local and neighboring process data, effectively decoupling the totally unknown interaction component. Subsequently, fully distributed sensor fault isolation is realized at the subsystem and element levels in simultaneous-fault cases. Both local and neighboring sensor fault isolation can be realized in the LDU, allowing the global system sensor fault isolation with only several key LDUs. Then, the data-driven DKF-based estimator is built in each LDU to estimate sensor faults occurring in multiple subsystems. The distributed Kalman gain is computed in a fully distributed manner, with stability analysis performed locally without overall system knowledge. Finally, the effectiveness and performance of the proposed scheme are validated through case studies on the power network system.
本文提出了一种基于数据驱动分布式卡尔曼滤波(DKF)的大型互联动态系统故障隔离与估计方案,该系统由异构子系统通过有向拓扑图耦合组成。为每个子系统建立局部诊断单元(LDU),利用局部和相邻过程数据构建基于数据驱动的dkf残差发生器,有效解耦了完全未知的交互分量。在此基础上,实现了同时故障情况下在子系统和元件层面的全分布式传感器故障隔离。LDU可以实现局部和邻近传感器的故障隔离,只需要几个关键LDU就可以实现全局传感器的故障隔离。然后,在每个LDU中构建基于数据驱动的dkf估计器,对多个子系统中发生的传感器故障进行估计。分布式卡尔曼增益以完全分布的方式计算,稳定性分析在局部进行,而不需要整个系统的知识。最后,通过电网系统的实例分析,验证了所提方案的有效性和性能。
{"title":"Data-Driven Distributed Kalman Filter-Based Sensor Fault Isolation and Estimation for Large-Scale Interconnected Systems.","authors":"Shuyu Ding,Haoran Ma,Zhengen Zhao,Steven X Ding,Ying Yang","doi":"10.1109/tcyb.2025.3635766","DOIUrl":"https://doi.org/10.1109/tcyb.2025.3635766","url":null,"abstract":"This article proposes a data-driven distributed Kalman filter (DKF)-based sensor fault isolation and estimation scheme for large-scale interconnected dynamic systems, composed of heterogeneous subsystems coupled through a directed topological graph. A local diagnosis unit (LDU) is established for each subsystem, where the data-driven DKF-based residual generator is constructed using local and neighboring process data, effectively decoupling the totally unknown interaction component. Subsequently, fully distributed sensor fault isolation is realized at the subsystem and element levels in simultaneous-fault cases. Both local and neighboring sensor fault isolation can be realized in the LDU, allowing the global system sensor fault isolation with only several key LDUs. Then, the data-driven DKF-based estimator is built in each LDU to estimate sensor faults occurring in multiple subsystems. The distributed Kalman gain is computed in a fully distributed manner, with stability analysis performed locally without overall system knowledge. Finally, the effectiveness and performance of the proposed scheme are validated through case studies on the power network system.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"93 1","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive Neural Network Iterative Learning PI Control of Fractional-Order Nonlinear Systems Using Generalized Barrier Lyapunov Function. 基于广义势垒Lyapunov函数的分数阶非线性系统自适应神经网络迭代学习PI控制。
IF 11.8 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1109/tcyb.2025.3634145
Xingyue Yang,Xiulan Zhang,Jinde Cao,Heng Liu
Note that the available barrier Lyapunov function (BLF) design considers that the precondition of the specified function must be a smooth convex function, which is relatively harsh for most models. In this article, built on proportional-integral (PI) theory, an adaptive neural network (ANN) PI iterative learning tracking control method for fractional-order nonlinear systems (FONSs) with full-state constraints is presented. A new type of BLF is built that only requires finding a derivative of this function needs to be monotonic under the fractional Lyapunov direct method. To meet the needs of reducing computational complexity and data volume, the designed backstepping controller based on PI control consists of a series of constant gains and dynamic variables with basic linkage relationships. Moreover, it also incorporates iterative learning algorithm that can achieve continuous or discontinuous self-learning and updating. The results indicate that all closed-loop signals of FONSs are semi-globally ultimately uniformly bounded and the constraint is not violated. Theoretical analysis and numerical simulation have verified the rationality of this study.
注意,可用障壁Lyapunov函数(BLF)设计考虑指定函数的前提条件必须是光滑凸函数,这对大多数模型来说是比较苛刻的。本文基于比例积分(PI)理论,提出了一种具有全状态约束的分数阶非线性系统的自适应神经网络(ANN) PI迭代学习跟踪控制方法。在分数阶Lyapunov直接方法下,构造了一种新的BLF,它只需要求该函数的导数是单调的。为满足降低计算量和数据量的需要,设计的基于PI控制的反步控制器由一系列具有基本联动关系的恒定增益和动态变量组成。此外,它还结合了迭代学习算法,可以实现连续或不连续的自学习和更新。结果表明,fss的所有闭环信号最终都是半全局一致有界的,并且不违反约束。理论分析和数值模拟验证了本研究的合理性。
{"title":"Adaptive Neural Network Iterative Learning PI Control of Fractional-Order Nonlinear Systems Using Generalized Barrier Lyapunov Function.","authors":"Xingyue Yang,Xiulan Zhang,Jinde Cao,Heng Liu","doi":"10.1109/tcyb.2025.3634145","DOIUrl":"https://doi.org/10.1109/tcyb.2025.3634145","url":null,"abstract":"Note that the available barrier Lyapunov function (BLF) design considers that the precondition of the specified function must be a smooth convex function, which is relatively harsh for most models. In this article, built on proportional-integral (PI) theory, an adaptive neural network (ANN) PI iterative learning tracking control method for fractional-order nonlinear systems (FONSs) with full-state constraints is presented. A new type of BLF is built that only requires finding a derivative of this function needs to be monotonic under the fractional Lyapunov direct method. To meet the needs of reducing computational complexity and data volume, the designed backstepping controller based on PI control consists of a series of constant gains and dynamic variables with basic linkage relationships. Moreover, it also incorporates iterative learning algorithm that can achieve continuous or discontinuous self-learning and updating. The results indicate that all closed-loop signals of FONSs are semi-globally ultimately uniformly bounded and the constraint is not violated. Theoretical analysis and numerical simulation have verified the rationality of this study.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"5 1","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative AI Empower Addiction-Related Brain Circuits Detection via Graph Diffusion-Infused Adversarial Learning. 生成人工智能通过图扩散注入对抗学习增强成瘾相关脑回路检测。
IF 11.8 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1109/tcyb.2025.3634735
Changhong Jing,Baiying Lei,Shanshan Wang,Yan Liu,Feng Liu,C L Philip Chen,Shuqiang Wang
The study of the nicotine addiction mechanism is of great significance in both nicotine withdrawal and brain science. The detection of addiction-related brain circuitry using functional magnetic resonance imaging (fMRI) is a critical step in studying this mechanism. However, it is challenging to accurately estimate addiction-related brain circuitry due to the low signal-to-noise ratio of fMRI and the issue of small sample size. In this work, a graph diffusion-infused adversarial learning (GDAL) network is proposed to capture addiction-related brain circuitry accurately. The GDAL combines the graph convolution method with the diffusion model so that the model can fully capture addiction-related brain circuitry in non-Euclidean space. The diffusion reconstruction module (DRM) is designed to reconstruct the brain network to maintain the consistency of sample distribution in the latent space so that the brain circuitry can be detected more accurately. The proposed model reduces the search space by improving the conditional guidance of the DRM so that the model can better understand the latent distribution for the issue of small sample size. The experimental results demonstrate the effectiveness of the proposed method.
尼古丁成瘾机制的研究在尼古丁戒断和脑科学领域都具有重要意义。使用功能磁共振成像(fMRI)检测成瘾相关的脑回路是研究这一机制的关键步骤。然而,由于fMRI的低信噪比和小样本量的问题,准确估计成瘾相关的脑回路是一项挑战。在这项工作中,提出了一个图扩散注入对抗学习(GDAL)网络来准确捕获成瘾相关的脑回路。GDAL将图卷积方法与扩散模型相结合,使该模型能够在非欧几里德空间中完整地捕捉成瘾相关的脑回路。扩散重建模块(diffusion reconstruction module, DRM)用于重建脑网络,以保持样本在潜在空间分布的一致性,从而更准确地检测到脑回路。该模型通过改进DRM的条件引导减小了搜索空间,使模型能够更好地理解小样本量问题下的潜在分布。实验结果证明了该方法的有效性。
{"title":"Generative AI Empower Addiction-Related Brain Circuits Detection via Graph Diffusion-Infused Adversarial Learning.","authors":"Changhong Jing,Baiying Lei,Shanshan Wang,Yan Liu,Feng Liu,C L Philip Chen,Shuqiang Wang","doi":"10.1109/tcyb.2025.3634735","DOIUrl":"https://doi.org/10.1109/tcyb.2025.3634735","url":null,"abstract":"The study of the nicotine addiction mechanism is of great significance in both nicotine withdrawal and brain science. The detection of addiction-related brain circuitry using functional magnetic resonance imaging (fMRI) is a critical step in studying this mechanism. However, it is challenging to accurately estimate addiction-related brain circuitry due to the low signal-to-noise ratio of fMRI and the issue of small sample size. In this work, a graph diffusion-infused adversarial learning (GDAL) network is proposed to capture addiction-related brain circuitry accurately. The GDAL combines the graph convolution method with the diffusion model so that the model can fully capture addiction-related brain circuitry in non-Euclidean space. The diffusion reconstruction module (DRM) is designed to reconstruct the brain network to maintain the consistency of sample distribution in the latent space so that the brain circuitry can be detected more accurately. The proposed model reduces the search space by improving the conditional guidance of the DRM so that the model can better understand the latent distribution for the issue of small sample size. The experimental results demonstrate the effectiveness of the proposed method.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"25 1","pages":""},"PeriodicalIF":11.8,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145657036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Guiding Multiagent Multitask Reinforcement Learning by a Hierarchical Framework With Logical Reward Shaping. 基于逻辑奖励形成的分层框架指导多智能体多任务强化学习。
IF 10.5 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-02 DOI: 10.1109/TCYB.2025.3631239
Chanjuan Liu, Jinmiao Cong, Bingcai Chen, Yaochu Jin, Enqiang Zhu

Multiagent hierarchical reinforcement learning (MAHRL) has been studied as an effective means to solve intelligent decision problems in complex and large-scale environments. However, most current MAHRL algorithms follow the traditional way of using reward functions in reinforcement learning (RL), which limits their use to a single task. This study aims to design a multiagent cooperative algorithm with logic reward shaping (LRS), which uses a more flexible way of setting the rewards, allowing for the effective completion of multitasks. LRS uses linear-time temporal logic (LTL) to express the internal logic relation of subtasks within a complex task. Then, it evaluates whether the subformulas of the LTL expressions are satisfied based on a designed reward structure. This helps agents to learn to effectively complete tasks by adhering to the LTL expressions, thus enhancing the interpretability and credibility of their decisions. To enhance coordination and cooperation among multiple agents, a value iteration technique is designed to evaluate the actions taken by each agent. Based on this evaluation, a reward function is shaped for coordination, which enables each agent to evaluate its status and complete the remaining subtasks through experiential learning. Experiments have been conducted on various types of tasks in the Minecraft World and Office World. The results demonstrate that the proposed algorithm can improve the performance of multiagents when learning to complete multitasks.

多智能体分层强化学习(MAHRL)作为解决复杂大规模环境下智能决策问题的有效手段,得到了广泛的研究。然而,目前大多数MAHRL算法都遵循在强化学习(RL)中使用奖励函数的传统方法,这限制了它们在单个任务中的使用。本研究旨在设计一种具有逻辑奖励塑造(LRS)的多智能体合作算法,该算法采用更灵活的奖励设置方式,允许多任务的有效完成。LRS使用线性时间-时间逻辑(LTL)来表达复杂任务中子任务之间的内部逻辑关系。然后,基于设计的奖励结构,评估LTL表达式的子公式是否被满足。这有助于智能体通过遵守LTL表达式来学习有效地完成任务,从而增强其决策的可解释性和可信度。为了加强多个智能体之间的协调与合作,设计了一种价值迭代技术来评估每个智能体所采取的行动。在此基础上,形成一个协调的奖励函数,使每个agent能够评估自己的状态,并通过经验学习完成剩余的子任务。在《我的世界》和《办公室世界》中对各种类型的任务进行了实验。结果表明,该算法可以提高多智能体学习完成多任务的性能。
{"title":"Guiding Multiagent Multitask Reinforcement Learning by a Hierarchical Framework With Logical Reward Shaping.","authors":"Chanjuan Liu, Jinmiao Cong, Bingcai Chen, Yaochu Jin, Enqiang Zhu","doi":"10.1109/TCYB.2025.3631239","DOIUrl":"https://doi.org/10.1109/TCYB.2025.3631239","url":null,"abstract":"<p><p>Multiagent hierarchical reinforcement learning (MAHRL) has been studied as an effective means to solve intelligent decision problems in complex and large-scale environments. However, most current MAHRL algorithms follow the traditional way of using reward functions in reinforcement learning (RL), which limits their use to a single task. This study aims to design a multiagent cooperative algorithm with logic reward shaping (LRS), which uses a more flexible way of setting the rewards, allowing for the effective completion of multitasks. LRS uses linear-time temporal logic (LTL) to express the internal logic relation of subtasks within a complex task. Then, it evaluates whether the subformulas of the LTL expressions are satisfied based on a designed reward structure. This helps agents to learn to effectively complete tasks by adhering to the LTL expressions, thus enhancing the interpretability and credibility of their decisions. To enhance coordination and cooperation among multiple agents, a value iteration technique is designed to evaluate the actions taken by each agent. Based on this evaluation, a reward function is shaped for coordination, which enables each agent to evaluate its status and complete the remaining subtasks through experiential learning. Experiments have been conducted on various types of tasks in the Minecraft World and Office World. The results demonstrate that the proposed algorithm can improve the performance of multiagents when learning to complete multitasks.</p>","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"PP ","pages":""},"PeriodicalIF":10.5,"publicationDate":"2025-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145661085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-Driven Model-Free Adaptive Dynamic Programming Resilient Control for Nonlinear Networked Control Systems Under DoS Attacks. DoS攻击下非线性网络控制系统的数据驱动无模型自适应动态规划弹性控制
IF 10.5 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-01 DOI: 10.1109/TCYB.2025.3594793
Mei Zhong, Jiancheng Zhang, Gang Zheng, Heng Liu

Enhancing system security under denial-of-service (DoS) attacks requires robust compensation mechanisms. However, existing model-free adaptive control-based compensation solutions are limited to constant reference signals and neglect control optimization, causing insufficient tracking performance in dynamic attacks. This study develops a data-driven adaptive dynamic programming (ADP) resilient control scheme for networked control system under aperiodic DoS attacks. An ADP method with a modified performance index is proposed to derive a globally optimal controller, while a dynamic penalty factor is introduced to accelerate error convergence. Leveraging ADP technology and the latest available control increments, a compensation mechanism for time-varying reference signals is designed to reduce performance degradation. Finally, theoretical proofs ensure error convergence, and comparative simulations verify the strategy's superiority.

增强系统在拒绝服务(DoS)攻击下的安全性需要健全的补偿机制。然而,现有的基于无模型自适应控制的补偿方案仅限于恒定的参考信号,忽略了控制优化,导致动态攻击的跟踪性能不足。针对非周期性DoS攻击,提出了一种数据驱动的自适应动态规划(ADP)弹性控制方案。提出了一种改进性能指标的ADP方法来推导全局最优控制器,同时引入动态惩罚因子来加速误差收敛。利用ADP技术和最新可用的控制增量,设计了时变参考信号的补偿机制,以减少性能下降。最后,理论证明保证了误差收敛性,对比仿真验证了该策略的优越性。
{"title":"Data-Driven Model-Free Adaptive Dynamic Programming Resilient Control for Nonlinear Networked Control Systems Under DoS Attacks.","authors":"Mei Zhong, Jiancheng Zhang, Gang Zheng, Heng Liu","doi":"10.1109/TCYB.2025.3594793","DOIUrl":"10.1109/TCYB.2025.3594793","url":null,"abstract":"<p><p>Enhancing system security under denial-of-service (DoS) attacks requires robust compensation mechanisms. However, existing model-free adaptive control-based compensation solutions are limited to constant reference signals and neglect control optimization, causing insufficient tracking performance in dynamic attacks. This study develops a data-driven adaptive dynamic programming (ADP) resilient control scheme for networked control system under aperiodic DoS attacks. An ADP method with a modified performance index is proposed to derive a globally optimal controller, while a dynamic penalty factor is introduced to accelerate error convergence. Leveraging ADP technology and the latest available control increments, a compensation mechanism for time-varying reference signals is designed to reduce performance degradation. Finally, theoretical proofs ensure error convergence, and comparative simulations verify the strategy's superiority.</p>","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"PP ","pages":"5700-5713"},"PeriodicalIF":10.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144882813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid-Self-Supervised Learning Framework. 基于多模态混合自监督学习框架的睡眠分析基础模型。
IF 10.5 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-01 DOI: 10.1109/TCYB.2025.3603608
Cheol-Hui Lee, Hakseung Kim, Byung Chul Yoon, Dong-Joo Kim

Sleep is essential for maintaining human health and quality of life. Analyzing physiological signals during sleep is critical in assessing sleep quality and diagnosing sleep disorders. However, manual diagnoses by clinicians are time-intensive and subjective. Despite advances in deep learning that have enhanced automation, these approaches remain heavily dependent on large-scale labeled datasets. This study introduces SynthSleepNet, a multimodal hybrid-self-supervised learning (SSL) framework designed for analyzing polysomnography (PSG) data. SynthSleepNet effectively integrates masked prediction and contrastive learning to leverage complementary features across multiple modalities, including electroencephalogram (EEG), electrooculography (EOG), electromyography (EMG), and electrocardiogram (ECG). This approach enables the model to learn highly expressive representations of PSG data. Furthermore, a temporal context module based on Mamba was developed to efficiently capture contextual information across signals. SynthSleepNet achieved superior performance compared to state-of-the-art methods across three downstream tasks: sleep-stage classification, apnea detection, and hypopnea detection, with accuracies of 89.89%, 99.75%, and 89.60%, respectively. The model demonstrated robust performance in a semi-SSL environment with limited labels, achieving accuracies of 87.98%, 99.37%, and 77.52% in the same tasks. These results underscore the potential of the model as a foundational tool for the comprehensive analysis of PSG data. SynthSleepNet demonstrates comprehensively superior performance across multiple downstream tasks compared to other methodologies, making it expected to set a new standard for sleep disorder monitoring and diagnostic systems. The source code is available at https://github.com/dlcjfgmlnasa/SynthSleepNet.

睡眠对维持人类健康和生活质量至关重要。分析睡眠中的生理信号对于评估睡眠质量和诊断睡眠障碍至关重要。然而,临床医生的手工诊断是费时且主观的。尽管深度学习的进步提高了自动化程度,但这些方法仍然严重依赖于大规模标记数据集。本研究介绍了SynthSleepNet,一个多模态混合自监督学习(SSL)框架,设计用于分析多导睡眠图(PSG)数据。SynthSleepNet有效地集成了掩模预测和对比学习,以利用多种模式的互补功能,包括脑电图(EEG)、眼电图(EOG)、肌电图(EMG)和心电图(ECG)。这种方法使模型能够学习PSG数据的高表达表示。此外,开发了基于曼巴的中医,以有效地捕获跨信号的上下文信息。与最先进的方法相比,SynthSleepNet在三个下游任务(睡眠阶段分类、呼吸暂停检测和呼吸不足检测)上取得了卓越的性能,准确率分别为89.89%、99.75%和89.60%。该模型在标签有限的半ssl环境中表现出稳健的性能,在相同的任务中实现了87.98%,99.37%和77.52%的准确率。这些结果强调了该模型作为PSG数据综合分析的基础工具的潜力。与其他方法相比,SynthSleepNet在多个下游任务中表现出全面的优越性能,有望为睡眠障碍监测和诊断系统树立新的标准。源代码可从https://github.com/dlcjfgmlnasa/SynthSleepNet获得。
{"title":"Toward Foundational Model for Sleep Analysis Using a Multimodal Hybrid-Self-Supervised Learning Framework.","authors":"Cheol-Hui Lee, Hakseung Kim, Byung Chul Yoon, Dong-Joo Kim","doi":"10.1109/TCYB.2025.3603608","DOIUrl":"10.1109/TCYB.2025.3603608","url":null,"abstract":"<p><p>Sleep is essential for maintaining human health and quality of life. Analyzing physiological signals during sleep is critical in assessing sleep quality and diagnosing sleep disorders. However, manual diagnoses by clinicians are time-intensive and subjective. Despite advances in deep learning that have enhanced automation, these approaches remain heavily dependent on large-scale labeled datasets. This study introduces SynthSleepNet, a multimodal hybrid-self-supervised learning (SSL) framework designed for analyzing polysomnography (PSG) data. SynthSleepNet effectively integrates masked prediction and contrastive learning to leverage complementary features across multiple modalities, including electroencephalogram (EEG), electrooculography (EOG), electromyography (EMG), and electrocardiogram (ECG). This approach enables the model to learn highly expressive representations of PSG data. Furthermore, a temporal context module based on Mamba was developed to efficiently capture contextual information across signals. SynthSleepNet achieved superior performance compared to state-of-the-art methods across three downstream tasks: sleep-stage classification, apnea detection, and hypopnea detection, with accuracies of 89.89%, 99.75%, and 89.60%, respectively. The model demonstrated robust performance in a semi-SSL environment with limited labels, achieving accuracies of 87.98%, 99.37%, and 77.52% in the same tasks. These results underscore the potential of the model as a foundational tool for the comprehensive analysis of PSG data. SynthSleepNet demonstrates comprehensively superior performance across multiple downstream tasks compared to other methodologies, making it expected to set a new standard for sleep disorder monitoring and diagnostic systems. The source code is available at https://github.com/dlcjfgmlnasa/SynthSleepNet.</p>","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"PP ","pages":"5619-5632"},"PeriodicalIF":10.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145029529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Improved Jump Model for Two-Dimensional Markov Jump Roesser Systems and Its H Control. 二维马尔可夫跃变系统的改进跃变模型及其H∞控制。
IF 10.5 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-12-01 DOI: 10.1109/TCYB.2025.3592848
Yue-Yue Tao, Zheng-Guang Wu, Gang Feng

In this study, an improved jump model is proposed for the Roesser-type 2-D Markov jump systems (MJSs). We use two independent Markov chains that propagate along the horizontal and vertical directions, respectively, to characterize the switching of system dynamics in those two directions. Compared with the conventional jump model, which uses only one Markov chain to characterize the switching of system dynamics in both directions, the newly proposed 2-D jump model shows better modeling capabilities for real-world applications with abrupt changes while inherently avoiding the mode ambiguity phenomenon. Based on the proposed jump model, we then propose a dual-mode-dependent state feedback control law to stabilize the concerned 2-D MJS. A sufficient criterion, whose feasibility is enhanced via a dual-mode-dependent Lyapunov functional technique, is obtained to ensure the asymptotic mean square stability and $H_{infty }$ disturbance attenuation level of the resulting closed-loop system. Subsequently, resorting to a novel nonconservative separation principle, two equivalent conditions with one of them in the form of linear matrix inequalities (LMIs) are developed. Finally, a convex optimization algorithm which is formulated by the obtained LMIs is proposed to design the control law. An example of the Darboux equation with Markov switching parameters is presented to validate the effectiveness of the obtained results.

本文针对roesser型二维马尔可夫跳跃系统提出了一种改进的跳跃模型。我们分别使用沿水平和垂直方向传播的两个独立的马尔可夫链来表征系统动力学在这两个方向上的切换。传统的跳跃模型只使用一条马尔可夫链来描述系统动力学在两个方向上的切换,与之相比,新提出的二维跳跃模型在固有地避免模式模糊现象的同时,对具有突变的实际应用具有更好的建模能力。基于所提出的跳跃模型,提出了一种双模相关的状态反馈控制律来稳定所关注的二维MJS。通过双模相关Lyapunov泛函技术,得到了保证闭环系统的渐近均方稳定性和H∞扰动衰减水平的充分判据。随后,利用一种新的非保守分离原理,导出了两个等价条件,其中一个条件以线性矩阵不等式的形式存在。最后,提出了一种由所得到的lmi构成的凸优化算法来设计控制律。最后给出了带马尔可夫切换参数的达布方程的算例,验证了所得结果的有效性。
{"title":"An Improved Jump Model for Two-Dimensional Markov Jump Roesser Systems and Its H<sub>∞</sub> Control.","authors":"Yue-Yue Tao, Zheng-Guang Wu, Gang Feng","doi":"10.1109/TCYB.2025.3592848","DOIUrl":"10.1109/TCYB.2025.3592848","url":null,"abstract":"<p><p>In this study, an improved jump model is proposed for the Roesser-type 2-D Markov jump systems (MJSs). We use two independent Markov chains that propagate along the horizontal and vertical directions, respectively, to characterize the switching of system dynamics in those two directions. Compared with the conventional jump model, which uses only one Markov chain to characterize the switching of system dynamics in both directions, the newly proposed 2-D jump model shows better modeling capabilities for real-world applications with abrupt changes while inherently avoiding the mode ambiguity phenomenon. Based on the proposed jump model, we then propose a dual-mode-dependent state feedback control law to stabilize the concerned 2-D MJS. A sufficient criterion, whose feasibility is enhanced via a dual-mode-dependent Lyapunov functional technique, is obtained to ensure the asymptotic mean square stability and $H_{infty }$ disturbance attenuation level of the resulting closed-loop system. Subsequently, resorting to a novel nonconservative separation principle, two equivalent conditions with one of them in the form of linear matrix inequalities (LMIs) are developed. Finally, a convex optimization algorithm which is formulated by the obtained LMIs is proposed to design the control law. An example of the Darboux equation with Markov switching parameters is presented to validate the effectiveness of the obtained results.</p>","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"PP ","pages":"5594-5604"},"PeriodicalIF":10.5,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144882811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
2025 Index IEEE Transactions on Cybernetics 2025索引IEEE控制论学报
IF 10.5 1区 计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS Pub Date : 2025-11-27 DOI: 10.1109/TCYB.2025.3638081
{"title":"2025 Index IEEE Transactions on Cybernetics","authors":"","doi":"10.1109/TCYB.2025.3638081","DOIUrl":"10.1109/TCYB.2025.3638081","url":null,"abstract":"","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"55 12","pages":"6013-6130"},"PeriodicalIF":10.5,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11270972","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145611025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Transactions on Cybernetics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1