IEEE Transactions on Systems Man Cybernetics-Systems最新文献_第7页

Observer-Based Adaptive Prescribed-Time Asymptotic Tracking Control for Flexible-Joint Manipulators 基于观测器的柔性关节机械臂自适应规定时间渐近跟踪控制

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-30 DOI: 10.1109/TSMC.2025.3612585

Yu Gao;Wei Sun;Ning Sun

This study concentrates on adaptive prescribed-time tracking control for n-link flexible-joint (FJ) manipulators with unmeasurable state variables. First of all, auxiliary signals are constructed utilizing measurable variables. Based on auxiliary signals, the observer is directly designed to estimate the system states, which allows the observer dynamics to incorporate unknown terms. Owing to the uniqueness of the designed observer, the observation errors converge to zero. Furthermore, with the aim of enhancing control efficiency, the prescribed-time scale function is introduced into the controllers, and the unknown terms are processed based on the fuzzy logic system (FLS), so that the tracking error of FJ manipulators converges to the specified range within the prescribed time. Meanwhile, with the help of positive integrable time-varying functions, asymptotic tracking is further achieved. In the whole control design, the tuning functions are adopted to avoid overparameterization. In theory, it is ensured that all signals in the closed-loop system are bounded, and the tracking error converges to the small neighborhood of the zero within a specified time and gradually converges to zero. Finally, the simulation example confirms the feasibility of the control design.

研究了状态变量不可测的n连杆柔性关节（FJ）机械臂的自适应规定时间跟踪控制问题。首先，利用可测量变量构造辅助信号。在辅助信号的基础上，直接设计观测器来估计系统状态，从而允许观测器动力学中包含未知项。由于所设计观测器的唯一性，观测误差收敛到零。为了提高控制效率，在控制器中引入规定时间尺度函数，并基于模糊逻辑系统（FLS）对未知项进行处理，使FJ机械手的跟踪误差在规定时间内收敛到规定范围内。同时，利用正可积时变函数进一步实现渐近跟踪。在整个控制设计中，为了避免过度参数化，采用了整定函数。理论上，保证闭环系统中所有信号都是有界的，跟踪误差在规定时间内收敛到零点的小邻域，并逐渐收敛到零。最后通过仿真实例验证了控制设计的可行性。

{"title":"Observer-Based Adaptive Prescribed-Time Asymptotic Tracking Control for Flexible-Joint Manipulators","authors":"Yu Gao;Wei Sun;Ning Sun","doi":"10.1109/TSMC.2025.3612585","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3612585","url":null,"abstract":"This study concentrates on adaptive prescribed-time tracking control for n-link flexible-joint (FJ) manipulators with unmeasurable state variables. First of all, auxiliary signals are constructed utilizing measurable variables. Based on auxiliary signals, the observer is directly designed to estimate the system states, which allows the observer dynamics to incorporate unknown terms. Owing to the uniqueness of the designed observer, the observation errors converge to zero. Furthermore, with the aim of enhancing control efficiency, the prescribed-time scale function is introduced into the controllers, and the unknown terms are processed based on the fuzzy logic system (FLS), so that the tracking error of FJ manipulators converges to the specified range within the prescribed time. Meanwhile, with the help of positive integrable time-varying functions, asymptotic tracking is further achieved. In the whole control design, the tuning functions are adopted to avoid overparameterization. In theory, it is ensured that all signals in the closed-loop system are bounded, and the tracking error converges to the small neighborhood of the zero within a specified time and gradually converges to zero. Finally, the simulation example confirms the feasibility of the control design.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 12","pages":"9165-9174"},"PeriodicalIF":8.7,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A New Encryption Algorithm for Secure Aviation Communications 一种新的安全航空通信加密算法

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-30 DOI: 10.1109/TSMC.2025.3613057

Moatsum Alawida

As an integral application within the aerospace and electronic engineering systems, the aviation system relies heavily on secure communication protocols to safeguard sensitive information exchanged during travel and operations. However, many existing communication systems lack adequate security, leaving them vulnerable to adversarial attacks and data exfiltration. While effective in many security protocols, classical encryption algorithms are susceptible to various attacks, including statistical, differential, and side-channel attacks. To address these vulnerabilities, a new cryptographic encryption algorithm is proposed based on the extended Feistel network structure and binary tree structure. This algorithm enhances security measures by segmenting plaintext into blocks of 1024 bits, further divided into left and right halves across four distinct levels. Encryption begins at Level 64, employing a summation-based algorithm and XOR operations to ensure diffusion and confusion properties. The parallel implementation of encryption enhances processing speed while deriving subkeys from a sensitive secret key enhances security against single-bit changes. Experimental assessments and security analyses demonstrate the robustness of the proposed cipher against various attacks. The proposed cipher offers high-quality encryption capabilities, making it an ideal candidate for inclusion in the secure communication protocols of aviation systems.

作为航空航天和电子工程系统的一个整体应用，航空系统在很大程度上依赖于安全通信协议来保护旅行和操作期间交换的敏感信息。然而，许多现有的通信系统缺乏足够的安全性，使它们容易受到对抗性攻击和数据泄露。虽然在许多安全协议中有效，但经典加密算法容易受到各种攻击，包括统计攻击、差分攻击和侧信道攻击。针对这些漏洞，提出了一种基于扩展Feistel网络结构和二叉树结构的新型加密算法。该算法通过将明文分割成1024位的块来增强安全措施，并在四个不同的级别上进一步分为左右两半。加密从64级开始，采用基于求和的算法和异或操作来确保扩散和混淆特性。加密的并行实现提高了处理速度，同时从敏感的秘密密钥派生子密钥增强了针对单个比特更改的安全性。实验评估和安全分析证明了所提出的密码对各种攻击的鲁棒性。所提出的密码提供了高质量的加密能力，使其成为航空系统安全通信协议的理想候选。

{"title":"A New Encryption Algorithm for Secure Aviation Communications","authors":"Moatsum Alawida","doi":"10.1109/TSMC.2025.3613057","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3613057","url":null,"abstract":"As an integral application within the aerospace and electronic engineering systems, the aviation system relies heavily on secure communication protocols to safeguard sensitive information exchanged during travel and operations. However, many existing communication systems lack adequate security, leaving them vulnerable to adversarial attacks and data exfiltration. While effective in many security protocols, classical encryption algorithms are susceptible to various attacks, including statistical, differential, and side-channel attacks. To address these vulnerabilities, a new cryptographic encryption algorithm is proposed based on the extended Feistel network structure and binary tree structure. This algorithm enhances security measures by segmenting plaintext into blocks of 1024 bits, further divided into left and right halves across four distinct levels. Encryption begins at Level 64, employing a summation-based algorithm and XOR operations to ensure diffusion and confusion properties. The parallel implementation of encryption enhances processing speed while deriving subkeys from a sensitive secret key enhances security against single-bit changes. Experimental assessments and security analyses demonstrate the robustness of the proposed cipher against various attacks. The proposed cipher offers high-quality encryption capabilities, making it an ideal candidate for inclusion in the secure communication protocols of aviation systems.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 12","pages":"9186-9200"},"PeriodicalIF":8.7,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning-Based Prescribed-Time Fuzzy Optimal Quantized Control for Large-Scale Systems With Bridge-Hole Constraint 具有桥孔约束的大型系统的基于学习的规定时间模糊最优量化控制

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-29 DOI: 10.1109/TSMC.2025.3612400

Shiyu Xie;Wei Sun;Yuqiang Wu

This study presents an advanced adaptive fuzzy optimal bridge-hole constraint control method for large-scale interconnected systems under quantized input. To address the conflict in constraint ranges caused by the combined effect of both results in the bridge-hole and performance constraints, a new prescribed time function with parameter requirements is proposed, which bridges the balance between them and keeps the tracking error within a desired zone in a prescribed time. Meanwhile, output constraint is realized by building a new bridge-hole constraint function, which ensures the time interval for the constraint behavior to occur by the flexible setting of the switching time. Unlike traditional optimal control schemes, the designed optimal controller is further quantized by a hysteresis quantizer, which minimizes energy cost and saves bandwidth. Besides, a reinforcement learning (RL) scheme based on an actor–critic-identifier fuzzy logic system (FLS) structure is designed; its overall control idea is to optimize the entire backstepping control system by using all virtual and actual backstepping control as the optimal solution of their respective subsystems. Finally, the effectiveness of the proposed scheme is confirmed by simulation experiments.

提出了一种基于量化输入的大型互联系统自适应模糊最优桥孔约束控制方法。针对两种结果在桥孔约束和性能约束的共同作用下在约束范围内产生的冲突，提出了一种新的带参数要求的规定时间函数，在两者之间架起桥梁，使跟踪误差在规定时间内保持在期望范围内。同时，通过建立新的桥孔约束函数来实现输出约束，通过灵活设置开关时间来保证约束行为发生的时间间隔。与传统的最优控制方案不同，所设计的最优控制器通过迟滞量化器进一步量化，从而最大限度地降低了能量消耗并节省了带宽。在此基础上，设计了一种基于行动者-关键-标识符模糊逻辑系统结构的强化学习方案；其总体控制思想是将所有虚拟和实际的后退控制作为各自子系统的最优解来优化整个后退控制系统。最后，通过仿真实验验证了该方案的有效性。

{"title":"Learning-Based Prescribed-Time Fuzzy Optimal Quantized Control for Large-Scale Systems With Bridge-Hole Constraint","authors":"Shiyu Xie;Wei Sun;Yuqiang Wu","doi":"10.1109/TSMC.2025.3612400","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3612400","url":null,"abstract":"This study presents an advanced adaptive fuzzy optimal bridge-hole constraint control method for large-scale interconnected systems under quantized input. To address the conflict in constraint ranges caused by the combined effect of both results in the bridge-hole and performance constraints, a new prescribed time function with parameter requirements is proposed, which bridges the balance between them and keeps the tracking error within a desired zone in a prescribed time. Meanwhile, output constraint is realized by building a new bridge-hole constraint function, which ensures the time interval for the constraint behavior to occur by the flexible setting of the switching time. Unlike traditional optimal control schemes, the designed optimal controller is further quantized by a hysteresis quantizer, which minimizes energy cost and saves bandwidth. Besides, a reinforcement learning (RL) scheme based on an actor–critic-identifier fuzzy logic system (FLS) structure is designed; its overall control idea is to optimize the entire backstepping control system by using all virtual and actual backstepping control as the optimal solution of their respective subsystems. Finally, the effectiveness of the proposed scheme is confirmed by simulation experiments.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 12","pages":"9097-9108"},"PeriodicalIF":8.7,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Deep Multiobject Detection Model for Passenger Escalator Safety 自动扶梯乘客安全的深度多目标检测模型

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-29 DOI: 10.1109/TSMC.2025.3612901

Yo-Ping Huang;Satchidanand Kshetrimayum;Haobijam Basanta;Frode Eika Sandnes

Accidents involving escalators in mass rapid transit (MRT) systems pose a serious risk to public safety, often resulting from clothing or footwear getting caught, or large items toppling during movement. Despite the availability of passive warnings, such as signage and audio announcements, these methods often go unnoticed by commuters and lack the ability to adapt to real-time risks. Existing computer vision solutions are either too computationally intensive for deployment on edge devices or lack sufficient accuracy for practical use. To address these challenges, this study proposes a real-time, lightweight object detection system using a pruned YOLOv7-Tiny model, optimized for deployment on the NVIDIA Jetson Nano edge computing platform. The system is designed to identify safety-critical items, such as general footwear, high heels, long skirts, suitcases, strollers, and shopping trolleys, in real-time. Upon detection, it issues visual and auditory alerts, and in cases involving large items, sends email notifications to station personnel. Model pruning significantly reduces computational overhead while maintaining high accuracy. Experimental results demonstrate that the system achieves a mean average precision (mAP) of 94.69%, outperforming conventional detection models while maintaining real-time performance. These results highlight the system’s potential for enhancing passenger safety and operational efficiency in resource-constrained public transit environments.

地铁系统中涉及自动扶梯的事故对公共安全构成严重威胁，通常是由于衣服或鞋子被夹住，或者在运行过程中大件物品倾倒。尽管有被动警告，如标牌和音频广播，但这些方法往往被通勤者忽视，缺乏适应实时风险的能力。现有的计算机视觉解决方案要么过于计算密集，无法部署在边缘设备上，要么缺乏足够的精度，无法实际使用。为了应对这些挑战，本研究提出了一种使用精简YOLOv7-Tiny模型的实时轻量级目标检测系统，该系统针对NVIDIA Jetson Nano边缘计算平台进行了优化。该系统旨在实时识别安全关键物品，如普通鞋类、高跟鞋、长裙、手提箱、婴儿车和购物车。一旦检测到，它就会发出视觉和听觉警报，在涉及大件物品的情况下，它会向车站人员发送电子邮件通知。模型剪枝显著减少计算开销，同时保持较高的准确性。实验结果表明，该系统的平均精度（mAP）为94.69%，在保持实时性的同时，优于传统的检测模型。这些结果突出了该系统在资源有限的公共交通环境中提高乘客安全和运营效率的潜力。

{"title":"A Deep Multiobject Detection Model for Passenger Escalator Safety","authors":"Yo-Ping Huang;Satchidanand Kshetrimayum;Haobijam Basanta;Frode Eika Sandnes","doi":"10.1109/TSMC.2025.3612901","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3612901","url":null,"abstract":"Accidents involving escalators in mass rapid transit (MRT) systems pose a serious risk to public safety, often resulting from clothing or footwear getting caught, or large items toppling during movement. Despite the availability of passive warnings, such as signage and audio announcements, these methods often go unnoticed by commuters and lack the ability to adapt to real-time risks. Existing computer vision solutions are either too computationally intensive for deployment on edge devices or lack sufficient accuracy for practical use. To address these challenges, this study proposes a real-time, lightweight object detection system using a pruned YOLOv7-Tiny model, optimized for deployment on the NVIDIA Jetson Nano edge computing platform. The system is designed to identify safety-critical items, such as general footwear, high heels, long skirts, suitcases, strollers, and shopping trolleys, in real-time. Upon detection, it issues visual and auditory alerts, and in cases involving large items, sends email notifications to station personnel. Model pruning significantly reduces computational overhead while maintaining high accuracy. Experimental results demonstrate that the system achieves a mean average precision (mAP) of 94.69%, outperforming conventional detection models while maintaining real-time performance. These results highlight the system’s potential for enhancing passenger safety and operational efficiency in resource-constrained public transit environments.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 12","pages":"9109-9119"},"PeriodicalIF":8.7,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145546976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Constrained Sampling-Based MPC Using Path Integral for Collision-Free Robot Manipulation 基于路径积分的约束采样MPC无碰撞机器人操作

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-26 DOI: 10.1109/TSMC.2025.3611922

Xingfang Wang;Hui Li;Dong Wang;Xiao Huang;Zhihong Jiang

The dynamic and unknown human behaviors in human–robot interaction make it challenging for collision-free robot manipulation. Although sampling-based model predictive control (MPC) has achieved real-time control in the above scenarios, it is hard to handle equality hard constraints, such as working along a specified trajectory, due to sampling disturbances. To improve manipulation performance under multiple constraints, this article presents a novel constrained sampling-based MPC (CSMPC) method using path integral. First, hierarchical optimization combining policy sampling projection and the Lagrange multiplier method is used to handle equality hard constraints for high-precision manipulation tasks. Second, collision avoidance and smooth motion are modeled as inequality soft constraints, where collision detection and time series prediction are used to ensure the safety and smoothness of dynamic interaction. Finally, an adaptive noise method is built to improve the stability of physical robot manipulation. The simulation and experiment results demonstrate that the proposed method enables a 7-DOF robot manipulator to achieve precise manipulation while avoiding dynamic obstacles.

在人机交互中，人的行为是动态的、未知的，这给机器人的无碰撞操作带来了挑战。尽管基于采样的模型预测控制（MPC）在上述情况下实现了实时控制，但由于采样干扰，难以处理等硬约束，例如沿指定轨迹工作。为了提高多约束条件下的操作性能，提出了一种基于路径积分的约束采样MPC （CSMPC）方法。首先，采用分层优化结合策略抽样投影和拉格朗日乘数法处理高精度操作任务的相等硬约束；其次，将避碰和平滑运动建模为不等式软约束，利用碰撞检测和时间序列预测来保证动态交互的安全性和平滑性；最后，建立了一种自适应噪声方法来提高物理机器人操作的稳定性。仿真和实验结果表明，该方法能够使七自由度机器人在避开动态障碍物的同时实现精确的操纵。

{"title":"Constrained Sampling-Based MPC Using Path Integral for Collision-Free Robot Manipulation","authors":"Xingfang Wang;Hui Li;Dong Wang;Xiao Huang;Zhihong Jiang","doi":"10.1109/TSMC.2025.3611922","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3611922","url":null,"abstract":"The dynamic and unknown human behaviors in human–robot interaction make it challenging for collision-free robot manipulation. Although sampling-based model predictive control (MPC) has achieved real-time control in the above scenarios, it is hard to handle equality hard constraints, such as working along a specified trajectory, due to sampling disturbances. To improve manipulation performance under multiple constraints, this article presents a novel constrained sampling-based MPC (CSMPC) method using path integral. First, hierarchical optimization combining policy sampling projection and the Lagrange multiplier method is used to handle equality hard constraints for high-precision manipulation tasks. Second, collision avoidance and smooth motion are modeled as inequality soft constraints, where collision detection and time series prediction are used to ensure the safety and smoothness of dynamic interaction. Finally, an adaptive noise method is built to improve the stability of physical robot manipulation. The simulation and experiment results demonstrate that the proposed method enables a 7-DOF robot manipulator to achieve precise manipulation while avoiding dynamic obstacles.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8701-8714"},"PeriodicalIF":8.7,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145335306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph-Based Dual-Agent Deep Reinforcement Learning for Dynamic Human–Machine Hybrid Reconfiguration Manufacturing Scheduling 基于图的双智能体深度强化学习的动态人机混合重构制造调度

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-26 DOI: 10.1109/TSMC.2025.3612300

Yuxin Li;Qihao Liu;Chunjiang Zhang;Xinyu Li;Liang Gao

Human–machine hybrid reconfiguration manufacturing is an emerging paradigm in the field of precision equipment production and can greatly improve the production capability of the workshop. However, numerous complex constraints and a dynamic environment make reasonable scheduling very difficult. To this end, this article studies the dynamic human–machine hybrid reconfiguration manufacturing scheduling problem (DHMRSP) and proposes a novel deep reinforcement learning (DRL) scheduling method. Specifically, a dual-agent Markov decision process (MDP) is established, which can handle seven complex constraints and three disturbance events. Then, a heterogeneous competition graph attention network (HCGAN) is designed, where the meta-path-based subgraph conversion reflects the resource-operation competition, and three modules use node-level attention and semantic-level attention to realize important information embedding. Afterward, a dual proximal policy optimization (PPO) algorithm with HCGAN and mixed action space (HM-DPPO) is proposed, where the allocation agent and reconfiguration agent achieve collaborative learning by taking joint action and sharing graph embeddings and reward. Experimental results prove that the proposed approach outperforms rules, genetic programming (GP), and three DRL methods on different instances and can effectively handle various disturbance events.

人机混合重构制造是精密装备生产领域的一种新兴范式，可以极大地提高车间的生产能力。然而，众多复杂的约束和动态的环境使得合理的调度非常困难。为此，本文研究了动态人机混合重构制造调度问题（DHMRSP），提出了一种新的深度强化学习（DRL）调度方法。具体地说，建立了一个可以处理7个复杂约束和3个干扰事件的双代理马尔可夫决策过程。然后，设计了异构竞争图关注网络（HCGAN），其中基于元路径的子图转换反映了资源运营竞争，三个模块分别使用节点级关注和语义级关注实现重要信息嵌入。随后，提出了一种基于HCGAN和混合动作空间（HM-DPPO）的双近端策略优化（PPO）算法，其中分配智能体和重构智能体通过联合行动、共享图嵌入和奖励实现协同学习。实验结果表明，该方法在不同情况下优于规则、遗传规划（GP）和三种DRL方法，能够有效地处理各种干扰事件。

{"title":"Graph-Based Dual-Agent Deep Reinforcement Learning for Dynamic Human–Machine Hybrid Reconfiguration Manufacturing Scheduling","authors":"Yuxin Li;Qihao Liu;Chunjiang Zhang;Xinyu Li;Liang Gao","doi":"10.1109/TSMC.2025.3612300","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3612300","url":null,"abstract":"Human–machine hybrid reconfiguration manufacturing is an emerging paradigm in the field of precision equipment production and can greatly improve the production capability of the workshop. However, numerous complex constraints and a dynamic environment make reasonable scheduling very difficult. To this end, this article studies the dynamic human–machine hybrid reconfiguration manufacturing scheduling problem (DHMRSP) and proposes a novel deep reinforcement learning (DRL) scheduling method. Specifically, a dual-agent Markov decision process (MDP) is established, which can handle seven complex constraints and three disturbance events. Then, a heterogeneous competition graph attention network (HCGAN) is designed, where the meta-path-based subgraph conversion reflects the resource-operation competition, and three modules use node-level attention and semantic-level attention to realize important information embedding. Afterward, a dual proximal policy optimization (PPO) algorithm with HCGAN and mixed action space (HM-DPPO) is proposed, where the allocation agent and reconfiguration agent achieve collaborative learning by taking joint action and sharing graph embeddings and reward. Experimental results prove that the proposed approach outperforms rules, genetic programming (GP), and three DRL methods on different instances and can effectively handle various disturbance events.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8729-8741"},"PeriodicalIF":8.7,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145335326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Global Regulation of Time-Varying Stochastic Nonlinear Systems via Output Feedback and Its Application in One-Link Manipulator 时变随机非线性系统的输出反馈全局调节及其在单连杆机械臂中的应用

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-26 DOI: 10.1109/TSMC.2025.3611915

Xian-Long Yin;Zong-Yao Sun;Changyun Wen;Chih-Chiang Chen

This study focuses on addressing the challenge of global output feedback control problem for a class of time-varying stochastic nonlinear systems subject to multiple uncertainties. The primary challenge concerns how to construct time-varying functions to counteract the effects of unmeasurable error coming from system output as well as the persistently increasing nonlinearities. By employing a full-order state observer and the dual gain approach, we design an output feedback regulator over the entire time domain to guarantee the existence and uniqueness of the closed-loop system’s solution and the almost sure asymptotic convergence of the state. This methodology achieves both the domination of the unknown growth rate and the unified system design, irrespective of sensor sensitivity. Finally, practical and numerical simulation examples demonstrate the feasibility of the presented approach.

研究了一类具有多重不确定性的时变随机非线性系统的全局输出反馈控制问题。主要的挑战是如何构造时变函数来抵消来自系统输出的不可测量误差以及持续增加的非线性的影响。利用全阶状态观测器和双增益方法，在整个时域上设计了输出反馈调节器，保证了闭环系统解的存在唯一性和状态的几乎肯定渐近收敛。该方法在不考虑传感器灵敏度的情况下，既实现了对未知增长率的控制，又实现了系统的统一设计。最后，通过实例和数值仿真验证了所提方法的可行性。

引用次数: 0

Polynomial-Based Gain-Scheduling Mechanism of Fuzzy Markov Jump System With Incomplete Transition Probability Information With Experimental Validation 不完全转移概率模糊马尔可夫跳跃系统基于多项式的增益调度机制及实验验证

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-24 DOI: 10.1109/TSMC.2025.3611824

Xingchen Shao;Lipo Mo;Xiangpeng Xie

This article investigates the stabilization problem of fuzzy Markov jump systems (F-MJSs) with incomplete transition probability (TP) information. Existing methods for handling partially unknown TPs often introduce excessive conservatism using scaling techniques that may violate the fundamental stochastic constraints. To address this issue, we propose a novel polynomial-based gain-scheduling control framework that integrates a polytopic probability reconstruction strategy. This strategy rigorously preserves the stochastic completeness of TP matrices (TPMs) while reducing conservatism in controller design. By leveraging homogeneous polynomial theory, we further establish a codesign methodology for both polynomial Lyapunov functions and fuzzy controllers, significantly expanding the feasible solution space. Theoretical analysis demonstrates that the proposed method achieves substantially reduced conservatism compared with conventional aggregated approximation approaches. Numerical simulations reveal the improvement compared with classical aggregated treatment approaches. Hardware-in-the-loop (HIL) experiments on active suspension systems validate the effectiveness and robustness of the designed control strategy, especially

$gamma _{mathrm { min}}$

achieved a reduction optimization of 87.5%.

研究了具有不完全转移概率信息的模糊马尔可夫跳跃系统的镇定问题。现有的处理部分未知tp的方法常常引入过度的保守性，使用的缩放技术可能违反基本的随机约束。为了解决这个问题，我们提出了一个新的基于多项式的增益调度控制框架，该框架集成了一个多边形概率重建策略。该策略严格地保留了TP矩阵的随机完备性，同时降低了控制器设计中的保守性。利用齐次多项式理论，我们进一步建立了多项式Lyapunov函数和模糊控制器的协同设计方法，极大地扩展了可行解空间。理论分析表明，与传统的聚合逼近方法相比，该方法的保守性大大降低。数值模拟结果表明，与经典的聚合处理方法相比，该方法有了改进。在主动悬架系统上的硬件在环（HIL）实验验证了所设计控制策略的有效性和鲁棒性，特别是$gamma _{ maththrm {min}}$实现了87.5%的减少优化。

{"title":"Polynomial-Based Gain-Scheduling Mechanism of Fuzzy Markov Jump System With Incomplete Transition Probability Information With Experimental Validation","authors":"Xingchen Shao;Lipo Mo;Xiangpeng Xie","doi":"10.1109/TSMC.2025.3611824","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3611824","url":null,"abstract":"This article investigates the stabilization problem of fuzzy Markov jump systems (F-MJSs) with incomplete transition probability (TP) information. Existing methods for handling partially unknown TPs often introduce excessive conservatism using scaling techniques that may violate the fundamental stochastic constraints. To address this issue, we propose a novel polynomial-based gain-scheduling control framework that integrates a polytopic probability reconstruction strategy. This strategy rigorously preserves the stochastic completeness of TP matrices (TPMs) while reducing conservatism in controller design. By leveraging homogeneous polynomial theory, we further establish a codesign methodology for both polynomial Lyapunov functions and fuzzy controllers, significantly expanding the feasible solution space. Theoretical analysis demonstrates that the proposed method achieves substantially reduced conservatism compared with conventional aggregated approximation approaches. Numerical simulations reveal the improvement compared with classical aggregated treatment approaches. Hardware-in-the-loop (HIL) experiments on active suspension systems validate the effectiveness and robustness of the designed control strategy, especially <inline-formula> <tex-math>$gamma _{mathrm { min}}$ </tex-math></inline-formula> achieved a reduction optimization of 87.5%.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8742-8754"},"PeriodicalIF":8.7,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145335257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Event-Triggered Control for Linear Positive Discrete-Time Singular Systems With Time Delay 线性正离散时滞奇异系统的事件触发控制

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-24 DOI: 10.1109/TSMC.2025.3608211

Nguyen Huu Sau;Mai Viet Thuan

This study addresses the stabilization of discrete-time singular systems with delays by employing an event-triggered control (ETC) method. In particular, we propose an innovative triggering mechanism that compares the measurement-error coordinates with the system state coordinates, thereby preserving system positivity even under time delays. A novel algorithm is introduced, which relies on comparing the coordinates of measurement errors and system states to maintain the positivity of the system. This article establishes sufficient conditions to ensure that the closed-loop system remains regular, causal, positive, and exponentially stable, building upon this newly formulated triggering approach and leveraging advanced matrix properties such as nonnegative matrices. To illustrate the efficacy and nontrivial nature of these conditions, we provide an algorithmic diagram and a diverse set of examples, including both simulations and a practical case study. The ETC mechanism, characterized by the sequence of event occurrences, demonstrates substantial nontrivial properties. These conditions are easily verifiable using MATLAB tools. This article also includes a range of examples, featuring both numerical simulations and a practical case study, to validate the effectiveness of the theoretical findings.

本文采用事件触发控制（ETC）方法研究了具有时滞的离散奇异系统的镇定问题。特别是，我们提出了一种创新的触发机制，将测量误差坐标与系统状态坐标进行比较，从而即使在时间延迟下也能保持系统的正性。提出了一种通过比较测量误差坐标与系统状态坐标来保持系统正性的新算法。本文建立了充分的条件，以确保闭环系统保持规则、因果、正和指数稳定，建立在这种新制定的触发方法的基础上，并利用先进的矩阵性质，如非负矩阵。为了说明这些条件的有效性和重要性质，我们提供了一个算法图和一组不同的示例，包括模拟和实际案例研究。ETC机制以事件发生的顺序为特征，展示了大量的非平凡性质。这些条件很容易用MATLAB工具验证。本文还包括一系列的例子，包括数值模拟和实际案例研究，以验证理论发现的有效性。

{"title":"Event-Triggered Control for Linear Positive Discrete-Time Singular Systems With Time Delay","authors":"Nguyen Huu Sau;Mai Viet Thuan","doi":"10.1109/TSMC.2025.3608211","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3608211","url":null,"abstract":"This study addresses the stabilization of discrete-time singular systems with delays by employing an event-triggered control (ETC) method. In particular, we propose an innovative triggering mechanism that compares the measurement-error coordinates with the system state coordinates, thereby preserving system positivity even under time delays. A novel algorithm is introduced, which relies on comparing the coordinates of measurement errors and system states to maintain the positivity of the system. This article establishes sufficient conditions to ensure that the closed-loop system remains regular, causal, positive, and exponentially stable, building upon this newly formulated triggering approach and leveraging advanced matrix properties such as nonnegative matrices. To illustrate the efficacy and nontrivial nature of these conditions, we provide an algorithmic diagram and a diverse set of examples, including both simulations and a practical case study. The ETC mechanism, characterized by the sequence of event occurrences, demonstrates substantial nontrivial properties. These conditions are easily verifiable using MATLAB tools. This article also includes a range of examples, featuring both numerical simulations and a practical case study, to validate the effectiveness of the theoretical findings.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8625-8637"},"PeriodicalIF":8.7,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145335280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Contrastive-Learning-Based Decision Making for Dynamic Time-Linkage Optimization 基于对比学习的动态时间链优化决策

IF 8.7 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Systems Man Cybernetics-Systems

Pub Date : 2025-09-24 DOI: 10.1109/TSMC.2025.3611797

Xiao-Fang Liu;Meng Gao;Yongchun Fang;Zhi-Hui Zhan;Jun Zhang

In dynamic time-linkage optimization, current decisions influence the future state of environments. To make good decisions that have a positive impact on future states, existing methods usually build a model to predict the future rewards of solutions for decision making. However, these prediction models present low accuracy since decision data are not enough to train such a complex model. To address this issue, this article proposes a contrastive-learning-based decision making (CLDM) method, which builds a contrastive model to learn the relationship between solutions but not absolute rewards and adopts a quick decision strategy to select solutions. In CLDM, a clustering-based time-linkage detection (CD) strategy is developed to measure the intensity of the time linkage, which determines whether to make decisions based on future rewards. To represent the relative relationship between solutions, a large number of contrastive samples are constructed using the limited historical decisions. A contrastive model is trained for solution comparison in terms of the combination of current fitness and future rewards. Candidate solutions are clustered into multiple groups to filter poor ones, and a few solutions are preserved to rank using the contrastive model. The winner is taken as the decision solution. Integrating CLDM into particle swarm optimization (PSO), a new algorithm named contrastive-learning-based PSO (CL-PSO) is put forward. Experimental results on multiple dynamic time-linkage optimization instances demonstrate that CL-PSO outperforms state-of-the-art algorithms in terms of solution quality. CL-PSO can also well solve the mobile robot path planning problem.

在动态时间链优化中，当前决策影响环境的未来状态。为了做出对未来状态有积极影响的好决策，现有的方法通常会建立一个模型来预测决策解决方案的未来回报。然而，由于决策数据不足以训练如此复杂的模型，这些预测模型呈现出较低的准确性。针对这一问题，本文提出了一种基于对比学习的决策方法（CLDM），该方法通过建立对比模型来学习解决方案之间的关系，而不是绝对奖励，并采用快速决策策略来选择解决方案。在CLDM中，开发了一种基于聚类的时间链接检测策略（CD）来测量时间链接的强度，从而决定是否根据未来奖励做出决策。为了表示解之间的相对关系，使用有限的历史决策构造了大量的对比样本。根据当前适应度和未来奖励的组合，训练了一个对比模型来进行解决方案的比较。候选解决方案聚类成多组以过滤差的解决方案，并保留一些解决方案使用对比模型进行排名。取优胜者作为决策解。将CLDM算法与粒子群优化算法相结合，提出了一种基于对比学习的粒子群优化算法。在多个动态时间链优化实例上的实验结果表明，CL-PSO在求解质量上优于现有算法。CL-PSO还能很好地解决移动机器人路径规划问题。

{"title":"Contrastive-Learning-Based Decision Making for Dynamic Time-Linkage Optimization","authors":"Xiao-Fang Liu;Meng Gao;Yongchun Fang;Zhi-Hui Zhan;Jun Zhang","doi":"10.1109/TSMC.2025.3611797","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3611797","url":null,"abstract":"In dynamic time-linkage optimization, current decisions influence the future state of environments. To make good decisions that have a positive impact on future states, existing methods usually build a model to predict the future rewards of solutions for decision making. However, these prediction models present low accuracy since decision data are not enough to train such a complex model. To address this issue, this article proposes a contrastive-learning-based decision making (CLDM) method, which builds a contrastive model to learn the relationship between solutions but not absolute rewards and adopts a quick decision strategy to select solutions. In CLDM, a clustering-based time-linkage detection (CD) strategy is developed to measure the intensity of the time linkage, which determines whether to make decisions based on future rewards. To represent the relative relationship between solutions, a large number of contrastive samples are constructed using the limited historical decisions. A contrastive model is trained for solution comparison in terms of the combination of current fitness and future rewards. Candidate solutions are clustered into multiple groups to filter poor ones, and a few solutions are preserved to rank using the contrastive model. The winner is taken as the decision solution. Integrating CLDM into particle swarm optimization (PSO), a new algorithm named contrastive-learning-based PSO (CL-PSO) is put forward. Experimental results on multiple dynamic time-linkage optimization instances demonstrate that CL-PSO outperforms state-of-the-art algorithms in terms of solution quality. CL-PSO can also well solve the mobile robot path planning problem.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 11","pages":"8661-8674"},"PeriodicalIF":8.7,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145335292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0