The topside separation system is an important device installed on offshore oil exploration platforms for the treatment of produced water. Due to its operation in high-moisture and salt-infested environments, the system is susceptible to valve malfunctions. Additionally, the presence of strong couplings and slugging disturbances in the system further complicate the development of fault-tolerant control (FTC). To achieve this, this article investigates the fault-tolerant $ H_{infty } $ control problem in the topside separation system. To recover control performance against actuator faults while reducing disturbance sensitivity, the fault-tolerant $ H_{infty } $ control problem is formulated for the topside separation system and is expressed as a two-player differential game problem. A Nash equilibrium solution to the fault-tolerant $ H_{infty } $ control problem is derived by solving the game algebraic Riccati equation (GARE). Considering the tailor-made property and difficulty in full-state sensing in industry, an output feedback reinforcement learning (RL) algorithm is proposed to implement the fault-tolerant $ H_{infty } $ control method without the need for system dynamics. Simulation studies are performed to verify the effectiveness of the proposed algorithm.
{"title":"Fault-Tolerant H ∞ Control for Topside Separation Systems via Output-Feedback Reinforcement Learning","authors":"Yuguang Zhang;Xiaoyuan Luo;Shaobao Li;Juan Wang;Zhenyu Yang;Xinping Guan","doi":"10.1109/TSMC.2024.3523904","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3523904","url":null,"abstract":"The topside separation system is an important device installed on offshore oil exploration platforms for the treatment of produced water. Due to its operation in high-moisture and salt-infested environments, the system is susceptible to valve malfunctions. Additionally, the presence of strong couplings and slugging disturbances in the system further complicate the development of fault-tolerant control (FTC). To achieve this, this article investigates the fault-tolerant <inline-formula> <tex-math>$ H_{infty } $ </tex-math></inline-formula> control problem in the topside separation system. To recover control performance against actuator faults while reducing disturbance sensitivity, the fault-tolerant <inline-formula> <tex-math>$ H_{infty } $ </tex-math></inline-formula> control problem is formulated for the topside separation system and is expressed as a two-player differential game problem. A Nash equilibrium solution to the fault-tolerant <inline-formula> <tex-math>$ H_{infty } $ </tex-math></inline-formula> control problem is derived by solving the game algebraic Riccati equation (GARE). Considering the tailor-made property and difficulty in full-state sensing in industry, an output feedback reinforcement learning (RL) algorithm is proposed to implement the fault-tolerant <inline-formula> <tex-math>$ H_{infty } $ </tex-math></inline-formula> control method without the need for system dynamics. Simulation studies are performed to verify the effectiveness of the proposed algorithm.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2795-2805"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In recent years, pretrained language-image models (PLIMs) have delivered advances in video captioning. However, existing PLIMs primarily focus on extracting global feature representations from still images and text sequences, while neglecting fine-grained semantic alignment and temporal variations between vision and text pairs. To this end, we propose a global-local alignment module and a temporal parsing module to reflect the detailed correspondence and temporal perception between the two modalities, respectively. In particular, the global-local alignment module enables cross-modal registration at two levels, i.e., the sentence-video level and the word-frame level, to obtain mixed-granularity semantic video features. The temporal parsing module is a dedicated self-attention structure that highlights temporal order cues along video frames, compensating for the limited temporal capacity of PLIMs. In addition, an adaptive two-stage gating structure is designed to leverage the linguistic predictions further. The linguistic information derived from the first stage prediction is dynamically routed through an adaptive decision gate, allowing for quality assessment of whether the information should proceed to the second stage. This structure can effectively reduce the computational burden for easy samples and further improve the accuracy of the prediction results. The experimental results obtained on several benchmark datasets demonstrate the effectiveness of the proposed solution, with improved performance compared to state-of-the-art methods.
{"title":"ATMNet: Adaptive Two-Stage Modular Network for Accurate Video Captioning","authors":"Tianyang Xu;Yunjie Zhang;Xiaoning Song;Zheng-Hua Feng;Xiao-Jun Wu","doi":"10.1109/TSMC.2024.3524682","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3524682","url":null,"abstract":"In recent years, pretrained language-image models (PLIMs) have delivered advances in video captioning. However, existing PLIMs primarily focus on extracting global feature representations from still images and text sequences, while neglecting fine-grained semantic alignment and temporal variations between vision and text pairs. To this end, we propose a global-local alignment module and a temporal parsing module to reflect the detailed correspondence and temporal perception between the two modalities, respectively. In particular, the global-local alignment module enables cross-modal registration at two levels, i.e., the sentence-video level and the word-frame level, to obtain mixed-granularity semantic video features. The temporal parsing module is a dedicated self-attention structure that highlights temporal order cues along video frames, compensating for the limited temporal capacity of PLIMs. In addition, an adaptive two-stage gating structure is designed to leverage the linguistic predictions further. The linguistic information derived from the first stage prediction is dynamically routed through an adaptive decision gate, allowing for quality assessment of whether the information should proceed to the second stage. This structure can effectively reduce the computational burden for easy samples and further improve the accuracy of the prediction results. The experimental results obtained on several benchmark datasets demonstrate the effectiveness of the proposed solution, with improved performance compared to state-of-the-art methods.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2821-2833"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3523705
Li Wang;Huaicheng Yan;Xiao Hu;Zhichen Li;Meng Wang
This article investigates secure bipartite containment control with the event-triggered mechanism (ETM) for nonlinear heterogeneous multiagent systems (MASs) via the feedback control in fixed time. A novel attacks detection algorithm and an updated label strategy are designed to judge the occurrence of Denial of Service (DoS) attacks or not. It can search for new multiple directed spanning trees for MASs with multiple leaders to decrease the negative influence when attacks happen. In this case, the dynamic ETM is adopted by adjusting the triggered threshold online to save resource consumption. The state-feedback control and output-feedback control are selectively employed according to the situation where the state value of the follower is available or not. The settling time is further relaxed by fixed-time stability theory and can be preset in advance. By theoretical discussion, conditions of achieving bipartite containment control are derived by Lyapunov functions. Finally, the effectiveness of the established control method is verified by simulations.
{"title":"Fixed-Time Bipartite Containment Control for Heterogeneous Multiagent Systems Under DoS Attacks: An Event-Triggered Mechanism","authors":"Li Wang;Huaicheng Yan;Xiao Hu;Zhichen Li;Meng Wang","doi":"10.1109/TSMC.2024.3523705","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3523705","url":null,"abstract":"This article investigates secure bipartite containment control with the event-triggered mechanism (ETM) for nonlinear heterogeneous multiagent systems (MASs) via the feedback control in fixed time. A novel attacks detection algorithm and an updated label strategy are designed to judge the occurrence of Denial of Service (DoS) attacks or not. It can search for new multiple directed spanning trees for MASs with multiple leaders to decrease the negative influence when attacks happen. In this case, the dynamic ETM is adopted by adjusting the triggered threshold online to save resource consumption. The state-feedback control and output-feedback control are selectively employed according to the situation where the state value of the follower is available or not. The settling time is further relaxed by fixed-time stability theory and can be preset in advance. By theoretical discussion, conditions of achieving bipartite containment control are derived by Lyapunov functions. Finally, the effectiveness of the established control method is verified by simulations.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2782-2794"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3525296
Jie Su;Yongduan Song
Achieving full state regulation within a prescribed-time for uncertain nonlinear systems under any initial condition is rather challenging although highly desirable, whereas most existing prescribed-time control results are literally contingent upon infinite feedback gain at the equilibrium. This article presents a new prescribed-time control design method that is able to ensure prescribed-time stability with bounded feedback gain and bounded control action during the entire process of system operation, elegantly circumventing the infinite feedback gain problem. As a nonscaling-based method with structural adaptation is utilized, the proposed control scheme is able to regulate all the states to zero well before the prescribed-time, yet in the presence of time-varying and mismatched structural uncertainties, substantially reducing the numerical computational complexity induced by scaling-based methods. The theoretical results are supported by two numerical simulations.
{"title":"Prescribed-Time Control With Bounded Feedback Gain: A Nonscaling and Structural Adaptation-Based Approach","authors":"Jie Su;Yongduan Song","doi":"10.1109/TSMC.2024.3525296","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3525296","url":null,"abstract":"Achieving full state regulation within a prescribed-time for uncertain nonlinear systems under any initial condition is rather challenging although highly desirable, whereas most existing prescribed-time control results are literally contingent upon infinite feedback gain at the equilibrium. This article presents a new prescribed-time control design method that is able to ensure prescribed-time stability with bounded feedback gain and bounded control action during the entire process of system operation, elegantly circumventing the infinite feedback gain problem. As a nonscaling-based method with structural adaptation is utilized, the proposed control scheme is able to regulate all the states to zero well before the prescribed-time, yet in the presence of time-varying and mismatched structural uncertainties, substantially reducing the numerical computational complexity induced by scaling-based methods. The theoretical results are supported by two numerical simulations.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2580-2589"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online social networks (OSNs) facilitate the rapid and extensive spreading of rumors. While most existing methods for debunking rumors consider a solitary debunker, they overlook that rumor-mongering and debunking are interdependent and confrontational behaviors. In reality, a debunker must consider the impact of rumor-mongering behavior when making decisions. Moreover, a single rumor-debunking strategy is ineffective in addressing the complexity of the rumor environment in networks. Therefore, this article proposes a hybrid rumor-debunking approach that combines truth dissemination and regulatory measures based on the differential game theory under adversarial behaviors of rumor-mongering and debunking. Toward this end, we first establish a rumor propagation model using node-based modeling techniques that can be applied to any network structure. Next, we mathematically describe and analyze the processes of rumor-mongering and debunking. Finally, we validate the theoretical results of the proposed method through various comparative experiments, including comparisons with a random strategy, a uniform strategy, and single strategy models on real-world datasets collected from Facebook, Twitter, and YouTube. Furthermore, we harness two actual rumor events to estimate parameters and predict rumor propagation, thereby affirming the veracity and effectiveness of our rumor propagation model.
{"title":"Hybrid Rumor Debunking in Online Social Networks: A Differential Game Approach","authors":"Chenquan Gan;Wei Yang;Qingyi Zhu;Meng Li;Deepak Kumar Jain;Vitomir Štruc;Da-Wen Huang","doi":"10.1109/TSMC.2025.3526734","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3526734","url":null,"abstract":"Online social networks (OSNs) facilitate the rapid and extensive spreading of rumors. While most existing methods for debunking rumors consider a solitary debunker, they overlook that rumor-mongering and debunking are interdependent and confrontational behaviors. In reality, a debunker must consider the impact of rumor-mongering behavior when making decisions. Moreover, a single rumor-debunking strategy is ineffective in addressing the complexity of the rumor environment in networks. Therefore, this article proposes a hybrid rumor-debunking approach that combines truth dissemination and regulatory measures based on the differential game theory under adversarial behaviors of rumor-mongering and debunking. Toward this end, we first establish a rumor propagation model using node-based modeling techniques that can be applied to any network structure. Next, we mathematically describe and analyze the processes of rumor-mongering and debunking. Finally, we validate the theoretical results of the proposed method through various comparative experiments, including comparisons with a random strategy, a uniform strategy, and single strategy models on real-world datasets collected from Facebook, Twitter, and YouTube. Furthermore, we harness two actual rumor events to estimate parameters and predict rumor propagation, thereby affirming the veracity and effectiveness of our rumor propagation model.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2513-2527"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849987","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, a novel visionary policy iteration (VPI) framework is proposed to address the continuous-action reinforcement learning (RL) tasks. In VPI, a visionary Q-function is constructed by incorporating the successor state into the standard Q-function. Due to the introduction of the successor state, the proposed visionary Q-function captures information about state transitions within the Markov decision process (MDP), thereby providing a forward-looking perspective that enables a more accurate and foresighted evaluation of potential action outcomes. The relationship between the visionary Q-function and the standard Q-function is analyzed. Subsequently, both the policy evaluation and policy improvement rules in VPI are designed based on the proposed visionary Q-function. The convergence proof for VPI is provided, ensuring that the iterative policy sequence in VPI will converge to the optimal policy. By combining the VPI framework with the twin delayed deep deterministic policy gradient (TD3) algorithm, a visionary TD3 (VTD3) algorithm is developed. The evaluation of VTD3 is performed on multiple continuous-action control tasks from Mujoco and OpenAI Gym platforms. The results of comparative experiments demonstrate that VTD3 can achieve more competitive performance than other state-of-the-art (SOTA) RL approaches. Additionally, the experimental results indicate that VPI enhances decision-making capability, reduces Q-function estimation bias, and improves sample efficiency, thereby boosting the performance of existing RL algorithms.
{"title":"Visionary Policy Iteration for Continuous Control","authors":"Botao Dong;Longyang Huang;Xiwen Ma;Hongtian Chen;Weidong Zhang","doi":"10.1109/TSMC.2025.3525473","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3525473","url":null,"abstract":"In this article, a novel visionary policy iteration (VPI) framework is proposed to address the continuous-action reinforcement learning (RL) tasks. In VPI, a visionary Q-function is constructed by incorporating the successor state into the standard Q-function. Due to the introduction of the successor state, the proposed visionary Q-function captures information about state transitions within the Markov decision process (MDP), thereby providing a forward-looking perspective that enables a more accurate and foresighted evaluation of potential action outcomes. The relationship between the visionary Q-function and the standard Q-function is analyzed. Subsequently, both the policy evaluation and policy improvement rules in VPI are designed based on the proposed visionary Q-function. The convergence proof for VPI is provided, ensuring that the iterative policy sequence in VPI will converge to the optimal policy. By combining the VPI framework with the twin delayed deep deterministic policy gradient (TD3) algorithm, a visionary TD3 (VTD3) algorithm is developed. The evaluation of VTD3 is performed on multiple continuous-action control tasks from Mujoco and OpenAI Gym platforms. The results of comparative experiments demonstrate that VTD3 can achieve more competitive performance than other state-of-the-art (SOTA) RL approaches. Additionally, the experimental results indicate that VPI enhances decision-making capability, reduces Q-function estimation bias, and improves sample efficiency, thereby boosting the performance of existing RL algorithms.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2707-2720"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amid the global push for sustainable development, rising market demands have necessitated a multiregional, multiobjective, and flexible production model. Against this backdrop, this article investigates the multiobjective distributed flow shop group scheduling problem by formulating a mathematical model and introducing an advanced memetic algorithm integrated with reinforcement learning (RLMA). The RLMA involves a novel cooperative crossover operation in conjunction with the nature of the coupled problems to extensively explore the solution space. Additionally, the Sarsa algorithm enhanced with eligibility traces guides the selection of optimal schemes during the local enhancement phase. To ensure a balance between convergence and diversity, a solution selection strategy based on penalty-based boundary intersection decomposition is utilized. Furthermore, the increasing-efficiency and reducing-consumption strategies integrating a rapid evaluation mechanism are designed by dynamically changing the machine speed to balance economic and sustainability metrics. Comprehensive numerical experiments and comparative analyses demonstrate that the proposed RLMA surpasses existing state-of-the-art algorithms in addressing this complex problem.
{"title":"Reinforcement Learning-Assisted Memetic Algorithm for Sustainability-Oriented Multiobjective Distributed Flow Shop Group Scheduling","authors":"Yuhang Wang;Yuyan Han;Yuting Wang;Xianpeng Wang;Yiping Liu;Kaizhou Gao","doi":"10.1109/TSMC.2024.3518625","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3518625","url":null,"abstract":"Amid the global push for sustainable development, rising market demands have necessitated a multiregional, multiobjective, and flexible production model. Against this backdrop, this article investigates the multiobjective distributed flow shop group scheduling problem by formulating a mathematical model and introducing an advanced memetic algorithm integrated with reinforcement learning (RLMA). The RLMA involves a novel cooperative crossover operation in conjunction with the nature of the coupled problems to extensively explore the solution space. Additionally, the Sarsa algorithm enhanced with eligibility traces guides the selection of optimal schemes during the local enhancement phase. To ensure a balance between convergence and diversity, a solution selection strategy based on penalty-based boundary intersection decomposition is utilized. Furthermore, the increasing-efficiency and reducing-consumption strategies integrating a rapid evaluation mechanism are designed by dynamically changing the machine speed to balance economic and sustainability metrics. Comprehensive numerical experiments and comparative analyses demonstrate that the proposed RLMA surpasses existing state-of-the-art algorithms in addressing this complex problem.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2399-2413"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3524795
Peng Wang;Yingxin Fu;Peide Liu
Graph model for conflict resolution (GMCR) is an effective tool to solve conflicts, which determines the feasible states by modeling the conflict, and then analyzes the behavior of decision-makers (DMs) through stability analysis to find a solution to the conflict. This article studies the composite DMs (CDMs) and the heterogeneous behaviors of opponents in GMCR. Based on the social relationship between DMs, the social network is applied to analyze the individuals in CDMs and to identify the types of heterogeneous behaviors of DMs. Combining social network and aggregating operator, this article unifies the preferences of individuals in a CDM. Subsequently, an identification mechanism is designed to determine the kind of opponents’ heterogeneous behaviors. Then, the mixed stabilities are extended to hesitant fuzzy mixed general meta-rationality (HFMGMR) and hesitant fuzzy mixed symmetric meta-rationality (HFMSMR). The matrix representations of two stabilities are developed to analyze the equilibrium of conflicts. Finally, a conflict in pollution rectification of industry enterprises is analyzed to demonstrate how social networks can be applied to GMCR with CDM and heterogeneous opponents. Hesitant fuzzy mixed stability analysis reveals the influence of heterogeneous behaviors in GMCR. Different types of DM behavior lead to different equilibrium results, which is concluded in this article.
{"title":"Graph Model for Conflict Resolution Considering Heterogeneous Behavior Based on Hesitant Fuzzy Preference and Social Network Analysis","authors":"Peng Wang;Yingxin Fu;Peide Liu","doi":"10.1109/TSMC.2024.3524795","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3524795","url":null,"abstract":"Graph model for conflict resolution (GMCR) is an effective tool to solve conflicts, which determines the feasible states by modeling the conflict, and then analyzes the behavior of decision-makers (DMs) through stability analysis to find a solution to the conflict. This article studies the composite DMs (CDMs) and the heterogeneous behaviors of opponents in GMCR. Based on the social relationship between DMs, the social network is applied to analyze the individuals in CDMs and to identify the types of heterogeneous behaviors of DMs. Combining social network and aggregating operator, this article unifies the preferences of individuals in a CDM. Subsequently, an identification mechanism is designed to determine the kind of opponents’ heterogeneous behaviors. Then, the mixed stabilities are extended to hesitant fuzzy mixed general meta-rationality (HFMGMR) and hesitant fuzzy mixed symmetric meta-rationality (HFMSMR). The matrix representations of two stabilities are developed to analyze the equilibrium of conflicts. Finally, a conflict in pollution rectification of industry enterprises is analyzed to demonstrate how social networks can be applied to GMCR with CDM and heterogeneous opponents. Hesitant fuzzy mixed stability analysis reveals the influence of heterogeneous behaviors in GMCR. Different types of DM behavior lead to different equilibrium results, which is concluded in this article.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2644-2658"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3525038
Tao Dong;Rui He;Huaqing Li;Wenjie Hu;Tingwen Huang
Phase-change memory (PCM) is a novel type of nonvolatile memory and offers low power consumption, high integration, and significant plasticity, making it suitable for neural synapses. In this article, we investigate the global exponential stabilization (GES) of phase-change inertial neural networks (PCINNs) with discrete and distributed time-varying delays. Initially, a piecewise equation is established to model the electrical conductivity of PCM. Based on this, we use PCM to simulate neural synapses, and a class of PCINNs with discrete and distributed time-varying delays is formulated. A continuous state feedback controller is designed to obtain the ${boldsymbol {rho} {textrm {th}}({rho ge 1})}$ moment GES conditions of PCINNs in the Filippov sense by using differential inclusion theory, comparison strategies, and inequality techniques. Additionally, the global exponential stability conditions of phase-change Hopfield neural networks are obtained, expressed in the form of an M-matrix. Finally, three simulation examples are provided to verify the effectiveness of the theoretical results.
{"title":"Exponential Stabilization of Phase-Change Inertial Neural Networks With Time-Varying Delays","authors":"Tao Dong;Rui He;Huaqing Li;Wenjie Hu;Tingwen Huang","doi":"10.1109/TSMC.2024.3525038","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3525038","url":null,"abstract":"Phase-change memory (PCM) is a novel type of nonvolatile memory and offers low power consumption, high integration, and significant plasticity, making it suitable for neural synapses. In this article, we investigate the global exponential stabilization (GES) of phase-change inertial neural networks (PCINNs) with discrete and distributed time-varying delays. Initially, a piecewise equation is established to model the electrical conductivity of PCM. Based on this, we use PCM to simulate neural synapses, and a class of PCINNs with discrete and distributed time-varying delays is formulated. A continuous state feedback controller is designed to obtain the <inline-formula> <tex-math>${boldsymbol {rho} {textrm {th}}({rho ge 1})}$ </tex-math></inline-formula> moment GES conditions of PCINNs in the Filippov sense by using differential inclusion theory, comparison strategies, and inequality techniques. Additionally, the global exponential stability conditions of phase-change Hopfield neural networks are obtained, expressed in the form of an M-matrix. Finally, three simulation examples are provided to verify the effectiveness of the theoretical results.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2659-2669"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2025.3526357
Jingwei Lu;Qinglai Wei;Fei-Yue Wang
This article utilizes parallel control to investigate the problem of continuous-time (CT) nonzero-sum games (NZSGs) for completely unknown nonlinear systems via reinforcement learning (RL), and a parallel control-based NZSG (PNZSG) method is developed without reconstructing unknown dynamics or employing off-policy integral RL (IRL). First, novel dynamic control policies (DCPs) are developed for NZSGs by introducing controls into feedback, and an augmented system with augmented performance indices is constructed to derive the DCPs. Then, we theoretically analyze the effect of the DCPs on the control stability and performance indices, and the optimality of PNZSG is proven to be equivalent to the optimality of the original NZSGs. Subsequently, an IRL technique is employed to achieve the developed PNZSG method, and we show that no prior knowledge of the dynamics of NZSGs is needed to deploy the developed PNZSG method because of the augmented system and performance indices. Finally, numerical examples, including cooperative adaptive cruise control (CACC) of a vehicular platoon, demonstrate the correctness of the developed PNZSG method. The associated code is available at: https://github.com/lujingweihh/Adaptive-dynamic-programming-algorithms/tree/main/model_free_nonzero_sum_games.
{"title":"Parallel Control for Nonzero-Sum Games With Completely Unknown Nonlinear Dynamics via Reinforcement Learning","authors":"Jingwei Lu;Qinglai Wei;Fei-Yue Wang","doi":"10.1109/TSMC.2025.3526357","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3526357","url":null,"abstract":"This article utilizes parallel control to investigate the problem of continuous-time (CT) nonzero-sum games (NZSGs) for completely unknown nonlinear systems via reinforcement learning (RL), and a parallel control-based NZSG (PNZSG) method is developed without reconstructing unknown dynamics or employing off-policy integral RL (IRL). First, novel dynamic control policies (DCPs) are developed for NZSGs by introducing controls into feedback, and an augmented system with augmented performance indices is constructed to derive the DCPs. Then, we theoretically analyze the effect of the DCPs on the control stability and performance indices, and the optimality of PNZSG is proven to be equivalent to the optimality of the original NZSGs. Subsequently, an IRL technique is employed to achieve the developed PNZSG method, and we show that no prior knowledge of the dynamics of NZSGs is needed to deploy the developed PNZSG method because of the augmented system and performance indices. Finally, numerical examples, including cooperative adaptive cruise control (CACC) of a vehicular platoon, demonstrate the correctness of the developed PNZSG method. The associated code is available at: <uri>https://github.com/lujingweihh/Adaptive-dynamic-programming-algorithms/tree/main/model_free_nonzero_sum_games</uri>.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2884-2896"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143667507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}