This article investigates a novel policy-adjustable Q-learning (PA-QL) algorithm aimed at addressing the optimal tracking control (OTC) problem for nonlinear discrete-time (DT) systems with enhanced adaptability and flexibility. A novel iteration scheme is developed that integrates the control weights into the augmented neural network (NN) input, thereby reformulating the learning process to explicitly characterize the optimal policy as a function of the adjustable weights. Consequently, the learned control policy is not constrained by predetermined weights, allowing for dynamic adjustment after offline training is completed. Moreover, such adjustments can be performed online seamlessly, offering substantially greater flexibility in adapting to changes in operating conditions or control objectives. Finally, the effectiveness of the proposed algorithm is established through rigorous theoretical analysis and further validated by simulation studies.
{"title":"Policy-Adjustable Q-Learning for Data-Driven Nonlinear Optimal Tracking Control.","authors":"Jiaoyuan Chen,Dawei Gong,Yuyang Zhao,Shijie Song,Minglei Zhu","doi":"10.1109/tnnls.2026.3672136","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3672136","url":null,"abstract":"This article investigates a novel policy-adjustable Q-learning (PA-QL) algorithm aimed at addressing the optimal tracking control (OTC) problem for nonlinear discrete-time (DT) systems with enhanced adaptability and flexibility. A novel iteration scheme is developed that integrates the control weights into the augmented neural network (NN) input, thereby reformulating the learning process to explicitly characterize the optimal policy as a function of the adjustable weights. Consequently, the learned control policy is not constrained by predetermined weights, allowing for dynamic adjustment after offline training is completed. Moreover, such adjustments can be performed online seamlessly, offering substantially greater flexibility in adapting to changes in operating conditions or control objectives. Finally, the effectiveness of the proposed algorithm is established through rigorous theoretical analysis and further validated by simulation studies.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"10 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147465191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-16DOI: 10.1109/tnnls.2026.3670269
Qifen Yang,Yuhui Deng,Jiande Huang,Lijuan Lu,Peng Zhou,Geyong Min
Ensuring model fairness for preventing potential biases based on any sensitive attribute is crucial for the societal acceptance of artificial intelligence in critical applications. Among various fairness concepts, counterfactual fairness has gained prominence as it is grounded in causal inference. This concept requires that an individual's prediction in the original world remains consistent with that in the counterfactual world where the sensitive feature value is modified. In this article, we aim to mitigate counterfactual biases of the model through causal intervention. Specifically, we first achieve effective causal intervention and counterfactual generation by proposing the causal inference tabular generative adversarial network (CITGAN) architecture. Unlike prior approaches based on variational autoencoders (VAEs) that inherently lack structural causal model (SCM) fidelity due to simultaneous generation, CITGAN strictly enforces causal consistency via an end-to-end topological generation process. By integrating exogenous variable inference with sequential generation, CITGAN ensures that functional dependencies are structurally preserved. Building on CITGAN, we propose the CIRCUS framework, a causal intervention-based framework designed to intuitively enhance the counterfactual fairness in trained classifiers. CIRCUS generates counterfactually discriminatory samples (CDSs) via causal intervention, guided by gradients and feature contributions, and subsequently applies bias correction preprocessing to their labels for classifier retraining. Experimental results demonstrate that CIRCUS effectively enhances counterfactual fairness while maintaining robust classification performance. Specifically, for the deep neural network (DNN) model, the $text {MMD}_{text {L}}$ and $text {MMD}_{text {K}}$ values are reduced by averages of 39.7% and 40.4%, respectively, compared with the second-best result. For the residual network (ResNet) model, these reductions amount to 56.7% and 54.5%, respectively.
{"title":"CIRCUS: A Causal Intervention-Based Framework for Enhancing Counterfactual Fairness in Trained Classifiers.","authors":"Qifen Yang,Yuhui Deng,Jiande Huang,Lijuan Lu,Peng Zhou,Geyong Min","doi":"10.1109/tnnls.2026.3670269","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670269","url":null,"abstract":"Ensuring model fairness for preventing potential biases based on any sensitive attribute is crucial for the societal acceptance of artificial intelligence in critical applications. Among various fairness concepts, counterfactual fairness has gained prominence as it is grounded in causal inference. This concept requires that an individual's prediction in the original world remains consistent with that in the counterfactual world where the sensitive feature value is modified. In this article, we aim to mitigate counterfactual biases of the model through causal intervention. Specifically, we first achieve effective causal intervention and counterfactual generation by proposing the causal inference tabular generative adversarial network (CITGAN) architecture. Unlike prior approaches based on variational autoencoders (VAEs) that inherently lack structural causal model (SCM) fidelity due to simultaneous generation, CITGAN strictly enforces causal consistency via an end-to-end topological generation process. By integrating exogenous variable inference with sequential generation, CITGAN ensures that functional dependencies are structurally preserved. Building on CITGAN, we propose the CIRCUS framework, a causal intervention-based framework designed to intuitively enhance the counterfactual fairness in trained classifiers. CIRCUS generates counterfactually discriminatory samples (CDSs) via causal intervention, guided by gradients and feature contributions, and subsequently applies bias correction preprocessing to their labels for classifier retraining. Experimental results demonstrate that CIRCUS effectively enhances counterfactual fairness while maintaining robust classification performance. Specifically, for the deep neural network (DNN) model, the $text {MMD}_{text {L}}$ and $text {MMD}_{text {K}}$ values are reduced by averages of 39.7% and 40.4%, respectively, compared with the second-best result. For the residual network (ResNet) model, these reductions amount to 56.7% and 54.5%, respectively.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"58 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147465193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-13DOI: 10.1109/tnnls.2026.3672118
Junpei Yang,Zhan Li,Weibing Li
Time-varying quadratic programming (TVQP) problems can be regarded as a challenging issue in a wide range of engineering applications, frequently incorporating equality, inequality, and bound constraints. By integrating a nonlinear complementarity problem (NCP) formulation, zeroing neural networks (ZNNs) can be generalized to solve TVQP problems effectively with such a full set of constraints. However, three issues may limit its computational efficiency and practical applications: 1) the inherent formulation of NCP functions for handling equality/inequality and boundary constraints substantially expands dimensions of coefficient matrices/vectors and solution-space variables; 2) the conventional ZNN (CZNN) framework inevitably necessitates matrix inversion operations for real-time solutions; and 3) insufficient robustness against noise interference compromises solution accuracy in practical implementations. To overcome these issues and promote solution performance, this article develops an enhanced lower dimension NCP low-computational-complexity (LCC) ZNN (ELNCP-LCCZNN) model for solving TVQP problems with time-varying equality, inequality, and variable boundary constraints. An ELNCP function is designed to reduce the matrix/vector coefficients of the model, and the LCCZNN model is utilized to construct a new dynamic model that eliminates the necessity for matrix inversion during solution. Furthermore, a nonlinear activation function is involved to guarantee predefined-time convergence and strengthen robustness against noise. The theoretical properties of the proposed ELNCP-LCCZNN model are validated through numerical simulations and experiments on robotic manipulator kinematic control. The results corroborate the analysis and demonstrate improved computational efficiency, enhanced noise robustness, and practical implementability compared with existing approaches.
{"title":"An Enhanced Low-Computational-Complexity Predefined-Time Convergent Zeroing Neural Network for Constrained Time-Varying Quadratic Programming With Kinematic Control of Robotic Manipulator.","authors":"Junpei Yang,Zhan Li,Weibing Li","doi":"10.1109/tnnls.2026.3672118","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3672118","url":null,"abstract":"Time-varying quadratic programming (TVQP) problems can be regarded as a challenging issue in a wide range of engineering applications, frequently incorporating equality, inequality, and bound constraints. By integrating a nonlinear complementarity problem (NCP) formulation, zeroing neural networks (ZNNs) can be generalized to solve TVQP problems effectively with such a full set of constraints. However, three issues may limit its computational efficiency and practical applications: 1) the inherent formulation of NCP functions for handling equality/inequality and boundary constraints substantially expands dimensions of coefficient matrices/vectors and solution-space variables; 2) the conventional ZNN (CZNN) framework inevitably necessitates matrix inversion operations for real-time solutions; and 3) insufficient robustness against noise interference compromises solution accuracy in practical implementations. To overcome these issues and promote solution performance, this article develops an enhanced lower dimension NCP low-computational-complexity (LCC) ZNN (ELNCP-LCCZNN) model for solving TVQP problems with time-varying equality, inequality, and variable boundary constraints. An ELNCP function is designed to reduce the matrix/vector coefficients of the model, and the LCCZNN model is utilized to construct a new dynamic model that eliminates the necessity for matrix inversion during solution. Furthermore, a nonlinear activation function is involved to guarantee predefined-time convergence and strengthen robustness against noise. The theoretical properties of the proposed ELNCP-LCCZNN model are validated through numerical simulations and experiments on robotic manipulator kinematic control. The results corroborate the analysis and demonstrate improved computational efficiency, enhanced noise robustness, and practical implementability compared with existing approaches.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"160 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147447039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-13DOI: 10.1109/tnnls.2026.3670947
Qinglong Chen,Fei Zhu
Most existing adversarial training methods in reinforcement learning (RL) offer limited robustness and remain vulnerable to novel attacks. To address this limitation, an approach that enhances policy robustness by leveraging the historically optimal policy to guide policy optimization and generating diverse adversarial perturbations, termed robust RL via leveraging historically optimal policy with regulation of performance (HORP), is proposed. Unlike other approaches that rely solely on trial-and-error interactions, HORP constructs a guidance value function by simultaneously considering value gaps and policy distribution divergence, thereby focusing on prioritized learning in promising action spaces. It also incorporates an adaptive performance-aware optimization mechanism to trigger timely corrections, preventing the agent from deviating from optimal performance. Furthermore, HORP dynamically modulates perturbation entropy through controlled uncertainty injection, thereby improving the agent's generalized defensive capabilities. Experiments demonstrate that HORP achieves superior performance in most cases regarding both natural performance and robustness against various state attacks.
{"title":"Robust Reinforcement Learning via Leveraging Historically Optimal Policy With Regulation of Performance.","authors":"Qinglong Chen,Fei Zhu","doi":"10.1109/tnnls.2026.3670947","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670947","url":null,"abstract":"Most existing adversarial training methods in reinforcement learning (RL) offer limited robustness and remain vulnerable to novel attacks. To address this limitation, an approach that enhances policy robustness by leveraging the historically optimal policy to guide policy optimization and generating diverse adversarial perturbations, termed robust RL via leveraging historically optimal policy with regulation of performance (HORP), is proposed. Unlike other approaches that rely solely on trial-and-error interactions, HORP constructs a guidance value function by simultaneously considering value gaps and policy distribution divergence, thereby focusing on prioritized learning in promising action spaces. It also incorporates an adaptive performance-aware optimization mechanism to trigger timely corrections, preventing the agent from deviating from optimal performance. Furthermore, HORP dynamically modulates perturbation entropy through controlled uncertainty injection, thereby improving the agent's generalized defensive capabilities. Experiments demonstrate that HORP achieves superior performance in most cases regarding both natural performance and robustness against various state attacks.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147447038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-12DOI: 10.1109/tnnls.2026.3670013
Yan Zhong,Lei Ma,Xiaomeng Yan
Decentralized federated learning (DFL) has gained significant attention as a framework for analyzing large-scale data distributed across multiple sites, where communication between sites is constrained by a decentralized graph structure. Due to privacy concerns and high communication costs, reducing the number of communication rounds in DFL has become an important area of research. This article investigates the effects of the decentralized graph's topology on the convergence rate of DFL algorithms and introduces a novel tensor-based multiple-gossip-steps (T-MGS) method to optimize communication efficiency from the topology aspect. The core idea of this method is to use gossip tensors to guide the information flow between sites and enable dynamic adjustments to the transmitted content at each communication step without increasing its volume. The proposed method minimizes the second-largest absolute eigenvalue of the equivalent gossip matrix, a key factor influencing convergence speed. Experimental results on both simulated and real datasets demonstrate that the proposed T-MGS outperforms existing strategies in terms of communication efficiency, reducing the number of communication rounds without compromising model accuracy.
{"title":"Topology-Optimal Multiple Gossip Steps for Decentralized Federated Learning via Gossip Tensor.","authors":"Yan Zhong,Lei Ma,Xiaomeng Yan","doi":"10.1109/tnnls.2026.3670013","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670013","url":null,"abstract":"Decentralized federated learning (DFL) has gained significant attention as a framework for analyzing large-scale data distributed across multiple sites, where communication between sites is constrained by a decentralized graph structure. Due to privacy concerns and high communication costs, reducing the number of communication rounds in DFL has become an important area of research. This article investigates the effects of the decentralized graph's topology on the convergence rate of DFL algorithms and introduces a novel tensor-based multiple-gossip-steps (T-MGS) method to optimize communication efficiency from the topology aspect. The core idea of this method is to use gossip tensors to guide the information flow between sites and enable dynamic adjustments to the transmitted content at each communication step without increasing its volume. The proposed method minimizes the second-largest absolute eigenvalue of the equivalent gossip matrix, a key factor influencing convergence speed. Experimental results on both simulated and real datasets demonstrate that the proposed T-MGS outperforms existing strategies in terms of communication efficiency, reducing the number of communication rounds without compromising model accuracy.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-03-12DOI: 10.1109/tnnls.2026.3670396
Jun Fu,Yupeng Liu,Qi Wang,Meiting Pan,Tonglei Cheng
Stimulated Raman scattering (SRS) plays a pivotal role in applications such as optical communications, fiber optic sensing, and spectral analysis. However, traditional modeling methods like the split-step Fourier method (SSFM) are computationally demanding. In response to these challenges, we propose a novel deep learning framework based on a hybrid neural network, specifically architected to capture the complex spatio-temporal dependencies inherent in nonlinear pulse propagation. Our model offers rapid and precise predictions of SRS behavior, alleviating the need for computationally expensive simulations like SSFM. To validate the model's performance, we conducted experiments using chalcogenide microstructured optical fibers (MOFs), which are attracting attention due to their high Raman gain coefficient and wide spectral range in the mid-infrared (MIR) region. Specifically, we demonstrate the first successful generation of MIR SRS in a 2- $mu $ m direct-pumped suspended-core As2S3 MOF, which provides a crucial real-world dataset for model validation. The results demonstrate that our hybrid neural network is 116 times faster on a GPU and 44 times faster on a CPU compared to SSFM while maintaining accuracy and generalization. This significant acceleration paves the way for real-time analysis and inverse design of nonlinear photonic systems, tasks previously intractable with traditional methods.
{"title":"A Deep Learning Approach for Dynamic Modeling of Stimulated Raman Scattering in Chalcogenide Microstructured Optical Fibers.","authors":"Jun Fu,Yupeng Liu,Qi Wang,Meiting Pan,Tonglei Cheng","doi":"10.1109/tnnls.2026.3670396","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670396","url":null,"abstract":"Stimulated Raman scattering (SRS) plays a pivotal role in applications such as optical communications, fiber optic sensing, and spectral analysis. However, traditional modeling methods like the split-step Fourier method (SSFM) are computationally demanding. In response to these challenges, we propose a novel deep learning framework based on a hybrid neural network, specifically architected to capture the complex spatio-temporal dependencies inherent in nonlinear pulse propagation. Our model offers rapid and precise predictions of SRS behavior, alleviating the need for computationally expensive simulations like SSFM. To validate the model's performance, we conducted experiments using chalcogenide microstructured optical fibers (MOFs), which are attracting attention due to their high Raman gain coefficient and wide spectral range in the mid-infrared (MIR) region. Specifically, we demonstrate the first successful generation of MIR SRS in a 2- $mu $ m direct-pumped suspended-core As2S3 MOF, which provides a crucial real-world dataset for model validation. The results demonstrate that our hybrid neural network is 116 times faster on a GPU and 44 times faster on a CPU compared to SSFM while maintaining accuracy and generalization. This significant acceleration paves the way for real-time analysis and inverse design of nonlinear photonic systems, tasks previously intractable with traditional methods.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"95 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sharpness-aware minimization (SAM) enhances generalization by minimizing max-sharpness (MaxS). Despite its practical success, we empirically found that the MaxS behind SAM's generalization enhancements faces the "flatness indicator problem" (FIP), where SAM only considers the flatness in the direction of gradient ascent. This leads to high Hessian eigenvalues for the deep neural network (DNN), indicating insufficient flatness in the solution region. Abetter flatness indicator (FI) would lower these Hessian eigenvalues, resulting in a flatter minimum and improved generalization of the network. Because SAM is inherently a greedy search method. In this article, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote asmin-sharpness (MinS). By merging MaxS and MinS, we create a better FI that indicates a flatter direction during the optimization. Specifically, we combine this FI with SAM into the proposed bilateral SAM (BSAM), which finds a flatter minimum than SAM. The theoretical analysis demonstrates that BSAM converges to a local minimum. Extensive experiments demonstrate that BSAM offers superior generalization performance and robustness compared to vanilla SAM across various tasks, i.e.,classification, transfer learning, human pose estimation, semantic segmentation, and network quantization.
{"title":"Bilateral Sharpness-Aware Minimization for Flatter Minima.","authors":"Jiaxin Deng,Junbiao Pang,Baochang Zhang,Qingming Huang","doi":"10.1109/tnnls.2026.3671361","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3671361","url":null,"abstract":"Sharpness-aware minimization (SAM) enhances generalization by minimizing max-sharpness (MaxS). Despite its practical success, we empirically found that the MaxS behind SAM's generalization enhancements faces the \"flatness indicator problem\" (FIP), where SAM only considers the flatness in the direction of gradient ascent. This leads to high Hessian eigenvalues for the deep neural network (DNN), indicating insufficient flatness in the solution region. Abetter flatness indicator (FI) would lower these Hessian eigenvalues, resulting in a flatter minimum and improved generalization of the network. Because SAM is inherently a greedy search method. In this article, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote asmin-sharpness (MinS). By merging MaxS and MinS, we create a better FI that indicates a flatter direction during the optimization. Specifically, we combine this FI with SAM into the proposed bilateral SAM (BSAM), which finds a flatter minimum than SAM. The theoretical analysis demonstrates that BSAM converges to a local minimum. Extensive experiments demonstrate that BSAM offers superior generalization performance and robustness compared to vanilla SAM across various tasks, i.e.,classification, transfer learning, human pose estimation, semantic segmentation, and network quantization.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"73 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}