首页 > 最新文献

IEEE transactions on neural networks and learning systems最新文献

英文 中文
Adaptive Niching-Based Gradient-Accelerated Differential Evolution for High-Dimensional Nonconvex Optimization 基于自适应小生境的高维非凸优化梯度加速差分进化
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-17 DOI: 10.1109/tnnls.2026.3671634
Qi Yu, Xijun Liang, Jinmeng Liu, Pu Tian, Ling Jian
{"title":"Adaptive Niching-Based Gradient-Accelerated Differential Evolution for High-Dimensional Nonconvex Optimization","authors":"Qi Yu, Xijun Liang, Jinmeng Liu, Pu Tian, Ling Jian","doi":"10.1109/tnnls.2026.3671634","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3671634","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147471021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rethinking Spectral Graph Neural Networks With Spatially Adaptive Filtering 空间自适应滤波谱图神经网络的再思考
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-17 DOI: 10.1109/tnnls.2026.3673098
Jingwei Guo, Kaizhu Huang, Xinping Yi, Zixian Su, Rui Zhang
{"title":"Rethinking Spectral Graph Neural Networks With Spatially Adaptive Filtering","authors":"Jingwei Guo, Kaizhu Huang, Xinping Yi, Zixian Su, Rui Zhang","doi":"10.1109/tnnls.2026.3673098","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3673098","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147471022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Policy-Adjustable Q-Learning for Data-Driven Nonlinear Optimal Tracking Control. 数据驱动非线性最优跟踪控制的策略可调q学习。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-16 DOI: 10.1109/tnnls.2026.3672136
Jiaoyuan Chen,Dawei Gong,Yuyang Zhao,Shijie Song,Minglei Zhu
This article investigates a novel policy-adjustable Q-learning (PA-QL) algorithm aimed at addressing the optimal tracking control (OTC) problem for nonlinear discrete-time (DT) systems with enhanced adaptability and flexibility. A novel iteration scheme is developed that integrates the control weights into the augmented neural network (NN) input, thereby reformulating the learning process to explicitly characterize the optimal policy as a function of the adjustable weights. Consequently, the learned control policy is not constrained by predetermined weights, allowing for dynamic adjustment after offline training is completed. Moreover, such adjustments can be performed online seamlessly, offering substantially greater flexibility in adapting to changes in operating conditions or control objectives. Finally, the effectiveness of the proposed algorithm is established through rigorous theoretical analysis and further validated by simulation studies.
本文研究了一种新的策略可调q -学习(PA-QL)算法,旨在解决非线性离散时间(DT)系统的最优跟踪控制(OTC)问题,具有增强的适应性和灵活性。开发了一种新的迭代方案,将控制权值集成到增强神经网络(NN)输入中,从而重新制定学习过程,将最优策略明确表征为可调权值的函数。因此,学习到的控制策略不受预定权值的约束,可以在离线训练完成后进行动态调整。此外,这种调整可以在线无缝地进行,在适应操作条件或控制目标的变化方面提供了更大的灵活性。最后,通过严格的理论分析和仿真研究验证了所提算法的有效性。
{"title":"Policy-Adjustable Q-Learning for Data-Driven Nonlinear Optimal Tracking Control.","authors":"Jiaoyuan Chen,Dawei Gong,Yuyang Zhao,Shijie Song,Minglei Zhu","doi":"10.1109/tnnls.2026.3672136","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3672136","url":null,"abstract":"This article investigates a novel policy-adjustable Q-learning (PA-QL) algorithm aimed at addressing the optimal tracking control (OTC) problem for nonlinear discrete-time (DT) systems with enhanced adaptability and flexibility. A novel iteration scheme is developed that integrates the control weights into the augmented neural network (NN) input, thereby reformulating the learning process to explicitly characterize the optimal policy as a function of the adjustable weights. Consequently, the learned control policy is not constrained by predetermined weights, allowing for dynamic adjustment after offline training is completed. Moreover, such adjustments can be performed online seamlessly, offering substantially greater flexibility in adapting to changes in operating conditions or control objectives. Finally, the effectiveness of the proposed algorithm is established through rigorous theoretical analysis and further validated by simulation studies.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"10 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147465191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CIRCUS: A Causal Intervention-Based Framework for Enhancing Counterfactual Fairness in Trained Classifiers. CIRCUS:一个基于因果干预的框架,用于增强训练分类器中的反事实公平性。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-16 DOI: 10.1109/tnnls.2026.3670269
Qifen Yang,Yuhui Deng,Jiande Huang,Lijuan Lu,Peng Zhou,Geyong Min
Ensuring model fairness for preventing potential biases based on any sensitive attribute is crucial for the societal acceptance of artificial intelligence in critical applications. Among various fairness concepts, counterfactual fairness has gained prominence as it is grounded in causal inference. This concept requires that an individual's prediction in the original world remains consistent with that in the counterfactual world where the sensitive feature value is modified. In this article, we aim to mitigate counterfactual biases of the model through causal intervention. Specifically, we first achieve effective causal intervention and counterfactual generation by proposing the causal inference tabular generative adversarial network (CITGAN) architecture. Unlike prior approaches based on variational autoencoders (VAEs) that inherently lack structural causal model (SCM) fidelity due to simultaneous generation, CITGAN strictly enforces causal consistency via an end-to-end topological generation process. By integrating exogenous variable inference with sequential generation, CITGAN ensures that functional dependencies are structurally preserved. Building on CITGAN, we propose the CIRCUS framework, a causal intervention-based framework designed to intuitively enhance the counterfactual fairness in trained classifiers. CIRCUS generates counterfactually discriminatory samples (CDSs) via causal intervention, guided by gradients and feature contributions, and subsequently applies bias correction preprocessing to their labels for classifier retraining. Experimental results demonstrate that CIRCUS effectively enhances counterfactual fairness while maintaining robust classification performance. Specifically, for the deep neural network (DNN) model, the $text {MMD}_{text {L}}$ and $text {MMD}_{text {K}}$ values are reduced by averages of 39.7% and 40.4%, respectively, compared with the second-best result. For the residual network (ResNet) model, these reductions amount to 56.7% and 54.5%, respectively.
确保模型的公平性,以防止基于任何敏感属性的潜在偏见,对于人工智能在关键应用中的社会接受度至关重要。在各种公平概念中,反事实公平因其建立在因果推理的基础上而备受关注。这一概念要求个体在原始世界中的预测与在修改敏感特征值的反事实世界中的预测保持一致。在本文中,我们旨在通过因果干预来减轻模型的反事实偏见。具体来说,我们首先通过提出因果推理表生成对抗网络(CITGAN)架构实现了有效的因果干预和反事实生成。与之前基于变分自编码器(VAEs)的方法不同,这些方法由于同时生成而固有地缺乏结构因果模型(SCM)保真度,CITGAN通过端到端拓扑生成过程严格执行因果一致性。通过将外生变量推理与序列生成相结合,CITGAN确保功能依赖在结构上得到保留。在CITGAN的基础上,我们提出了CIRCUS框架,这是一个基于因果干预的框架,旨在直观地增强训练分类器的反事实公平性。CIRCUS在梯度和特征贡献的指导下,通过因果干预产生反事实歧视样本(cds),随后对其标签进行偏差校正预处理,以进行分类器再训练。实验结果表明,该算法在保持稳健分类性能的同时,有效地增强了反事实公平性。具体来说,对于深度神经网络(DNN)模型,与次优结果相比,$text {MMD}_{text {L}}$和$text {MMD}_{text {K}}$值分别平均降低了39.7%和40.4%。对于残余网络(ResNet)模型,这些减少量分别为56.7%和54.5%。
{"title":"CIRCUS: A Causal Intervention-Based Framework for Enhancing Counterfactual Fairness in Trained Classifiers.","authors":"Qifen Yang,Yuhui Deng,Jiande Huang,Lijuan Lu,Peng Zhou,Geyong Min","doi":"10.1109/tnnls.2026.3670269","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670269","url":null,"abstract":"Ensuring model fairness for preventing potential biases based on any sensitive attribute is crucial for the societal acceptance of artificial intelligence in critical applications. Among various fairness concepts, counterfactual fairness has gained prominence as it is grounded in causal inference. This concept requires that an individual's prediction in the original world remains consistent with that in the counterfactual world where the sensitive feature value is modified. In this article, we aim to mitigate counterfactual biases of the model through causal intervention. Specifically, we first achieve effective causal intervention and counterfactual generation by proposing the causal inference tabular generative adversarial network (CITGAN) architecture. Unlike prior approaches based on variational autoencoders (VAEs) that inherently lack structural causal model (SCM) fidelity due to simultaneous generation, CITGAN strictly enforces causal consistency via an end-to-end topological generation process. By integrating exogenous variable inference with sequential generation, CITGAN ensures that functional dependencies are structurally preserved. Building on CITGAN, we propose the CIRCUS framework, a causal intervention-based framework designed to intuitively enhance the counterfactual fairness in trained classifiers. CIRCUS generates counterfactually discriminatory samples (CDSs) via causal intervention, guided by gradients and feature contributions, and subsequently applies bias correction preprocessing to their labels for classifier retraining. Experimental results demonstrate that CIRCUS effectively enhances counterfactual fairness while maintaining robust classification performance. Specifically, for the deep neural network (DNN) model, the $text {MMD}_{text {L}}$ and $text {MMD}_{text {K}}$ values are reduced by averages of 39.7% and 40.4%, respectively, compared with the second-best result. For the residual network (ResNet) model, these reductions amount to 56.7% and 54.5%, respectively.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"58 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147465193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Enhanced Low-Computational-Complexity Predefined-Time Convergent Zeroing Neural Network for Constrained Time-Varying Quadratic Programming With Kinematic Control of Robotic Manipulator. 机器人运动控制约束时变二次规划的一种增强低计算复杂度预定义时间收敛归零神经网络。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-13 DOI: 10.1109/tnnls.2026.3672118
Junpei Yang,Zhan Li,Weibing Li
Time-varying quadratic programming (TVQP) problems can be regarded as a challenging issue in a wide range of engineering applications, frequently incorporating equality, inequality, and bound constraints. By integrating a nonlinear complementarity problem (NCP) formulation, zeroing neural networks (ZNNs) can be generalized to solve TVQP problems effectively with such a full set of constraints. However, three issues may limit its computational efficiency and practical applications: 1) the inherent formulation of NCP functions for handling equality/inequality and boundary constraints substantially expands dimensions of coefficient matrices/vectors and solution-space variables; 2) the conventional ZNN (CZNN) framework inevitably necessitates matrix inversion operations for real-time solutions; and 3) insufficient robustness against noise interference compromises solution accuracy in practical implementations. To overcome these issues and promote solution performance, this article develops an enhanced lower dimension NCP low-computational-complexity (LCC) ZNN (ELNCP-LCCZNN) model for solving TVQP problems with time-varying equality, inequality, and variable boundary constraints. An ELNCP function is designed to reduce the matrix/vector coefficients of the model, and the LCCZNN model is utilized to construct a new dynamic model that eliminates the necessity for matrix inversion during solution. Furthermore, a nonlinear activation function is involved to guarantee predefined-time convergence and strengthen robustness against noise. The theoretical properties of the proposed ELNCP-LCCZNN model are validated through numerical simulations and experiments on robotic manipulator kinematic control. The results corroborate the analysis and demonstrate improved computational efficiency, enhanced noise robustness, and practical implementability compared with existing approaches.
时变二次规划(TVQP)问题在广泛的工程应用中被认为是一个具有挑战性的问题,它经常包含等式、不等式和有界约束。通过对非线性互补问题(NCP)公式的整合,可以将归零神经网络(znn)推广到具有完整约束集的TVQP问题。然而,三个问题可能限制其计算效率和实际应用:1)NCP函数处理等式/不等式和边界约束的固有公式大大扩展了系数矩阵/向量和解空间变量的维度;2)传统的ZNN (CZNN)框架不可避免地需要对实时解进行矩阵反演操作;3)对噪声干扰的鲁棒性不足会影响解决方案在实际实现中的准确性。为了克服这些问题并提高解决方案的性能,本文开发了一种增强的低维NCP低计算复杂度(LCC) ZNN (ELNCP-LCCZNN)模型,用于解决具有时变等式、不等式和可变边界约束的TVQP问题。设计ELNCP函数降低模型的矩阵/向量系数,利用LCCZNN模型构建新的动态模型,消除求解过程中矩阵反演的必要性。此外,该算法还引入了非线性激活函数,以保证算法的时间收敛性和增强对噪声的鲁棒性。通过数值仿真和机械臂运动控制实验,验证了ELNCP-LCCZNN模型的理论特性。结果证实了分析结果,并证明与现有方法相比,该方法提高了计算效率,增强了噪声鲁棒性和实际可实现性。
{"title":"An Enhanced Low-Computational-Complexity Predefined-Time Convergent Zeroing Neural Network for Constrained Time-Varying Quadratic Programming With Kinematic Control of Robotic Manipulator.","authors":"Junpei Yang,Zhan Li,Weibing Li","doi":"10.1109/tnnls.2026.3672118","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3672118","url":null,"abstract":"Time-varying quadratic programming (TVQP) problems can be regarded as a challenging issue in a wide range of engineering applications, frequently incorporating equality, inequality, and bound constraints. By integrating a nonlinear complementarity problem (NCP) formulation, zeroing neural networks (ZNNs) can be generalized to solve TVQP problems effectively with such a full set of constraints. However, three issues may limit its computational efficiency and practical applications: 1) the inherent formulation of NCP functions for handling equality/inequality and boundary constraints substantially expands dimensions of coefficient matrices/vectors and solution-space variables; 2) the conventional ZNN (CZNN) framework inevitably necessitates matrix inversion operations for real-time solutions; and 3) insufficient robustness against noise interference compromises solution accuracy in practical implementations. To overcome these issues and promote solution performance, this article develops an enhanced lower dimension NCP low-computational-complexity (LCC) ZNN (ELNCP-LCCZNN) model for solving TVQP problems with time-varying equality, inequality, and variable boundary constraints. An ELNCP function is designed to reduce the matrix/vector coefficients of the model, and the LCCZNN model is utilized to construct a new dynamic model that eliminates the necessity for matrix inversion during solution. Furthermore, a nonlinear activation function is involved to guarantee predefined-time convergence and strengthen robustness against noise. The theoretical properties of the proposed ELNCP-LCCZNN model are validated through numerical simulations and experiments on robotic manipulator kinematic control. The results corroborate the analysis and demonstrate improved computational efficiency, enhanced noise robustness, and practical implementability compared with existing approaches.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"160 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147447039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatiotemporal Decoupled Learning for Spiking Neural Networks 脉冲神经网络的时空解耦学习
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-13 DOI: 10.1109/tnnls.2026.3671461
Chenxiang Ma, Xinyi Chen, Kay Chen Tan, Jibin Wu
{"title":"Spatiotemporal Decoupled Learning for Spiking Neural Networks","authors":"Chenxiang Ma, Xinyi Chen, Kay Chen Tan, Jibin Wu","doi":"10.1109/tnnls.2026.3671461","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3671461","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"54 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147454351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Reinforcement Learning via Leveraging Historically Optimal Policy With Regulation of Performance. 利用历史最优策略和性能调节的鲁棒强化学习。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-13 DOI: 10.1109/tnnls.2026.3670947
Qinglong Chen,Fei Zhu
Most existing adversarial training methods in reinforcement learning (RL) offer limited robustness and remain vulnerable to novel attacks. To address this limitation, an approach that enhances policy robustness by leveraging the historically optimal policy to guide policy optimization and generating diverse adversarial perturbations, termed robust RL via leveraging historically optimal policy with regulation of performance (HORP), is proposed. Unlike other approaches that rely solely on trial-and-error interactions, HORP constructs a guidance value function by simultaneously considering value gaps and policy distribution divergence, thereby focusing on prioritized learning in promising action spaces. It also incorporates an adaptive performance-aware optimization mechanism to trigger timely corrections, preventing the agent from deviating from optimal performance. Furthermore, HORP dynamically modulates perturbation entropy through controlled uncertainty injection, thereby improving the agent's generalized defensive capabilities. Experiments demonstrate that HORP achieves superior performance in most cases regarding both natural performance and robustness against various state attacks.
大多数现有的强化学习(RL)对抗性训练方法提供有限的鲁棒性,并且容易受到新的攻击。为了解决这一限制,提出了一种通过利用历史最优政策来指导政策优化并产生各种对抗性扰动来增强政策鲁棒性的方法,称为鲁棒RL,通过利用历史最优政策与性能调节(HORP)。与其他仅依赖于试错交互的方法不同,HORP通过同时考虑价值差距和政策分布分歧来构建指导价值函数,从而专注于有希望的行动空间中的优先学习。它还包含一个自适应的性能感知优化机制,以触发及时的纠正,防止代理偏离最佳性能。此外,HORP通过可控的不确定性注入来动态调节扰动熵,从而提高智能体的广义防御能力。实验表明,在大多数情况下,HORP在自然性能和对各种状态攻击的鲁棒性方面都取得了优异的性能。
{"title":"Robust Reinforcement Learning via Leveraging Historically Optimal Policy With Regulation of Performance.","authors":"Qinglong Chen,Fei Zhu","doi":"10.1109/tnnls.2026.3670947","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670947","url":null,"abstract":"Most existing adversarial training methods in reinforcement learning (RL) offer limited robustness and remain vulnerable to novel attacks. To address this limitation, an approach that enhances policy robustness by leveraging the historically optimal policy to guide policy optimization and generating diverse adversarial perturbations, termed robust RL via leveraging historically optimal policy with regulation of performance (HORP), is proposed. Unlike other approaches that rely solely on trial-and-error interactions, HORP constructs a guidance value function by simultaneously considering value gaps and policy distribution divergence, thereby focusing on prioritized learning in promising action spaces. It also incorporates an adaptive performance-aware optimization mechanism to trigger timely corrections, preventing the agent from deviating from optimal performance. Furthermore, HORP dynamically modulates perturbation entropy through controlled uncertainty injection, thereby improving the agent's generalized defensive capabilities. Experiments demonstrate that HORP achieves superior performance in most cases regarding both natural performance and robustness against various state attacks.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147447038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topology-Optimal Multiple Gossip Steps for Decentralized Federated Learning via Gossip Tensor. 基于八卦张量的分散联邦学习拓扑最优多八卦步。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-12 DOI: 10.1109/tnnls.2026.3670013
Yan Zhong,Lei Ma,Xiaomeng Yan
Decentralized federated learning (DFL) has gained significant attention as a framework for analyzing large-scale data distributed across multiple sites, where communication between sites is constrained by a decentralized graph structure. Due to privacy concerns and high communication costs, reducing the number of communication rounds in DFL has become an important area of research. This article investigates the effects of the decentralized graph's topology on the convergence rate of DFL algorithms and introduces a novel tensor-based multiple-gossip-steps (T-MGS) method to optimize communication efficiency from the topology aspect. The core idea of this method is to use gossip tensors to guide the information flow between sites and enable dynamic adjustments to the transmitted content at each communication step without increasing its volume. The proposed method minimizes the second-largest absolute eigenvalue of the equivalent gossip matrix, a key factor influencing convergence speed. Experimental results on both simulated and real datasets demonstrate that the proposed T-MGS outperforms existing strategies in terms of communication efficiency, reducing the number of communication rounds without compromising model accuracy.
分散式联邦学习(DFL)作为一种分析分布在多个站点上的大规模数据的框架,已经获得了极大的关注,其中站点之间的通信受到分散式图结构的约束。由于隐私问题和高昂的通信成本,减少DFL中的通信轮数已成为一个重要的研究领域。本文研究了分散图的拓扑结构对DFL算法收敛速度的影响,提出了一种新的基于张量的多重八卦步骤(T-MGS)方法,从拓扑结构方面优化通信效率。该方法的核心思想是利用八卦张量引导站点间的信息流,在不增加信息量的情况下,对每一步传播的内容进行动态调整。该方法使等效八卦矩阵的第二大绝对特征值最小,这是影响收敛速度的关键因素。在模拟和真实数据集上的实验结果表明,所提出的T-MGS在通信效率方面优于现有策略,在不影响模型精度的情况下减少了通信轮数。
{"title":"Topology-Optimal Multiple Gossip Steps for Decentralized Federated Learning via Gossip Tensor.","authors":"Yan Zhong,Lei Ma,Xiaomeng Yan","doi":"10.1109/tnnls.2026.3670013","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670013","url":null,"abstract":"Decentralized federated learning (DFL) has gained significant attention as a framework for analyzing large-scale data distributed across multiple sites, where communication between sites is constrained by a decentralized graph structure. Due to privacy concerns and high communication costs, reducing the number of communication rounds in DFL has become an important area of research. This article investigates the effects of the decentralized graph's topology on the convergence rate of DFL algorithms and introduces a novel tensor-based multiple-gossip-steps (T-MGS) method to optimize communication efficiency from the topology aspect. The core idea of this method is to use gossip tensors to guide the information flow between sites and enable dynamic adjustments to the transmitted content at each communication step without increasing its volume. The proposed method minimizes the second-largest absolute eigenvalue of the equivalent gossip matrix, a key factor influencing convergence speed. Experimental results on both simulated and real datasets demonstrate that the proposed T-MGS outperforms existing strategies in terms of communication efficiency, reducing the number of communication rounds without compromising model accuracy.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"8 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Deep Learning Approach for Dynamic Modeling of Stimulated Raman Scattering in Chalcogenide Microstructured Optical Fibers. 硫系微结构光纤受激拉曼散射动态建模的深度学习方法。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-12 DOI: 10.1109/tnnls.2026.3670396
Jun Fu,Yupeng Liu,Qi Wang,Meiting Pan,Tonglei Cheng
Stimulated Raman scattering (SRS) plays a pivotal role in applications such as optical communications, fiber optic sensing, and spectral analysis. However, traditional modeling methods like the split-step Fourier method (SSFM) are computationally demanding. In response to these challenges, we propose a novel deep learning framework based on a hybrid neural network, specifically architected to capture the complex spatio-temporal dependencies inherent in nonlinear pulse propagation. Our model offers rapid and precise predictions of SRS behavior, alleviating the need for computationally expensive simulations like SSFM. To validate the model's performance, we conducted experiments using chalcogenide microstructured optical fibers (MOFs), which are attracting attention due to their high Raman gain coefficient and wide spectral range in the mid-infrared (MIR) region. Specifically, we demonstrate the first successful generation of MIR SRS in a 2- $mu $ m direct-pumped suspended-core As2S3 MOF, which provides a crucial real-world dataset for model validation. The results demonstrate that our hybrid neural network is 116 times faster on a GPU and 44 times faster on a CPU compared to SSFM while maintaining accuracy and generalization. This significant acceleration paves the way for real-time analysis and inverse design of nonlinear photonic systems, tasks previously intractable with traditional methods.
受激拉曼散射(SRS)在光通信、光纤传感和光谱分析等应用中起着举足轻重的作用。然而,传统的建模方法,如分步傅立叶方法(SSFM),计算量很大。为了应对这些挑战,我们提出了一种基于混合神经网络的新型深度学习框架,专门用于捕获非线性脉冲传播中固有的复杂时空依赖关系。我们的模型提供了对SRS行为的快速和精确的预测,减轻了像SSFM这样计算昂贵的模拟的需要。为了验证该模型的性能,我们使用硫系微结构光纤(MOFs)进行了实验,该光纤因其高拉曼增益系数和宽中红外(MIR)区光谱范围而备受关注。具体来说,我们在一个2亿美元的直接泵送悬核As2S3 MOF中展示了第一代成功的MIR SRS,这为模型验证提供了一个重要的现实数据集。结果表明,与SSFM相比,我们的混合神经网络在GPU上的速度快116倍,在CPU上的速度快44倍,同时保持准确性和泛化。这种显著的加速为非线性光子系统的实时分析和逆向设计铺平了道路,这些任务以前难以用传统方法完成。
{"title":"A Deep Learning Approach for Dynamic Modeling of Stimulated Raman Scattering in Chalcogenide Microstructured Optical Fibers.","authors":"Jun Fu,Yupeng Liu,Qi Wang,Meiting Pan,Tonglei Cheng","doi":"10.1109/tnnls.2026.3670396","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3670396","url":null,"abstract":"Stimulated Raman scattering (SRS) plays a pivotal role in applications such as optical communications, fiber optic sensing, and spectral analysis. However, traditional modeling methods like the split-step Fourier method (SSFM) are computationally demanding. In response to these challenges, we propose a novel deep learning framework based on a hybrid neural network, specifically architected to capture the complex spatio-temporal dependencies inherent in nonlinear pulse propagation. Our model offers rapid and precise predictions of SRS behavior, alleviating the need for computationally expensive simulations like SSFM. To validate the model's performance, we conducted experiments using chalcogenide microstructured optical fibers (MOFs), which are attracting attention due to their high Raman gain coefficient and wide spectral range in the mid-infrared (MIR) region. Specifically, we demonstrate the first successful generation of MIR SRS in a 2- $mu $ m direct-pumped suspended-core As2S3 MOF, which provides a crucial real-world dataset for model validation. The results demonstrate that our hybrid neural network is 116 times faster on a GPU and 44 times faster on a CPU compared to SSFM while maintaining accuracy and generalization. This significant acceleration paves the way for real-time analysis and inverse design of nonlinear photonic systems, tasks previously intractable with traditional methods.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"95 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bilateral Sharpness-Aware Minimization for Flatter Minima. 平坦最小值的双边锐度感知最小化。
IF 10.4 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-03-12 DOI: 10.1109/tnnls.2026.3671361
Jiaxin Deng,Junbiao Pang,Baochang Zhang,Qingming Huang
Sharpness-aware minimization (SAM) enhances generalization by minimizing max-sharpness (MaxS). Despite its practical success, we empirically found that the MaxS behind SAM's generalization enhancements faces the "flatness indicator problem" (FIP), where SAM only considers the flatness in the direction of gradient ascent. This leads to high Hessian eigenvalues for the deep neural network (DNN), indicating insufficient flatness in the solution region. Abetter flatness indicator (FI) would lower these Hessian eigenvalues, resulting in a flatter minimum and improved generalization of the network. Because SAM is inherently a greedy search method. In this article, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote asmin-sharpness (MinS). By merging MaxS and MinS, we create a better FI that indicates a flatter direction during the optimization. Specifically, we combine this FI with SAM into the proposed bilateral SAM (BSAM), which finds a flatter minimum than SAM. The theoretical analysis demonstrates that BSAM converges to a local minimum. Extensive experiments demonstrate that BSAM offers superior generalization performance and robustness compared to vanilla SAM across various tasks, i.e.,classification, transfer learning, human pose estimation, semantic segmentation, and network quantization.
锐度感知最小化(SAM)通过最小化最大锐度(MaxS)来增强泛化。尽管它在实践中取得了成功,但我们经验地发现,SAM的泛化增强背后的MaxS面临着“平坦度指标问题”(FIP),其中SAM只考虑梯度上升方向的平坦度。这导致深度神经网络(DNN)的高Hessian特征值,表明解决区域的平坦度不足。更好的平坦度指标(FI)会降低这些Hessian特征值,从而得到更平坦的最小值,提高网络的泛化能力。因为SAM本质上是一种贪婪搜索方法。在本文中,我们建议利用训练损失和当前权值周围邻域的最小损失之间的差异,我们将其称为最小锐度(min)。通过合并max和min,我们创建了一个更好的FI,在优化过程中指示一个更平坦的方向。具体而言,我们将该FI与SAM结合到拟议的双边SAM (BSAM)中,BSAM找到了比SAM更平坦的最小值。理论分析表明,BSAM收敛于局部极小值。大量实验表明,在分类、迁移学习、人体姿态估计、语义分割和网络量化等任务上,BSAM比vanilla SAM具有更好的泛化性能和鲁棒性。
{"title":"Bilateral Sharpness-Aware Minimization for Flatter Minima.","authors":"Jiaxin Deng,Junbiao Pang,Baochang Zhang,Qingming Huang","doi":"10.1109/tnnls.2026.3671361","DOIUrl":"https://doi.org/10.1109/tnnls.2026.3671361","url":null,"abstract":"Sharpness-aware minimization (SAM) enhances generalization by minimizing max-sharpness (MaxS). Despite its practical success, we empirically found that the MaxS behind SAM's generalization enhancements faces the \"flatness indicator problem\" (FIP), where SAM only considers the flatness in the direction of gradient ascent. This leads to high Hessian eigenvalues for the deep neural network (DNN), indicating insufficient flatness in the solution region. Abetter flatness indicator (FI) would lower these Hessian eigenvalues, resulting in a flatter minimum and improved generalization of the network. Because SAM is inherently a greedy search method. In this article, we propose to utilize the difference between the training loss and the minimum loss over the neighborhood surrounding the current weight, which we denote asmin-sharpness (MinS). By merging MaxS and MinS, we create a better FI that indicates a flatter direction during the optimization. Specifically, we combine this FI with SAM into the proposed bilateral SAM (BSAM), which finds a flatter minimum than SAM. The theoretical analysis demonstrates that BSAM converges to a local minimum. Extensive experiments demonstrate that BSAM offers superior generalization performance and robustness compared to vanilla SAM across various tasks, i.e.,classification, transfer learning, human pose estimation, semantic segmentation, and network quantization.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"73 1","pages":""},"PeriodicalIF":10.4,"publicationDate":"2026-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"147439275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE transactions on neural networks and learning systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1