首页 > 最新文献

Neural Networks最新文献

英文 中文
Transforming tabular data into images for deep learning models 将表格数据转换为深度学习模型的图像
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-02-10 DOI: 10.1016/j.neunet.2026.108715
Abdullah Elen , Emre Avuçlu
Deep learning (DL) has achieved remarkable success in processing unstructured data such as images, text, and audio, yet its application to tabular numerical datasets remains challenging due to the lack of inherent spatial structure. In this study, we present a novel approach for transforming numerical tabular data into grayscale image representations, enabling the effective use of convolutional neural networks and other DL architectures on traditionally numerical datasets. The method normalizes features, organizes them into square image matrices, and generates labeled images for classification. Experiments were conducted on four publicly available datasets: Rice MSC Dataset (RMSCD), Optical Recognition of Handwritten Digits (Optdigits), TUNADROMD, and Spambase. Transformed datasets were evaluated using Residual Network (ResNet-18) and Directed Acyclic Graph Neural Network (DAG-Net) models with 5-fold cross-validation. The DAG-Net model achieved accuracies of 99.91% on RMSCD, 99.77% on Optdigits, 98.84% on TUNADROMD, and 93.06% on Spambase, demonstrating the efficacy of the proposed transformation. Additional ablation studies and efficiency analyses highlight improvements in training performance and computational cost. The results indicate that the proposed image-based transformation provides a practical and efficient strategy for integrating numerical datasets into deep learning workflows, broadening the applicability of DL techniques across diverse domains. The implementation is released as open-source software to facilitate reproducibility and further research.
深度学习(DL)在处理非结构化数据(如图像、文本和音频)方面取得了显著的成功,但由于缺乏固有的空间结构,将其应用于表格数字数据集仍然具有挑战性。在本研究中,我们提出了一种将数值表格数据转换为灰度图像表示的新方法,从而能够在传统的数值数据集上有效地使用卷积神经网络和其他深度学习架构。该方法将特征归一化,组织成方形图像矩阵,生成标记图像进行分类。实验在四个公开的数据集上进行:Rice MSC Dataset (RMSCD)、Optical Recognition of handwriting Digits (Optdigits)、TUNADROMD和Spambase。转换后的数据集使用残差网络(ResNet-18)和有向无环图神经网络(DAG-Net)模型进行评估,并进行5次交叉验证。DAG-Net模型在RMSCD上的准确率为99.91%,在Optdigits上的准确率为99.77%,在TUNADROMD上的准确率为98.84%,在Spambase上的准确率为93.06%,证明了所提出转换的有效性。额外的消融研究和效率分析强调了训练性能和计算成本的改进。结果表明,所提出的基于图像的转换为将数值数据集集成到深度学习工作流中提供了一种实用而有效的策略,扩大了深度学习技术在不同领域的适用性。该实现作为开源软件发布,以促进可重复性和进一步的研究。
{"title":"Transforming tabular data into images for deep learning models","authors":"Abdullah Elen ,&nbsp;Emre Avuçlu","doi":"10.1016/j.neunet.2026.108715","DOIUrl":"10.1016/j.neunet.2026.108715","url":null,"abstract":"<div><div>Deep learning (DL) has achieved remarkable success in processing unstructured data such as images, text, and audio, yet its application to tabular numerical datasets remains challenging due to the lack of inherent spatial structure. In this study, we present a novel approach for transforming numerical tabular data into grayscale image representations, enabling the effective use of convolutional neural networks and other DL architectures on traditionally numerical datasets. The method normalizes features, organizes them into square image matrices, and generates labeled images for classification. Experiments were conducted on four publicly available datasets: Rice MSC Dataset (RMSCD), Optical Recognition of Handwritten Digits (Optdigits), TUNADROMD, and Spambase. Transformed datasets were evaluated using Residual Network (ResNet-18) and Directed Acyclic Graph Neural Network (DAG-Net) models with 5-fold cross-validation. The DAG-Net model achieved accuracies of 99.91% on RMSCD, 99.77% on Optdigits, 98.84% on TUNADROMD, and 93.06% on Spambase, demonstrating the efficacy of the proposed transformation. Additional ablation studies and efficiency analyses highlight improvements in training performance and computational cost. The results indicate that the proposed image-based transformation provides a practical and efficient strategy for integrating numerical datasets into deep learning workflows, broadening the applicability of DL techniques across diverse domains. The implementation is released as open-source software to facilitate reproducibility and further research.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108715"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gradient-informed neural networks: Embedding prior beliefs for learning in low-data scenarios 梯度通知神经网络:嵌入先验信念在低数据场景下的学习。
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-02-02 DOI: 10.1016/j.neunet.2026.108681
Filippo Aglietti , Francesco Della Santa , Andrea Piano , Virginia Aglietti
We propose Gradient-Informed Neural Networks (gradinn s), a methodology that can be used to efficiently approximate a wide range of functions in low-data regimes, when only general prior beliefs are available, a condition that is often encountered in complex engineering problems.
gradinn s incorporate prior beliefs about the first-order derivatives of the target function to constrain the behavior of its gradient, thus implicitly shaping it, without requiring explicit access to the target function’s derivatives. This is achieved by using two Neural Networks: one modeling the target function and a second, auxiliary network expressing the prior beliefs about the first-order derivatives (e.g., smoothness, oscillations, etc.). A customized loss function enables the training of the first network while enforcing gradient constraints derived from the auxiliary network; at the same time, it allows these constraints to be relaxed in accordance with the training data. Numerical experiments demonstrate the advantages of gradinn s, particularly in low-data regimes, with results showing strong performance compared to standard Neural Networks across the tested scenarios, including synthetic benchmark functions and real-world engineering tasks.
我们提出了Gradient-Informed Neural Networks (gradinn s),这是一种方法,当只有一般先验信念可用时,可以用来在低数据区域有效地近似广泛的函数,这是复杂工程问题中经常遇到的情况。Gradinn s结合了关于目标函数一阶导数的先验信念来约束其梯度的行为,从而隐式地塑造它,而不需要显式地访问目标函数的导数。这是通过使用两个神经网络来实现的:一个建模目标函数,第二个,辅助网络表达关于一阶导数的先验信念(例如,平滑,振荡等)。自定义损失函数能够在训练第一网络的同时强制执行从辅助网络导出的梯度约束;同时,它允许根据训练数据放宽这些约束。数值实验证明了梯度神经网络的优势,特别是在低数据条件下,与标准神经网络相比,在测试场景(包括合成基准函数和现实世界的工程任务)中表现出强大的性能。
{"title":"Gradient-informed neural networks: Embedding prior beliefs for learning in low-data scenarios","authors":"Filippo Aglietti ,&nbsp;Francesco Della Santa ,&nbsp;Andrea Piano ,&nbsp;Virginia Aglietti","doi":"10.1016/j.neunet.2026.108681","DOIUrl":"10.1016/j.neunet.2026.108681","url":null,"abstract":"<div><div>We propose Gradient-Informed Neural Networks (<span>g</span>rad<span>inn</span> s), a methodology that can be used to efficiently approximate a wide range of functions in low-data regimes, when only general prior beliefs are available, a condition that is often encountered in complex engineering problems.</div><div><span>g</span>rad<span>inn</span> s incorporate prior beliefs about the first-order derivatives of the target function to constrain the behavior of its gradient, thus implicitly shaping it, without requiring explicit access to the target function’s derivatives. This is achieved by using two Neural Networks: one modeling the target function and a second, auxiliary network expressing the prior beliefs about the first-order derivatives (e.g., smoothness, oscillations, etc.). A customized loss function enables the training of the first network while enforcing gradient constraints derived from the auxiliary network; at the same time, it allows these constraints to be relaxed in accordance with the training data. Numerical experiments demonstrate the advantages of <span>g</span>rad<span>inn</span> s, particularly in low-data regimes, with results showing strong performance compared to standard Neural Networks across the tested scenarios, including synthetic benchmark functions and real-world engineering tasks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108681"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146167562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NG-SNN: A neurogenesis-inspired dynamic adaptive framework for efficient spike classification NG-SNN:一种神经发生启发的动态自适应框架,用于有效的尖峰分类
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108656
Jing Tang , Depeng Li , Zhenyu Zhang , Zhigang Zeng
Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.
脉冲神经网络(snn)是为低功耗神经形态计算而设计的。一种被广泛采用的混合范式将特征提取与分类分离,以提高生物的可信性和模块化。然而,这种解耦将决策集中在下游分类器上,这在许多系统中成为准确性和效率的限制因素。手动预置的固定拓扑存在冗余或容量不足的风险,并且代理梯度训练的计算成本仍然很高。生物神经发生是大脑自适应地增加新的神经元以建立有效的、特定任务的回路的机制。受这一过程的启发,我们提出了神经发生启发的峰值神经网络(NG-SNN),这是一个动态自适应框架,使用两个关键创新来解决这些挑战。具体而言,我们首先引入了一种监督增量构建机制,该机制通过在贡献准则下选择性地整合神经元来动态地增长任务最优结构。其次,我们设计了一种活动依赖的分析学习方法,该方法将每次结构更新的迭代优化替换为单次和自适应权重计算,大大提高了训练效率。因此,NG-SNN独特地将动态结构自适应与高效的非迭代学习相结合,形成了一个自组织、快速收敛的分类系统。此外,这种神经发生驱动的过程使NG-SNN具有高度紧凑的结构,需要的参数显著减少。大量的实验表明,我们的NG-SNN在不同的数据集上匹配或优于其竞争对手,而无需迭代训练和手动架构调优的开销。
{"title":"NG-SNN: A neurogenesis-inspired dynamic adaptive framework for efficient spike classification","authors":"Jing Tang ,&nbsp;Depeng Li ,&nbsp;Zhenyu Zhang ,&nbsp;Zhigang Zeng","doi":"10.1016/j.neunet.2026.108656","DOIUrl":"10.1016/j.neunet.2026.108656","url":null,"abstract":"<div><div>Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108656"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trainable-parameter-free structural-diversity message passing for graph neural networks 图神经网络的无可训练参数结构分集消息传递
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-02-10 DOI: 10.1016/j.neunet.2026.108711
Mingyue Kong, Yinglong Zhang, Chengda Xu, Xuewen Xia, Xing Xu
Graph Neural Networks (GNNs) have achieved strong performance in structured data modeling such as node classification. However, real-world graphs often exhibit heterogeneous neighborhoods and complex feature distributions, while mainstream approaches rely on many learnable parameters and apply uniform aggregation to all neighbors. This lack of explicit modeling for structural diversity often leads to representation homogenization, semantic degradation, and poor adaptability under challenging conditions such as low supervision or class imbalance. To address these limitations, we propose a trainable-parameter-free graph neural network framework, termed the Structural-Diversity Graph Neural Network (SDGNN), which operationalizes structural diversity in message passing. At its core, the Structural-Diversity Message Passing (SDMP) mechanism performs within-group statistics followed by cross-group selection, thereby capturing neighborhood heterogeneity while stabilizing feature semantics. SDGNN further incorporates complementary structure-driven and feature-driven partitioning strategies, together with a normalized-propagation-based global structural enhancer, to enhance adaptability across diverse graphs. Extensive experiments on nine public benchmark datasets and an interdisciplinary PubMed citation network demonstrate that SDGNN consistently outperforms mainstream GNNs, especially under low supervision, class imbalance, and cross-domain transfer. The full implementation, including code and configurations, is publicly available at: https://github.com/mingyue15694/SGDNN/tree/main.
图神经网络(gnn)在节点分类等结构化数据建模方面取得了优异的成绩。然而,现实世界的图经常表现出异构邻域和复杂的特征分布,而主流方法依赖于许多可学习的参数,并对所有邻域应用统一聚合。缺乏对结构多样性的显式建模通常会导致表征同质化、语义退化以及在低监督或类不平衡等具有挑战性的条件下的适应性差。为了解决这些限制,我们提出了一个无训练参数的图神经网络框架,称为结构多样性图神经网络(SDGNN),它在消息传递中实现结构多样性。在其核心,结构多样性消息传递(SDMP)机制执行组内统计,然后进行跨组选择,从而在稳定特征语义的同时捕获邻居异质性。SDGNN进一步融合了互补的结构驱动和特征驱动划分策略,以及基于归一化传播的全局结构增强器,以增强不同图的适应性。在9个公共基准数据集和一个跨学科的PubMed引文网络上进行的大量实验表明,SDGNN的性能始终优于主流gnn,特别是在低监督、类不平衡和跨领域迁移的情况下。完整的实现,包括代码和配置,可以在:https://github.com/mingyue15694/SGDNN/tree/main上公开获得。
{"title":"Trainable-parameter-free structural-diversity message passing for graph neural networks","authors":"Mingyue Kong,&nbsp;Yinglong Zhang,&nbsp;Chengda Xu,&nbsp;Xuewen Xia,&nbsp;Xing Xu","doi":"10.1016/j.neunet.2026.108711","DOIUrl":"10.1016/j.neunet.2026.108711","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have achieved strong performance in structured data modeling such as node classification. However, real-world graphs often exhibit heterogeneous neighborhoods and complex feature distributions, while mainstream approaches rely on many learnable parameters and apply uniform aggregation to all neighbors. This lack of explicit modeling for structural diversity often leads to representation homogenization, semantic degradation, and poor adaptability under challenging conditions such as low supervision or class imbalance. To address these limitations, we propose a trainable-parameter-free graph neural network framework, termed the Structural-Diversity Graph Neural Network (SDGNN), which operationalizes structural diversity in message passing. At its core, the Structural-Diversity Message Passing (SDMP) mechanism performs within-group statistics followed by cross-group selection, thereby capturing neighborhood heterogeneity while stabilizing feature semantics. SDGNN further incorporates complementary structure-driven and feature-driven partitioning strategies, together with a normalized-propagation-based global structural enhancer, to enhance adaptability across diverse graphs. Extensive experiments on nine public benchmark datasets and an interdisciplinary PubMed citation network demonstrate that SDGNN consistently outperforms mainstream GNNs, especially under low supervision, class imbalance, and cross-domain transfer. The full implementation, including code and configurations, is publicly available at: <span><span>https://github.com/mingyue15694/SGDNN/tree/main</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108711"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CGLK-GNN : A connectome generation network with large kernels for GNN based Alzheimer’s disease analysis CGLK-GNN:用于基于GNN的阿尔茨海默病分析的大核连接组生成网络
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-02-07 DOI: 10.1016/j.neunet.2026.108689
Wenqi Zhu , Zhong Yin , Yinghua Fu , Alzheimer's Disease Neuroimaging Initiative
Alzheimer’s disease (AD) is a currently incurable neurodegenerative disease, with early detection representing a high research priority. AD is characterized by progressive cognitive decline accompanied by alterations in brain functional connectivity. Based on its data structure similar to the graph, graph neural networks (GNNs) have emerged as important methods for brain function analysis and disease prediction in recent years. However, most GNN methods are limited by information loss caused by traditional functional connectivity calculation as well as common noise issues in functional magnetic resonance imaging (fMRI) data. This paper proposes a graph generation based AD classification model using resting state fMRI to address this issue. The connectome generation network with large kernels for GNN (CGLK-GNN) based AD Analysis contains a graph generation block and a GNN prediction block. The graph generation block employs decoupled convolutional networks with large kernels to extract comprehensive temporal features while preserving sequential dependencies, contrasting with previous generative GNN approaches. This module constructs the connectome graph by encoding both edge-wise correlations and node-embedded temporal features, thereby utilizing the generated graph more effectively. The subsequent GNN prediction block adopts an efficient architecture to learn these enhanced representations and perform final AD stage classification. Through independent cohort validations, CGLK-GNN outperforms state-of-the-art GNN and rsfMRI-based AD classifiers in differentiating AD status. Furthermore, CGLK-GNN demonstrates high clinical value by learning clinically relevant connectome node and connectivity features from two independent datasets.
阿尔茨海默病(AD)是一种目前无法治愈的神经退行性疾病,早期发现是研究的重点。阿尔茨海默病的特点是进行性认知能力下降,并伴有脑功能连通性的改变。图神经网络(graph neural networks, gnn)由于其数据结构类似于图,近年来成为脑功能分析和疾病预测的重要方法。然而,大多数GNN方法受到传统功能连通性计算导致的信息丢失以及功能磁共振成像(fMRI)数据中常见的噪声问题的限制。为了解决这一问题,本文提出了一种基于静息状态fMRI的AD分类模型。基于CGLK-GNN的大核连接体生成网络包含一个图生成块和一个GNN预测块。与之前的生成式GNN方法相比,图生成块采用具有大核的解耦卷积网络来提取全面的时间特征,同时保留顺序依赖关系。该模块通过编码沿边相关性和节点嵌入的时间特征来构建连接体图,从而更有效地利用生成的图。随后的GNN预测块采用一种高效的架构来学习这些增强的表示,并执行最终的AD阶段分类。通过独立队列验证,CGLK-GNN在区分AD状态方面优于最先进的GNN和基于rsfmri的AD分类器。此外,CGLK-GNN通过从两个独立的数据集学习临床相关的连接组节点和连接特征,显示出很高的临床价值。
{"title":"CGLK-GNN : A connectome generation network with large kernels for GNN based Alzheimer’s disease analysis","authors":"Wenqi Zhu ,&nbsp;Zhong Yin ,&nbsp;Yinghua Fu ,&nbsp;Alzheimer's Disease Neuroimaging Initiative","doi":"10.1016/j.neunet.2026.108689","DOIUrl":"10.1016/j.neunet.2026.108689","url":null,"abstract":"<div><div>Alzheimer’s disease (AD) is a currently incurable neurodegenerative disease, with early detection representing a high research priority. AD is characterized by progressive cognitive decline accompanied by alterations in brain functional connectivity. Based on its data structure similar to the graph, graph neural networks (GNNs) have emerged as important methods for brain function analysis and disease prediction in recent years. However, most GNN methods are limited by information loss caused by traditional functional connectivity calculation as well as common noise issues in functional magnetic resonance imaging (fMRI) data. This paper proposes a graph generation based AD classification model using resting state fMRI to address this issue. The connectome generation network with large kernels for GNN (CGLK-GNN) based AD Analysis contains a graph generation block and a GNN prediction block. The graph generation block employs decoupled convolutional networks with large kernels to extract comprehensive temporal features while preserving sequential dependencies, contrasting with previous generative GNN approaches. This module constructs the connectome graph by encoding both edge-wise correlations and node-embedded temporal features, thereby utilizing the generated graph more effectively. The subsequent GNN prediction block adopts an efficient architecture to learn these enhanced representations and perform final AD stage classification. Through independent cohort validations, CGLK-GNN outperforms state-of-the-art GNN and rsfMRI-based AD classifiers in differentiating AD status. Furthermore, CGLK-GNN demonstrates high clinical value by learning clinically relevant connectome node and connectivity features from two independent datasets.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108689"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCAD: A self-constrained solution to automate context-guided zero-shot image anomaly detection SCAD:一种自我约束的解决方案,用于自动进行上下文引导的零拍摄图像异常检测
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-19 DOI: 10.1016/j.neunet.2026.108577
Siqi Wang , Guangpu Wang , Xinwang Liu , Jie Liu , Jiyuan Liu , Siwei Wang
Image anomaly detection (IAD) usually requires a separated train set to build an inductive model, which then infers on the test set. However, the cost of collecting and labeling training images has inspired zero-shot IAD (ZS-IAD), which directly processes the test set without the train set. Most ZS-IAD methods resort to pre-trained foundation models (e.g., CLIP), which rely on external prompts and lack adaptation to the target IAD scene. By contrast, context-guided ZS-IAD methods have recently attracted a growing interest: They not only avoid using external prompts by exploiting scene-specific context clues within unlabeled images, but also achieve superior performance to prior ZS-IAD counterparts. Unfortunately, existing context-guided ZS-IAD methods suffer from two vital flaws: The absence of train set forces them to set key hyperparameters blindly, which leads to unreliable performance. Besides, they do not actively handle mixed anomalies that disturb the learning process. To this end, we propose to automate context-guided ZS-IAD by a novel Self-Constrained Anomaly Detector (SCAD), which makes the following contributions: (1) We propose a novel self-constrained mechanism that can automatically determine proper values for key hyperparameters. (2) We design a new online self-constrained sampler that terminates the time-consuming sampling process by a proper stopping point, which can significantly reduce the computational cost. (3) We develop self-constrained normality refinement strategies that can actively constrain anomalies’ impact and automatically rectify the stopping threshold. To the best of our knowledge, this is also the first work that addresses hyperparameter selection in the IAD realm. Experiments show that SCAD not only yields comparable performance to classic IAD solutions, but also matches ZS-IAD solutions enhanced by hindsight knowledge (i.e., hyperparameters validated on the test set).
图像异常检测(IAD)通常需要一个分离的训练集来建立归纳模型,然后在测试集上进行推理。然而,训练图像的收集和标记成本激发了零射击IAD (ZS-IAD),它直接处理测试集而不处理训练集。大多数ZS-IAD方法采用预先训练的基础模型(例如CLIP),这些模型依赖于外部提示,缺乏对目标IAD场景的适应性。相比之下,上下文引导的ZS-IAD方法最近引起了越来越多的兴趣:它们不仅通过利用未标记图像中的特定场景上下文线索来避免使用外部提示,而且比之前的ZS-IAD方法具有更好的性能。不幸的是,现有的上下文引导的ZS-IAD方法存在两个重要缺陷:缺乏训练集迫使它们盲目地设置关键超参数,从而导致性能不可靠。此外,他们没有积极处理干扰学习过程的混合异常。为此,我们提出了一种新的自约束异常检测器(Self-Constrained Anomaly Detector, SCAD)来实现上下文引导下的ZS-IAD自动化,它做出了以下贡献:(1)我们提出了一种新的自约束机制,可以自动确定关键超参数的适当值。(2)我们设计了一种新的在线自约束采样器,通过适当的停止点终止耗时的采样过程,可以显著降低计算成本。(3)开发了能够主动约束异常影响并自动校正停止阈值的自约束正态性改进策略。据我们所知,这也是第一个在IAD领域解决超参数选择的工作。实验表明,SCAD不仅可以产生与经典IAD解决方案相当的性能,而且可以与经过后见之明(即在测试集上验证的超参数)增强的ZS-IAD解决方案相匹配。
{"title":"SCAD: A self-constrained solution to automate context-guided zero-shot image anomaly detection","authors":"Siqi Wang ,&nbsp;Guangpu Wang ,&nbsp;Xinwang Liu ,&nbsp;Jie Liu ,&nbsp;Jiyuan Liu ,&nbsp;Siwei Wang","doi":"10.1016/j.neunet.2026.108577","DOIUrl":"10.1016/j.neunet.2026.108577","url":null,"abstract":"<div><div>Image anomaly detection (IAD) usually requires a separated train set to build an inductive model, which then infers on the test set. However, the cost of collecting and labeling training images has inspired <em>zero-shot IAD</em> (ZS-IAD), which directly processes the test set without the train set. Most ZS-IAD methods resort to pre-trained foundation models (e.g., CLIP), which rely on external prompts and lack adaptation to the target IAD scene. By contrast, <em>context-guided ZS-IAD</em> methods have recently attracted a growing interest: They not only avoid using external prompts by exploiting scene-specific context clues within unlabeled images, but also achieve superior performance to prior ZS-IAD counterparts. Unfortunately, existing context-guided ZS-IAD methods suffer from two vital flaws: The absence of train set forces them to set key hyperparameters blindly, which leads to unreliable performance. Besides, they do not actively handle mixed anomalies that disturb the learning process. To this end, we propose to automate context-guided ZS-IAD by a novel <strong>S</strong>elf-<strong>C</strong>onstrained <strong>A</strong>nomaly <strong>D</strong>etector (SCAD), which makes the following contributions: <strong>(1)</strong> We propose a novel self-constrained mechanism that can automatically determine proper values for key hyperparameters. <strong>(2)</strong> We design a new online self-constrained sampler that terminates the time-consuming sampling process by a proper stopping point, which can significantly reduce the computational cost. <strong>(3)</strong> We develop self-constrained normality refinement strategies that can actively constrain anomalies’ impact and automatically rectify the stopping threshold. To the best of our knowledge, this is also the first work that addresses hyperparameter selection in the IAD realm. Experiments show that SCAD not only yields comparable performance to classic IAD solutions, but also matches ZS-IAD solutions enhanced by hindsight knowledge (i.e., hyperparameters validated on the test set).</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108577"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146049071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient semantic segmentation via logit-guided feature distillation 基于对数引导特征蒸馏的高效语义分割。
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108663
Xuyi Yu , Shang Lou , Yinghai Zhao , Huipeng Zhang , Kuizhi Mei
Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in https://github.com/fate2715/LFD.
知识蒸馏(Knowledge Distillation, KD)是模型压缩的一项关键技术,有助于将隐性知识从教师模型转移到更紧凑、可部署的学生模型。KD一般可分为两类:logit精馏和特征精馏。特征蒸馏在实现最先进(SOTA)性能方面占主导地位,但logit蒸馏的最新进展已经开始缩小差距。我们提出了一个logit引导的特征蒸馏(LFD)框架,该框架结合了logit和特征蒸馏的优点,以提高知识转移的效率,特别是利用logit中固有的丰富分类信息进行语义分割任务。此外,我们观察到深度神经网络(dnn)仅在足够深度下表现出与任务相关的特征,这可能是实现更高精度的限制因素。在这项工作中,我们引入了一种协作蒸馏方法,在早期阶段先发制人地关注关键像素和类别。我们使用来自深层的逻辑来生成细粒度的空间掩模,这些掩模直接传递到特征蒸馏阶段,从而产生空间梯度差异。此外,我们生成动态调节浅辅助头部权重的类掩码,确保主头部可以校准与类相关的特征。提出了一种新的共享辅助水头蒸馏方法。在cityscape、Pascal VOC和CamVid数据集上的实验表明,该方法在保持较低内存占用的同时取得了具有竞争力的性能。我们的代码将在https://github.com/fate2715/LFD上发布。
{"title":"Efficient semantic segmentation via logit-guided feature distillation","authors":"Xuyi Yu ,&nbsp;Shang Lou ,&nbsp;Yinghai Zhao ,&nbsp;Huipeng Zhang ,&nbsp;Kuizhi Mei","doi":"10.1016/j.neunet.2026.108663","DOIUrl":"10.1016/j.neunet.2026.108663","url":null,"abstract":"<div><div>Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in <span><span>https://github.com/fate2715/LFD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108663"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval 通过conidfine解决代码细化中的歧义:具有消歧义和目标检索的会话感知框架。
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108650
Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li
Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.
In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an Ambiguous Discriminator that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a Type-Driven Multi-Strategy Retrieval Framework that applies targeted strategies based on categories like Inaccurate Localization, Unclear Expression, and Lack of Specific Guidance to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.
代码细化是软件开发的一个重要方面,涉及到对开发人员贡献的代码的审查和增强。在这个过程中,一个关键的挑战来自于不清楚或模棱两可的评审评论,这可能会阻碍开发人员对所需变更的理解。我们的初步研究表明,开发人员和评审人员之间的对话通常包含有价值的信息,这些信息可以帮助解决这种模糊的评审建议。然而,利用会话数据来解决这个问题提出了两个关键挑战:(1)使模型能够自主地确定审查建议是否含糊不清,以及(2)有效地从对话中提取有助于解决含糊不清的相关片段。在本文中,我们提出了一种新的方法,通过利用审稿人和开发人员之间的对话来处理模棱两可的审查建议。为了解决上述两个挑战,我们引入了一个歧义判别器,它使用多任务学习对歧义进行分类,并从gpt -4标记的数据集中生成类型感知的混淆点。这些混淆点指导了一个类型驱动的多策略检索框架,该框架基于诸如定位不准确、表达不清和缺乏具体指导等类别应用目标策略,从对话上下文中提取可操作的信息。为了支持这一点,我们构建了一个包含空间指示、澄清模式和动作导向动词的语义辅助指令库,使复习建议和信息会话片段之间能够精确对齐。我们的方法在两个广泛使用的代码优化数据集CodeReview和CodeReview- new上进行了评估,我们证明了我们的方法显着提高了各种最先进的模型的性能,包括TransReview, T5-Review, CodeT5, CodeReviewer和ChatGPT。此外,我们深入探讨了会话信息如何提高模型处理细粒度情况的能力,并进行了人工评估,以评估歧义检测的准确性和生成混淆点的正确性。我们是第一个在代码细化领域引入模棱两可的评审建议问题的人,并提出了一个解决方案,不仅解决了这些挑战,而且为未来的研究奠定了基础。我们的方法为改进评审建议的清晰度和有效性提供了有价值的见解,为推进代码精化技术提供了一个有希望的方向。
{"title":"Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval","authors":"Aoyu Song ,&nbsp;Afizan Azman ,&nbsp;Shanzhi Gu ,&nbsp;Fangjian Jiang ,&nbsp;Jianchi Du ,&nbsp;Tailong Wu ,&nbsp;Mingyang Geng ,&nbsp;Jia Li","doi":"10.1016/j.neunet.2026.108650","DOIUrl":"10.1016/j.neunet.2026.108650","url":null,"abstract":"<div><div>Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.</div><div>In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an <strong>Ambiguous Discriminator</strong> that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a <strong>Type-Driven Multi-Strategy Retrieval Framework</strong> that applies targeted strategies based on categories like <em>Inaccurate Localization, Unclear Expression</em>, and <em>Lack of Specific Guidance</em> to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108650"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Warm-start or cold-start? A comparison of generalizability in gradient-based hyperparameter tuning 热启动还是冷启动?基于梯度的超参数整定的泛化性比较。
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108647
Yubo Zhou , Jun Shu , Chengli Tan , Haishan Ye , Quanziang Wang , Junmin Liu , Deyu Meng , Ivor Tsang , Guang Dai
Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.
双层优化(BO)在超参数调优中引起了越来越多的关注。BO方法通常在内部层使用两种不同的策略:冷启动,使用固定初始化;热启动,每次分别使用最后一个内部近似解作为内部求解器的起点。以往的研究主要表明暖启动策略具有更好的收敛性,本文从广义的角度对两种策略进行了详细的比较。结果表明,与冷启动策略相比,热启动策略表现出更差的泛化性能,如验证集的过拟合更严重。为了解释这一点,我们建立了两种策略的泛化界限。研究发现,热启动策略由于与内部动态的相互作用更密切,产生了更差的泛化上界,自然导致了较差的泛化性能。在理论结果的启发下,我们提出了几种方法来提高热启动策略的泛化能力,缩小其与冷启动策略的差距,特别是一种新的随机摄动初始化方法。实验验证了理论分析的正确性和所提方法的有效性。
{"title":"Warm-start or cold-start? A comparison of generalizability in gradient-based hyperparameter tuning","authors":"Yubo Zhou ,&nbsp;Jun Shu ,&nbsp;Chengli Tan ,&nbsp;Haishan Ye ,&nbsp;Quanziang Wang ,&nbsp;Junmin Liu ,&nbsp;Deyu Meng ,&nbsp;Ivor Tsang ,&nbsp;Guang Dai","doi":"10.1016/j.neunet.2026.108647","DOIUrl":"10.1016/j.neunet.2026.108647","url":null,"abstract":"<div><div>Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108647"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition SPD-Net:一种基于语义划分的动态图网络转换器,用于改进的基于骨骼的步态识别
IF 6.3 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-07-01 Epub Date: 2026-02-03 DOI: 10.1016/j.neunet.2026.108679
Priyanka D, Mala T
Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.
步态识别由于其不显眼和非侵入性而成为一种突出的生物识别方式。现有的方法主要依赖于基于轮廓的表示,这使得它们对服装、遮挡和背景噪声的变化很敏感。相比之下,基于模型的方法利用骨骼序列通过关节连接来捕获运动动力学,从而减少了对视觉外观的依赖。然而,这些方法通常依赖于物理连接的关节,限制了它们对语义上有意义的关节关系建模的能力。基于转换器的模型通过捕获远程依赖关系减轻了这一限制,但代价是大量的计算开销。为了解决这些挑战,本研究提出了基于动态图网络(SPD-Net)的语义分割转换器,用于鲁棒步态识别。SPD-Net集成了动态图卷积网络(DGCN)、时间卷积网络(TCN)和语义分割多头自注意(SP-MSA)来增强步态特征的表征。DGCN动态学习关节之间的空间相关性,而TCN捕获时间依赖性。此外,SP-MSA引入了一种语义划分策略,选择性地关注关键关节和框架,在保留关键步态模式的同时显著降低了计算复杂度。这种方法有效地模拟了物理上相邻和远处的关节关系,以及帧内和帧间的相关性。最后,关节部分映射(JPM)模块通过捕获跨多个尺度的分层关节关系来增强步态表征的判别能力。对基准步态数据集的实验评估表明,SPD-Net超越了先前最先进的方法,在各种步态识别挑战中实现了更高的鲁棒性和准确性。
{"title":"SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition","authors":"Priyanka D,&nbsp;Mala T","doi":"10.1016/j.neunet.2026.108679","DOIUrl":"10.1016/j.neunet.2026.108679","url":null,"abstract":"<div><div>Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108679"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Networks
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1