Pub Date : 2026-07-01Epub Date: 2026-02-10DOI: 10.1016/j.neunet.2026.108715
Abdullah Elen , Emre Avuçlu
Deep learning (DL) has achieved remarkable success in processing unstructured data such as images, text, and audio, yet its application to tabular numerical datasets remains challenging due to the lack of inherent spatial structure. In this study, we present a novel approach for transforming numerical tabular data into grayscale image representations, enabling the effective use of convolutional neural networks and other DL architectures on traditionally numerical datasets. The method normalizes features, organizes them into square image matrices, and generates labeled images for classification. Experiments were conducted on four publicly available datasets: Rice MSC Dataset (RMSCD), Optical Recognition of Handwritten Digits (Optdigits), TUNADROMD, and Spambase. Transformed datasets were evaluated using Residual Network (ResNet-18) and Directed Acyclic Graph Neural Network (DAG-Net) models with 5-fold cross-validation. The DAG-Net model achieved accuracies of 99.91% on RMSCD, 99.77% on Optdigits, 98.84% on TUNADROMD, and 93.06% on Spambase, demonstrating the efficacy of the proposed transformation. Additional ablation studies and efficiency analyses highlight improvements in training performance and computational cost. The results indicate that the proposed image-based transformation provides a practical and efficient strategy for integrating numerical datasets into deep learning workflows, broadening the applicability of DL techniques across diverse domains. The implementation is released as open-source software to facilitate reproducibility and further research.
深度学习(DL)在处理非结构化数据(如图像、文本和音频)方面取得了显著的成功,但由于缺乏固有的空间结构,将其应用于表格数字数据集仍然具有挑战性。在本研究中,我们提出了一种将数值表格数据转换为灰度图像表示的新方法,从而能够在传统的数值数据集上有效地使用卷积神经网络和其他深度学习架构。该方法将特征归一化,组织成方形图像矩阵,生成标记图像进行分类。实验在四个公开的数据集上进行:Rice MSC Dataset (RMSCD)、Optical Recognition of handwriting Digits (Optdigits)、TUNADROMD和Spambase。转换后的数据集使用残差网络(ResNet-18)和有向无环图神经网络(DAG-Net)模型进行评估,并进行5次交叉验证。DAG-Net模型在RMSCD上的准确率为99.91%,在Optdigits上的准确率为99.77%,在TUNADROMD上的准确率为98.84%,在Spambase上的准确率为93.06%,证明了所提出转换的有效性。额外的消融研究和效率分析强调了训练性能和计算成本的改进。结果表明,所提出的基于图像的转换为将数值数据集集成到深度学习工作流中提供了一种实用而有效的策略,扩大了深度学习技术在不同领域的适用性。该实现作为开源软件发布,以促进可重复性和进一步的研究。
{"title":"Transforming tabular data into images for deep learning models","authors":"Abdullah Elen , Emre Avuçlu","doi":"10.1016/j.neunet.2026.108715","DOIUrl":"10.1016/j.neunet.2026.108715","url":null,"abstract":"<div><div>Deep learning (DL) has achieved remarkable success in processing unstructured data such as images, text, and audio, yet its application to tabular numerical datasets remains challenging due to the lack of inherent spatial structure. In this study, we present a novel approach for transforming numerical tabular data into grayscale image representations, enabling the effective use of convolutional neural networks and other DL architectures on traditionally numerical datasets. The method normalizes features, organizes them into square image matrices, and generates labeled images for classification. Experiments were conducted on four publicly available datasets: Rice MSC Dataset (RMSCD), Optical Recognition of Handwritten Digits (Optdigits), TUNADROMD, and Spambase. Transformed datasets were evaluated using Residual Network (ResNet-18) and Directed Acyclic Graph Neural Network (DAG-Net) models with 5-fold cross-validation. The DAG-Net model achieved accuracies of 99.91% on RMSCD, 99.77% on Optdigits, 98.84% on TUNADROMD, and 93.06% on Spambase, demonstrating the efficacy of the proposed transformation. Additional ablation studies and efficiency analyses highlight improvements in training performance and computational cost. The results indicate that the proposed image-based transformation provides a practical and efficient strategy for integrating numerical datasets into deep learning workflows, broadening the applicability of DL techniques across diverse domains. The implementation is released as open-source software to facilitate reproducibility and further research.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108715"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-07-01Epub Date: 2026-02-02DOI: 10.1016/j.neunet.2026.108681
Filippo Aglietti , Francesco Della Santa , Andrea Piano , Virginia Aglietti
We propose Gradient-Informed Neural Networks (gradinn s), a methodology that can be used to efficiently approximate a wide range of functions in low-data regimes, when only general prior beliefs are available, a condition that is often encountered in complex engineering problems.
gradinn s incorporate prior beliefs about the first-order derivatives of the target function to constrain the behavior of its gradient, thus implicitly shaping it, without requiring explicit access to the target function’s derivatives. This is achieved by using two Neural Networks: one modeling the target function and a second, auxiliary network expressing the prior beliefs about the first-order derivatives (e.g., smoothness, oscillations, etc.). A customized loss function enables the training of the first network while enforcing gradient constraints derived from the auxiliary network; at the same time, it allows these constraints to be relaxed in accordance with the training data. Numerical experiments demonstrate the advantages of gradinn s, particularly in low-data regimes, with results showing strong performance compared to standard Neural Networks across the tested scenarios, including synthetic benchmark functions and real-world engineering tasks.
{"title":"Gradient-informed neural networks: Embedding prior beliefs for learning in low-data scenarios","authors":"Filippo Aglietti , Francesco Della Santa , Andrea Piano , Virginia Aglietti","doi":"10.1016/j.neunet.2026.108681","DOIUrl":"10.1016/j.neunet.2026.108681","url":null,"abstract":"<div><div>We propose Gradient-Informed Neural Networks (<span>g</span>rad<span>inn</span> s), a methodology that can be used to efficiently approximate a wide range of functions in low-data regimes, when only general prior beliefs are available, a condition that is often encountered in complex engineering problems.</div><div><span>g</span>rad<span>inn</span> s incorporate prior beliefs about the first-order derivatives of the target function to constrain the behavior of its gradient, thus implicitly shaping it, without requiring explicit access to the target function’s derivatives. This is achieved by using two Neural Networks: one modeling the target function and a second, auxiliary network expressing the prior beliefs about the first-order derivatives (e.g., smoothness, oscillations, etc.). A customized loss function enables the training of the first network while enforcing gradient constraints derived from the auxiliary network; at the same time, it allows these constraints to be relaxed in accordance with the training data. Numerical experiments demonstrate the advantages of <span>g</span>rad<span>inn</span> s, particularly in low-data regimes, with results showing strong performance compared to standard Neural Networks across the tested scenarios, including synthetic benchmark functions and real-world engineering tasks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108681"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146167562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.
{"title":"NG-SNN: A neurogenesis-inspired dynamic adaptive framework for efficient spike classification","authors":"Jing Tang , Depeng Li , Zhenyu Zhang , Zhigang Zeng","doi":"10.1016/j.neunet.2026.108656","DOIUrl":"10.1016/j.neunet.2026.108656","url":null,"abstract":"<div><div>Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108656"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Graph Neural Networks (GNNs) have achieved strong performance in structured data modeling such as node classification. However, real-world graphs often exhibit heterogeneous neighborhoods and complex feature distributions, while mainstream approaches rely on many learnable parameters and apply uniform aggregation to all neighbors. This lack of explicit modeling for structural diversity often leads to representation homogenization, semantic degradation, and poor adaptability under challenging conditions such as low supervision or class imbalance. To address these limitations, we propose a trainable-parameter-free graph neural network framework, termed the Structural-Diversity Graph Neural Network (SDGNN), which operationalizes structural diversity in message passing. At its core, the Structural-Diversity Message Passing (SDMP) mechanism performs within-group statistics followed by cross-group selection, thereby capturing neighborhood heterogeneity while stabilizing feature semantics. SDGNN further incorporates complementary structure-driven and feature-driven partitioning strategies, together with a normalized-propagation-based global structural enhancer, to enhance adaptability across diverse graphs. Extensive experiments on nine public benchmark datasets and an interdisciplinary PubMed citation network demonstrate that SDGNN consistently outperforms mainstream GNNs, especially under low supervision, class imbalance, and cross-domain transfer. The full implementation, including code and configurations, is publicly available at: https://github.com/mingyue15694/SGDNN/tree/main.
{"title":"Trainable-parameter-free structural-diversity message passing for graph neural networks","authors":"Mingyue Kong, Yinglong Zhang, Chengda Xu, Xuewen Xia, Xing Xu","doi":"10.1016/j.neunet.2026.108711","DOIUrl":"10.1016/j.neunet.2026.108711","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have achieved strong performance in structured data modeling such as node classification. However, real-world graphs often exhibit heterogeneous neighborhoods and complex feature distributions, while mainstream approaches rely on many learnable parameters and apply uniform aggregation to all neighbors. This lack of explicit modeling for structural diversity often leads to representation homogenization, semantic degradation, and poor adaptability under challenging conditions such as low supervision or class imbalance. To address these limitations, we propose a trainable-parameter-free graph neural network framework, termed the Structural-Diversity Graph Neural Network (SDGNN), which operationalizes structural diversity in message passing. At its core, the Structural-Diversity Message Passing (SDMP) mechanism performs within-group statistics followed by cross-group selection, thereby capturing neighborhood heterogeneity while stabilizing feature semantics. SDGNN further incorporates complementary structure-driven and feature-driven partitioning strategies, together with a normalized-propagation-based global structural enhancer, to enhance adaptability across diverse graphs. Extensive experiments on nine public benchmark datasets and an interdisciplinary PubMed citation network demonstrate that SDGNN consistently outperforms mainstream GNNs, especially under low supervision, class imbalance, and cross-domain transfer. The full implementation, including code and configurations, is publicly available at: <span><span>https://github.com/mingyue15694/SGDNN/tree/main</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108711"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alzheimer’s disease (AD) is a currently incurable neurodegenerative disease, with early detection representing a high research priority. AD is characterized by progressive cognitive decline accompanied by alterations in brain functional connectivity. Based on its data structure similar to the graph, graph neural networks (GNNs) have emerged as important methods for brain function analysis and disease prediction in recent years. However, most GNN methods are limited by information loss caused by traditional functional connectivity calculation as well as common noise issues in functional magnetic resonance imaging (fMRI) data. This paper proposes a graph generation based AD classification model using resting state fMRI to address this issue. The connectome generation network with large kernels for GNN (CGLK-GNN) based AD Analysis contains a graph generation block and a GNN prediction block. The graph generation block employs decoupled convolutional networks with large kernels to extract comprehensive temporal features while preserving sequential dependencies, contrasting with previous generative GNN approaches. This module constructs the connectome graph by encoding both edge-wise correlations and node-embedded temporal features, thereby utilizing the generated graph more effectively. The subsequent GNN prediction block adopts an efficient architecture to learn these enhanced representations and perform final AD stage classification. Through independent cohort validations, CGLK-GNN outperforms state-of-the-art GNN and rsfMRI-based AD classifiers in differentiating AD status. Furthermore, CGLK-GNN demonstrates high clinical value by learning clinically relevant connectome node and connectivity features from two independent datasets.
{"title":"CGLK-GNN : A connectome generation network with large kernels for GNN based Alzheimer’s disease analysis","authors":"Wenqi Zhu , Zhong Yin , Yinghua Fu , Alzheimer's Disease Neuroimaging Initiative","doi":"10.1016/j.neunet.2026.108689","DOIUrl":"10.1016/j.neunet.2026.108689","url":null,"abstract":"<div><div>Alzheimer’s disease (AD) is a currently incurable neurodegenerative disease, with early detection representing a high research priority. AD is characterized by progressive cognitive decline accompanied by alterations in brain functional connectivity. Based on its data structure similar to the graph, graph neural networks (GNNs) have emerged as important methods for brain function analysis and disease prediction in recent years. However, most GNN methods are limited by information loss caused by traditional functional connectivity calculation as well as common noise issues in functional magnetic resonance imaging (fMRI) data. This paper proposes a graph generation based AD classification model using resting state fMRI to address this issue. The connectome generation network with large kernels for GNN (CGLK-GNN) based AD Analysis contains a graph generation block and a GNN prediction block. The graph generation block employs decoupled convolutional networks with large kernels to extract comprehensive temporal features while preserving sequential dependencies, contrasting with previous generative GNN approaches. This module constructs the connectome graph by encoding both edge-wise correlations and node-embedded temporal features, thereby utilizing the generated graph more effectively. The subsequent GNN prediction block adopts an efficient architecture to learn these enhanced representations and perform final AD stage classification. Through independent cohort validations, CGLK-GNN outperforms state-of-the-art GNN and rsfMRI-based AD classifiers in differentiating AD status. Furthermore, CGLK-GNN demonstrates high clinical value by learning clinically relevant connectome node and connectivity features from two independent datasets.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108689"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-07-01Epub Date: 2026-01-19DOI: 10.1016/j.neunet.2026.108577
Siqi Wang , Guangpu Wang , Xinwang Liu , Jie Liu , Jiyuan Liu , Siwei Wang
Image anomaly detection (IAD) usually requires a separated train set to build an inductive model, which then infers on the test set. However, the cost of collecting and labeling training images has inspired zero-shot IAD (ZS-IAD), which directly processes the test set without the train set. Most ZS-IAD methods resort to pre-trained foundation models (e.g., CLIP), which rely on external prompts and lack adaptation to the target IAD scene. By contrast, context-guided ZS-IAD methods have recently attracted a growing interest: They not only avoid using external prompts by exploiting scene-specific context clues within unlabeled images, but also achieve superior performance to prior ZS-IAD counterparts. Unfortunately, existing context-guided ZS-IAD methods suffer from two vital flaws: The absence of train set forces them to set key hyperparameters blindly, which leads to unreliable performance. Besides, they do not actively handle mixed anomalies that disturb the learning process. To this end, we propose to automate context-guided ZS-IAD by a novel Self-Constrained Anomaly Detector (SCAD), which makes the following contributions: (1) We propose a novel self-constrained mechanism that can automatically determine proper values for key hyperparameters. (2) We design a new online self-constrained sampler that terminates the time-consuming sampling process by a proper stopping point, which can significantly reduce the computational cost. (3) We develop self-constrained normality refinement strategies that can actively constrain anomalies’ impact and automatically rectify the stopping threshold. To the best of our knowledge, this is also the first work that addresses hyperparameter selection in the IAD realm. Experiments show that SCAD not only yields comparable performance to classic IAD solutions, but also matches ZS-IAD solutions enhanced by hindsight knowledge (i.e., hyperparameters validated on the test set).
{"title":"SCAD: A self-constrained solution to automate context-guided zero-shot image anomaly detection","authors":"Siqi Wang , Guangpu Wang , Xinwang Liu , Jie Liu , Jiyuan Liu , Siwei Wang","doi":"10.1016/j.neunet.2026.108577","DOIUrl":"10.1016/j.neunet.2026.108577","url":null,"abstract":"<div><div>Image anomaly detection (IAD) usually requires a separated train set to build an inductive model, which then infers on the test set. However, the cost of collecting and labeling training images has inspired <em>zero-shot IAD</em> (ZS-IAD), which directly processes the test set without the train set. Most ZS-IAD methods resort to pre-trained foundation models (e.g., CLIP), which rely on external prompts and lack adaptation to the target IAD scene. By contrast, <em>context-guided ZS-IAD</em> methods have recently attracted a growing interest: They not only avoid using external prompts by exploiting scene-specific context clues within unlabeled images, but also achieve superior performance to prior ZS-IAD counterparts. Unfortunately, existing context-guided ZS-IAD methods suffer from two vital flaws: The absence of train set forces them to set key hyperparameters blindly, which leads to unreliable performance. Besides, they do not actively handle mixed anomalies that disturb the learning process. To this end, we propose to automate context-guided ZS-IAD by a novel <strong>S</strong>elf-<strong>C</strong>onstrained <strong>A</strong>nomaly <strong>D</strong>etector (SCAD), which makes the following contributions: <strong>(1)</strong> We propose a novel self-constrained mechanism that can automatically determine proper values for key hyperparameters. <strong>(2)</strong> We design a new online self-constrained sampler that terminates the time-consuming sampling process by a proper stopping point, which can significantly reduce the computational cost. <strong>(3)</strong> We develop self-constrained normality refinement strategies that can actively constrain anomalies’ impact and automatically rectify the stopping threshold. To the best of our knowledge, this is also the first work that addresses hyperparameter selection in the IAD realm. Experiments show that SCAD not only yields comparable performance to classic IAD solutions, but also matches ZS-IAD solutions enhanced by hindsight knowledge (i.e., hyperparameters validated on the test set).</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108577"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146049071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in https://github.com/fate2715/LFD.
{"title":"Efficient semantic segmentation via logit-guided feature distillation","authors":"Xuyi Yu , Shang Lou , Yinghai Zhao , Huipeng Zhang , Kuizhi Mei","doi":"10.1016/j.neunet.2026.108663","DOIUrl":"10.1016/j.neunet.2026.108663","url":null,"abstract":"<div><div>Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in <span><span>https://github.com/fate2715/LFD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108663"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-07-01Epub Date: 2026-01-29DOI: 10.1016/j.neunet.2026.108650
Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li
Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.
In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an Ambiguous Discriminator that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a Type-Driven Multi-Strategy Retrieval Framework that applies targeted strategies based on categories like Inaccurate Localization, Unclear Expression, and Lack of Specific Guidance to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.
{"title":"Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval","authors":"Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li","doi":"10.1016/j.neunet.2026.108650","DOIUrl":"10.1016/j.neunet.2026.108650","url":null,"abstract":"<div><div>Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.</div><div>In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an <strong>Ambiguous Discriminator</strong> that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a <strong>Type-Driven Multi-Strategy Retrieval Framework</strong> that applies targeted strategies based on categories like <em>Inaccurate Localization, Unclear Expression</em>, and <em>Lack of Specific Guidance</em> to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108650"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-07-01Epub Date: 2026-01-29DOI: 10.1016/j.neunet.2026.108647
Yubo Zhou , Jun Shu , Chengli Tan , Haishan Ye , Quanziang Wang , Junmin Liu , Deyu Meng , Ivor Tsang , Guang Dai
Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.
{"title":"Warm-start or cold-start? A comparison of generalizability in gradient-based hyperparameter tuning","authors":"Yubo Zhou , Jun Shu , Chengli Tan , Haishan Ye , Quanziang Wang , Junmin Liu , Deyu Meng , Ivor Tsang , Guang Dai","doi":"10.1016/j.neunet.2026.108647","DOIUrl":"10.1016/j.neunet.2026.108647","url":null,"abstract":"<div><div>Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108647"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-07-01Epub Date: 2026-02-03DOI: 10.1016/j.neunet.2026.108679
Priyanka D, Mala T
Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.
{"title":"SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition","authors":"Priyanka D, Mala T","doi":"10.1016/j.neunet.2026.108679","DOIUrl":"10.1016/j.neunet.2026.108679","url":null,"abstract":"<div><div>Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108679"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}