Pub Date : 2026-01-30DOI: 10.1016/j.neunet.2026.108670
Tianyu Wang , Maite Zhang , Mingxuan Lu , Mian Li
In real-world applications, tabular datasets often evolve over time, leading to temporal shift that degrades the long-range neural network performance. Most existing temporal encoding or adaptation solutions treat time cues as fixed auxiliary variables at a single scale. Motivated by the multi-horizon nature of temporal shifts with heterogeneous temporal dynamics, this paper presents TARS (Temporal Abstraction with Routed Scales), a novel plug-and-play method for robust tabular learning under temporal shift, applicable to various deep learning model backbones. First, an explicit temporal encoder decomposes timestamps into short-term recency, mid-term periodicity, and long-term contextual embeddings with structured memory. Next, an implicit drift encoder tracks higher-order distributional statistics at the same aligned timescales, producing drift signals that reflect ongoing temporal dynamics. These signals drive a drift-aware routing mechanism that adaptively weights the explicit temporal pathways, emphasizing the most relevant timescales under current conditions. Finally, a feature-temporal fusion layer integrates the routed temporal representation with original features, injecting context-aware bias. Extensive experiments on eight real-world datasets from the TabReD benchmark show that TARS consistently outperforms the competitive compared methods across various backbone models, achieving up to +2.38% average relative improvement on MLP, +4.08% on DCNv2, etc. Ablation studies verify the complementary contributions of all four modules. These results highlight the effectiveness of TARS for improving the temporal robustness of existing deep tabular models.
在现实世界的应用中,表格数据集经常随着时间的推移而变化,导致时间的变化,从而降低了远程神经网络的性能。大多数现有的时间编码或自适应解决方案将时间线索视为单一尺度上的固定辅助变量。摘要针对具有异构时间动态的时间转移的多视界特性,提出了一种适用于各种深度学习模型主干的时间转移鲁棒表格学习的即插即用方法TARS (temporal Abstraction with routing Scales)。首先,显式时间编码器将时间戳分解为具有结构化记忆的短期近期性、中期周期性和长期上下文嵌入。接下来,隐式漂移编码器在相同的对齐时间尺度上跟踪高阶分布统计数据,产生反映持续时间动态的漂移信号。这些信号驱动漂移感知路由机制,该机制自适应地加权显式时间路径,强调当前条件下最相关的时间尺度。最后,特征时间融合层将路由的时间表示与原始特征集成在一起,注入上下文感知偏差。在TabReD基准测试的8个真实数据集上进行的大量实验表明,TARS在各种骨干模型中始终优于竞争性比较方法,在MLP上实现了+2.38%的平均相对改进,在DCNv2等上实现了+4.08%的平均相对改进。消融研究证实了所有四个模块的互补贡献。这些结果突出了TARS在提高现有深度表格模型的时间鲁棒性方面的有效性。
{"title":"Multi-timescale representation with adaptive routing for deep tabular learning under temporal shift","authors":"Tianyu Wang , Maite Zhang , Mingxuan Lu , Mian Li","doi":"10.1016/j.neunet.2026.108670","DOIUrl":"10.1016/j.neunet.2026.108670","url":null,"abstract":"<div><div>In real-world applications, tabular datasets often evolve over time, leading to temporal shift that degrades the long-range neural network performance. Most existing temporal encoding or adaptation solutions treat time cues as fixed auxiliary variables at a single scale. Motivated by the multi-horizon nature of temporal shifts with heterogeneous temporal dynamics, this paper presents TARS (Temporal Abstraction with Routed Scales), a novel plug-and-play method for robust tabular learning under temporal shift, applicable to various deep learning model backbones. First, an explicit temporal encoder decomposes timestamps into short-term recency, mid-term periodicity, and long-term contextual embeddings with structured memory. Next, an implicit drift encoder tracks higher-order distributional statistics at the same aligned timescales, producing drift signals that reflect ongoing temporal dynamics. These signals drive a drift-aware routing mechanism that adaptively weights the explicit temporal pathways, emphasizing the most relevant timescales under current conditions. Finally, a feature-temporal fusion layer integrates the routed temporal representation with original features, injecting context-aware bias. Extensive experiments on eight real-world datasets from the TabReD benchmark show that TARS consistently outperforms the competitive compared methods across various backbone models, achieving up to +2.38% average relative improvement on MLP, +4.08% on DCNv2, etc. Ablation studies verify the complementary contributions of all four modules. These results highlight the effectiveness of TARS for improving the temporal robustness of existing deep tabular models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108670"},"PeriodicalIF":6.3,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in https://github.com/fate2715/LFD.
{"title":"Efficient semantic segmentation via logit-guided feature distillation","authors":"Xuyi Yu , Shang Lou , Yinghai Zhao , Huipeng Zhang , Kuizhi Mei","doi":"10.1016/j.neunet.2026.108663","DOIUrl":"10.1016/j.neunet.2026.108663","url":null,"abstract":"<div><div>Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in <span><span>https://github.com/fate2715/LFD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108663"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108650
Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li
Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.
In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an Ambiguous Discriminator that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a Type-Driven Multi-Strategy Retrieval Framework that applies targeted strategies based on categories like Inaccurate Localization, Unclear Expression, and Lack of Specific Guidance to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.
{"title":"Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval","authors":"Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li","doi":"10.1016/j.neunet.2026.108650","DOIUrl":"10.1016/j.neunet.2026.108650","url":null,"abstract":"<div><div>Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.</div><div>In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an <strong>Ambiguous Discriminator</strong> that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a <strong>Type-Driven Multi-Strategy Retrieval Framework</strong> that applies targeted strategies based on categories like <em>Inaccurate Localization, Unclear Expression</em>, and <em>Lack of Specific Guidance</em> to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108650"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108647
Yubo Zhou , Jun Shu , Chengli Tan , Haishan Ye , Quanziang Wang , Junmin Liu , Deyu Meng , Ivor Tsang , Guang Dai
Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.
{"title":"Warm-start or cold-start? A comparison of generalizability in gradient-based hyperparameter tuning","authors":"Yubo Zhou , Jun Shu , Chengli Tan , Haishan Ye , Quanziang Wang , Junmin Liu , Deyu Meng , Ivor Tsang , Guang Dai","doi":"10.1016/j.neunet.2026.108647","DOIUrl":"10.1016/j.neunet.2026.108647","url":null,"abstract":"<div><div>Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108647"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.
{"title":"NG-SNN: A neurogenesis-inspired dynamic adaptive framework for efficient spike classification","authors":"Jing Tang , Depeng Li , Zhenyu Zhang , Zhigang Zeng","doi":"10.1016/j.neunet.2026.108656","DOIUrl":"10.1016/j.neunet.2026.108656","url":null,"abstract":"<div><div>Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108656"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108668
Yiping Song , Juhua Zhang , Zhiliang Tian , Taishu Sheng , Yuxin Yang , Minlie Huang , Xinwang Liu , Dongsheng Li
Data augmentation (DA) is a widely adopted approach for mitigating data insufficiency. Conducting DA in private domains requires privacy-preserving text generation, including anonymization or perturbation applied to sensitive textual data. The above methods lack formal protection guarantees. Existing Differential Privacy (DP) learning methods provide theoretical guarantees by adding calibrated noise to models or outputs. However, the large output space and model scales in text generation require substantial noise, which severely degrades synthesis quality. In this paper, we transfer DP-based synthetic sample generation to DP-based sample discrimination. Specifically, we propose a DP-based DA framework with a large language model (LLM) and a DP-based discriminator for private-domain text generation. Our key idea is to (1) leverage LLMs to generate large-scale high-quality samples, (2) select synthesized samples fitting the private domain, and (3) align the label distribution with the private domain. To achieve this, we use knowledge distillation to construct a DP-based discriminator: teacher models, accessing private data, guide a student model to select samples under calibrated noise. A DP-based tutor further constrains the label distribution of synthesized samples with a low privacy budget. We theoretically analyze the privacy guarantees and empirically validate our method on three medical text classification datasets, showing that our DP-synthesized samples significantly outperform state-of-the-art DP fine-tuning baselines in utility.
{"title":"Differentially private data augmentation via LLM generation with discriminative and distribution-aligned filtering","authors":"Yiping Song , Juhua Zhang , Zhiliang Tian , Taishu Sheng , Yuxin Yang , Minlie Huang , Xinwang Liu , Dongsheng Li","doi":"10.1016/j.neunet.2026.108668","DOIUrl":"10.1016/j.neunet.2026.108668","url":null,"abstract":"<div><div>Data augmentation (DA) is a widely adopted approach for mitigating data insufficiency. Conducting DA in private domains requires privacy-preserving text generation, including anonymization or perturbation applied to sensitive textual data. The above methods lack formal protection guarantees. Existing Differential Privacy (DP) learning methods provide theoretical guarantees by adding calibrated noise to models or outputs. However, the large output space and model scales in text generation require substantial noise, which severely degrades synthesis quality. In this paper, we transfer DP-based synthetic sample generation to DP-based sample discrimination. Specifically, we propose a DP-based DA framework with a large language model (LLM) and a DP-based discriminator for private-domain text generation. Our key idea is to (1) leverage LLMs to generate large-scale high-quality samples, (2) select synthesized samples fitting the private domain, and (3) align the label distribution with the private domain. To achieve this, we use knowledge distillation to construct a DP-based discriminator: teacher models, accessing private data, guide a student model to select samples under calibrated noise. A DP-based tutor further constrains the label distribution of synthesized samples with a low privacy budget. We theoretically analyze the privacy guarantees and empirically validate our method on three medical text classification datasets, showing that our DP-synthesized samples significantly outperform state-of-the-art DP fine-tuning baselines in utility.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108668"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146138106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108664
Shiji Qiu , Zuoqi Hu , Tiange Zhang , Zhi Liu , Junyu Dong , Qing Cai
Dense 3D reconstruction using forward-looking sonar (FLS) is essential for ocean exploration. Recent advancements in FLS-based 3D reconstruction using neural radiance fields have emerged, demonstrating promising results. However, their excessively slow reconstruction speed significantly impacts their application in real-world scenarios, primarily due to two reasons: (1) the reliance on MLPs for scene representation leads to slow training, often requiring several hours for reconstruction; and (2) the uniform sampling strategy along the elevation arc is inefficient, greatly hindering both training speed and reconstruction quality. To address these challenges, we propose a voxel-based efficient neural implicit surface reconstruction approach using FLS, featuring three key innovations: 1) Replacing MLPs with voxel grids for scene representation, utilizing a signed distance function (SDF) voxel grid to model geometry and a feature voxel grid to capture appearance. 2) Introducing a hierarchical sampling strategy along the elevation arc to improve sampling efficiency. 3) Applying SDF Gaussian convolution to the SDF voxel grid, effectively reducing noise and surface roughness. Extensive experiments demonstrate that our method significantly outperforms existing unsupervised dense FLS reconstruction techniques. Notably, our approach achieves the same reconstruction quality in just 10 minutes of training that previously required 4 hours with state-of-the-art methods, while also delivering superior results. We will open-source our code upon paper acceptance.
{"title":"Sonar-neus:voxel-based efficient neural implicit surface reconstruction for forward-looking sonar","authors":"Shiji Qiu , Zuoqi Hu , Tiange Zhang , Zhi Liu , Junyu Dong , Qing Cai","doi":"10.1016/j.neunet.2026.108664","DOIUrl":"10.1016/j.neunet.2026.108664","url":null,"abstract":"<div><div>Dense 3D reconstruction using forward-looking sonar (FLS) is essential for ocean exploration. Recent advancements in FLS-based 3D reconstruction using neural radiance fields have emerged, demonstrating promising results. However, their excessively slow reconstruction speed significantly impacts their application in real-world scenarios, primarily due to two reasons: (1) the reliance on MLPs for scene representation leads to slow training, often requiring several hours for reconstruction; and (2) the uniform sampling strategy along the elevation arc is inefficient, greatly hindering both training speed and reconstruction quality. To address these challenges, we propose a voxel-based efficient neural implicit surface reconstruction approach using FLS, featuring three key innovations: 1) Replacing MLPs with voxel grids for scene representation, utilizing a signed distance function (SDF) voxel grid to model geometry and a feature voxel grid to capture appearance. 2) Introducing a hierarchical sampling strategy along the elevation arc to improve sampling efficiency. 3) Applying SDF Gaussian convolution to the SDF voxel grid, effectively reducing noise and surface roughness. Extensive experiments demonstrate that our method significantly outperforms existing unsupervised dense FLS reconstruction techniques. Notably, our approach achieves the same reconstruction quality in just 10 minutes of training that previously required 4 hours with state-of-the-art methods, while also delivering superior results. We will open-source our code upon paper acceptance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108664"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146101006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108660
Xiu Yin , Xiyu Liu , Shulei Chang , Bosheng Song , Guanzhong Gong , Jiaxing Yin , Dengwang Li , Jie Xue
Current neural-like P systems use “point neurons” as the computing entities, and the computations in these neurons are simplified, ignoring the fact that, in organisms, subcellular compartments (such as neuronal dendrites) can also perform operations as independent computing units in addition to computing at the individual neuron level. The nervous system has a strong ability for optimization learning. Therefore, we propose learnable dendrite neural P (LDNP) systems with new plasticity rules, in which the dendrite structure and learning function can be adaptively changed when solving different application problems. Specifically, the dendrites of neurons are designed as dendritic trees composed of multiple dendritic branches, each of which serves as an independent computing unit. The multilevel complex topological structure of dendrites provides powerful computing capabilities for neurons. A model for predicting the overall survival of glioblastoma (GBM) patients was developed based on LDNP systems and validated on the GBM cohort from the Cancer Genome Atlas. Compared with thirteen state-of-the-art methods, the LDNP system achieves the best performance.
{"title":"Learnable dendrite neural P systems and applications in survival prediction of glioblastoma patients","authors":"Xiu Yin , Xiyu Liu , Shulei Chang , Bosheng Song , Guanzhong Gong , Jiaxing Yin , Dengwang Li , Jie Xue","doi":"10.1016/j.neunet.2026.108660","DOIUrl":"10.1016/j.neunet.2026.108660","url":null,"abstract":"<div><div>Current neural-like P systems use “point neurons” as the computing entities, and the computations in these neurons are simplified, ignoring the fact that, in organisms, subcellular compartments (such as neuronal dendrites) can also perform operations as independent computing units in addition to computing at the individual neuron level. The nervous system has a strong ability for optimization learning. Therefore, we propose learnable dendrite neural P (LDNP) systems with new plasticity rules, in which the dendrite structure and learning function can be adaptively changed when solving different application problems. Specifically, the dendrites of neurons are designed as dendritic trees composed of multiple dendritic branches, each of which serves as an independent computing unit. The multilevel complex topological structure of dendrites provides powerful computing capabilities for neurons. A model for predicting the overall survival of glioblastoma (GBM) patients was developed based on LDNP systems and validated on the GBM cohort from the Cancer Genome Atlas. Compared with thirteen state-of-the-art methods, the LDNP system achieves the best performance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108660"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bundle recommendation is designed to suggest a set of correlated items to a user in a holistic manner rather than recommending these items separately. Recent methods introduce contrastive learning (CL) to refine the node representations learned from different graphs (generally termed the item and bundle views) for better recommendation performance. Unfortunately, these methods have two deficiencies. Firstly, few of them explicitly model the user-user and bundle-bundle relationships simultaneously from both the item and bundle views, leading to the underutilization of high-order relationships between users (bundles). Secondly, they use InfoNCE as the contrastive loss, which overlooks the graph structure as supervised signals in defining positive (negative) samples, resulting in anchor-like nodes being treated as negative samples. To tackle these deficiencies, an approach of cross-view contrastive representation learning (CCRL) on meta-path induced graphs with node features is proposed for bundle recommendation. First, we introduce meta-path to model the user-user and bundle-bundle relationships as meta-path induced graphs with node features from both the item and bundle views. Second, we perform graph representation learning on the meta-path induced graphs with node features to procure the user (bundle) representations and introduce a contrastive loss that supports multiple positive samples to build a cross-view graph CL mechanism for refining the learned user (bundle) representations. Finally, the model is trained with a joint optimization objective. Experiments on the benchmark datasets manifest that our approach surpasses the baselines in bundle recommendation.
{"title":"Cross-view contrastive representation learning on meta-path induced graphs with node features for bundle recommendation","authors":"Peng Zhang , Zhendong Niu , Ru Ma , Shunpan Liang , Fuzhi Zhang","doi":"10.1016/j.neunet.2026.108669","DOIUrl":"10.1016/j.neunet.2026.108669","url":null,"abstract":"<div><div>Bundle recommendation is designed to suggest a set of correlated items to a user in a holistic manner rather than recommending these items separately. Recent methods introduce <u>c</u>ontrastive <u>l</u>earning (CL) to refine the node representations learned from different graphs (generally termed the item and bundle views) for better recommendation performance. Unfortunately, these methods have two deficiencies. Firstly, few of them explicitly model the user-user and bundle-bundle relationships simultaneously from both the item and bundle views, leading to the underutilization of high-order relationships between users (bundles). Secondly, they use InfoNCE as the contrastive loss, which overlooks the graph structure as supervised signals in defining positive (negative) samples, resulting in anchor-like nodes being treated as negative samples. To tackle these deficiencies, an approach of <u>c</u>ross-view <u>c</u>ontrastive <u>r</u>epresentation <u>l</u>earning (CCRL) on meta-path induced graphs with node features is proposed for bundle recommendation. First, we introduce meta-path to model the user-user and bundle-bundle relationships as meta-path induced graphs with node features from both the item and bundle views. Second, we perform graph representation learning on the meta-path induced graphs with node features to procure the user (bundle) representations and introduce a contrastive loss that supports multiple positive samples to build a cross-view graph CL mechanism for refining the learned user (bundle) representations. Finally, the model is trained with a joint optimization objective. Experiments on the benchmark datasets manifest that our approach surpasses the baselines in bundle recommendation.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108669"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108655
Suheng Peng , Jiacai Liao , Libo Cao
Deep neural networks excel in road garbage segmentation but require costly pixel-level annotations. Balancing accuracy and annotation costs is a key bottleneck in urban garbage management. Semi-supervised learning (SSL) reduces the dependence on annotations by utilizing large amounts of unlabeled data. However, existing methods face a key challenge: under extreme annotation imbalance, the scarce labeled data often lacks diversity. This leads to repeated reuse during training, preventing full information exploitation and causing model performance stagnation. Specifically, we introduce the Dynamic Bidirectional Data Recomposition (DBDR) mechanism, which dynamically adjusts the bidirectional information interaction between labeled and unlabeled data to solve the problem of representation stagnation. Early training: The labeled data is integrated into the unlabeled data stream according to confidence levels, guiding the model to prioritize capturing and stabilizing basic semantic prototypes. Mid-training: A dynamic memory queue is constructed to quantify the evolution of model confidence states over time. We use dynamic thresholds and dual validation to trigger a reverse flow of knowledge from unlabeled to labeled supervision. This breaks local optima in the encoder and reshapes the semantic decision boundaries. DBDR can be integrated into any current mainstream SSL framework. On a real-world road garbage dataset, DBDR delivers a significant performance boost over all five state-of-the-art baseline models. Ablation experiments validate its key improvements in the segmentation of confusing targets (e.g., plastic, paper). This research provides an economically feasible solution for future smart city waste management technologies.
{"title":"Dynamic bidirectional data recomposition for efficient road garbage segmentation in semi-supervised learning","authors":"Suheng Peng , Jiacai Liao , Libo Cao","doi":"10.1016/j.neunet.2026.108655","DOIUrl":"10.1016/j.neunet.2026.108655","url":null,"abstract":"<div><div>Deep neural networks excel in road garbage segmentation but require costly pixel-level annotations. Balancing accuracy and annotation costs is a key bottleneck in urban garbage management. Semi-supervised learning (SSL) reduces the dependence on annotations by utilizing large amounts of unlabeled data. However, existing methods face a key challenge: under extreme annotation imbalance, the scarce labeled data often lacks diversity. This leads to repeated reuse during training, preventing full information exploitation and causing model performance stagnation. Specifically, we introduce the Dynamic Bidirectional Data Recomposition (DBDR) mechanism, which dynamically adjusts the bidirectional information interaction between labeled and unlabeled data to solve the problem of representation stagnation. Early training: The labeled data is integrated into the unlabeled data stream according to confidence levels, guiding the model to prioritize capturing and stabilizing basic semantic prototypes. Mid-training: A dynamic memory queue is constructed to quantify the evolution of model confidence states over time. We use dynamic thresholds and dual validation to trigger a reverse flow of knowledge from unlabeled to labeled supervision. This breaks local optima in the encoder and reshapes the semantic decision boundaries. DBDR can be integrated into any current mainstream SSL framework. On a real-world road garbage dataset, DBDR delivers a significant performance boost over all five state-of-the-art baseline models. Ablation experiments validate its key improvements in the segmentation of confusing targets (e.g., plastic, paper). This research provides an economically feasible solution for future smart city waste management technologies.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108655"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}