Pub Date : 2026-02-01Epub Date: 2025-10-03DOI: 10.1016/j.neunet.2025.108141
Yueyao Li, Bin Wu
Incomplete multi-view clustering (IMVC) has become an area of increasing focus due to the frequent occurrence of missing views in real-world multi-view datasets. Traditional methods often address this by attempting to recover the missing views before clustering. However, these methods face two main limitations: (1) inadequate modeling of cross-view consistency, which weakens the relationships between views, especially with a high missing rate, and (2) limited capacity to generate realistic and diverse missing views, leading to suboptimal clustering results. To tackle these issues, we propose a novel framework, Joint Generative Adversarial Network and Alignment Adversarial (JGA-IMVC). Our framework leverages adversarial learning to simultaneously generate missing views and enforce consistency alignment across views, ensuring effective reconstruction of incomplete data while preserving underlying structural relationships. Extensive experiments on benchmark datasets with varying missing rates demonstrate that JGA-IMVC consistently outperforms current state-of-the-art methods. The model achieves improvements of 3 % to 5 % in key clustering metrics such as Accuracy, Normalized Mutual Information (NMI), and Adjusted Rand Index (ARI). JGA-IMVC excels under high missing conditions, confirming its robustness and generalization capabilities, providing a practical solution for incomplete multi-view clustering scenarios.
{"title":"Joint generative and alignment adversarial learning for robust incomplete multi-view clustering.","authors":"Yueyao Li, Bin Wu","doi":"10.1016/j.neunet.2025.108141","DOIUrl":"10.1016/j.neunet.2025.108141","url":null,"abstract":"<p><p>Incomplete multi-view clustering (IMVC) has become an area of increasing focus due to the frequent occurrence of missing views in real-world multi-view datasets. Traditional methods often address this by attempting to recover the missing views before clustering. However, these methods face two main limitations: (1) inadequate modeling of cross-view consistency, which weakens the relationships between views, especially with a high missing rate, and (2) limited capacity to generate realistic and diverse missing views, leading to suboptimal clustering results. To tackle these issues, we propose a novel framework, Joint Generative Adversarial Network and Alignment Adversarial (JGA-IMVC). Our framework leverages adversarial learning to simultaneously generate missing views and enforce consistency alignment across views, ensuring effective reconstruction of incomplete data while preserving underlying structural relationships. Extensive experiments on benchmark datasets with varying missing rates demonstrate that JGA-IMVC consistently outperforms current state-of-the-art methods. The model achieves improvements of 3 % to 5 % in key clustering metrics such as Accuracy, Normalized Mutual Information (NMI), and Adjusted Rand Index (ARI). JGA-IMVC excels under high missing conditions, confirming its robustness and generalization capabilities, providing a practical solution for incomplete multi-view clustering scenarios.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108141"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145309821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-02-01Epub Date: 2025-10-08DOI: 10.1016/j.neunet.2025.108191
Jiayi Mao, Hanle Zheng, Huifeng Yin, Hanxiao Fan, Lingrui Mei, Hao Guo, Yao Li, Jibin Wu, Jing Pei, Lei Deng
Brain-inspired neural networks, drawing insights from biological neural systems, have emerged as a promising paradigm for temporal information processing due to their inherent neural dynamics. Spiking Neural Networks (SNNs) have gained extensive attention among existing brain-inspired neural models. However, they often struggle with capturing multi-timescale temporal features due to the static parameters across time steps and the low-precision spike activities. To this end, we propose a dynamic SNN with enhanced dendritic heterogeneity to enhance the multi-timescale feature extraction capability. We design a Leaky Integrate Modulation neuron model with Dendritic Heterogeneity (DH-LIM) that replaces traditional spike activities with a continuous modulation mechanism for preserving the nonlinear behaviors while enhancing the feature expression capability. We also introduce an Adaptive Dendritic Plasticity (ADP) mechanism that dynamically adjusts dendritic timing factors based on the frequency domain information of input signals, enabling the model to capture both rapid- and slow-changing temporal patterns. Extensive experiments on multiple datasets with rich temporal features demonstrate that our proposed method achieves excellent performance in processing complex temporal signals. These optimizations provide fresh solutions for optimizing the multi-timescale feature extraction capability of SNNs, showcasing its broad application potential.
{"title":"Adaptive dendritic plasticity in brain-inspired dynamic neural networks for enhanced multi-timescale feature extraction.","authors":"Jiayi Mao, Hanle Zheng, Huifeng Yin, Hanxiao Fan, Lingrui Mei, Hao Guo, Yao Li, Jibin Wu, Jing Pei, Lei Deng","doi":"10.1016/j.neunet.2025.108191","DOIUrl":"10.1016/j.neunet.2025.108191","url":null,"abstract":"<p><p>Brain-inspired neural networks, drawing insights from biological neural systems, have emerged as a promising paradigm for temporal information processing due to their inherent neural dynamics. Spiking Neural Networks (SNNs) have gained extensive attention among existing brain-inspired neural models. However, they often struggle with capturing multi-timescale temporal features due to the static parameters across time steps and the low-precision spike activities. To this end, we propose a dynamic SNN with enhanced dendritic heterogeneity to enhance the multi-timescale feature extraction capability. We design a Leaky Integrate Modulation neuron model with Dendritic Heterogeneity (DH-LIM) that replaces traditional spike activities with a continuous modulation mechanism for preserving the nonlinear behaviors while enhancing the feature expression capability. We also introduce an Adaptive Dendritic Plasticity (ADP) mechanism that dynamically adjusts dendritic timing factors based on the frequency domain information of input signals, enabling the model to capture both rapid- and slow-changing temporal patterns. Extensive experiments on multiple datasets with rich temporal features demonstrate that our proposed method achieves excellent performance in processing complex temporal signals. These optimizations provide fresh solutions for optimizing the multi-timescale feature extraction capability of SNNs, showcasing its broad application potential.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108191"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145287548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent advancements in visual speech recognition (VSR) have promoted progress in lip-to-speech synthesis, where pre-trained VSR models enhance the intelligibility of synthesized speech by providing valuable semantic information. The success achieved by cascade frameworks, which combine pseudo-VSR with pseudo-text-to-speech (TTS) or implicitly utilize the transcribed text, highlights the benefits of leveraging VSR models. However, these methods typically rely on mel-spectrograms as an intermediate representation, which may introduce a key bottleneck: the domain gap between synthetic mel-spectrograms, generated from inherently error-prone lip-to-speech mappings, and real mel-spectrograms used to train vocoders. This mismatch inevitably degrades synthesis quality. To bridge this gap, we propose Natural Lip-to-Speech (NaturalL2S), an end-to-end framework that jointly trains the vocoder with the acoustic inductive priors. Specifically, our architecture introduces a fundamental frequency (F0) predictor to explicitly model prosodic variations, where the predicted F0 contour drives a differentiable digital signal processing (DDSP) synthesizer to provide acoustic priors for subsequent refinement. Notably, the proposed system achieves satisfactory performance on speaker similarity without requiring explicit speaker embeddings. Both objective metrics and subjective listening tests demonstrate that NaturalL2S significantly enhances synthesized speech quality compared to existing state-of-the-art methods. Audio samples are available on our demonstration page: https://yifan-liang.github.io/NaturalL2S/.
{"title":"NaturalL2S: End-to-end high-quality multispeaker lip-to-speech synthesis with differential digital signal processing.","authors":"Yifan Liang, Fangkun Liu, Andong Li, Xiaodong Li, Chengyou Lei, Chengshi Zheng","doi":"10.1016/j.neunet.2025.108163","DOIUrl":"10.1016/j.neunet.2025.108163","url":null,"abstract":"<p><p>Recent advancements in visual speech recognition (VSR) have promoted progress in lip-to-speech synthesis, where pre-trained VSR models enhance the intelligibility of synthesized speech by providing valuable semantic information. The success achieved by cascade frameworks, which combine pseudo-VSR with pseudo-text-to-speech (TTS) or implicitly utilize the transcribed text, highlights the benefits of leveraging VSR models. However, these methods typically rely on mel-spectrograms as an intermediate representation, which may introduce a key bottleneck: the domain gap between synthetic mel-spectrograms, generated from inherently error-prone lip-to-speech mappings, and real mel-spectrograms used to train vocoders. This mismatch inevitably degrades synthesis quality. To bridge this gap, we propose Natural Lip-to-Speech (NaturalL2S), an end-to-end framework that jointly trains the vocoder with the acoustic inductive priors. Specifically, our architecture introduces a fundamental frequency (F0) predictor to explicitly model prosodic variations, where the predicted F0 contour drives a differentiable digital signal processing (DDSP) synthesizer to provide acoustic priors for subsequent refinement. Notably, the proposed system achieves satisfactory performance on speaker similarity without requiring explicit speaker embeddings. Both objective metrics and subjective listening tests demonstrate that NaturalL2S significantly enhances synthesized speech quality compared to existing state-of-the-art methods. Audio samples are available on our demonstration page: https://yifan-liang.github.io/NaturalL2S/.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108163"},"PeriodicalIF":6.3,"publicationDate":"2026-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145294364","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-31DOI: 10.1016/j.neunet.2026.108675
Teng Zhang , Gen Li , Yanhui Xiao , Huawei Tian , Yun Cao
With the continuous advancement of Deepfake techniques, traditional unimodal detection methods struggle to address the challenges posed by multimodal manipulations. Most existing approaches rely on large-scale training data, which limits their generalization to unseen identities or different manipulation types in few-shot settings. In this paper, we propose an emotion-aware multimodal Deepfake detection method that exploits emotion signals for forgery detection. Specifically, we design an emotion embedding extractor (Emoencoder) to capture emotion representations within modalities. Then, we employ Emotion-Aware Contrastive Learning and Cross-Modal Contrastive Learning to capture cross-modal inconsistencies and enhance modality feature extraction. Furthermore, we propose a Text-Guided Semantic Fusion module, where the text modality serves as a semantic anchor to guide audio-visual feature interactions for multimodal feature fusion. To validate our approach under data-limited conditions and unseen identities, we employ a cross-identity few-shot training strategy on benchmark datasets. Experimental results demonstrate that our method outperforms SOTAs and demonstrates superior generalization to both unseen identities and manipulation types.
{"title":"Emotion-Aware multimodal deepfake detection","authors":"Teng Zhang , Gen Li , Yanhui Xiao , Huawei Tian , Yun Cao","doi":"10.1016/j.neunet.2026.108675","DOIUrl":"10.1016/j.neunet.2026.108675","url":null,"abstract":"<div><div>With the continuous advancement of Deepfake techniques, traditional unimodal detection methods struggle to address the challenges posed by multimodal manipulations. Most existing approaches rely on large-scale training data, which limits their generalization to unseen identities or different manipulation types in few-shot settings. In this paper, we propose an emotion-aware multimodal Deepfake detection method that exploits emotion signals for forgery detection. Specifically, we design an emotion embedding extractor (Emoencoder) to capture emotion representations within modalities. Then, we employ Emotion-Aware Contrastive Learning and Cross-Modal Contrastive Learning to capture cross-modal inconsistencies and enhance modality feature extraction. Furthermore, we propose a Text-Guided Semantic Fusion module, where the text modality serves as a semantic anchor to guide audio-visual feature interactions for multimodal feature fusion. To validate our approach under data-limited conditions and unseen identities, we employ a cross-identity few-shot training strategy on benchmark datasets. Experimental results demonstrate that our method outperforms SOTAs and demonstrates superior generalization to both unseen identities and manipulation types.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108675"},"PeriodicalIF":6.3,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146133401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-31DOI: 10.1016/j.neunet.2026.108646
Wenqian Du , Mingduo Lin , Guoling Yuan , Bo Zhao
In this paper, an event-triggered decentralized adaptive critic learning (ACL) control method is proposed for interconnected systems with nonlinear inequality state constraints. First, by introducing a slack function, the nonlinear inequality state constraints of original isolated subsystem are transformed into equality forms, and then the original isolated subsystem is augmented to an unconstrained one. Then, by establishing a cost function with discount factors for each isolated subsystem, a local policy iteration-based decentralized control law is developed by solving the Hamilton–Jacobi–Bellman equation with the help of a local critic neural network (NN) for each isolated subsystem. Through developing a novel event-triggering mechanism for each isolated subsystem, the decentralized control policy is updated at the triggering instants only, which assists to save the computational and communication resources. Hereafter, the event-triggered decentralized control law of isolated subsystem is derived. Then, the overall optimal control for the entire interconnected system is derived by constituting an array of developed event-triggered decentralized control laws. Furthermore, the closed-loop nonlinear interconnected system and the weight estimation errors of local critic NNs are guaranteed to be uniformly ultimately bounded. Finally, the effectiveness of the proposed method is validated through two comparative simulation examples.
{"title":"Event-triggered decentralized adaptive critic learning control for interconnected systems with nonlinear inequality state constraints","authors":"Wenqian Du , Mingduo Lin , Guoling Yuan , Bo Zhao","doi":"10.1016/j.neunet.2026.108646","DOIUrl":"10.1016/j.neunet.2026.108646","url":null,"abstract":"<div><div>In this paper, an event-triggered decentralized adaptive critic learning (ACL) control method is proposed for interconnected systems with nonlinear inequality state constraints. First, by introducing a slack function, the nonlinear inequality state constraints of original isolated subsystem are transformed into equality forms, and then the original isolated subsystem is augmented to an unconstrained one. Then, by establishing a cost function with discount factors for each isolated subsystem, a local policy iteration-based decentralized control law is developed by solving the Hamilton–Jacobi–Bellman equation with the help of a local critic neural network (NN) for each isolated subsystem. Through developing a novel event-triggering mechanism for each isolated subsystem, the decentralized control policy is updated at the triggering instants only, which assists to save the computational and communication resources. Hereafter, the event-triggered decentralized control law of isolated subsystem is derived. Then, the overall optimal control for the entire interconnected system is derived by constituting an array of developed event-triggered decentralized control laws. Furthermore, the closed-loop nonlinear interconnected system and the weight estimation errors of local critic NNs are guaranteed to be uniformly ultimately bounded. Finally, the effectiveness of the proposed method is validated through two comparative simulation examples.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108646"},"PeriodicalIF":6.3,"publicationDate":"2026-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.neunet.2026.108652
Yu Hao , Xin Gao , Xinping Diao , Yuan Li , Yukun Lin , Tianyang Chen , Qiangwei Li , Jiawen Lu
Enhancing model classification capability for samples within overlapping regions in complex feature spaces remains a key challenge in imbalanced classification research. Existing mainstream methods at the data-level and algorithm-level primarily rely on original sample distribution information to reduce overlap impact, without deeply modeling the causal relationship between features and labels. Furthermore, these approaches often overlook instance-level explanations that could guide deep discriminative information mining for samples of different classes in overlapping regions, thus the improvement on classification performance and model credibility may be constrained. This paper proposes an explainable imbalanced classification framework with adaptive sample repulsion against class-specific counterfactuals (CSCF-SR), forming a closed-loop between explanation generation and classification decisions by dynamically regulating the feature-space distribution through generated counterfactual samples. Two core phases are jointly optimized. (1) Counterfactual searching: a class-specific dual-actor architecture based on reinforcement learning decouples perturbation policy learning for majority and minority classes. A multi-step dynamic perturbation mechanism is designed to control counterfactual search behavior more precisely and smoothly, effectively generating reliable counterfactual samples. (2) Adaptive sample repulsion against counterfactuals: exploiting the inter-class discriminative information in displacement vectors between counterfactual and original samples, each original sample is adaptively perturbed along the direction opposite to its counterfactual. This fine-grained regulation gradually displaces samples from the overlapping region and clarifies class boundaries. Experiments on 50 imbalanced datasets demonstrate that CSCF-SR has a performance advantage over 27 typical imbalanced classification methods on both F1-score and G-mean, with more pronounced improvements on 25 datasets with severe class overlap.
{"title":"Adaptive sample repulsion against class-specific counterfactuals for explainable imbalanced classification","authors":"Yu Hao , Xin Gao , Xinping Diao , Yuan Li , Yukun Lin , Tianyang Chen , Qiangwei Li , Jiawen Lu","doi":"10.1016/j.neunet.2026.108652","DOIUrl":"10.1016/j.neunet.2026.108652","url":null,"abstract":"<div><div>Enhancing model classification capability for samples within overlapping regions in complex feature spaces remains a key challenge in imbalanced classification research. Existing mainstream methods at the data-level and algorithm-level primarily rely on original sample distribution information to reduce overlap impact, without deeply modeling the causal relationship between features and labels. Furthermore, these approaches often overlook instance-level explanations that could guide deep discriminative information mining for samples of different classes in overlapping regions, thus the improvement on classification performance and model credibility may be constrained. This paper proposes an explainable imbalanced classification framework with adaptive sample repulsion against class-specific counterfactuals (CSCF-SR), forming a closed-loop between explanation generation and classification decisions by dynamically regulating the feature-space distribution through generated counterfactual samples. Two core phases are jointly optimized. (1) Counterfactual searching: a class-specific dual-actor architecture based on reinforcement learning decouples perturbation policy learning for majority and minority classes. A multi-step dynamic perturbation mechanism is designed to control counterfactual search behavior more precisely and smoothly, effectively generating reliable counterfactual samples. (2) Adaptive sample repulsion against counterfactuals: exploiting the inter-class discriminative information in displacement vectors between counterfactual and original samples, each original sample is adaptively perturbed along the direction opposite to its counterfactual. This fine-grained regulation gradually displaces samples from the overlapping region and clarifies class boundaries. Experiments on 50 imbalanced datasets demonstrate that CSCF-SR has a performance advantage over 27 typical imbalanced classification methods on both F1-score and G-mean, with more pronounced improvements on 25 datasets with severe class overlap.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108652"},"PeriodicalIF":6.3,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-30DOI: 10.1016/j.neunet.2026.108670
Tianyu Wang , Maite Zhang , Mingxuan Lu , Mian Li
In real-world applications, tabular datasets often evolve over time, leading to temporal shift that degrades the long-range neural network performance. Most existing temporal encoding or adaptation solutions treat time cues as fixed auxiliary variables at a single scale. Motivated by the multi-horizon nature of temporal shifts with heterogeneous temporal dynamics, this paper presents TARS (Temporal Abstraction with Routed Scales), a novel plug-and-play method for robust tabular learning under temporal shift, applicable to various deep learning model backbones. First, an explicit temporal encoder decomposes timestamps into short-term recency, mid-term periodicity, and long-term contextual embeddings with structured memory. Next, an implicit drift encoder tracks higher-order distributional statistics at the same aligned timescales, producing drift signals that reflect ongoing temporal dynamics. These signals drive a drift-aware routing mechanism that adaptively weights the explicit temporal pathways, emphasizing the most relevant timescales under current conditions. Finally, a feature-temporal fusion layer integrates the routed temporal representation with original features, injecting context-aware bias. Extensive experiments on eight real-world datasets from the TabReD benchmark show that TARS consistently outperforms the competitive compared methods across various backbone models, achieving up to +2.38% average relative improvement on MLP, +4.08% on DCNv2, etc. Ablation studies verify the complementary contributions of all four modules. These results highlight the effectiveness of TARS for improving the temporal robustness of existing deep tabular models.
在现实世界的应用中,表格数据集经常随着时间的推移而变化,导致时间的变化,从而降低了远程神经网络的性能。大多数现有的时间编码或自适应解决方案将时间线索视为单一尺度上的固定辅助变量。摘要针对具有异构时间动态的时间转移的多视界特性,提出了一种适用于各种深度学习模型主干的时间转移鲁棒表格学习的即插即用方法TARS (temporal Abstraction with routing Scales)。首先,显式时间编码器将时间戳分解为具有结构化记忆的短期近期性、中期周期性和长期上下文嵌入。接下来,隐式漂移编码器在相同的对齐时间尺度上跟踪高阶分布统计数据,产生反映持续时间动态的漂移信号。这些信号驱动漂移感知路由机制,该机制自适应地加权显式时间路径,强调当前条件下最相关的时间尺度。最后,特征时间融合层将路由的时间表示与原始特征集成在一起,注入上下文感知偏差。在TabReD基准测试的8个真实数据集上进行的大量实验表明,TARS在各种骨干模型中始终优于竞争性比较方法,在MLP上实现了+2.38%的平均相对改进,在DCNv2等上实现了+4.08%的平均相对改进。消融研究证实了所有四个模块的互补贡献。这些结果突出了TARS在提高现有深度表格模型的时间鲁棒性方面的有效性。
{"title":"Multi-timescale representation with adaptive routing for deep tabular learning under temporal shift","authors":"Tianyu Wang , Maite Zhang , Mingxuan Lu , Mian Li","doi":"10.1016/j.neunet.2026.108670","DOIUrl":"10.1016/j.neunet.2026.108670","url":null,"abstract":"<div><div>In real-world applications, tabular datasets often evolve over time, leading to temporal shift that degrades the long-range neural network performance. Most existing temporal encoding or adaptation solutions treat time cues as fixed auxiliary variables at a single scale. Motivated by the multi-horizon nature of temporal shifts with heterogeneous temporal dynamics, this paper presents TARS (Temporal Abstraction with Routed Scales), a novel plug-and-play method for robust tabular learning under temporal shift, applicable to various deep learning model backbones. First, an explicit temporal encoder decomposes timestamps into short-term recency, mid-term periodicity, and long-term contextual embeddings with structured memory. Next, an implicit drift encoder tracks higher-order distributional statistics at the same aligned timescales, producing drift signals that reflect ongoing temporal dynamics. These signals drive a drift-aware routing mechanism that adaptively weights the explicit temporal pathways, emphasizing the most relevant timescales under current conditions. Finally, a feature-temporal fusion layer integrates the routed temporal representation with original features, injecting context-aware bias. Extensive experiments on eight real-world datasets from the TabReD benchmark show that TARS consistently outperforms the competitive compared methods across various backbone models, achieving up to +2.38% average relative improvement on MLP, +4.08% on DCNv2, etc. Ablation studies verify the complementary contributions of all four modules. These results highlight the effectiveness of TARS for improving the temporal robustness of existing deep tabular models.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108670"},"PeriodicalIF":6.3,"publicationDate":"2026-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in https://github.com/fate2715/LFD.
{"title":"Efficient semantic segmentation via logit-guided feature distillation","authors":"Xuyi Yu , Shang Lou , Yinghai Zhao , Huipeng Zhang , Kuizhi Mei","doi":"10.1016/j.neunet.2026.108663","DOIUrl":"10.1016/j.neunet.2026.108663","url":null,"abstract":"<div><div>Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in <span><span>https://github.com/fate2715/LFD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108663"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-29DOI: 10.1016/j.neunet.2026.108650
Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li
Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.
In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an Ambiguous Discriminator that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a Type-Driven Multi-Strategy Retrieval Framework that applies targeted strategies based on categories like Inaccurate Localization, Unclear Expression, and Lack of Specific Guidance to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.
{"title":"Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval","authors":"Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li","doi":"10.1016/j.neunet.2026.108650","DOIUrl":"10.1016/j.neunet.2026.108650","url":null,"abstract":"<div><div>Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.</div><div>In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an <strong>Ambiguous Discriminator</strong> that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a <strong>Type-Driven Multi-Strategy Retrieval Framework</strong> that applies targeted strategies based on categories like <em>Inaccurate Localization, Unclear Expression</em>, and <em>Lack of Specific Guidance</em> to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108650"},"PeriodicalIF":6.3,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}