Pub Date : 2026-01-10DOI: 10.1016/j.neunet.2026.108569
Jiawen Gong , Beihao Xia , Qinmu Peng , Bin Zou , Xinge You
Data heterogeneity is a common yet complex challenge in distributed machine learning scenarios. However, current Distributed Support Vector Machines (DSVMs) lack effective mechanisms to identify suitable support vectors across diverse data structures, limiting their ability to dynamically adjust decision boundaries. To address this issue, we propose a Distributed Hybrid Learning based on Support Vector Machine (DH-SVM). This approach leverages global pre-learning to capture data structure information, which then guides local learning processes to identify higher-quality support vectors and adaptively refine decision boundaries. Furthermore, considering the computational overhead inherent in distributed learning, we enhance our algorithm by incorporating a Markov sampling technique (DH-MSVM). Theoretically, we derive the generalization bound of the algorithm based on uniformly ergodic Markov chain samples and establish a fast learning rate, demonstrating the robustness and scalability of DH-MSVM. Empirically, extensive experiments on real-world datasets validate the superior performance of the proposed algorithms.
{"title":"DH-MSVM: A hybrid algorithm for seeking quality support vectors in distributed learning","authors":"Jiawen Gong , Beihao Xia , Qinmu Peng , Bin Zou , Xinge You","doi":"10.1016/j.neunet.2026.108569","DOIUrl":"10.1016/j.neunet.2026.108569","url":null,"abstract":"<div><div>Data heterogeneity is a common yet complex challenge in distributed machine learning scenarios. However, current Distributed Support Vector Machines (DSVMs) lack effective mechanisms to identify suitable support vectors across diverse data structures, limiting their ability to dynamically adjust decision boundaries. To address this issue, we propose a Distributed Hybrid Learning based on Support Vector Machine (DH-SVM). This approach leverages global pre-learning to capture data structure information, which then guides local learning processes to identify higher-quality support vectors and adaptively refine decision boundaries. Furthermore, considering the computational overhead inherent in distributed learning, we enhance our algorithm by incorporating a Markov sampling technique (DH-MSVM). Theoretically, we derive the generalization bound of the algorithm based on uniformly ergodic Markov chain samples and establish a fast learning rate, demonstrating the robustness and scalability of DH-MSVM. Empirically, extensive experiments on real-world datasets validate the superior performance of the proposed algorithms.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108569"},"PeriodicalIF":6.3,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.neunet.2026.108585
Xiao Lin , Zeyu Rong , Yan Li , Qizhe Yang , Ping Li , Wei Huang
Long-tailed data is ubiquitous in real-world applications, posing significant challenges due to imbalanced class distribution and high levels of label noise. Previous methods to address long-tailed data with label noise often incur high computational and manual costs. To address these challenges, we propose a novel two-stage active label cleaning strategy, which extends active learning beyond traditional label acquisition to efficiently identify and correct mislabeled samples while minimizing annotation cost. Specifically, in the first stage, we propose a Balanced Class-Centered Contrastive Learning (BCCL) to enhance feature representation quality and identify potential label noise within long-tailed datasets. BCCL achieves this through a novel loss function that integrates contrastive learning with weighted average of class centers. The second stage employs an uncertainty-based active learning sorted sampling to retrain on potential label noise samples, focusing on high-uncertainty instances to determine the final noise samples needing to be relabeled. Our two-stage active label cleaning strategy minimizes the amount of data requiring re-annotation, ultimately improving classification performance through iterative re-labeling, while optimizing the use of annotation resources and reducing the annotation workload. Experimental results demonstrate the robustness of our proposed method across varying noise ratios and levels of imbalance, effectively enhancing discriminative capability on noisy data in multiple datasets and achieving superior classification performance on long-tailed data, particularly in high-noise scenarios. In experiments on the CIFAR10-LT dataset under imbalance ratio 10 and symmetric noise 0.6, we significantly outperform the state-of-the-art PCSE with a relative improvement of 5.17%. In addition, on the real-noise long-tail dataset Red Mini-ImageNet under imbalance ratio 100 and noise ratio 0.4, we achieve an accuracy of 38.37%, surpassing existing baselines.
{"title":"A two-stage active cleaning strategy for long-tail label noise","authors":"Xiao Lin , Zeyu Rong , Yan Li , Qizhe Yang , Ping Li , Wei Huang","doi":"10.1016/j.neunet.2026.108585","DOIUrl":"10.1016/j.neunet.2026.108585","url":null,"abstract":"<div><div>Long-tailed data is ubiquitous in real-world applications, posing significant challenges due to imbalanced class distribution and high levels of label noise. Previous methods to address long-tailed data with label noise often incur high computational and manual costs. To address these challenges, we propose a novel two-stage active label cleaning strategy, which extends active learning beyond traditional label acquisition to efficiently identify and correct mislabeled samples while minimizing annotation cost. Specifically, in the first stage, we propose a Balanced Class-Centered Contrastive Learning (BCCL) to enhance feature representation quality and identify potential label noise within long-tailed datasets. BCCL achieves this through a novel loss function that integrates contrastive learning with weighted average of class centers. The second stage employs an uncertainty-based active learning sorted sampling to retrain on potential label noise samples, focusing on high-uncertainty instances to determine the final noise samples needing to be relabeled. Our two-stage active label cleaning strategy minimizes the amount of data requiring re-annotation, ultimately improving classification performance through iterative re-labeling, while optimizing the use of annotation resources and reducing the annotation workload. Experimental results demonstrate the robustness of our proposed method across varying noise ratios and levels of imbalance, effectively enhancing discriminative capability on noisy data in multiple datasets and achieving superior classification performance on long-tailed data, particularly in high-noise scenarios. In experiments on the CIFAR10-LT dataset under imbalance ratio 10 and symmetric noise 0.6, we significantly outperform the state-of-the-art PCSE with a relative improvement of 5.17%. In addition, on the real-noise long-tail dataset Red Mini-ImageNet under imbalance ratio 100 and noise ratio 0.4, we achieve an accuracy of 38.37%, surpassing existing baselines.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108585"},"PeriodicalIF":6.3,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.neunet.2026.108571
Heng Hu, Hao-Zhe Wang, Si-Bao Chen, Jin Tang
With development of deep learning methods, performance of object detection has been greatly improved. However, the high resolution of remotely sensed images, the complexity of the background, the uneven distribution of objects, and the uneven number of objects among them lead to unsatisfactory detection results of existing detectors. Facing these challenges, we propose YOFOR (You Only Focus on Object Regions), an adaptive local sensing enhancement network. It contains three components: adaptive local sensing module, fuzzy enhancement module and class balance module. Among them, adaptive local sensing module can adaptively localize dense object regions and dynamically crop dense object regions on view, which effectively solves problem of uneven distribution of objects. Fuzzy enhancement module further enhances object region by weakening the background interference, thus improving detection performance. Class balancing module, which analyzes dataset to obtain distribution of long-tailed classes, takes into account direction of tailed classes and distance around object, and operates on tailed classes within a certain range to alleviate long-tailed class problem and further improve detection performance. All three components are unsupervised and can be easily inserted into existing networks. Extensive experiments on the VisDrone, DOTA, and AI-TOD datasets demonstrate the effectiveness and adaptability of the method.
随着深度学习方法的发展,目标检测的性能得到了很大的提高。然而,由于遥感图像的高分辨率、背景的复杂性、目标分布的不均匀以及其中目标数量的不均匀,导致现有探测器的检测效果不理想。面对这些挑战,我们提出了一种自适应局部传感增强网络YOFOR (You Only Focus on Object Regions)。该系统由三个部分组成:自适应局部感知模块、模糊增强模块和类平衡模块。其中,自适应局部传感模块能够自适应定位密集目标区域,并动态裁剪可视密集目标区域,有效解决了目标分布不均匀的问题。模糊增强模块通过减弱背景干扰进一步增强目标区域,从而提高检测性能。类平衡模块通过对数据集进行分析得到长尾类的分布,考虑到尾类的方向和与目标的距离,在一定范围内对尾类进行操作,缓解长尾类问题,进一步提高检测性能。这三个组件都是无监督的,可以很容易地插入到现有的网络中。在VisDrone、DOTA和AI-TOD数据集上的大量实验证明了该方法的有效性和适应性。
{"title":"YOFOR : You only focus on object regions for tiny object detection in aerial images","authors":"Heng Hu, Hao-Zhe Wang, Si-Bao Chen, Jin Tang","doi":"10.1016/j.neunet.2026.108571","DOIUrl":"10.1016/j.neunet.2026.108571","url":null,"abstract":"<div><div>With development of deep learning methods, performance of object detection has been greatly improved. However, the high resolution of remotely sensed images, the complexity of the background, the uneven distribution of objects, and the uneven number of objects among them lead to unsatisfactory detection results of existing detectors. Facing these challenges, we propose YOFOR (You Only Focus on Object Regions), an adaptive local sensing enhancement network. It contains three components: adaptive local sensing module, fuzzy enhancement module and class balance module. Among them, adaptive local sensing module can adaptively localize dense object regions and dynamically crop dense object regions on view, which effectively solves problem of uneven distribution of objects. Fuzzy enhancement module further enhances object region by weakening the background interference, thus improving detection performance. Class balancing module, which analyzes dataset to obtain distribution of long-tailed classes, takes into account direction of tailed classes and distance around object, and operates on tailed classes within a certain range to alleviate long-tailed class problem and further improve detection performance. All three components are unsupervised and can be easily inserted into existing networks. Extensive experiments on the VisDrone, DOTA, and AI-TOD datasets demonstrate the effectiveness and adaptability of the method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108571"},"PeriodicalIF":6.3,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-10DOI: 10.1016/j.neunet.2026.108576
Haihua Luo , Xuming Ran , Zhengji Li , Huiyan Xue , Tingting Jiang , Jiangrong Shen , Tommi Kärkkäinen , Qi Xu , Fengyu Cong
Continual learning aims to enable models to acquire new knowledge while retaining previously learned information. Prompt-based methods have shown remarkable performance in this domain; however, they typically rely on key-value pairing, which can introduce inter-task interference and hinder scalability. To overcome these limitations, we propose a novel approach employing task-specific Prompt-Prototype (ProP), thereby eliminating the need for key-value pairs. In our method, task-specific prompts facilitate more effective feature learning for the current task, while corresponding prototypes capture the representative features of the input. During inference, predictions are generated by binding each task-specific prompt with its associated prototype. Additionally, we introduce regularization constraints during prompt initialization to penalize excessively large values, thereby enhancing stability. Experiments on several widely used datasets demonstrate the effectiveness of the proposed method. In contrast to mainstream prompt-based approaches, our framework removes the dependency on key-value pairs, offering a fresh perspective for future continual learning research.
{"title":"Key-value pair-free continual learner via task-specific prompt-prototype","authors":"Haihua Luo , Xuming Ran , Zhengji Li , Huiyan Xue , Tingting Jiang , Jiangrong Shen , Tommi Kärkkäinen , Qi Xu , Fengyu Cong","doi":"10.1016/j.neunet.2026.108576","DOIUrl":"10.1016/j.neunet.2026.108576","url":null,"abstract":"<div><div>Continual learning aims to enable models to acquire new knowledge while retaining previously learned information. Prompt-based methods have shown remarkable performance in this domain; however, they typically rely on key-value pairing, which can introduce inter-task interference and hinder scalability. To overcome these limitations, we propose a novel approach employing task-specific <strong>Pro</strong>mpt-<strong>P</strong>rototype (<strong>ProP</strong>), thereby eliminating the need for key-value pairs. In our method, task-specific prompts facilitate more effective feature learning for the current task, while corresponding prototypes capture the representative features of the input. During inference, predictions are generated by binding each task-specific prompt with its associated prototype. Additionally, we introduce regularization constraints during prompt initialization to penalize excessively large values, thereby enhancing stability. Experiments on several widely used datasets demonstrate the effectiveness of the proposed method. In contrast to mainstream prompt-based approaches, our framework removes the dependency on key-value pairs, offering a fresh perspective for future continual learning research.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108576"},"PeriodicalIF":6.3,"publicationDate":"2026-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.neunet.2026.108560
Jangseop Park , Namwoo Kang
Nonlinear structural analyses in engineering often require extensive finite element simulations, limiting their applicability in design optimization and real-time control. Conventional deep learning surrogates often struggle with complex, non-parametric three-dimensional (3D) geometries and directionally varying loads. This work presents Point-DeepONet, an operator-learning-based surrogate that integrates PointNet into the DeepONet framework to learn a mapping from non-parametric geometries and variable load conditions to physical response fields. By leveraging PointNet to learn a geometric representation from raw point clouds, our model circumvents the need for manual parameterization. This geometric embedding is then synergistically fused with load conditions within the DeepONet architecture to accurately predict three-dimensional displacement and von Mises stress fields. Trained on a large-scale dataset, Point-DeepONet demonstrates high fidelity, achieving a coefficient of determination (R²) reaching 0.987 for displacement and 0.923 for von Mises stress. Furthermore, to rigorously validate its generalization capabilities, we conducted additional experiments on unseen, randomly oriented load directions, where the model maintained exceptional accuracy. Compared to nonlinear finite element analyses that require about 19.32 minutes per case, Point-DeepONet provides predictions in mere seconds-approximately 400 times faster-while maintaining excellent scalability. These findings, validated through extensive experiments and ablation studies, highlight the potential of Point-DeepONet to enable rapid, high-fidelity structural analyses for complex engineering workflows.
{"title":"Point-Deeponet: Predicting nonlinear fields on non-Parametric geometries under variable load conditions","authors":"Jangseop Park , Namwoo Kang","doi":"10.1016/j.neunet.2026.108560","DOIUrl":"10.1016/j.neunet.2026.108560","url":null,"abstract":"<div><div>Nonlinear structural analyses in engineering often require extensive finite element simulations, limiting their applicability in design optimization and real-time control. Conventional deep learning surrogates often struggle with complex, non-parametric three-dimensional (3D) geometries and directionally varying loads. This work presents Point-DeepONet, an operator-learning-based surrogate that integrates PointNet into the DeepONet framework to learn a mapping from non-parametric geometries and variable load conditions to physical response fields. By leveraging PointNet to learn a geometric representation from raw point clouds, our model circumvents the need for manual parameterization. This geometric embedding is then synergistically fused with load conditions within the DeepONet architecture to accurately predict three-dimensional displacement and von Mises stress fields. Trained on a large-scale dataset, Point-DeepONet demonstrates high fidelity, achieving a coefficient of determination (R²) reaching 0.987 for displacement and 0.923 for von Mises stress. Furthermore, to rigorously validate its generalization capabilities, we conducted additional experiments on unseen, randomly oriented load directions, where the model maintained exceptional accuracy. Compared to nonlinear finite element analyses that require about 19.32 minutes per case, Point-DeepONet provides predictions in mere seconds-approximately 400 times faster-while maintaining excellent scalability. These findings, validated through extensive experiments and ablation studies, highlight the potential of Point-DeepONet to enable rapid, high-fidelity structural analyses for complex engineering workflows.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108560"},"PeriodicalIF":6.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.neunet.2026.108564
Xiaoru Gao , Guoyan Zheng
Unsupervised domain adaptation (UDA) in medical image segmentation presents significant challenges due to substantial cross-domain disparities and the inherent absence of target domain annotations. In this study, to address these challenges, we propose an end-to-end progressive domain bridging framework based on representation disentanglement and triple-level consistency-driven feature alignment, referred to as ReTri, that synergistically integrates a representation disentanglement-based image alignment (RDIA) module with a novel triple-level consistency-driven feature alignment (TCFA) module. In particular, the RDIA module aims to establish an initial domain bridge by decoupling and aligning fundamental visual disparities through disentangled representation learning, while the novel TCFA module hierarchically bridges remaining cross-domain semantic discrepancies and feature distribution disparities via two novel consistency-driven alignment mechanisms: 1) attention-guided semantics-level consistency alignment, where we purposely design a bi-attentive semantic feature extraction (BSFE) component coupled with an attention-adaptive semantic consistency (ASC) loss function, facilitating dynamic alignment of high-level semantic representations across domains, and 2) multi-view dual-level mixing consistency alignment, consisting of Feature-Cut consistent self-ensembling (FCCS) and Trans-Cut consistent self-ensembling (TCCS) components. These two components operate within intermediate mixing spaces to ensure robust knowledge transfer through complementary feature- and prediction-level consistency regularization. Extensive experimental evaluations are conducted on four challenging datasets (Lumbar Spine CT-MR, Cardiac CT-MR, Cross-domain Echocardiography, and Multi-center Prostate MR) across seven UDA-based segmentation scenarios and two external validation scenarios. Our framework achieves superior performance over the best state-of-the-art (SOTA) methods on following UDA-based segmentation scenarios: +2.9% DSC for spine CT → MR segmentation, +3.6% and +2.4% DSC for bidirectional cardiac CT↔MR segmentation, +1.7% and +2.3% DSC for bidirectional cross-center cross-vendor Echocardiography (CAMUS↔EchoNet-Dynamic) segmentation, and +12.2% and +12.0% DSC for bidirectional multi-center prostate MR segmentation. The source code and the datasets are publicly available at https://github.com/xiaorugao999/ReTri.
{"title":"ReTri: Progressive domain bridging via representation disentanglement and triple-level consistency-driven feature alignment for unsupervised domain adaptive medical image segmentation","authors":"Xiaoru Gao , Guoyan Zheng","doi":"10.1016/j.neunet.2026.108564","DOIUrl":"10.1016/j.neunet.2026.108564","url":null,"abstract":"<div><div>Unsupervised domain adaptation (UDA) in medical image segmentation presents significant challenges due to substantial cross-domain disparities and the inherent absence of target domain annotations. In this study, to address these challenges, we propose an end-to-end progressive domain bridging framework based on representation disentanglement and triple-level consistency-driven feature alignment, referred to as ReTri, that synergistically integrates a representation disentanglement-based image alignment (RDIA) module with a novel triple-level consistency-driven feature alignment (TCFA) module. In particular, the RDIA module aims to establish an initial domain bridge by decoupling and aligning fundamental visual disparities through disentangled representation learning, while the novel TCFA module hierarchically bridges remaining cross-domain semantic discrepancies and feature distribution disparities via two novel consistency-driven alignment mechanisms: 1) attention-guided semantics-level consistency alignment, where we purposely design a bi-attentive semantic feature extraction (BSFE) component coupled with an attention-adaptive semantic consistency (ASC) loss function, facilitating dynamic alignment of high-level semantic representations across domains, and 2) multi-view dual-level mixing consistency alignment, consisting of Feature-Cut consistent self-ensembling (FCCS) and Trans-Cut consistent self-ensembling (TCCS) components. These two components operate within intermediate mixing spaces to ensure robust knowledge transfer through complementary feature- and prediction-level consistency regularization. Extensive experimental evaluations are conducted on four challenging datasets (Lumbar Spine CT-MR, Cardiac CT-MR, Cross-domain Echocardiography, and Multi-center Prostate MR) across seven UDA-based segmentation scenarios and two external validation scenarios. Our framework achieves superior performance over the best state-of-the-art (SOTA) methods on following UDA-based segmentation scenarios: +2.9% DSC for spine CT → MR segmentation, +3.6% and +2.4% DSC for bidirectional cardiac CT↔MR segmentation, +1.7% and +2.3% DSC for bidirectional cross-center cross-vendor Echocardiography (CAMUS↔EchoNet-Dynamic) segmentation, and +12.2% and +12.0% DSC for bidirectional multi-center prostate MR segmentation. The source code and the datasets are publicly available at <span><span>https://github.com/xiaorugao999/ReTri</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108564"},"PeriodicalIF":6.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.neunet.2026.108541
Wei Zhou , Yining Xie , Fengjiao Wang , Jing Zhao , Jiayi Ma
Accurately identifying the mediastinal regions where metastatic lymph nodes are located is critical for the staging diagnosis of lung cancer. This identification task involves two distinct detection dimensions: mediastinal region identification and lymph node metastasis assessment. Traditional single-task image classification algorithms struggle to manage the interference between different classification dimensions within a single task. Existing multi-task learning methods struggle to balance the relationship between shared and task-specific features, and often fail to effectively fit the underlying data distributions and task characteristics during gradient adjustment. To address these challenges, we propose a Multi-Task Information Decoupling Strategy (MT-IDS). MT-IDS decomposes the main task into multiple auxiliary tasks along different feature dimensions, forming a unified multi-task system to optimize detection performance across tasks. A Dual-control Branch Routing Gate Mechanism (DBR) is employed in MT-IDS to compute the weighting of shared and task-specific features, thereby enabling more precise expert selection and feature extraction for each task. Additionally, a Dual-Dimensional Gradient Balancing Algorithm (DD-GB) is introduced in MT-IDS, whereby gradient balance is achieved through alignment of gradient directions and dynamic scaling of magnitudes, while the distribution of inter-task gradient characteristics is maintained. The significant advantages demonstrated by MT-IDS in both ablation and comparative experiments indicate its potential as an innovative solution for multi-dimensional medical image classification problems.
{"title":"MT-IDS: A multi-task information decoupling strategy for identifying lymph node metastasis in the mediastinal region","authors":"Wei Zhou , Yining Xie , Fengjiao Wang , Jing Zhao , Jiayi Ma","doi":"10.1016/j.neunet.2026.108541","DOIUrl":"10.1016/j.neunet.2026.108541","url":null,"abstract":"<div><div>Accurately identifying the mediastinal regions where metastatic lymph nodes are located is critical for the staging diagnosis of lung cancer. This identification task involves two distinct detection dimensions: mediastinal region identification and lymph node metastasis assessment. Traditional single-task image classification algorithms struggle to manage the interference between different classification dimensions within a single task. Existing multi-task learning methods struggle to balance the relationship between shared and task-specific features, and often fail to effectively fit the underlying data distributions and task characteristics during gradient adjustment. To address these challenges, we propose a Multi-Task Information Decoupling Strategy (MT-IDS). MT-IDS decomposes the main task into multiple auxiliary tasks along different feature dimensions, forming a unified multi-task system to optimize detection performance across tasks. A Dual-control Branch Routing Gate Mechanism (DBR) is employed in MT-IDS to compute the weighting of shared and task-specific features, thereby enabling more precise expert selection and feature extraction for each task. Additionally, a Dual-Dimensional Gradient Balancing Algorithm (DD-GB) is introduced in MT-IDS, whereby gradient balance is achieved through alignment of gradient directions and dynamic scaling of magnitudes, while the distribution of inter-task gradient characteristics is maintained. The significant advantages demonstrated by MT-IDS in both ablation and comparative experiments indicate its potential as an innovative solution for multi-dimensional medical image classification problems.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108541"},"PeriodicalIF":6.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.neunet.2026.108580
Long Tang , Xin Si , Yingjie Tian , Panos M Pardalos
As a novel paradigm for learning with inexact supervision, Pcomp classification reduces the annotation costs of training a binary classifier by using ordered pairwise samples without requiring precise labels. However, existing methods fail to fully account for sign differences in empirical risk at the level of individual sample pairs, resulting in polarized fitting where the risks of overfitting and underfitting coexist. Actually, positive and negative empirical risks indicate varying degrees of training difficulty, necessitating differentiated treatments. In this work, we propose a BLINEX-Pcomp model that employs a bounded linear-exponential function to impose distinct penalties on positive and negative risks for each sample pair. The BLINEX-Pcomp model dynamically shifts the training focus toward challenging sample pairs, well balancing pairwise-level risks of overfitting and underfitting. Additionally, a multi-view version of BLINEX-Pcomp (MV-BLINEX-Pcomp) is developed to further enhance performance by integrating multi-view features. We have theoretically verified that MV-BLINEX-Pcomp degrades to BLINEX-Pcomp when only a single view of features is available. A dual-stage solver is designed to train the MV-BLINEX-Pcomp model. Exciting numerical results from comparative experiments validate the effectiveness of our methods in tackling Pcomp classification.
{"title":"Taming polarized fitting: BLINEX-Pcomp with asymmetric risk penalty for robust Pcomp classification","authors":"Long Tang , Xin Si , Yingjie Tian , Panos M Pardalos","doi":"10.1016/j.neunet.2026.108580","DOIUrl":"10.1016/j.neunet.2026.108580","url":null,"abstract":"<div><div>As a novel paradigm for learning with inexact supervision, Pcomp classification reduces the annotation costs of training a binary classifier by using ordered pairwise samples without requiring precise labels. However, existing methods fail to fully account for sign differences in empirical risk at the level of individual sample pairs, resulting in polarized fitting where the risks of overfitting and underfitting coexist. Actually, positive and negative empirical risks indicate varying degrees of training difficulty, necessitating differentiated treatments. In this work, we propose a BLINEX-Pcomp model that employs a bounded linear-exponential function to impose distinct penalties on positive and negative risks for each sample pair. The BLINEX-Pcomp model dynamically shifts the training focus toward challenging sample pairs, well balancing pairwise-level risks of overfitting and underfitting. Additionally, a multi-view version of BLINEX-Pcomp (MV-BLINEX-Pcomp) is developed to further enhance performance by integrating multi-view features. We have theoretically verified that MV-BLINEX-Pcomp degrades to BLINEX-Pcomp when only a single view of features is available. A dual-stage solver is designed to train the MV-BLINEX-Pcomp model. Exciting numerical results from comparative experiments validate the effectiveness of our methods in tackling Pcomp classification.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108580"},"PeriodicalIF":6.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981714","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.neunet.2026.108581
Zengbiao Yang, Yihua Tan
Deep multi-view clustering has rapidly developed in recent years, leveraging the powerful representation capabilities of deep neural networks. Among them, instance-level feature contrastive learning is widely used in deep multi-view clustering to align the embedded features of the same samples across views. However, it overlooks the constraint on the clustering structure consistency across views. Drawing inspiration from the instance-level feature contrastive learning mentioned above, we propose deep multi-view clustering based on instance-level adaptive structural contrastive learning. First, considering that different views may have varying impacts on the clustering results of different multi-view samples, we utilize Transformer Pooling module to adaptively fuse different views of different samples, obtaining the fused view. The final clustering results are derived from the fused view as well. Secondly, we propose a clue-consistency-based approach to identify sample pairs that exhibit consistent clustering structures between each view and the fused view, forming local and global consistency clustering information. Incorporating global consistency clustering information, we construct adjacency matrices for each view and the fused view. Since adjacency matrices record the clustering structure of each sample with others, we propose the instance-level adaptive structural contrastive learning, leveraging the above local consistency information to align the overall clustering structures of the same samples across different views and the fused view. By comparing the results of proposed method with several state-of-the-art methods on multiple multi-view datasets, we demonstrate the superiority of the proposed approach.
{"title":"Deep multi-view clustering based on instance-level adaptive structural contrastive learning","authors":"Zengbiao Yang, Yihua Tan","doi":"10.1016/j.neunet.2026.108581","DOIUrl":"10.1016/j.neunet.2026.108581","url":null,"abstract":"<div><div>Deep multi-view clustering has rapidly developed in recent years, leveraging the powerful representation capabilities of deep neural networks. Among them, instance-level feature contrastive learning is widely used in deep multi-view clustering to align the embedded features of the same samples across views. However, it overlooks the constraint on the clustering structure consistency across views. Drawing inspiration from the instance-level feature contrastive learning mentioned above, we propose deep multi-view clustering based on instance-level adaptive structural contrastive learning. First, considering that different views may have varying impacts on the clustering results of different multi-view samples, we utilize Transformer Pooling module to adaptively fuse different views of different samples, obtaining the fused view. The final clustering results are derived from the fused view as well. Secondly, we propose a clue-consistency-based approach to identify sample pairs that exhibit consistent clustering structures between each view and the fused view, forming local and global consistency clustering information. Incorporating global consistency clustering information, we construct adjacency matrices for each view and the fused view. Since adjacency matrices record the clustering structure of each sample with others, we propose the instance-level adaptive structural contrastive learning, leveraging the above local consistency information to align the overall clustering structures of the same samples across different views and the fused view. By comparing the results of proposed method with several state-of-the-art methods on multiple multi-view datasets, we demonstrate the superiority of the proposed approach.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108581"},"PeriodicalIF":6.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.1016/j.neunet.2026.108592
Meihua Zhou , Tianlong Zheng , Baihua Wang , Xinyu Tong , Wai kin Fung , Li Yang
Deep clustering of single-cell RNA-seq data faces significant challenges due to extreme sparsity and noise. We present DAGCL (Dynamic Attention-enhanced Graph Embedding with Curriculum Learning), a dynamic graph embedding framework that reframes representation learning as a coarse-to-fine evolutionary process. Unlike conventional static paradigms, DAGCL employs a curriculum-guided scheduling mechanism that actively modulates both attention intensity and supervision stringency throughout training. This strategy aligns model complexity with feature maturity, effectively mitigating early-stage confirmation bias. To further stabilize optimization, we incorporate an entropy-regularized Sinkhorn projection that enforces globally balanced soft assignments. Extensive experiments on 27 benchmarks demonstrate that DAGCL consistently outperforms baselines in clustering accuracy and robustness. Our work establishes a principled strategy for unsupervised learning where structural constraints and supervisory pressure co-evolve with learned representations.
{"title":"Curriculum-guided divergence scheduling improves single-cell clustering robustness","authors":"Meihua Zhou , Tianlong Zheng , Baihua Wang , Xinyu Tong , Wai kin Fung , Li Yang","doi":"10.1016/j.neunet.2026.108592","DOIUrl":"10.1016/j.neunet.2026.108592","url":null,"abstract":"<div><div>Deep clustering of single-cell RNA-seq data faces significant challenges due to extreme sparsity and noise. We present DAGCL (Dynamic Attention-enhanced Graph Embedding with Curriculum Learning), a dynamic graph embedding framework that reframes representation learning as a coarse-to-fine evolutionary process. Unlike conventional static paradigms, DAGCL employs a curriculum-guided scheduling mechanism that actively modulates both attention intensity and supervision stringency throughout training. This strategy aligns model complexity with feature maturity, effectively mitigating early-stage confirmation bias. To further stabilize optimization, we incorporate an entropy-regularized Sinkhorn projection that enforces globally balanced soft assignments. Extensive experiments on 27 benchmarks demonstrate that DAGCL consistently outperforms baselines in clustering accuracy and robustness. Our work establishes a principled strategy for unsupervised learning where structural constraints and supervisory pressure co-evolve with learned representations.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"198 ","pages":"Article 108592"},"PeriodicalIF":6.3,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145981728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}