首页 > 最新文献

Neural Networks最新文献

英文 中文
DropNaE: Alleviating irregularity for large-scale graph representation learning. DropNaE:为大规模图形表示学习减轻不规则性。
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-01 Epub Date: 2024-12-06 DOI: 10.1016/j.neunet.2024.106930
Xin Liu, Xunbin Xiong, Mingyu Yan, Runzhen Xue, Shirui Pan, Songwen Pei, Lei Deng, Xiaochun Ye, Dongrui Fan

Large-scale graphs are prevalent in various real-world scenarios and can be effectively processed using Graph Neural Networks (GNNs) on GPUs to derive meaningful representations. However, the inherent irregularity found in real-world graphs poses challenges for leveraging the single-instruction multiple-data execution mode of GPUs, leading to inefficiencies in GNN training. In this paper, we try to alleviate this irregularity at its origin-the irregular graph data itself. To this end, we propose DropNaE to alleviate the irregularity in large-scale graphs by conditionally dropping nodes and edges before GNN training. Specifically, we first present a metric to quantify the neighbor heterophily of all nodes in a graph. Then, we propose DropNaE containing two variants to transform the irregular degree distribution of the large-scale graph to a uniform one, based on the proposed metric. Experiments show that DropNaE is highly compatible and can be integrated into popular GNNs to promote both training efficiency and accuracy of used GNNs. DropNaE is offline performed and requires no online computing resources, benefiting the state-of-the-art GNNs in the present and future to a significant extent.

大规模图在各种现实场景中普遍存在,并且可以使用gpu上的图神经网络(gnn)有效地处理以获得有意义的表示。然而,在现实世界的图形中发现的固有的不规则性对利用gpu的单指令多数据执行模式提出了挑战,导致GNN训练效率低下。在本文中,我们试图从其根源-不规则图形数据本身来缓解这种不规则性。为此,我们提出DropNaE,通过在GNN训练前有条件地删除节点和边来缓解大规模图中的不规则性。具体来说,我们首先提出了一个度量来量化图中所有节点的邻居异质性。然后,我们提出了包含两个变量的DropNaE,以基于提出的度量将大规模图的不规则度分布转换为均匀度分布。实验表明,DropNaE具有很高的兼容性,可以集成到流行的GNNs中,以提高使用GNNs的训练效率和准确性。DropNaE是离线执行的,不需要在线计算资源,在很大程度上有利于当前和未来最先进的gnn。
{"title":"DropNaE: Alleviating irregularity for large-scale graph representation learning.","authors":"Xin Liu, Xunbin Xiong, Mingyu Yan, Runzhen Xue, Shirui Pan, Songwen Pei, Lei Deng, Xiaochun Ye, Dongrui Fan","doi":"10.1016/j.neunet.2024.106930","DOIUrl":"10.1016/j.neunet.2024.106930","url":null,"abstract":"<p><p>Large-scale graphs are prevalent in various real-world scenarios and can be effectively processed using Graph Neural Networks (GNNs) on GPUs to derive meaningful representations. However, the inherent irregularity found in real-world graphs poses challenges for leveraging the single-instruction multiple-data execution mode of GPUs, leading to inefficiencies in GNN training. In this paper, we try to alleviate this irregularity at its origin-the irregular graph data itself. To this end, we propose DropNaE to alleviate the irregularity in large-scale graphs by conditionally dropping nodes and edges before GNN training. Specifically, we first present a metric to quantify the neighbor heterophily of all nodes in a graph. Then, we propose DropNaE containing two variants to transform the irregular degree distribution of the large-scale graph to a uniform one, based on the proposed metric. Experiments show that DropNaE is highly compatible and can be integrated into popular GNNs to promote both training efficiency and accuracy of used GNNs. DropNaE is offline performed and requires no online computing resources, benefiting the state-of-the-art GNNs in the present and future to a significant extent.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"106930"},"PeriodicalIF":6.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142819907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving forward compatibility in class incremental learning by increasing representation rank and feature richness. 通过增加表示等级和特征丰富度提高类增量学习的前向兼容性。
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-01 Epub Date: 2024-12-02 DOI: 10.1016/j.neunet.2024.106969
Jaeill Kim, Wonseok Lee, Moonjung Eo, Wonjong Rhee

Class Incremental Learning (CIL) constitutes a pivotal subfield within continual learning, aimed at enabling models to progressively learn new classification tasks while retaining knowledge obtained from prior tasks. Although previous studies have predominantly focused on backward compatible approaches to mitigate catastrophic forgetting, recent investigations have introduced forward compatible methods to enhance performance on novel tasks and complement existing backward compatible methods. In this study, we introduce effective-Rank based Feature Richness enhancement (RFR) method that is designed for improving forward compatibility. Specifically, this method increases the effective rank of representations during the base session, thereby facilitating the incorporation of more informative features pertinent to unseen novel tasks. Consequently, RFR achieves dual objectives in backward and forward compatibility: minimizing feature extractor modifications and enhancing novel task performance, respectively. To validate the efficacy of our approach, we establish a theoretical connection between effective rank and the Shannon entropy of representations. Subsequently, we conduct comprehensive experiments by integrating RFR into eleven well-known CIL methods. Our results demonstrate the effectiveness of our approach in enhancing novel-task performance while mitigating catastrophic forgetting. Furthermore, our method notably improves the average incremental accuracy across all eleven cases examined.

类增量学习(CIL)是持续学习中的一个重要子领域,其目的是使模型能够逐步学习新的分类任务,同时保留从先前任务中获得的知识。虽然以前的研究主要集中在后向兼容方法上,以减轻灾难性遗忘,但最近的研究引入了前向兼容方法,以提高新任务的性能,并补充现有的后向兼容方法。在本研究中,我们引入了基于有效等级的特征丰富度增强(RFR)方法,旨在提高前向兼容能力。具体来说,这种方法可以在基础会话中提高表征的有效等级,从而促进与未见过的新任务相关的更多信息特征的融入。因此,RFR 实现了后向和前向兼容性的双重目标:分别最大限度地减少特征提取器的修改和提高新任务的性能。为了验证我们方法的有效性,我们在有效等级和表征的香农熵之间建立了理论联系。随后,我们通过将 RFR 集成到 11 种著名的 CIL 方法中进行了综合实验。结果表明,我们的方法在提高新任务性能的同时,还能减轻灾难性遗忘。此外,我们的方法显著提高了所有 11 个案例的平均增量准确率。
{"title":"Improving forward compatibility in class incremental learning by increasing representation rank and feature richness.","authors":"Jaeill Kim, Wonseok Lee, Moonjung Eo, Wonjong Rhee","doi":"10.1016/j.neunet.2024.106969","DOIUrl":"10.1016/j.neunet.2024.106969","url":null,"abstract":"<p><p>Class Incremental Learning (CIL) constitutes a pivotal subfield within continual learning, aimed at enabling models to progressively learn new classification tasks while retaining knowledge obtained from prior tasks. Although previous studies have predominantly focused on backward compatible approaches to mitigate catastrophic forgetting, recent investigations have introduced forward compatible methods to enhance performance on novel tasks and complement existing backward compatible methods. In this study, we introduce effective-Rank based Feature Richness enhancement (RFR) method that is designed for improving forward compatibility. Specifically, this method increases the effective rank of representations during the base session, thereby facilitating the incorporation of more informative features pertinent to unseen novel tasks. Consequently, RFR achieves dual objectives in backward and forward compatibility: minimizing feature extractor modifications and enhancing novel task performance, respectively. To validate the efficacy of our approach, we establish a theoretical connection between effective rank and the Shannon entropy of representations. Subsequently, we conduct comprehensive experiments by integrating RFR into eleven well-known CIL methods. Our results demonstrate the effectiveness of our approach in enhancing novel-task performance while mitigating catastrophic forgetting. Furthermore, our method notably improves the average incremental accuracy across all eleven cases examined.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"106969"},"PeriodicalIF":6.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142796518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BPEN: Brain Posterior Evidential Network for trustworthy brain imaging analysis. 用于可靠脑成像分析的脑后证据网络。
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-03-01 Epub Date: 2024-11-26 DOI: 10.1016/j.neunet.2024.106943
Kai Ye, Haoteng Tang, Siyuan Dai, Igor Fortel, Paul M Thompson, R Scott Mackin, Alex Leow, Heng Huang, Liang Zhan

The application of deep learning techniques to analyze brain functional magnetic resonance imaging (fMRI) data has led to significant advancements in identifying prospective biomarkers associated with various clinical phenotypes and neurological conditions. Despite these achievements, the aspect of prediction uncertainty has been relatively underexplored in brain fMRI data analysis. Accurate uncertainty estimation is essential for trustworthy learning, given the challenges associated with brain fMRI data acquisition and the potential diagnostic implications for patients. To address this gap, we introduce a novel posterior evidential network, named the Brain Posterior Evidential Network (BPEN), designed to capture both aleatoric and epistemic uncertainty in the analysis of brain fMRI data. We conducted comprehensive experiments using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and ADNI-depression (ADNI-D) cohorts, focusing on predictions for mild cognitive impairment (MCI) and depression across various diagnostic groups. Our experiments not only unequivocally demonstrate the superior predictive performance of our BPEN model compared to existing state-of-the-art methods but also underscore the importance of uncertainty estimation in predictive models.

应用深度学习技术分析脑功能磁共振成像(fMRI)数据,在识别与各种临床表型和神经系统疾病相关的前瞻性生物标志物方面取得了重大进展。尽管取得了这些成就,但在脑功能磁共振成像数据分析中,预测不确定性方面的探索相对不足。考虑到与脑功能磁共振成像数据采集相关的挑战和对患者的潜在诊断意义,准确的不确定性估计对于可信的学习至关重要。为了解决这一差距,我们引入了一种新的后验证据网络,称为脑后验证据网络(BPEN),旨在捕捉脑功能磁共振成像数据分析中的任意不确定性和认知不确定性。我们使用来自阿尔茨海默病神经影像学倡议(ADNI)和ADNI-抑郁(ADNI- d)队列的数据进行了全面的实验,重点关注不同诊断组对轻度认知障碍(MCI)和抑郁的预测。我们的实验不仅明确地证明了我们的BPEN模型与现有的最先进的方法相比具有优越的预测性能,而且强调了预测模型中不确定性估计的重要性。
{"title":"BPEN: Brain Posterior Evidential Network for trustworthy brain imaging analysis.","authors":"Kai Ye, Haoteng Tang, Siyuan Dai, Igor Fortel, Paul M Thompson, R Scott Mackin, Alex Leow, Heng Huang, Liang Zhan","doi":"10.1016/j.neunet.2024.106943","DOIUrl":"10.1016/j.neunet.2024.106943","url":null,"abstract":"<p><p>The application of deep learning techniques to analyze brain functional magnetic resonance imaging (fMRI) data has led to significant advancements in identifying prospective biomarkers associated with various clinical phenotypes and neurological conditions. Despite these achievements, the aspect of prediction uncertainty has been relatively underexplored in brain fMRI data analysis. Accurate uncertainty estimation is essential for trustworthy learning, given the challenges associated with brain fMRI data acquisition and the potential diagnostic implications for patients. To address this gap, we introduce a novel posterior evidential network, named the Brain Posterior Evidential Network (BPEN), designed to capture both aleatoric and epistemic uncertainty in the analysis of brain fMRI data. We conducted comprehensive experiments using data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and ADNI-depression (ADNI-D) cohorts, focusing on predictions for mild cognitive impairment (MCI) and depression across various diagnostic groups. Our experiments not only unequivocally demonstrate the superior predictive performance of our BPEN model compared to existing state-of-the-art methods but also underscore the importance of uncertainty estimation in predictive models.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"183 ","pages":"106943"},"PeriodicalIF":6.0,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11750605/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142808452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Graph anomaly detection based on hybrid node representation learning.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-20 DOI: 10.1016/j.neunet.2025.107169
Xiang Wang, Hao Dou, Dibo Dong, Zhenyu Meng

Anomaly detection on graph data has garnered significant interest from both the academia and industry. In recent years, fueled by the rapid development of Graph Neural Networks (GNNs), various GNNs-based anomaly detection methods have been proposed and achieved good results. However, GNNs-based methods assume that connected nodes have similar classes and features, leading to issues of class inconsistency and semantic inconsistency in graph anomaly detection. Existing methods have yet to adequately address these issues, thereby limiting the detection performance of the model. Therefore, an anomaly detection method that consists of one semantic fusion-based node representation module and one attention mechanism-based node representation module is proposed to resolve the aforementioned issues, respectively. The main highlights of the current study are outlined below: First, a novel framework is developed, aiming to better resolve the issues of class inconsistency and semantic inconsistency in graph anomaly detection. Second, we propose the semantic fusion-based node representation module which is based on Chebyshev polynomial graph filtering and is able to effectively capture high-frequency and low-frequency components of graph signals. Third, to overcome semantic inconsistency in graph data, we devise an attention mechanism-based node representation module which can adaptively learns importance information of graph nodes, resulting in significant improvement of the model performance. Finally, experiments are carried out on five real-world anomaly detection datasets, and the results show that the proposed method outperforms the state-of-the-art methods.

{"title":"Graph anomaly detection based on hybrid node representation learning.","authors":"Xiang Wang, Hao Dou, Dibo Dong, Zhenyu Meng","doi":"10.1016/j.neunet.2025.107169","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107169","url":null,"abstract":"<p><p>Anomaly detection on graph data has garnered significant interest from both the academia and industry. In recent years, fueled by the rapid development of Graph Neural Networks (GNNs), various GNNs-based anomaly detection methods have been proposed and achieved good results. However, GNNs-based methods assume that connected nodes have similar classes and features, leading to issues of class inconsistency and semantic inconsistency in graph anomaly detection. Existing methods have yet to adequately address these issues, thereby limiting the detection performance of the model. Therefore, an anomaly detection method that consists of one semantic fusion-based node representation module and one attention mechanism-based node representation module is proposed to resolve the aforementioned issues, respectively. The main highlights of the current study are outlined below: First, a novel framework is developed, aiming to better resolve the issues of class inconsistency and semantic inconsistency in graph anomaly detection. Second, we propose the semantic fusion-based node representation module which is based on Chebyshev polynomial graph filtering and is able to effectively capture high-frequency and low-frequency components of graph signals. Third, to overcome semantic inconsistency in graph data, we devise an attention mechanism-based node representation module which can adaptively learns importance information of graph nodes, resulting in significant improvement of the model performance. Finally, experiments are carried out on five real-world anomaly detection datasets, and the results show that the proposed method outperforms the state-of-the-art methods.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107169"},"PeriodicalIF":6.0,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Novel session-based recommendation system using capsule graph neural network.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-18 DOI: 10.1016/j.neunet.2025.107176
Driss El Alaoui, Jamal Riffi, Abdelouahed Sabri, Badraddine Aghoutane, Ali Yahyaouy, Hamid Tairi

Session-based recommendation systems (SBRS) are essential for enhancing the customer experience, improving sales and loyalty, and providing the possibility to discover products in dynamic and real-world scenarios without needing user history. Despite their importance, traditional or even current SBRS algorithms face limitations, notably the inability to capture complex item transitions within each session and the disregard for general patterns that can be derived from multiple sessions. This paper proposes a novel SBRS model, called Capsule GraphSAGE for Session-Based Recommendation (CapsGSR), that marries GraphSAGE's scalability and inductive learning capabilities with the Capsules network's abstraction levels by generating multiple integrations for each node from different perspectives. Consequently, CapsGSR addresses challenges that may hinder the optimal item representations and captures transitions' complex nature, mitigating the loss of crucial information. Our system significantly outperforms baseline models on benchmark datasets, with improvements of 8.44% in HR@20 and 4.66% in MRR@20 , indicating its effectiveness in delivering precise and relevant recommendations.

{"title":"A Novel session-based recommendation system using capsule graph neural network.","authors":"Driss El Alaoui, Jamal Riffi, Abdelouahed Sabri, Badraddine Aghoutane, Ali Yahyaouy, Hamid Tairi","doi":"10.1016/j.neunet.2025.107176","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107176","url":null,"abstract":"<p><p>Session-based recommendation systems (SBRS) are essential for enhancing the customer experience, improving sales and loyalty, and providing the possibility to discover products in dynamic and real-world scenarios without needing user history. Despite their importance, traditional or even current SBRS algorithms face limitations, notably the inability to capture complex item transitions within each session and the disregard for general patterns that can be derived from multiple sessions. This paper proposes a novel SBRS model, called Capsule GraphSAGE for Session-Based Recommendation (CapsGSR), that marries GraphSAGE's scalability and inductive learning capabilities with the Capsules network's abstraction levels by generating multiple integrations for each node from different perspectives. Consequently, CapsGSR addresses challenges that may hinder the optimal item representations and captures transitions' complex nature, mitigating the loss of crucial information. Our system significantly outperforms baseline models on benchmark datasets, with improvements of 8.44% in HR@20 and 4.66% in MRR@20 , indicating its effectiveness in delivering precise and relevant recommendations.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107176"},"PeriodicalIF":6.0,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DCTCNet: Sequency discrete cosine transform convolution network for visual recognition.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-18 DOI: 10.1016/j.neunet.2025.107143
Jiayong Bao, Jiangshe Zhang, Chunxia Zhang, Lili Bao

The discrete cosine transform (DCT) has been widely used in computer vision tasks due to its ability of high compression ratio and high-quality visual presentation. However, conventional DCT is usually affected by the size of transform region and results in blocking effect. Therefore, eliminating the blocking effects to efficiently serve for vision tasks is significant and challenging. In this paper, we introduce All Phase Sequency DCT (APSeDCT) into convolutional networks to extract multi-frequency information of deep features. Due to the fact that APSeDCT can be equivalent to convolutional operation, we construct corresponding convolution module called APSeDCT Convolution (APSeDCTConv) that has great transferability similar to vanilla convolution. Then we propose an augmented convolutional operator called MultiConv with APSeDCTConv. By replacing the last three bottleneck blocks of ResNet with MultiConv, our approach not only reduces the computational costs and the number of parameters, but also exhibits great performance in classification, object detection and instance segmentation tasks. Extensive experiments show that APSeDCTConv augmentation leads to consistent performance improvements in image classification on ImageNet across various different models and scales, including ResNet, Res2Net and ResNext, and achieving 0.5%-1.1% and 0.4%-0.7% AP performance improvements for object detection and instance segmentation, respectively, on the COCO benchmark compared to the baseline.

{"title":"DCTCNet: Sequency discrete cosine transform convolution network for visual recognition.","authors":"Jiayong Bao, Jiangshe Zhang, Chunxia Zhang, Lili Bao","doi":"10.1016/j.neunet.2025.107143","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107143","url":null,"abstract":"<p><p>The discrete cosine transform (DCT) has been widely used in computer vision tasks due to its ability of high compression ratio and high-quality visual presentation. However, conventional DCT is usually affected by the size of transform region and results in blocking effect. Therefore, eliminating the blocking effects to efficiently serve for vision tasks is significant and challenging. In this paper, we introduce All Phase Sequency DCT (APSeDCT) into convolutional networks to extract multi-frequency information of deep features. Due to the fact that APSeDCT can be equivalent to convolutional operation, we construct corresponding convolution module called APSeDCT Convolution (APSeDCTConv) that has great transferability similar to vanilla convolution. Then we propose an augmented convolutional operator called MultiConv with APSeDCTConv. By replacing the last three bottleneck blocks of ResNet with MultiConv, our approach not only reduces the computational costs and the number of parameters, but also exhibits great performance in classification, object detection and instance segmentation tasks. Extensive experiments show that APSeDCTConv augmentation leads to consistent performance improvements in image classification on ImageNet across various different models and scales, including ResNet, Res2Net and ResNext, and achieving 0.5%-1.1% and 0.4%-0.7% AP performance improvements for object detection and instance segmentation, respectively, on the COCO benchmark compared to the baseline.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107143"},"PeriodicalIF":6.0,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143030236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Out-of-Distribution Detection via outlier exposure in federated learning. 联邦学习中基于离群暴露的分布外检测。
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-17 DOI: 10.1016/j.neunet.2025.107141
Gu-Bon Jeong, Dong-Wan Choi

Among various out-of-distribution (OOD) detection methods in neural networks, outlier exposure (OE) using auxiliary data has shown to achieve practical performance. However, existing OE methods are typically assumed to run in a centralized manner, and thus are not feasible for a standard federated learning (FL) setting where each client has low computing power and cannot collect a variety of auxiliary samples. To address this issue, we propose a practical yet realistic OE scenario in FL where only the central server has a large amount of outlier data and a relatively small amount of in-distribution (ID) data is given to each client. For this scenario, we introduce an effective OE-based OOD detection method, called internal separation & backstage collaboration, which makes the best use of many auxiliary outlier samples without sacrificing the ultimate goal of FL, that is, privacy preservation as well as collaborative training performance. The most challenging part is how to make the same effect in our scenario as in joint centralized training with outliers and ID samples. Our main strategy (internal separation) is to jointly train the feature vectors of an internal layer with outliers in the back layers of the global model, while ensuring privacy preservation. We also suggest an collaborative approach (backstage collaboration) where multiple back layers are trained together to detect OOD samples. Our extensive experiments demonstrate that our method shows remarkable detection performance, compared to baseline approaches in the proposed OE scenario.

在各种神经网络的out- distribution (OOD)检测方法中,利用辅助数据的outlier exposure (OE)已被证明具有较好的实用性。然而,现有的OE方法通常假定以集中的方式运行,因此对于标准的联邦学习(FL)设置是不可用的,因为每个客户机的计算能力都很低,不能收集各种辅助样本。为了解决这个问题,我们在FL中提出了一个实用而现实的OE场景,其中只有中央服务器拥有大量的离群数据,而相对少量的分布内(ID)数据被提供给每个客户端。针对这种场景,我们引入了一种有效的基于oe的OOD检测方法,称为内部分离&后台协作,在不牺牲FL的最终目标即隐私保护和协同训练性能的前提下,充分利用了众多辅助离群样本。最具挑战性的部分是如何在我们的场景中取得与异常值和ID样本联合集中训练相同的效果。我们的主要策略(内部分离)是在保证隐私保护的同时,与全局模型后层的离群值共同训练内层的特征向量。我们还建议采用协作方法(后台协作),其中多个后台层一起训练以检测OOD样本。我们的大量实验表明,与提出的OE场景中的基线方法相比,我们的方法具有显着的检测性能。
{"title":"Out-of-Distribution Detection via outlier exposure in federated learning.","authors":"Gu-Bon Jeong, Dong-Wan Choi","doi":"10.1016/j.neunet.2025.107141","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107141","url":null,"abstract":"<p><p>Among various out-of-distribution (OOD) detection methods in neural networks, outlier exposure (OE) using auxiliary data has shown to achieve practical performance. However, existing OE methods are typically assumed to run in a centralized manner, and thus are not feasible for a standard federated learning (FL) setting where each client has low computing power and cannot collect a variety of auxiliary samples. To address this issue, we propose a practical yet realistic OE scenario in FL where only the central server has a large amount of outlier data and a relatively small amount of in-distribution (ID) data is given to each client. For this scenario, we introduce an effective OE-based OOD detection method, called internal separation & backstage collaboration, which makes the best use of many auxiliary outlier samples without sacrificing the ultimate goal of FL, that is, privacy preservation as well as collaborative training performance. The most challenging part is how to make the same effect in our scenario as in joint centralized training with outliers and ID samples. Our main strategy (internal separation) is to jointly train the feature vectors of an internal layer with outliers in the back layers of the global model, while ensuring privacy preservation. We also suggest an collaborative approach (backstage collaboration) where multiple back layers are trained together to detect OOD samples. Our extensive experiments demonstrate that our method shows remarkable detection performance, compared to baseline approaches in the proposed OE scenario.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107141"},"PeriodicalIF":6.0,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Contrastive independent subspace analysis network for multi-view spatial information extraction.
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-17 DOI: 10.1016/j.neunet.2024.107105
Tengyu Zhang, Deyu Zeng, Wei Liu, Zongze Wu, Chris Ding, Xiaopin Zhong

Multi-view classification integrates features from different views to optimize classification performance. Most of the existing works typically utilize semantic information to achieve view fusion but neglect the spatial information of data itself, which accommodates data representation with correlation information and is proven to be an essential aspect. Thus robust independent subspace analysis network, optimized by sparse and soft orthogonal optimization, is first proposed to extract the latent spatial information of multi-view data with subspace bases. Building on this, a novel contrastive independent subspace analysis framework for multi-view classification is developed to further optimize from spatial perspective. Specifically, contrastive subspace optimization separates the subspaces, thereby enhancing their representational capacity. Whilst contrastive fusion optimization aims at building cross-view subspace correlations and forms a non overlapping data representation. In k-fold validation experiments, MvCISA achieved state-of-the-art accuracies of 76.95%, 98.50%, 93.33% and 88.24% on four benchmark multi-view datasets, significantly outperforming the second-best method by 8.57%, 0.25%, 1.66% and 5.96% in accuracy. And visualization experiments demonstrate the effectiveness of the subspace and feature space optimization, also indicating their promising potential for other downstream tasks. Our code is available at https://github.com/raRn0y/MvCISA.

{"title":"Contrastive independent subspace analysis network for multi-view spatial information extraction.","authors":"Tengyu Zhang, Deyu Zeng, Wei Liu, Zongze Wu, Chris Ding, Xiaopin Zhong","doi":"10.1016/j.neunet.2024.107105","DOIUrl":"https://doi.org/10.1016/j.neunet.2024.107105","url":null,"abstract":"<p><p>Multi-view classification integrates features from different views to optimize classification performance. Most of the existing works typically utilize semantic information to achieve view fusion but neglect the spatial information of data itself, which accommodates data representation with correlation information and is proven to be an essential aspect. Thus robust independent subspace analysis network, optimized by sparse and soft orthogonal optimization, is first proposed to extract the latent spatial information of multi-view data with subspace bases. Building on this, a novel contrastive independent subspace analysis framework for multi-view classification is developed to further optimize from spatial perspective. Specifically, contrastive subspace optimization separates the subspaces, thereby enhancing their representational capacity. Whilst contrastive fusion optimization aims at building cross-view subspace correlations and forms a non overlapping data representation. In k-fold validation experiments, MvCISA achieved state-of-the-art accuracies of 76.95%, 98.50%, 93.33% and 88.24% on four benchmark multi-view datasets, significantly outperforming the second-best method by 8.57%, 0.25%, 1.66% and 5.96% in accuracy. And visualization experiments demonstrate the effectiveness of the subspace and feature space optimization, also indicating their promising potential for other downstream tasks. Our code is available at https://github.com/raRn0y/MvCISA.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107105"},"PeriodicalIF":6.0,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143025042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DVPT: Dynamic Visual Prompt Tuning of large pre-trained models for medical image analysis. 用于医学图像分析的大型预训练模型的动态视觉提示调整。
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-16 DOI: 10.1016/j.neunet.2025.107168
Along He, Yanlin Wu, Zhihong Wang, Tao Li, Huazhu Fu

Pre-training and fine-tuning have become popular due to the rich representations embedded in large pre-trained models, which can be leveraged for downstream medical tasks. However, existing methods typically either fine-tune all parameters or only task-specific layers of pre-trained models, overlooking the variability in input medical images. As a result, these approaches may lack efficiency or effectiveness. In this study, our goal is to explore parameter-efficient fine-tuning (PEFT) for medical image analysis. To address this challenge, we introduce a novel method called Dynamic Visual Prompt Tuning (DVPT). It can extract knowledge beneficial to downstream tasks from large models with only a few trainable parameters. First, the frozen features are transformed by a lightweight bottleneck layer to learn the domain-specific distribution of downstream medical tasks. Then, a few learnable visual prompts are employed as dynamic queries to conduct cross-attention with the transformed features, aiming to acquire sample-specific features. This DVPT module can be shared across different Transformer layers, further reducing the number of trainable parameters. We conduct extensive experiments with various pre-trained models on medical classification and segmentation tasks. We find that this PEFT method not only efficiently adapts pre-trained models to the medical domain but also enhances data efficiency with limited labeled data. For example, with only 0.5% additional trainable parameters, our method not only outperforms state-of-the-art PEFT methods but also surpasses full fine-tuning by more than 2.20% in Kappa score on the medical classification task. It can save up to 60% of labeled data and 99% of storage cost of ViT-B/16.

由于大型预训练模型中嵌入了丰富的表示,可以用于下游医疗任务,因此预训练和微调已经变得流行。然而,现有的方法通常要么微调所有参数,要么只微调预训练模型的特定任务层,忽略了输入医学图像的可变性。因此,这些方法可能缺乏效率或有效性。在本研究中,我们的目标是探索用于医学图像分析的参数有效微调(PEFT)。为了解决这一挑战,我们引入了一种称为动态视觉提示调整(DVPT)的新方法。它可以从只有少量可训练参数的大型模型中提取对下游任务有益的知识。首先,通过一个轻量级瓶颈层对冻结特征进行转换,以了解下游医疗任务的特定领域分布。然后,使用一些可学习的视觉提示作为动态查询,与转换后的特征进行交叉关注,以获取特定于样本的特征。这个DVPT模块可以在不同的Transformer层之间共享,从而进一步减少可训练参数的数量。我们使用各种预训练模型对医学分类和分割任务进行了广泛的实验。我们发现,该方法不仅能有效地将预先训练好的模型适应于医学领域,而且还能在有限的标记数据下提高数据效率。例如,仅使用0.5%的额外可训练参数,我们的方法不仅优于最先进的PEFT方法,而且在医学分类任务上的Kappa分数超过完全微调超过2.20%。它可以节省高达60%的标签数据和99%的ViT-B/16存储成本。
{"title":"DVPT: Dynamic Visual Prompt Tuning of large pre-trained models for medical image analysis.","authors":"Along He, Yanlin Wu, Zhihong Wang, Tao Li, Huazhu Fu","doi":"10.1016/j.neunet.2025.107168","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107168","url":null,"abstract":"<p><p>Pre-training and fine-tuning have become popular due to the rich representations embedded in large pre-trained models, which can be leveraged for downstream medical tasks. However, existing methods typically either fine-tune all parameters or only task-specific layers of pre-trained models, overlooking the variability in input medical images. As a result, these approaches may lack efficiency or effectiveness. In this study, our goal is to explore parameter-efficient fine-tuning (PEFT) for medical image analysis. To address this challenge, we introduce a novel method called Dynamic Visual Prompt Tuning (DVPT). It can extract knowledge beneficial to downstream tasks from large models with only a few trainable parameters. First, the frozen features are transformed by a lightweight bottleneck layer to learn the domain-specific distribution of downstream medical tasks. Then, a few learnable visual prompts are employed as dynamic queries to conduct cross-attention with the transformed features, aiming to acquire sample-specific features. This DVPT module can be shared across different Transformer layers, further reducing the number of trainable parameters. We conduct extensive experiments with various pre-trained models on medical classification and segmentation tasks. We find that this PEFT method not only efficiently adapts pre-trained models to the medical domain but also enhances data efficiency with limited labeled data. For example, with only 0.5% additional trainable parameters, our method not only outperforms state-of-the-art PEFT methods but also surpasses full fine-tuning by more than 2.20% in Kappa score on the medical classification task. It can save up to 60% of labeled data and 99% of storage cost of ViT-B/16.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107168"},"PeriodicalIF":6.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection. PFENet:从稀疏点云中精确提取特征用于三维目标检测。
IF 6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-01-16 DOI: 10.1016/j.neunet.2025.107144
Yaochen Li, Qiao Li, Cong Gao, Shengjing Gao, Hao Wu, Rui Liu

Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.

准确的三维点云目标检测对于自动驾驶车辆至关重要。3D场景中点云的稀疏性,特别是对于像行人和自行车这样包含较少点的较小目标,使得检测特别具有挑战性。为了解决这一问题,我们提出了一种基于单阶段体素的三维物体检测方法,即PFENet。首先,我们设计了一个鲁棒的体素特征编码网络,该网络结合了堆叠三重注意机制来增强关键特征的提取并抑制噪声。此外,三维稀疏卷积层根据输出位置重要性动态调整特征处理,提高小目标识别能力。此外,区域建议网络中的注意特征融合模块将低水平空间特征与高水平语义特征融合,并通过空间金字塔池化扩大接收野以捕获多尺度特征。最后,我们开发了多个检测头,用于更精细的特征提取和目标分类,以及更准确的边界盒回归。在KITTI数据集上的实验结果证明了该方法的有效性。
{"title":"PFENet: Towards precise feature extraction from sparse point cloud for 3D object detection.","authors":"Yaochen Li, Qiao Li, Cong Gao, Shengjing Gao, Hao Wu, Rui Liu","doi":"10.1016/j.neunet.2025.107144","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.107144","url":null,"abstract":"<p><p>Accurate 3D point cloud object detection is crucially important for autonomous driving vehicles. The sparsity of point clouds in 3D scenes, especially for smaller targets like pedestrians and bicycles that contain fewer points, makes detection particularly challenging. To solve this problem, we propose a single-stage voxel-based 3D object detection method, namely PFENet. Firstly, we design a robust voxel feature encoding network that incorporates a stacked triple attention mechanism to enhance the extraction of key features and suppress noise. Moreover, a 3D sparse convolution layer dynamically adjusts feature processing based on output location importance, improving small object recognition. Additionally, the attentional feature fusion module in the region proposal network merges low-level spatial features with high-level semantic features, and broadens the receptive field through atrous spatial pyramid pooling to capture multi-scale features. Finally, we develop multiple detection heads for more refined feature extraction and object classification, as well as more accurate bounding box regression. Experimental results on the KITTI dataset demonstrate the effectiveness of the proposed method.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"185 ","pages":"107144"},"PeriodicalIF":6.0,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143014914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neural Networks
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1