首页 > 最新文献

Neurocomputing最新文献

英文 中文
BRAVE: A cascaded generative model with sample attention for robust few shot image classification BRAVE:一种具有样本关注度的级联生成模型,用于对少量图像进行稳健分类
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128585

Few-shot learning (FSL) confronts notable challenges due to the disparity between training and testing categories, leading to channel bias in neural networks and hindering accurate feature discernment. To address this, we introduce Biased-Reduction Attentive Network (BRAVE), an innovative model that incorporates a refined Vector Quantized Variational Autoencoder (VQ-VAE) backbone, enhanced with our Diverse Quantization (DQ) Module, for unbiased, fine-grained feature creation. Alongside, our Sample Attention (SA) Module is utilized for extracting discriminative features from these unbiased, fine-grained features. The DQ Module in BRAVE strategically integrates prior distribution regularization and stochastic masking with Gumbel sampling for balanced and diverse codebook engagement, while the SA Module leverages inter-sample dynamics for identifying critical features. This synergy effectively counters channel bias and boosts classification accuracy in FSL setups, surpassing current leading methods. Our approach represents a practical balance between preserving detailed features through the decoder and ensuring classification effectiveness, marking a significant advance in FSL. BRAVE’s implementation is accessible for community use and further exploration. Code and models available at https://github.com/ApocalypsezZ/BRAVE.

由于训练和测试类别之间的差异,少量学习(FSL)面临着显著的挑战,导致神经网络中的通道偏差,阻碍了准确的特征识别。为了解决这个问题,我们引入了有偏差还原注意力网络(BRAVE),这是一种创新模型,它采用了经过改进的矢量量化变异自动编码器(VQ-VAE)骨干,并通过我们的多样化量化(DQ)模块进行了增强,以实现无偏差的细粒度特征创建。同时,我们的样本关注(SA)模块可用于从这些无偏的细粒度特征中提取判别特征。BRAVE 中的 DQ 模块战略性地将先验分布正则化和随机掩蔽与 Gumbel 采样相结合,以实现均衡和多样化的编码本参与,而 SA 模块则利用样本间动态来识别关键特征。这种协同作用有效地消除了信道偏差,提高了 FSL 设置中的分类准确性,超越了当前的领先方法。我们的方法在通过解码器保留详细特征和确保分类效果之间实现了切实可行的平衡,标志着 FSL 领域的重大进步。BRAVE 的实现可供社区使用和进一步探索。代码和模型请访问 https://github.com/ApocalypsezZ/BRAVE。
{"title":"BRAVE: A cascaded generative model with sample attention for robust few shot image classification","authors":"","doi":"10.1016/j.neucom.2024.128585","DOIUrl":"10.1016/j.neucom.2024.128585","url":null,"abstract":"<div><p>Few-shot learning (FSL) confronts notable challenges due to the disparity between training and testing categories, leading to channel bias in neural networks and hindering accurate feature discernment. To address this, we introduce Biased-Reduction Attentive Network (BRAVE), an innovative model that incorporates a refined Vector Quantized Variational Autoencoder (VQ-VAE) backbone, enhanced with our Diverse Quantization (DQ) Module, for unbiased, fine-grained feature creation. Alongside, our Sample Attention (SA) Module is utilized for extracting discriminative features from these unbiased, fine-grained features. The DQ Module in BRAVE strategically integrates prior distribution regularization and stochastic masking with Gumbel sampling for balanced and diverse codebook engagement, while the SA Module leverages inter-sample dynamics for identifying critical features. This synergy effectively counters channel bias and boosts classification accuracy in FSL setups, surpassing current leading methods. Our approach represents a practical balance between preserving detailed features through the decoder and ensuring classification effectiveness, marking a significant advance in FSL. BRAVE’s implementation is accessible for community use and further exploration. Code and models available at <span><span>https://github.com/ApocalypsezZ/BRAVE</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142244055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Few-shot image classification using graph neural network with fine-grained feature descriptors 使用带有细粒度特征描述器的图神经网络对少量图像进行分类
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128448

Graph computation via Graph Neural Networks (GNNs) is emerging as a pivotal approach for addressing the challenges in image classification tasks. This paper introduces a novel strategy for image classification using minimal labeled data from the mini-ImageNet database. The primary contributions include the development of an innovative Fine-Grained Feature Descriptor (FGFD) module. Following this, the GNN is employed at a more granular level to enhance image classification efficiency. Additionally, ablation studies were conducted in conjunction with existing state-of-the-art systems for few-shot image classification. Comparative analyses were performed, and the simulation results demonstrate that the proposed method significantly improves classification accuracy over traditional few-shot image classification methods.

通过图神经网络(GNN)进行图计算正成为应对图像分类任务挑战的重要方法。本文介绍了一种利用迷你图像网络数据库(mini-ImageNet)中最小标记数据进行图像分类的新策略。其主要贡献包括开发了创新的细粒度特征描述器(FGFD)模块。在此基础上,在更细的层次上使用 GNN,以提高图像分类效率。此外,还结合现有的最先进系统进行了消融研究,以实现少镜头图像分类。模拟结果表明,与传统的少帧图像分类方法相比,建议的方法显著提高了分类准确性。
{"title":"Few-shot image classification using graph neural network with fine-grained feature descriptors","authors":"","doi":"10.1016/j.neucom.2024.128448","DOIUrl":"10.1016/j.neucom.2024.128448","url":null,"abstract":"<div><p>Graph computation via Graph Neural Networks (GNNs) is emerging as a pivotal approach for addressing the challenges in image classification tasks. This paper introduces a novel strategy for image classification using minimal labeled data from the mini-ImageNet database. The primary contributions include the development of an innovative Fine-Grained Feature Descriptor (FGFD) module. Following this, the GNN is employed at a more granular level to enhance image classification efficiency. Additionally, ablation studies were conducted in conjunction with existing state-of-the-art systems for few-shot image classification. Comparative analyses were performed, and the simulation results demonstrate that the proposed method significantly improves classification accuracy over traditional few-shot image classification methods.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142244051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mask-guided BERT for few-shot text classification 用于少量文本分类的掩码引导 BERT
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128576

Transformer-based language models have achieved significant success in various domains. However, the data-intensive nature of the transformer architecture requires much labeled data, which is challenging in low-resource scenarios (i.e., few-shot learning (FSL)). The main challenge of FSL is the difficulty of training robust models on small amounts of samples, which frequently leads to overfitting. Here we present Mask-BERT, a simple and modular framework to help BERT-based architectures tackle FSL. The proposed approach fundamentally differs from existing FSL strategies such as prompt tuning and meta-learning. The core idea is to selectively apply masks on text inputs and filter out irrelevant information, which guides the model to focus on discriminative tokens that influence prediction results. In addition, to make the text representations from different categories more separable and the text representations from the same category more compact, we introduce a contrastive learning loss function. Experimental results on open-domain and medical-domain datasets demonstrate the effectiveness of Mask-BERT. Code and data are available at: github.com/WenxiongLiao/mask-bert

基于变换器的语言模型在各个领域都取得了巨大成功。然而,转换器架构的数据密集特性需要大量标注数据,这在资源匮乏的情况下具有挑战性(即少量学习(FSL))。FSL 的主要挑战是很难在少量样本上训练出稳健的模型,这经常会导致过度拟合。在此,我们提出了 Mask-BERT,一个简单的模块化框架,帮助基于 BERT 的架构解决 FSL 问题。所提出的方法从根本上不同于现有的 FSL 策略,如及时调整和元学习。其核心思想是在文本输入上有选择地应用掩码,过滤掉无关信息,从而引导模型关注影响预测结果的辨别标记。此外,为了使不同类别的文本表征更易分离,同一类别的文本表征更紧凑,我们引入了对比学习损失函数。在开放领域和医疗领域数据集上的实验结果证明了 Mask-BERT 的有效性。代码和数据见:github.com/WenxiongLiao/mask-bert
{"title":"Mask-guided BERT for few-shot text classification","authors":"","doi":"10.1016/j.neucom.2024.128576","DOIUrl":"10.1016/j.neucom.2024.128576","url":null,"abstract":"<div><p>Transformer-based language models have achieved significant success in various domains. However, the data-intensive nature of the transformer architecture requires much labeled data, which is challenging in low-resource scenarios (i.e., few-shot learning (FSL)). The main challenge of FSL is the difficulty of training robust models on small amounts of samples, which frequently leads to overfitting. Here we present Mask-BERT, a simple and modular framework to help BERT-based architectures tackle FSL. The proposed approach fundamentally differs from existing FSL strategies such as prompt tuning and meta-learning. The core idea is to selectively apply masks on text inputs and filter out irrelevant information, which guides the model to focus on discriminative tokens that influence prediction results. In addition, to make the text representations from different categories more separable and the text representations from the same category more compact, we introduce a contrastive learning loss function. Experimental results on open-domain and medical-domain datasets demonstrate the effectiveness of Mask-BERT. Code and data are available at: <span><span>github.com/WenxiongLiao/mask-bert</span><svg><path></path></svg></span></p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142244208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MFFGD: An adaptive Caputo fractional-order gradient algorithm for DNN MFFGD:用于 DNN 的自适应卡普托分数阶梯度算法
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128606

As a primary optimization method for neural networks, gradient descent algorithm has received significant attention in the recent development of deep neural networks. However, current gradient descent algorithms still suffer from drawbacks such as an excess of hyperparameters, getting stuck in local optima, and poor generalization. This paper introduces a novel Caputo fractional-order gradient descent (MFFGD) algorithm to address these limitations. It provides fractional-order gradient derivation and error analysis for different activation functions and loss functions within the network, simplifying the computation of traditional fractional order gradients. Additionally, by introducing a memory factor to record past gradient variations, MFFGD achieves adaptive adjustment capabilities. Comparative experiments were conducted on multiple sets of datasets with different modalities, and the results, along with theoretical analysis, demonstrate the superiority of MFFGD over other optimizers.

作为神经网络的主要优化方法,梯度下降算法在近年来深度神经网络的发展中受到了极大关注。然而,目前的梯度下降算法仍存在超参数过多、陷入局部最优、泛化能力差等缺点。本文介绍了一种新颖的卡普托分数阶梯度下降(MFFGD)算法,以解决这些局限性。它为网络中不同的激活函数和损失函数提供分数阶梯度推导和误差分析,简化了传统分数阶梯度的计算。此外,通过引入记忆因子来记录过去的梯度变化,MFFGD 实现了自适应调整功能。我们在多组不同模式的数据集上进行了对比实验,结果和理论分析都证明了 MFFGD 优于其他优化器。
{"title":"MFFGD: An adaptive Caputo fractional-order gradient algorithm for DNN","authors":"","doi":"10.1016/j.neucom.2024.128606","DOIUrl":"10.1016/j.neucom.2024.128606","url":null,"abstract":"<div><p>As a primary optimization method for neural networks, gradient descent algorithm has received significant attention in the recent development of deep neural networks. However, current gradient descent algorithms still suffer from drawbacks such as an excess of hyperparameters, getting stuck in local optima, and poor generalization. This paper introduces a novel Caputo fractional-order gradient descent (MFFGD) algorithm to address these limitations. It provides fractional-order gradient derivation and error analysis for different activation functions and loss functions within the network, simplifying the computation of traditional fractional order gradients. Additionally, by introducing a memory factor to record past gradient variations, MFFGD achieves adaptive adjustment capabilities. Comparative experiments were conducted on multiple sets of datasets with different modalities, and the results, along with theoretical analysis, demonstrate the superiority of MFFGD over other optimizers.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142244269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Leveraging denoising diffusion probabilistic model to improve the multi-thickness CT segmentation 利用去噪扩散概率模型改进多厚度 CT 分割
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128573

Organs-at-risk (OARs) segmentation in computed tomography (CT) is a fundamental step in the radiotherapy workflow, which has been prone as a time-consuming and labor-intensive task. Deep neural networks (DNNs) have gained significant popularity in the field of OAR segmentation tasks, achieving remarkable progress in clinical practice. Typically, OARs are distributed throughout different areas of the body and require varying thicknesses of CT scans for better diagnosis and segmentation in clinical. Most DNN-based segmentation focuses on single-thickness CT scans, limiting their applicability to varying thicknesses due to a lack of diverse thickness-related feature learning. While pre-training with the denoising diffusion probabilistic model (DDPM) offers an effective solution for dense feature learning, current works are constrained in addressing feature diversity, as exemplified by scenarios such as multi-thickness CT. To address the above challenges, this paper introduces a novel pre-training approach called DiffMT. This approach leverages the DDPM to extract valuable features from multi-thickness CT images. By transferring the pre-trained DDPM to the downstream segmentation for fine-tuning, the model gains proficiency in learning diverse multi-thickness CT features, leading to precise segmentation across varied thicknesses. We explore DiffMT’s feature learning capacity through experiments involving pre-trained models of varying sizes and different denoising thicknesses. Subsequently, thorough experiments comparing DDPM-based segmentation with other state-of-the-art (SOTA) CT segmentation methods, along with assessments on diverse OARs and modalities, empirically demonstrate that the proposed DiffMT method outperforms the control methods. The codes are available at https://github.com/ychengrong/DiffMT.

计算机断层扫描(CT)中的危险器官(OAR)分割是放射治疗工作流程中的一个基本步骤,一直被认为是一项耗时耗力的任务。深度神经网络(DNN)在 OAR 分割任务领域大受欢迎,在临床实践中取得了显著进展。通常情况下,OAR 分布在身体的不同部位,需要不同厚度的 CT 扫描,以便在临床上进行更好的诊断和分割。大多数基于 DNN 的分割方法侧重于单厚度 CT 扫描,由于缺乏与厚度相关的多样化特征学习,其适用性受到限制。虽然使用去噪扩散概率模型(DDPM)进行预训练为密集特征学习提供了有效的解决方案,但目前的工作在解决特征多样性方面受到限制,多厚度 CT 等场景就是一个例子。为应对上述挑战,本文介绍了一种名为 DiffMT 的新型预训练方法。这种方法利用 DDPM 从多厚度 CT 图像中提取有价值的特征。通过将预先训练好的 DDPM 移植到下游分割中进行微调,该模型可以熟练地学习各种多厚度 CT 特征,从而对不同厚度进行精确分割。我们通过涉及不同大小和不同去噪厚度的预训练模型的实验来探索 DiffMT 的特征学习能力。随后,我们将基于 DDPM 的分割与其他最先进的(SOTA)CT 分割方法进行了全面的实验比较,并对不同的 OAR 和模式进行了评估,从经验上证明了所提出的 DiffMT 方法优于对照方法。代码见 https://github.com/ychengrong/DiffMT。
{"title":"Leveraging denoising diffusion probabilistic model to improve the multi-thickness CT segmentation","authors":"","doi":"10.1016/j.neucom.2024.128573","DOIUrl":"10.1016/j.neucom.2024.128573","url":null,"abstract":"<div><p>Organs-at-risk (OARs) segmentation in computed tomography (CT) is a fundamental step in the radiotherapy workflow, which has been prone as a time-consuming and labor-intensive task. Deep neural networks (DNNs) have gained significant popularity in the field of OAR segmentation tasks, achieving remarkable progress in clinical practice. Typically, OARs are distributed throughout different areas of the body and require varying thicknesses of CT scans for better diagnosis and segmentation in clinical. Most DNN-based segmentation focuses on single-thickness CT scans, limiting their applicability to varying thicknesses due to a lack of diverse thickness-related feature learning. While pre-training with the denoising diffusion probabilistic model (DDPM) offers an effective solution for dense feature learning, current works are constrained in addressing feature diversity, as exemplified by scenarios such as multi-thickness CT. To address the above challenges, this paper introduces a novel pre-training approach called DiffMT. This approach leverages the DDPM to extract valuable features from multi-thickness CT images. By transferring the pre-trained DDPM to the downstream segmentation for fine-tuning, the model gains proficiency in learning diverse multi-thickness CT features, leading to precise segmentation across varied thicknesses. We explore DiffMT’s feature learning capacity through experiments involving pre-trained models of varying sizes and different denoising thicknesses. Subsequently, thorough experiments comparing DDPM-based segmentation with other state-of-the-art (SOTA) CT segmentation methods, along with assessments on diverse OARs and modalities, empirically demonstrate that the proposed DiffMT method outperforms the control methods. The codes are available at <span><span>https://github.com/ychengrong/DiffMT</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142232235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
No-reference quality evaluation of realistic hazy images via singular value decomposition 通过奇异值分解对现实灰度图像进行无参考质量评估
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128574

Haze is one of the atmospheric image degradations that causes severe distortions to outdoor images such as low contrast, color shift, and structure damage. Due to the unique physical characteristics of haze, the quality of hazy images is not accurately assessed using general-purpose image quality assessment (IQA) approaches. Therefore, several haze-aware IQA approaches have been proposed to provide more efficient dehazing quality evaluation. These approaches extract several haze-aware features to be either combined to form a single IQA metric or fed to a regression model that predicts the dehazing quality. However, these haze-relevant features are extracted using pixel intensity, in which luminance and structure information are inseparable, leading to less correlation between such features and the type of degradation they are supposed to represent. To address this issue, we propose a singular value decomposition (SVD) based IQA metric that can effectively separate the luminance component of an image from structure. This separation offers the ability to accurately evaluate the degradation at two different levels i.e. luminance and structure. The experimental results show that our proposed SVD-based dehazing quality evaluator (SDQE) outperforms the existing state-of-the-art non-reference IQA metrics in terms of accuracy and processing time.

雾霾是大气图像退化的一种,会导致室外图像严重失真,如对比度低、颜色偏移和结构损坏。由于雾霾的独特物理特性,使用通用图像质量评估(IQA)方法无法准确评估雾霾图像的质量。因此,人们提出了几种雾霾感知 IQA 方法,以提供更有效的去雾质量评估。这些方法可提取多个雾霾感知特征,将其组合成一个 IQA 指标,或输入一个预测除霾质量的回归模型。然而,这些雾霾相关特征是通过像素强度提取的,而像素强度中的亮度和结构信息是不可分割的,这就导致这些特征与它们所代表的退化类型之间的相关性较低。为了解决这个问题,我们提出了一种基于奇异值分解(SVD)的 IQA 指标,它能有效地将图像的亮度成分与结构分开。这种分离方法能够准确评估亮度和结构这两个不同层次的劣化情况。实验结果表明,我们提出的基于 SVD 的除杂质量评估器(SDQE)在准确性和处理时间方面都优于现有的最先进的非参考 IQA 指标。
{"title":"No-reference quality evaluation of realistic hazy images via singular value decomposition","authors":"","doi":"10.1016/j.neucom.2024.128574","DOIUrl":"10.1016/j.neucom.2024.128574","url":null,"abstract":"<div><p>Haze is one of the atmospheric image degradations that causes severe distortions to outdoor images such as low contrast, color shift, and structure damage. Due to the unique physical characteristics of haze, the quality of hazy images is not accurately assessed using general-purpose image quality assessment (IQA) approaches. Therefore, several haze-aware IQA approaches have been proposed to provide more efficient dehazing quality evaluation. These approaches extract several haze-aware features to be either combined to form a single IQA metric or fed to a regression model that predicts the dehazing quality. However, these haze-relevant features are extracted using pixel intensity, in which luminance and structure information are inseparable, leading to less correlation between such features and the type of degradation they are supposed to represent. To address this issue, we propose a singular value decomposition (SVD) based IQA metric that can effectively separate the luminance component of an image from structure. This separation offers the ability to accurately evaluate the degradation at two different levels i.e. luminance and structure. The experimental results show that our proposed SVD-based dehazing quality evaluator (SDQE) outperforms the existing state-of-the-art non-reference IQA metrics in terms of accuracy and processing time.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0925231224013456/pdfft?md5=eb5b3c0e8846b24e7b57f2c28db91c5e&pid=1-s2.0-S0925231224013456-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive qualitative and quantitative survey on image dehazing based on deep neural networks 基于深度神经网络的图像去噪定性定量综合研究
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-13 DOI: 10.1016/j.neucom.2024.128582

Image dehazing has become a necessary area of research with the increasing popularity and demand of computer vision systems. Image dehazing is a method to remove haze from an image to improve its visual quality. Dehazing techniques are widely employed in a variety of computer vision applications to enhance their overall performance. Many techniques have been proposed by researchers in recent years to eliminate the haze from an image. However, there is a lack of available literature that provides a summary of deep learning-based state-of-the-art image dehazing methods. In this study, we provide a detailed review of recently proposed image dehazing techniques based on deep neural networks such as CNN, GAN, RNN, RCNN, and Transformer. A concise review of significant applications of image dehazing, benchmark datasets, and various performance metrics are also presented. We compare the state-of-the-art methods quantitatively using performance evaluation metrics such as SSIM and PSNR. Finally, this study discusses the fundamental difficulties associated with image dehazing approaches that need to be further explored.

随着计算机视觉系统的普及和需求的增加,图像去雾已经成为一个必要的研究领域。图像去雾是一种去除图像雾度以改善其视觉质量的方法。去雾技术被广泛应用于各种计算机视觉应用中,以提高其整体性能。近年来,研究人员提出了许多消除图像雾度的技术。然而,现有文献中缺乏对基于深度学习的最先进图像去毛刺方法的总结。在本研究中,我们详细回顾了最近提出的基于深度神经网络(如 CNN、GAN、RNN、RCNN 和 Transformer)的图像去雾技术。我们还简要回顾了图像去毛刺的重要应用、基准数据集和各种性能指标。我们使用 SSIM 和 PSNR 等性能评估指标对最先进的方法进行了定量比较。最后,本研究讨论了与图像去毛刺方法相关的基本难点,这些难点需要进一步探讨。
{"title":"A comprehensive qualitative and quantitative survey on image dehazing based on deep neural networks","authors":"","doi":"10.1016/j.neucom.2024.128582","DOIUrl":"10.1016/j.neucom.2024.128582","url":null,"abstract":"<div><p>Image dehazing has become a necessary area of research with the increasing popularity and demand of computer vision systems. Image dehazing is a method to remove haze from an image to improve its visual quality. Dehazing techniques are widely employed in a variety of computer vision applications to enhance their overall performance. Many techniques have been proposed by researchers in recent years to eliminate the haze from an image. However, there is a lack of available literature that provides a summary of deep learning-based state-of-the-art image dehazing methods. In this study, we provide a detailed review of recently proposed image dehazing techniques based on deep neural networks such as CNN, GAN, RNN, RCNN, and Transformer. A concise review of significant applications of image dehazing, benchmark datasets, and various performance metrics are also presented. We compare the state-of-the-art methods quantitatively using performance evaluation metrics such as SSIM and PSNR. Finally, this study discusses the fundamental difficulties associated with image dehazing approaches that need to be further explored.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DMMGNet: A discrimination mapping and memory bank mean guidance-based network for high-performance few-shot industrial anomaly detection DMMGNet:基于辨别映射和记忆库均值引导的网络,用于高性能少镜头工业异常检测
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-12 DOI: 10.1016/j.neucom.2024.128622

For deep learning-based industrial anomaly detection, it is still challenging to get adequate images for model training and achieve cold start for cross-product migration, restricting their practical application in real industrial production. Herein, an innovative few-shot anomaly detection network DMMGNet based on discrimination mapping and memory bank mean guidance strategies are demonstrated, which is trained by a new two-branch data augmentation technique. By separating the features stored in memory bank from the features used for training, the two-branch data augmentation method can significantly improve the robustness of few-shot model training and reduce the redundance of memory bank. In the elaborately designed discrimination mapping module, new negative samples are generated by adding dynamic Gaussian noise to normal samples along the channel dimension in feature space to solve the problem of sample imbalance. Meanwhile, the discrimination mapping module also helps to map the feature distribution of positive samples to the target domain more efficiently and reduce the deviation of feature domain, conducive to a more precise separation of positive and negative samples. In addition, a novel mean guidance approach with an optimized loss function is developed to guide the positive sample feature mapping by specifying the local feature space center to form a clear feature domain contour and enhance the detection accuracy. The multiple experimental results validate that our DMMGNet outperforms the most advanced anomaly detection counterparts on image-level AUROC, showing an increase by 0.3–3 % on both MVTec AD and MPDD benchmarks under several few-shot scenarios.

对于基于深度学习的工业异常检测来说,如何获取足够的图像进行模型训练并实现跨产品迁移的冷启动仍是一个挑战,限制了其在实际工业生产中的实际应用。本文展示了一种基于辨别映射和记忆库均值引导策略的创新型少镜头异常检测网络DMMGNet,该网络通过一种新的双分支数据增强技术进行训练。通过将存储在内存库中的特征与用于训练的特征分离,双分支数据增强方法可以显著提高少镜头模型训练的鲁棒性,并减少内存库的冗余。在精心设计的判别映射模块中,通过在正常样本中添加动态高斯噪声,在特征空间中沿着通道维度生成新的负样本,以解决样本不平衡的问题。同时,分辨映射模块还有助于将正样本的特征分布更有效地映射到目标域,减少特征域的偏差,有利于更精确地分离正负样本。此外,还开发了一种具有优化损失函数的新型均值引导方法,通过指定局部特征空间中心来引导正样本特征映射,从而形成清晰的特征域轮廓,提高检测精度。多个实验结果验证了我们的 DMMGNet 在图像级 AUROC 上优于最先进的异常检测同行,在多个少镜头场景下,DMMGNet 在 MVTec AD 和 MPDD 基准上都提高了 0.3-3 %。
{"title":"DMMGNet: A discrimination mapping and memory bank mean guidance-based network for high-performance few-shot industrial anomaly detection","authors":"","doi":"10.1016/j.neucom.2024.128622","DOIUrl":"10.1016/j.neucom.2024.128622","url":null,"abstract":"<div><p>For deep learning-based industrial anomaly detection, it is still challenging to get adequate images for model training and achieve cold start for cross-product migration, restricting their practical application in real industrial production. Herein, an innovative few-shot anomaly detection network DMMGNet based on discrimination mapping and memory bank mean guidance strategies are demonstrated, which is trained by a new two-branch data augmentation technique. By separating the features stored in memory bank from the features used for training, the two-branch data augmentation method can significantly improve the robustness of few-shot model training and reduce the redundance of memory bank. In the elaborately designed discrimination mapping module, new negative samples are generated by adding dynamic Gaussian noise to normal samples along the channel dimension in feature space to solve the problem of sample imbalance. Meanwhile, the discrimination mapping module also helps to map the feature distribution of positive samples to the target domain more efficiently and reduce the deviation of feature domain, conducive to a more precise separation of positive and negative samples. In addition, a novel mean guidance approach with an optimized loss function is developed to guide the positive sample feature mapping by specifying the local feature space center to form a clear feature domain contour and enhance the detection accuracy. The multiple experimental results validate that our DMMGNet outperforms the most advanced anomaly detection counterparts on image-level AUROC, showing an increase by 0.3–3 % on both MVTec AD and MPDD benchmarks under several few-shot scenarios.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic delineation and prognostic assessment of head and neck tumor lesion in multi-modality positron emission tomography / computed tomography images based on deep learning: A survey 基于深度学习的多模态正电子发射断层扫描/计算机断层扫描图像中头颈部肿瘤病灶的自动划分和预后评估:一项调查
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-10 DOI: 10.1016/j.neucom.2024.128531

Accurately segmenting and staging tumor lesions in cancer patients presents a significant challenge for radiologists, but it is essential for devising effective treatment plans including radiation therapy, personalized medicine, and surgical options. The integration of artificial intelligence (AI), particularly deep learning (DL), has become a useful tool for radiologists, enhancing their ability to understand tumor biology and deliver personalized care to patients with H&N tumors. Segmenting H&N tumor lesions using Positron Emission Tomography/Computed Tomography (PET/CT) images has gained significant attention. However, the diverse shapes and sizes of tumors, along with indistinct boundaries between malignant and normal tissues, present significant challenges in effectively fusing PET and CT images. To overcome these challenges, various DL-based models have been developed for segmenting tumor lesions in PET/CT images. This article reviews multimodality (PET/CT) based H&N tumor lesions segmentation methods. We firstly discuss the strengths and limitations of PET/CT imaging and the importance of DL-based models in H&N tumor lesion segmentation. Second, we examine the current state-of-the-art DL models for H&N tumor segmentation, categorizing them into UNet, VNet, Vision Transformer, and miscellaneous models based on their architectures. Third, we explore the annotation and evaluation processes, addressing challenges in segmentation annotation and discussing the metrics used to assess model performance. Finally, we discuss several open challenges and provide some avenues for future research in H&N tumor lesion segmentation.

对癌症患者的肿瘤病灶进行精确分割和分期是放射科医生面临的一项重大挑战,但这对于制定有效的治疗方案(包括放射治疗、个性化医疗和手术方案)至关重要。人工智能(AI),尤其是深度学习(DL)的集成已成为放射科医生的有用工具,可提高他们理解肿瘤生物学的能力,并为 H&N 肿瘤患者提供个性化治疗。利用正电子发射断层扫描/计算机断层扫描(PET/CT)图像对 H&N 肿瘤病灶进行分割已引起了广泛关注。然而,肿瘤的形状和大小多种多样,恶性肿瘤和正常组织之间的界限模糊不清,这给有效融合 PET 和 CT 图像带来了巨大挑战。为了克服这些挑战,人们开发了各种基于 DL 的模型来分割 PET/CT 图像中的肿瘤病灶。本文综述了基于多模态(PET/CT)的 H&N 肿瘤病灶分割方法。我们首先讨论了 PET/CT 成像的优势和局限性,以及基于 DL 的模型在 H&N 肿瘤病灶分割中的重要性。其次,我们研究了当前用于 H&N 肿瘤分割的最先进 DL 模型,并根据其架构将其分为 UNet、VNet、Vision Transformer 和其他模型。第三,我们探讨了注释和评估过程,解决了分割注释中的难题,并讨论了用于评估模型性能的指标。最后,我们讨论了在 H&N 肿瘤病灶分割方面面临的几个挑战,并为未来的研究提供了一些途径。
{"title":"Automatic delineation and prognostic assessment of head and neck tumor lesion in multi-modality positron emission tomography / computed tomography images based on deep learning: A survey","authors":"","doi":"10.1016/j.neucom.2024.128531","DOIUrl":"10.1016/j.neucom.2024.128531","url":null,"abstract":"<div><p>Accurately segmenting and staging tumor lesions in cancer patients presents a significant challenge for radiologists, but it is essential for devising effective treatment plans including radiation therapy, personalized medicine, and surgical options. The integration of artificial intelligence (AI), particularly deep learning (DL), has become a useful tool for radiologists, enhancing their ability to understand tumor biology and deliver personalized care to patients with H&amp;N tumors. Segmenting H&amp;N tumor lesions using Positron Emission Tomography/Computed Tomography (PET/CT) images has gained significant attention. However, the diverse shapes and sizes of tumors, along with indistinct boundaries between malignant and normal tissues, present significant challenges in effectively fusing PET and CT images. To overcome these challenges, various DL-based models have been developed for segmenting tumor lesions in PET/CT images. This article reviews multimodality (PET/CT) based H&amp;N tumor lesions segmentation methods. We firstly discuss the strengths and limitations of PET/CT imaging and the importance of DL-based models in H&amp;N tumor lesion segmentation. Second, we examine the current state-of-the-art DL models for H&amp;N tumor segmentation, categorizing them into UNet, VNet, Vision Transformer, and miscellaneous models based on their architectures. Third, we explore the annotation and evaluation processes, addressing challenges in segmentation annotation and discussing the metrics used to assess model performance. Finally, we discuss several open challenges and provide some avenues for future research in H&amp;N tumor lesion segmentation.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142232233","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Maximum entropy intrinsic learning for spiking networks towards embodied neuromorphic vision 面向具身神经形态视觉的尖峰网络最大熵本征学习
IF 5.5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-09-06 DOI: 10.1016/j.neucom.2024.128535

Spiking neural network (SNN), as a brain-inspired model, possesses outstanding low power consumption and the ability to mimic biological neuron mechanisms. Embodied vision is a promising field, which requires the low-power advantage of SNNs. However, SNN faces difficulties in achieving real-time, high generalization ability, and robustness in the implementation of embodied vision due to the limitations of existing training methods. In this paper, to prevent model overfitting to noise and unknown environment in embodied neuromorphic visual intelligence, we present a new and efficient learning strategy designed to enhance the training performance of deep SNNs, called Spiking Maximum Entropy Intrinsic Learning (SMEIL). The learning algorithm essentially promotes the perturbation of the underlying source distribution, which in turn enlarges the predictive uncertainty of the current model. This approach enhances the model's robustness and improves its ability to generalize during the training process. Superior performance across a variety of data sets is achieved, and different types of noise are added to SMEIL algorithm for testing its robustness. Experiments show that SMEIL can consistently improve the learning robustness over each noise disturbance, and can cut down the power consumption during training significantly. Hence, it is a powerful method for advancing direct training of deep SNNs, and opens a novel point of view for developing novel spike-based learning algorithm towards embodied neuromorphic intelligence.

尖峰神经网络(SNN)作为一种大脑启发模型,具有出色的低功耗和模仿生物神经元机制的能力。嵌入式视觉是一个前景广阔的领域,它需要 SNN 的低功耗优势。然而,由于现有训练方法的局限性,SNN 在实现嵌入式视觉的实时性、高泛化能力和鲁棒性方面面临困难。本文提出了一种新的高效学习策略,旨在提高深度 SNN 的训练性能,即尖峰最大熵本征学习(Spiking Maximum Entropy Intrinsic Learning,SMEIL)。该学习算法从根本上促进了对底层源分布的扰动,这反过来又扩大了当前模型的预测不确定性。这种方法增强了模型的鲁棒性,提高了模型在训练过程中的泛化能力。SMEIL 算法在各种数据集上都取得了优异的性能,并添加了不同类型的噪声来测试其鲁棒性。实验表明,SMEIL 可以在各种噪声干扰下持续提高学习鲁棒性,并能显著降低训练过程中的功耗。因此,它是一种推进深度 SNN 直接训练的强大方法,并为开发基于尖峰的新型学习算法、实现嵌入式神经形态智能开辟了一个新的视角。
{"title":"Maximum entropy intrinsic learning for spiking networks towards embodied neuromorphic vision","authors":"","doi":"10.1016/j.neucom.2024.128535","DOIUrl":"10.1016/j.neucom.2024.128535","url":null,"abstract":"<div><p>Spiking neural network (SNN), as a brain-inspired model, possesses outstanding low power consumption and the ability to mimic biological neuron mechanisms. Embodied vision is a promising field, which requires the low-power advantage of SNNs. However, SNN faces difficulties in achieving real-time, high generalization ability, and robustness in the implementation of embodied vision due to the limitations of existing training methods. In this paper, to prevent model overfitting to noise and unknown environment in embodied neuromorphic visual intelligence, we present a new and efficient learning strategy designed to enhance the training performance of deep SNNs, called Spiking Maximum Entropy Intrinsic Learning (SMEIL). The learning algorithm essentially promotes the perturbation of the underlying source distribution, which in turn enlarges the predictive uncertainty of the current model. This approach enhances the model's robustness and improves its ability to generalize during the training process. Superior performance across a variety of data sets is achieved, and different types of noise are added to SMEIL algorithm for testing its robustness. Experiments show that SMEIL can consistently improve the learning robustness over each noise disturbance, and can cut down the power consumption during training significantly. Hence, it is a powerful method for advancing direct training of deep SNNs, and opens a novel point of view for developing novel spike-based learning algorithm towards embodied neuromorphic intelligence.</p></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":null,"pages":null},"PeriodicalIF":5.5,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142230665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Neurocomputing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1