PANet: Pluralistic Attention Network for Few-Shot Image Classification

IF 2.8 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Neural Processing Letters Pub Date : 2024-06-29 DOI:10.1007/s11063-024-11638-5

Wenming Cao, Tianyuan Li, Qifan Liu, Zhiquan He

{"title":"PANet: Pluralistic Attention Network for Few-Shot Image Classification","authors":"Wenming Cao, Tianyuan Li, Qifan Liu, Zhiquan He","doi":"10.1007/s11063-024-11638-5","DOIUrl":null,"url":null,"abstract":"<p>Traditional deep learning methods require a large amount of labeled data for model training, which is laborious and costly in real word. Few-shot learning (FSL) aims to recognize novel classes with only a small number of labeled samples to address these challenges. We focus on metric-based few-shot learning with improvements in both feature extraction and metric method. In our work, we propose the Pluralistic Attention Network (PANet), a novel attention-oriented framework, involving both a local encoded intra-attention(LEIA) module and a global encoded reciprocal attention(GERA) module. The LEIA is designed to capture comprehensive local feature dependencies within every single sample. The GERA concentrates on the correlation between two samples and learns the discriminability of representations obtained from the LEIA. The two modules are complementary to each other and ensure the feature information within and between images can be fully utilized. Furthermore, we also design a dual-centralization (DC) cosine similarity to eliminate the disparity of data distribution in different dimensions and enhance the metric accuracy between support and query samples. Our method is thoroughly evaluated with extensive experiments, and the results demonstrate that with the contribution of each component, our model can achieve high-performance on four widely used few-shot classification benchmarks of miniImageNet, tieredImageNet, CUB-200-2011 and CIFAR-FS.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"81 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Processing Letters","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11063-024-11638-5","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Traditional deep learning methods require a large amount of labeled data for model training, which is laborious and costly in real word. Few-shot learning (FSL) aims to recognize novel classes with only a small number of labeled samples to address these challenges. We focus on metric-based few-shot learning with improvements in both feature extraction and metric method. In our work, we propose the Pluralistic Attention Network (PANet), a novel attention-oriented framework, involving both a local encoded intra-attention(LEIA) module and a global encoded reciprocal attention(GERA) module. The LEIA is designed to capture comprehensive local feature dependencies within every single sample. The GERA concentrates on the correlation between two samples and learns the discriminability of representations obtained from the LEIA. The two modules are complementary to each other and ensure the feature information within and between images can be fully utilized. Furthermore, we also design a dual-centralization (DC) cosine similarity to eliminate the disparity of data distribution in different dimensions and enhance the metric accuracy between support and query samples. Our method is thoroughly evaluated with extensive experiments, and the results demonstrate that with the contribution of each component, our model can achieve high-performance on four widely used few-shot classification benchmarks of miniImageNet, tieredImageNet, CUB-200-2011 and CIFAR-FS.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

PANet：用于少镜头图像分类的多元注意网络

传统的深度学习方法需要大量标注数据来训练模型，这在实际应用中既费力又费钱。少量学习（FSL）旨在只用少量标注样本来识别新类别，以应对这些挑战。我们将重点放在基于度量的少量学习上，并对特征提取和度量方法进行了改进。在我们的工作中，我们提出了多元注意力网络（PANet），这是一个以注意力为导向的新型框架，包含局部编码内部注意力（LEIA）模块和全局编码互惠注意力（GERA）模块。本地编码内部注意模块旨在捕捉每个样本中的全面本地特征依赖关系。GERA 专注于两个样本之间的相关性，并学习从 LEIA 中获得的表征的可辨别性。这两个模块相辅相成，确保图像内部和图像之间的特征信息得到充分利用。此外，我们还设计了双集中（DC）余弦相似性，以消除不同维度数据分布的差异，提高支持样本和查询样本之间的度量精度。我们通过大量实验对我们的方法进行了全面评估，结果表明，在各个组件的贡献下，我们的模型可以在 miniImageNet、tiereredImageNet、CUB-200-2011 和 CIFAR-FS 这四个广泛使用的少量图像分类基准上实现高性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Neural Processing Letters 工程技术-计算机：人工智能

CiteScore

4.90

自引率

12.90%

发文量

392

审稿时长

2.8 months

期刊介绍： Neural Processing Letters is an international journal publishing research results and innovative ideas on all aspects of artificial neural networks. Coverage includes theoretical developments, biological models, new formal modes, learning, applications, software and hardware developments, and prospective researches. The journal promotes fast exchange of information in the community of neural network researchers and users. The resurgence of interest in the field of artificial neural networks since the beginning of the 1980s is coupled to tremendous research activity in specialized or multidisciplinary groups. Research, however, is not possible without good communication between people and the exchange of information, especially in a field covering such different areas; fast communication is also a key aspect, and this is the reason for Neural Processing Letters