FSNet: A dual-domain network for few-shot image classification

IF 3.7 2区 工程技术 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Displays Pub Date : 2024-07-14 DOI:10.1016/j.displa.2024.102795
Xuewen Yan, Zhangjin Huang
{"title":"FSNet: A dual-domain network for few-shot image classification","authors":"Xuewen Yan,&nbsp;Zhangjin Huang","doi":"10.1016/j.displa.2024.102795","DOIUrl":null,"url":null,"abstract":"<div><p>Few-shot learning is a challenging task, that aims to learn and identify novel classes from a limited number of unseen labeled samples. Previous work has focused primarily on extracting features solely in the spatial domain of images. However, the compressed representation in the frequency domain which contains rich pattern information is a powerful tool in the field of signal processing. Combining the frequency and spatial domains to obtain richer information can effectively alleviate the overfitting problem. In this paper, we propose a dual-domain combined model called Frequency Space Net (FSNet), which preprocesses input images simultaneously in both the spatial and frequency domains, extracts spatial and frequency information through two feature extractors, and fuses them to a composite feature for image classification tasks. We start from a different view of frequency analysis, linking conventional average pooling to Discrete Cosine Transformation (DCT). We generalize the compression of the attention mechanism in the frequency domain. Consequently, we propose a novel Frequency Channel Spatial (FCS) attention mechanism. Extensive experiments demonstrate that frequency and spatial information are complementary in few-shot image classification, improving the performance of the model. Our method outperforms state-of-the-art approaches on miniImageNet and CUB.</p></div>","PeriodicalId":50570,"journal":{"name":"Displays","volume":"84 ","pages":"Article 102795"},"PeriodicalIF":3.7000,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Displays","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0141938224001598","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Few-shot learning is a challenging task, that aims to learn and identify novel classes from a limited number of unseen labeled samples. Previous work has focused primarily on extracting features solely in the spatial domain of images. However, the compressed representation in the frequency domain which contains rich pattern information is a powerful tool in the field of signal processing. Combining the frequency and spatial domains to obtain richer information can effectively alleviate the overfitting problem. In this paper, we propose a dual-domain combined model called Frequency Space Net (FSNet), which preprocesses input images simultaneously in both the spatial and frequency domains, extracts spatial and frequency information through two feature extractors, and fuses them to a composite feature for image classification tasks. We start from a different view of frequency analysis, linking conventional average pooling to Discrete Cosine Transformation (DCT). We generalize the compression of the attention mechanism in the frequency domain. Consequently, we propose a novel Frequency Channel Spatial (FCS) attention mechanism. Extensive experiments demonstrate that frequency and spatial information are complementary in few-shot image classification, improving the performance of the model. Our method outperforms state-of-the-art approaches on miniImageNet and CUB.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FSNet:用于少量图像分类的双域网络
少量学习是一项具有挑战性的任务,其目的是从数量有限的未见标注样本中学习和识别新的类别。以往的工作主要集中在仅提取图像空间域的特征。然而,频率域的压缩表示包含丰富的模式信息,是信号处理领域的有力工具。结合频域和空间域获取更丰富的信息可以有效缓解过拟合问题。本文提出了一种名为频率空间网(FSNet)的双域组合模型,它能同时在空间域和频率域对输入图像进行预处理,通过两个特征提取器提取空间和频率信息,并将它们融合为一个复合特征,用于图像分类任务。我们从频率分析的不同视角出发,将传统的平均集合与离散余弦变换(DCT)联系起来。我们在频域中对注意力机制的压缩进行了概括。因此,我们提出了一种新颖的频率通道空间(FCS)注意力机制。大量实验证明,频率和空间信息在少帧图像分类中是互补的,从而提高了模型的性能。我们的方法在 miniImageNet 和 CUB 上的表现优于最先进的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Displays
Displays 工程技术-工程:电子与电气
CiteScore
4.60
自引率
25.60%
发文量
138
审稿时长
92 days
期刊介绍: Displays is the international journal covering the research and development of display technology, its effective presentation and perception of information, and applications and systems including display-human interface. Technical papers on practical developments in Displays technology provide an effective channel to promote greater understanding and cross-fertilization across the diverse disciplines of the Displays community. Original research papers solving ergonomics issues at the display-human interface advance effective presentation of information. Tutorial papers covering fundamentals intended for display technologies and human factor engineers new to the field will also occasionally featured.
期刊最新文献
A assessment method for ergonomic risk based on fennec fox optimization algorithm and generalized regression neural network An overview of bit-depth enhancement: Algorithm datasets and evaluation No-reference underwater image quality assessment based on Multi-Scale and mutual information analysis DHDP-SLAM: Dynamic Hierarchical Dirichlet Process based data association for semantic SLAM Fabrication and Reflow of Indium Bumps for Active-Matrix Micro-LED Display of 3175 PPI
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1