Cross-Layer Feature based Multi-Granularity Visual Classification

Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma
{"title":"Cross-Layer Feature based Multi-Granularity Visual Classification","authors":"Junhan Chen, Dongliang Chang, Jiyang Xie, Ruoyi Du, Zhanyu Ma","doi":"10.1109/VCIP56404.2022.10008879","DOIUrl":null,"url":null,"abstract":"In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.","PeriodicalId":269379,"journal":{"name":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Visual Communications and Image Processing (VCIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VCIP56404.2022.10008879","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In contrast to traditional fine-grained visual clas-sification, multi-granularity visual classification is no longer limited to identifying the different sub-classes belonging to the same super-class (e.g., bird species, cars, and aircraft models). Instead, it gives a sequence of labels from coarse to fine (e.g., Passeriformes → Corvidae → Fish Crow), which is more convenient in practice. The key to solving this task is how to use the relationships between the different levels of labels to learn feature representations that contain different levels of granularity. Interestingly, the feature pyramid structure naturally implies different granularity of feature representation, with the shallow layers representing coarse-grained features and the deep layers representing fine-grained features. Therefore, in this paper, we exploit this property of the feature pyramid structure to decouple features and obtain feature representations corre-sponding to different granularities. Specifically, we use shallow features for coarse-grained classification and deep features for fine-grained classification. In addition, to enable fine-grained features to enhance the coarse-grained classification, we propose a feature reinforcement module based on the feature pyramid structure, where deep features are first upsampled and then combined with shallow features to make decisions. Experimental results on three widely used fine-grained image classification datasets such as CUB-200-2011, Stanford Cars, and FGVC-Aircraft validate the method's effectiveness. Code available at https://github.com/PRIS-CV/CGVC.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于跨层特征的多粒度视觉分类
与传统的细粒度视觉分类相比,多粒度视觉分类不再局限于识别属于同一超类的不同子类(例如鸟类、汽车和飞机模型)。相反,它给出了一个从粗到细的标签序列(例如,passerformes→Corvidae→Fish Crow),这在实践中更方便。解决这个问题的关键是如何利用不同级别标签之间的关系来学习包含不同粒度级别的特征表示。有趣的是,特征金字塔结构自然意味着不同粒度的特征表示,浅层表示粗粒度特征,深层表示细粒度特征。因此,本文利用特征金字塔结构的这一特性对特征进行解耦,得到不同粒度对应的特征表示。具体来说,我们使用浅特征进行粗粒度分类,使用深特征进行细粒度分类。此外,为了使细粒度特征能够增强粗粒度分类,我们提出了一种基于特征金字塔结构的特征增强模块,首先对深层特征进行上采样,然后结合浅层特征进行决策。在CUB-200-2011、Stanford Cars和FGVC-Aircraft三种广泛使用的细粒度图像分类数据集上的实验结果验证了该方法的有效性。代码可从https://github.com/PRIS-CV/CGVC获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CdCLR: Clip-Driven Contrastive Learning for Skeleton-Based Action Recognition Spectral Analysis of Aerial Light Field for Optimization Sampling and Rendering of Unmanned Aerial Vehicle Near-lossless Point Cloud Geometry Compression Based on Adaptive Residual Compensation Efficient Interpolation Filters for Chroma Motion Compensation in Video Coding Rate Controllable Learned Image Compression Based on RFL Model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1