FreqCAM: Frequent Class Activation Map for Weakly Supervised Object Localization

Proceedings of the 2022 International Conference on Multimedia Retrieval Pub Date : 2022-06-27 DOI:10.1145/3512527.3531349

Runsheng Zhang

{"title":"FreqCAM: Frequent Class Activation Map for Weakly Supervised Object Localization","authors":"Runsheng Zhang","doi":"10.1145/3512527.3531349","DOIUrl":null,"url":null,"abstract":"Class Activation Map (CAM) is a commonly used solution for weakly supervised tasks. However, most of the existing CAM-based methods have one crucial problem, that is, only small object parts instead of full object regions can be located. In this paper, we find that the co-occurrence between the feature maps of different channels might provide more clues for object locations. Therefore, we propose a simple yet effective method, called Frequent Class Activation Map (FreqCAM), which exploits element-wise frequency information from the last convolutional layers as an attention filter to generate object regions. Our FreqCAM can filter the background noise and obtain more accurate fine-grained object localization information robustly. Furthermore, our approach is a post-hoc method of a trained classification model, and thus can be used to improve the performance of existing methods without modification. Experiments on the standard dataset CUB-200-2011 show that our proposed method achieves a significant increase in localization performance compared to the original existing state-of-the-art methods without any architectural changes or re-training.","PeriodicalId":179895,"journal":{"name":"Proceedings of the 2022 International Conference on Multimedia Retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3512527.3531349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Class Activation Map (CAM) is a commonly used solution for weakly supervised tasks. However, most of the existing CAM-based methods have one crucial problem, that is, only small object parts instead of full object regions can be located. In this paper, we find that the co-occurrence between the feature maps of different channels might provide more clues for object locations. Therefore, we propose a simple yet effective method, called Frequent Class Activation Map (FreqCAM), which exploits element-wise frequency information from the last convolutional layers as an attention filter to generate object regions. Our FreqCAM can filter the background noise and obtain more accurate fine-grained object localization information robustly. Furthermore, our approach is a post-hoc method of a trained classification model, and thus can be used to improve the performance of existing methods without modification. Experiments on the standard dataset CUB-200-2011 show that our proposed method achieves a significant increase in localization performance compared to the original existing state-of-the-art methods without any architectural changes or re-training.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用于弱监督对象定位的频繁类激活图

类激活图(CAM)是弱监督任务的常用解决方案。然而，现有的大多数基于cam的方法都存在一个关键问题，即只能定位物体的小部分，而不能定位完整的物体区域。在本文中，我们发现不同通道的特征映射之间的共现可以为目标定位提供更多线索。因此，我们提出了一种简单而有效的方法，称为频繁类激活图(FreqCAM)，它利用来自最后卷积层的元素智能频率信息作为注意力过滤器来生成对象区域。我们的FreqCAM可以过滤背景噪声，获得更准确的细粒度目标定位信息。此外，我们的方法是一个经过训练的分类模型的事后方法，因此可以用来提高现有方法的性能，而无需修改。在标准数据集CUB-200-2011上的实验表明，与现有的最先进的定位方法相比，我们提出的方法在没有任何架构更改或重新训练的情况下实现了显著的定位性能提升。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2022 International Conference on Multimedia Retrieval

自引率

0.00%

发文量