基于粗尺度特征融合和多级注意块的弱监督图像检索

Proceedings of the 2019 on International Conference on Multimedia Retrieval Pub Date : 2019-06-05 DOI:10.1145/3323873.3325017

Xinyao Nie, Hong Lu, Zijian Wang, Jingyuan Liu, Zehua Guo

{"title":"基于粗尺度特征融合和多级注意块的弱监督图像检索","authors":"Xinyao Nie, Hong Lu, Zijian Wang, Jingyuan Liu, Zehua Guo","doi":"10.1145/3323873.3325017","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an end-to-end Attention-Block network for image retrieval (ABIR), which greatly increases the retrieval accuracy without human annotations like bounding boxes. Specifically, our network utilizes coarse-scale feature fusion, which generates the attentive local features via combining the information from different intermediate layers. Detailed feature information is extracted with the application of two attention blocks. Extensive experiments show that our method outperforms the state-of-the-art by a significant margin on four public datasets for image retrieval tasks.","PeriodicalId":149041,"journal":{"name":"Proceedings of the 2019 on International Conference on Multimedia Retrieval","volume":"391 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks\",\"authors\":\"Xinyao Nie, Hong Lu, Zijian Wang, Jingyuan Liu, Zehua Guo\",\"doi\":\"10.1145/3323873.3325017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an end-to-end Attention-Block network for image retrieval (ABIR), which greatly increases the retrieval accuracy without human annotations like bounding boxes. Specifically, our network utilizes coarse-scale feature fusion, which generates the attentive local features via combining the information from different intermediate layers. Detailed feature information is extracted with the application of two attention blocks. Extensive experiments show that our method outperforms the state-of-the-art by a significant margin on four public datasets for image retrieval tasks.\",\"PeriodicalId\":149041,\"journal\":{\"name\":\"Proceedings of the 2019 on International Conference on Multimedia Retrieval\",\"volume\":\"391 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 on International Conference on Multimedia Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3323873.3325017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 on International Conference on Multimedia Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3323873.3325017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在本文中，我们提出了一种端到端注意力块网络用于图像检索(ABIR)，它大大提高了检索精度，而不需要像边界框这样的人工注释。具体来说，我们的网络利用了粗尺度特征融合，通过结合来自不同中间层的信息来生成关注的局部特征。应用两个注意块提取详细的特征信息。大量的实验表明，我们的方法在图像检索任务的四个公共数据集上的表现明显优于最先进的技术。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Weakly Supervised Image Retrieval via Coarse-scale Feature Fusion and Multi-level Attention Blocks

In this paper, we propose an end-to-end Attention-Block network for image retrieval (ABIR), which greatly increases the retrieval accuracy without human annotations like bounding boxes. Specifically, our network utilizes coarse-scale feature fusion, which generates the attentive local features via combining the information from different intermediate layers. Detailed feature information is extracted with the application of two attention blocks. Extensive experiments show that our method outperforms the state-of-the-art by a significant margin on four public datasets for image retrieval tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2019 on International Conference on Multimedia Retrieval

自引率

0.00%

发文量

期刊最新文献

EAGER Multimodal Multimedia Retrieval with vitrivr RobustiQ: A Robust ANN Search Method for Billion-scale Similarity Search on GPUs Improving What Cross-Modal Retrieval Models Learn through Object-Oriented Inter- and Intra-Modal Attention Networks DeepMarks: A Secure Fingerprinting Framework for Digital Rights Management of Deep Learning Models