Spatial Pyramid Attention Enhanced Visual Descriptors for Landmark Retrieval

Q3 Computer Science 中国图象图形学报 Pub Date : 2023-12-01 DOI:10.18178/joig.11.4.359-366
Luepol Pipanmekaporn, Suwatchai Kamonsantiroj, Chiabwoot Ratanavilisagul, Sathit Prasomphan
{"title":"Spatial Pyramid Attention Enhanced Visual Descriptors for Landmark Retrieval","authors":"Luepol Pipanmekaporn, Suwatchai Kamonsantiroj, Chiabwoot Ratanavilisagul, Sathit Prasomphan","doi":"10.18178/joig.11.4.359-366","DOIUrl":null,"url":null,"abstract":"Landmark retrieval, which aims to search for landmark images similar to a query photo within a massive image database, has received considerable attention for many years. Despite this, finding landmarks quickly and accurately still presents some unique challenges. To tackle these challenges, we present a deep learning model, called the Spatial-Pyramid Attention network (SPA). This network is an end-to-end convolutional network, incorporating a spatial-pyramid attention layer that encodes the input image, leveraging the spatial pyramid structure to highlight regional features based on their relative spatial distinctiveness. An image descriptor is then generated by aggregating these regional features. According to our experiments on benchmark datasets including Oxford5k, Paris6k, and Landmark-100, our proposed model, SPA, achieves mean Average Precision (mAP) accuracy of 85.3% with the Oxford dataset, 89.6% with the Paris dataset, and 80.4% in the Landmark-100 dataset, outperforming existing state-of-theart deep image retrieval models.","PeriodicalId":36336,"journal":{"name":"中国图象图形学报","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"中国图象图形学报","FirstCategoryId":"1093","ListUrlMain":"https://doi.org/10.18178/joig.11.4.359-366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0

Abstract

Landmark retrieval, which aims to search for landmark images similar to a query photo within a massive image database, has received considerable attention for many years. Despite this, finding landmarks quickly and accurately still presents some unique challenges. To tackle these challenges, we present a deep learning model, called the Spatial-Pyramid Attention network (SPA). This network is an end-to-end convolutional network, incorporating a spatial-pyramid attention layer that encodes the input image, leveraging the spatial pyramid structure to highlight regional features based on their relative spatial distinctiveness. An image descriptor is then generated by aggregating these regional features. According to our experiments on benchmark datasets including Oxford5k, Paris6k, and Landmark-100, our proposed model, SPA, achieves mean Average Precision (mAP) accuracy of 85.3% with the Oxford dataset, 89.6% with the Paris dataset, and 80.4% in the Landmark-100 dataset, outperforming existing state-of-theart deep image retrieval models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于地标检索的空间金字塔注意力增强视觉描述符
地标检索(Landmark retrieval)是一种在海量图像数据库中搜索与查询照片相似的地标图像的方法,多年来一直受到人们的广泛关注。尽管如此,快速准确地找到地标仍然面临着一些独特的挑战。为了应对这些挑战,我们提出了一个深度学习模型,称为空间金字塔注意力网络(SPA)。该网络是一个端到端的卷积网络,包含一个空间金字塔关注层,该层对输入图像进行编码,利用空间金字塔结构根据区域特征的相对空间独特性来突出区域特征。然后通过聚合这些区域特征生成图像描述符。通过对牛津5k、巴黎6k和Landmark-100等基准数据集的实验,我们提出的SPA模型在牛津数据集、巴黎数据集和Landmark-100数据集上的平均精度(mAP)分别达到85.3%、89.6%和80.4%,优于现有的最先进的深度图像检索模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
中国图象图形学报
中国图象图形学报 Computer Science-Computer Graphics and Computer-Aided Design
CiteScore
1.20
自引率
0.00%
发文量
6776
期刊最新文献
Roselle Pest Detection and Classification Using Threshold and Template Matching Human Action Recognition with Skeleton and Infrared Fusion Model Melanoma Detection Based on SVM Using MATLAB Evaluation of SSD Architecture for Small Size Object Detection: A Case Study on UAV Oil Pipeline MonitoringEvaluation of SSD Architecture for Small Size Object Detection: A Case Study on UAV Oil Pipeline Monitoring Improving Brain Tumor Classification Efficacy through the Application of Feature Selection and Ensemble Classifiers
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1