Object retrieval with large vocabularies and fast spatial matching

James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman
{"title":"Object retrieval with large vocabularies and fast spatial matching","authors":"James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman","doi":"10.1109/CVPR.2007.383172","DOIUrl":null,"url":null,"abstract":"In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, \"web-scale \" image corpora.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3111","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2007.383172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3111

Abstract

In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, "web-scale " image corpora.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于大词汇量和快速空间匹配的对象检索
本文提出了一个大规模的目标检索系统。用户通过选择查询图像的一个区域来提供查询对象,系统返回包含相同对象的图像的排序列表,这些图像是从一个大型语料库中检索到的。我们使用牛津地标作为查询,在从照片共享网站Flickr[3]抓取的超过100万张图像的数据集上演示了我们系统的可扩展性和性能。由于数据集的大小,构建图像特征词汇表是一个主要的时间和性能瓶颈。为了解决这个问题,我们比较了构建词汇表的不同可扩展方法,并引入了一种基于随机树的新型量化方法,我们证明该方法在广泛的基础上优于当前最先进的方法。实验表明,量化对检索质量有重要影响。为了进一步提高查询性能,我们添加了一个有效的空间验证阶段来重新排序从词袋模型返回的结果,并表明这始终提高了搜索质量,尽管当视觉词汇量很大时,改进幅度较小。我们认为这项工作是朝着更大的“网络规模”图像语料库迈出的有希望的一步。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Combining Region and Edge Cues for Image Segmentation in a Probabilistic Gaussian Mixture Framework Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition Change Detection in a 3-d World Layered Graph Match with Graph Editing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1