Object retrieval with large vocabularies and fast spatial matching

2007 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2007-06-17 DOI:10.1109/CVPR.2007.383172

James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman

{"title":"Object retrieval with large vocabularies and fast spatial matching","authors":"James Philbin, Ondřej Chum, M. Isard, Josef Sivic, Andrew Zisserman","doi":"10.1109/CVPR.2007.383172","DOIUrl":null,"url":null,"abstract":"In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, \"web-scale \" image corpora.","PeriodicalId":351008,"journal":{"name":"2007 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3111","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 IEEE Conference on Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR.2007.383172","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3111

Abstract

In this paper, we present a large-scale object retrieval system. The user supplies a query object by selecting a region of a query image, and the system returns a ranked list of images that contain the same object, retrieved from a large corpus. We demonstrate the scalability and performance of our system on a dataset of over 1 million images crawled from the photo-sharing site, Flickr [3], using Oxford landmarks as queries. Building an image-feature vocabulary is a major time and performance bottleneck, due to the size of our dataset. To address this problem we compare different scalable methods for building a vocabulary and introduce a novel quantization method based on randomized trees which we show outperforms the current state-of-the-art on an extensive ground-truth. Our experiments show that the quantization has a major effect on retrieval quality. To further improve query performance, we add an efficient spatial verification stage to re-rank the results returned from our bag-of-words model and show that this consistently improves search quality, though by less of a margin when the visual vocabulary is large. We view this work as a promising step towards much larger, "web-scale " image corpora.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于大词汇量和快速空间匹配的对象检索

本文提出了一个大规模的目标检索系统。用户通过选择查询图像的一个区域来提供查询对象，系统返回包含相同对象的图像的排序列表，这些图像是从一个大型语料库中检索到的。我们使用牛津地标作为查询，在从照片共享网站Flickr[3]抓取的超过100万张图像的数据集上演示了我们系统的可扩展性和性能。由于数据集的大小，构建图像特征词汇表是一个主要的时间和性能瓶颈。为了解决这个问题，我们比较了构建词汇表的不同可扩展方法，并引入了一种基于随机树的新型量化方法，我们证明该方法在广泛的基础上优于当前最先进的方法。实验表明，量化对检索质量有重要影响。为了进一步提高查询性能，我们添加了一个有效的空间验证阶段来重新排序从词袋模型返回的结果，并表明这始终提高了搜索质量，尽管当视觉词汇量很大时，改进幅度较小。我们认为这项工作是朝着更大的“网络规模”图像语料库迈出的有希望的一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2007 IEEE Conference on Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Combining Region and Edge Cues for Image Segmentation in a Probabilistic Gaussian Mixture Framework Fast Human Pose Estimation using Appearance and Motion via Multi-Dimensional Boosting Regression Enhanced Level Building Algorithm for the Movement Epenthesis Problem in Sign Language Recognition Change Detection in a 3-d World Layered Graph Match with Graph Editing