一种高效的基于零件的近重复和子图像检索系统

Yan Ke, R. Sukthankar, Larry Huston
{"title":"一种高效的基于零件的近重复和子图像检索系统","authors":"Yan Ke, R. Sukthankar, Larry Huston","doi":"10.1145/1027527.1027729","DOIUrl":null,"url":null,"abstract":"We introduce a system for near-duplicate detection and sub-image retrieval. Such a system is useful for finding copyright violations and detecting forged images. We define near-duplicate as images altered with common transformations such as changing contrast, saturation, scaling, cropping, framing, etc. Our system builds a parts-based representation of images using <i>distinctive local descriptors</i> which give high quality matches even under severe transformations. To cope with the large number of features extracted from the images, we employ <i>locality-sensitive hashing</i> to index the local descriptors. This allows us to make approximate similarity queries that only examine a small fraction of the database. Although locality-sensitive hashing has excellent theoretical performance properties, a standard implementation would still be unacceptably slow for this application. We show that, by optimizing layout and access to the index data on disk, we can efficiently query indices containing millions of keypoints. Our system achieves near-perfect accuracy (100% precision at 99.85% recall) on the tests presented in Meng <i>et al.</i> [16], and consistently strong results on our own, significantly more challenging experiments. Query times are interactive even for collections of thousands of images.","PeriodicalId":292207,"journal":{"name":"MULTIMEDIA '04","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"425","resultStr":"{\"title\":\"An efficient parts-based near-duplicate and sub-image retrieval system\",\"authors\":\"Yan Ke, R. Sukthankar, Larry Huston\",\"doi\":\"10.1145/1027527.1027729\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a system for near-duplicate detection and sub-image retrieval. Such a system is useful for finding copyright violations and detecting forged images. We define near-duplicate as images altered with common transformations such as changing contrast, saturation, scaling, cropping, framing, etc. Our system builds a parts-based representation of images using <i>distinctive local descriptors</i> which give high quality matches even under severe transformations. To cope with the large number of features extracted from the images, we employ <i>locality-sensitive hashing</i> to index the local descriptors. This allows us to make approximate similarity queries that only examine a small fraction of the database. Although locality-sensitive hashing has excellent theoretical performance properties, a standard implementation would still be unacceptably slow for this application. We show that, by optimizing layout and access to the index data on disk, we can efficiently query indices containing millions of keypoints. Our system achieves near-perfect accuracy (100% precision at 99.85% recall) on the tests presented in Meng <i>et al.</i> [16], and consistently strong results on our own, significantly more challenging experiments. Query times are interactive even for collections of thousands of images.\",\"PeriodicalId\":292207,\"journal\":{\"name\":\"MULTIMEDIA '04\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"425\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"MULTIMEDIA '04\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1027527.1027729\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"MULTIMEDIA '04","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1027527.1027729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 425

摘要

介绍了一种近重复检测和子图像检索系统。这种系统对于发现侵犯版权的行为和检测伪造图像非常有用。我们将近复制定义为通过改变对比度、饱和度、缩放、裁剪、取景等常见变换改变的图像。我们的系统使用独特的局部描述符构建基于部件的图像表示,即使在严重的转换下也能给出高质量的匹配。为了处理从图像中提取的大量特征,我们采用位置敏感的哈希方法对局部描述符进行索引。这允许我们进行近似的相似性查询,只检查数据库的一小部分。尽管位置敏感散列在理论上具有出色的性能属性,但是对于这个应用程序,标准实现仍然会慢得令人无法接受。通过优化磁盘上索引数据的布局和访问,我们可以有效地查询包含数百万个关键点的索引。在Meng等人[16]的测试中,我们的系统达到了近乎完美的准确率(100%的准确率和99.85%的召回率),并且在我们自己的实验中也一直表现出很强的结果,这明显更具挑战性。查询时间是交互式的,甚至对于数千个图像的集合也是如此。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
An efficient parts-based near-duplicate and sub-image retrieval system
We introduce a system for near-duplicate detection and sub-image retrieval. Such a system is useful for finding copyright violations and detecting forged images. We define near-duplicate as images altered with common transformations such as changing contrast, saturation, scaling, cropping, framing, etc. Our system builds a parts-based representation of images using distinctive local descriptors which give high quality matches even under severe transformations. To cope with the large number of features extracted from the images, we employ locality-sensitive hashing to index the local descriptors. This allows us to make approximate similarity queries that only examine a small fraction of the database. Although locality-sensitive hashing has excellent theoretical performance properties, a standard implementation would still be unacceptably slow for this application. We show that, by optimizing layout and access to the index data on disk, we can efficiently query indices containing millions of keypoints. Our system achieves near-perfect accuracy (100% precision at 99.85% recall) on the tests presented in Meng et al. [16], and consistently strong results on our own, significantly more challenging experiments. Query times are interactive even for collections of thousands of images.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Context for semantic metadata Collusion attack on a multi-key secure video proxy scheme PLSA-based image auto-annotation: constraining the latent space The relative effectiveness of concept-based versus content-based video retrieval LEMUR: robotic musical instruments
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1