Similarity search without tears: the OMNI-family of all-purpose access methods

R. S. Filho, A. Traina, C. Traina, C. Faloutsos
{"title":"Similarity search without tears: the OMNI-family of all-purpose access methods","authors":"R. S. Filho, A. Traina, C. Traina, C. Faloutsos","doi":"10.1109/ICDE.2001.914877","DOIUrl":null,"url":null,"abstract":"Designing a new access method inside a commercial DBMS is cumbersome and expensive. We propose a family of metric access methods that are fast and easy to implement on top of existing access methods, such as sequential scan, R-trees and Slim-trees. The idea is to elect a set of objects as foci, and gauge all other objects with their distances from this set. We show how to define the foci set cardinality, how to choose appropriate foci, and how to perform range and nearest-neighbor queries using them, without false dismissals. The foci increase the pruning of distance calculations during the query processing. Furthermore we index the distances from each object to the foci to reduce even triangular inequality comparisons. Experiments on real and synthetic datasets show that our methods match or outperform existing methods. They are up to 10 times faster, and perform up to 10 times fewer distance calculations and disk accesses. In addition, it scales up well, exhibiting sub-linear performance with growing database size.","PeriodicalId":431818,"journal":{"name":"Proceedings 17th International Conference on Data Engineering","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"154","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 17th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2001.914877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 154

Abstract

Designing a new access method inside a commercial DBMS is cumbersome and expensive. We propose a family of metric access methods that are fast and easy to implement on top of existing access methods, such as sequential scan, R-trees and Slim-trees. The idea is to elect a set of objects as foci, and gauge all other objects with their distances from this set. We show how to define the foci set cardinality, how to choose appropriate foci, and how to perform range and nearest-neighbor queries using them, without false dismissals. The foci increase the pruning of distance calculations during the query processing. Furthermore we index the distances from each object to the foci to reduce even triangular inequality comparisons. Experiments on real and synthetic datasets show that our methods match or outperform existing methods. They are up to 10 times faster, and perform up to 10 times fewer distance calculations and disk accesses. In addition, it scales up well, exhibiting sub-linear performance with growing database size.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
无撕裂的相似搜索:omni家族的通用访问方法
在商业DBMS中设计一种新的访问方法既麻烦又昂贵。我们提出了一系列快速且易于实现的度量访问方法,如顺序扫描、r树和slim -tree。这个想法是选择一组物体作为焦点,并测量所有其他物体与这组物体的距离。我们将展示如何定义焦点集基数,如何选择合适的焦点,以及如何使用它们执行范围查询和最近邻查询,而不会出现错误的忽略。焦点增加了查询处理过程中距离计算的修剪。此外,我们索引从每个对象到焦点的距离,以减少三角不平等的比较。在真实和合成数据集上的实验表明,我们的方法与现有方法相匹配或优于现有方法。它们的速度最高可达10倍,执行的距离计算和磁盘访问最多可减少10倍。此外,它可以很好地扩展,随着数据库大小的增长表现出亚线性的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Quality-aware and load sensitive planning of image similarity queries Distinctiveness-sensitive nearest-neighbor search for efficient similarity retrieval of multimedia information Data management support of Web applications Prefetching based on the type-level access pattern in object-relational DBMSs Duality-based subsequence matching in time-series databases
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1