Indexing algorithm based on storing additional distances in metric space for multi-vantage-point tree

Igor Akeksandrov, Vladimir Fomin
{"title":"Indexing algorithm based on storing additional distances in metric space for multi-vantage-point tree","authors":"Igor Akeksandrov, Vladimir Fomin","doi":"10.31799/1684-8853-2021-4-18-27","DOIUrl":null,"url":null,"abstract":"Introduction: The similarity search paradigm is used in various computational tasks, such as classification, data mining, pattern recognition, etc. Currently, the technology of tree-like metric access methods occupies a significant place among search algorithms. The classical problem of reducing the time of similarity search in metric space is relevant for modern systems when processing big complex data. Due to multidimensional nature of the search algorithm effectiveness problem, local research in this direction is in demand, constantly bringing useful results. Purpose: To reduce the computational complexity of tree search algorithms in problems involving metric proximity. Results: We developed a search algorithm for a multi-vantage-point tree, based on the priority node-processing queue. We mathematically formalized the problems of additional calculations and ways to solve them. To improve the performance of similarity search, we have proposed procedures for forming a priority queue of processing nodes and reducing the number of intersections of same level nodes. Structural changes in the multi-vantage-point tree and the use of minimum distances between vantage points and node subtrees provide better search efficiency. More accurate determination of the distance from the search object to the nodes and the fact that the search area intersects with a tree node allows you to reduce the amount of calculations. Practical relevance: The resulting search algorithms need less time to process information due to an insignificant increase in memory requirements. Reducing the information processing time expands the application boundaries of tree metric indexing methods in search problems involving large data sets.","PeriodicalId":36977,"journal":{"name":"Informatsionno-Upravliaiushchie Sistemy","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Informatsionno-Upravliaiushchie Sistemy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31799/1684-8853-2021-4-18-27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: The similarity search paradigm is used in various computational tasks, such as classification, data mining, pattern recognition, etc. Currently, the technology of tree-like metric access methods occupies a significant place among search algorithms. The classical problem of reducing the time of similarity search in metric space is relevant for modern systems when processing big complex data. Due to multidimensional nature of the search algorithm effectiveness problem, local research in this direction is in demand, constantly bringing useful results. Purpose: To reduce the computational complexity of tree search algorithms in problems involving metric proximity. Results: We developed a search algorithm for a multi-vantage-point tree, based on the priority node-processing queue. We mathematically formalized the problems of additional calculations and ways to solve them. To improve the performance of similarity search, we have proposed procedures for forming a priority queue of processing nodes and reducing the number of intersections of same level nodes. Structural changes in the multi-vantage-point tree and the use of minimum distances between vantage points and node subtrees provide better search efficiency. More accurate determination of the distance from the search object to the nodes and the fact that the search area intersects with a tree node allows you to reduce the amount of calculations. Practical relevance: The resulting search algorithms need less time to process information due to an insignificant increase in memory requirements. Reducing the information processing time expands the application boundaries of tree metric indexing methods in search problems involving large data sets.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于度量空间中附加距离存储的多优势点树索引算法
简介:相似度搜索范式用于各种计算任务,如分类、数据挖掘、模式识别等。目前,树状度量访问方法技术在搜索算法中占有重要地位。在度量空间中减少相似性搜索时间是现代系统处理大型复杂数据的一个经典问题。由于搜索算法有效性问题的多维性,这方面的局部研究需求很大,不断带来有益的成果。目的:降低树搜索算法在度量接近问题中的计算复杂度。结果:我们开发了一种基于优先节点处理队列的多优势点树搜索算法。我们从数学上形式化了附加计算的问题和解决它们的方法。为了提高相似性搜索的性能,我们提出了形成处理节点优先队列和减少同级节点交集数量的方法。多优势点树的结构变化和优势点与节点子树之间最小距离的使用提供了更好的搜索效率。更准确地确定从搜索对象到节点的距离,以及搜索区域与树节点相交的事实,使您可以减少计算量。实际相关性:由于内存需求的轻微增加,结果搜索算法需要更少的时间来处理信息。减少信息处理时间扩展了树度量索引方法在大数据集搜索问题中的应用范围。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Informatsionno-Upravliaiushchie Sistemy
Informatsionno-Upravliaiushchie Sistemy Mathematics-Control and Optimization
CiteScore
1.40
自引率
0.00%
发文量
35
期刊最新文献
Modeling of bumping routes in the RSK algorithm and analysis of their approach to limit shapes Continuous control algorithms for conveyer belt routing based on multi-agent deep reinforcement learning Fully integrated optical sensor system with intensity interrogation Decoding of linear codes for single error bursts correction based on the determination of certain events Backend Bug Finder — a platform for effective compiler fuzzing
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1