Data filtering for scalable high-dimensional k-NN search on multicore systems

Xiaoxin Tang, S. Mills, D. Eyers, K. Leung, Zhiyi Huang, M. Guo
{"title":"Data filtering for scalable high-dimensional k-NN search on multicore systems","authors":"Xiaoxin Tang, S. Mills, D. Eyers, K. Leung, Zhiyi Huang, M. Guo","doi":"10.1145/2600212.2600710","DOIUrl":null,"url":null,"abstract":"K Nearest Neighbors (k-NN) search is a widely used category of algorithms with applications in domains such as computer vision and machine learning. With the rapidly increasing amount of data available, and their high dimensionality, k-NN algorithms scale poorly on multicore systems because they hit a memory wall. In this paper, we propose a novel data filtering strategy, named Subspace Clustering for Filtering (SCF), for k-NN search algorithms on multicore platforms. By excluding unlikely features in k-NN search, this strategy can reduce memory footprint as well as computation. Experimental results on four k-NN algorithms show that SCF can improve their performance on two modern multicore platforms with insignificant loss of search precision.","PeriodicalId":330072,"journal":{"name":"IEEE International Symposium on High-Performance Parallel Distributed Computing","volume":"126 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE International Symposium on High-Performance Parallel Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2600212.2600710","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

K Nearest Neighbors (k-NN) search is a widely used category of algorithms with applications in domains such as computer vision and machine learning. With the rapidly increasing amount of data available, and their high dimensionality, k-NN algorithms scale poorly on multicore systems because they hit a memory wall. In this paper, we propose a novel data filtering strategy, named Subspace Clustering for Filtering (SCF), for k-NN search algorithms on multicore platforms. By excluding unlikely features in k-NN search, this strategy can reduce memory footprint as well as computation. Experimental results on four k-NN algorithms show that SCF can improve their performance on two modern multicore platforms with insignificant loss of search precision.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多核系统上可扩展高维k-NN搜索的数据过滤
K近邻(K - nn)搜索是一种广泛使用的算法,在计算机视觉和机器学习等领域都有应用。随着可用数据量的迅速增加,以及它们的高维性,k-NN算法在多核系统上的可扩展性很差,因为它们遇到了内存瓶颈。本文针对多核平台上的k-NN搜索算法,提出了一种新的数据过滤策略,称为子空间聚类过滤(Subspace Clustering for filtering, SCF)。通过在k-NN搜索中排除不可能的特征,该策略可以减少内存占用和计算量。在四种k-NN算法上的实验结果表明,SCF算法在两种现代多核平台上的搜索精度损失不大的情况下,可以提高k-NN算法的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Data filtering for scalable high-dimensional k-NN search on multicore systems Communication-driven scheduling for virtual clusters in cloud When paxos meets erasure code: reduce network and storage cost in state machine replication Domino: an incremental computing framework in cloud with eventual synchronization TOP-PIM: throughput-oriented programmable processing in memory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1