Tutorial: Are You My Neighbor?: Bringing Order to Neighbor Computing Problems.

D. Anastasiu, H. Rangwala, Andrea Tagarelli
{"title":"Tutorial: Are You My Neighbor?: Bringing Order to Neighbor Computing Problems.","authors":"D. Anastasiu, H. Rangwala, Andrea Tagarelli","doi":"10.1145/3292500.3332292","DOIUrl":null,"url":null,"abstract":"Finding nearest neighbors is an important topic that has attracted much attention over the years and has applications in many fields, such as market basket analysis, plagiarism and anomaly detection, community detection, ligand-based virtual screening, etc. As data are easier and easier to collect, finding neighbors has become a potential bottleneck in analysis pipelines. Performing pairwise comparisons given the massive datasets of today is no longer feasible. The high computational complexity of the task has led researchers to develop approximate methods, which find many but not all of the nearest neighbors. Yet, for some types of data, efficient exact solutions have been found by carefully partitioning or filtering the search space in a way that avoids most unnecessary comparisons. In recent years, there have been several fundamental advances in our ability to efficiently identify appropriate neighbors, especially in non-traditional data, such as graphs or document collections. In this tutorial, we provide an in-depth overview of recent methods for finding (nearest) neighbors, focusing on the intuition behind choices made in the design of those algorithms and on the utility of the methods in real-world applications. Our tutorial aims to provide a unifying view of \"neighbor computing\" problems, spanning from numerical data to graph data, from categorical data to sequential data, and related application scenarios. For each type of data, we will review the current state-of-the-art approaches used to identify neighbors and discuss how neighbor search methods are used to solve important problems.","PeriodicalId":186134,"journal":{"name":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3292500.3332292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Finding nearest neighbors is an important topic that has attracted much attention over the years and has applications in many fields, such as market basket analysis, plagiarism and anomaly detection, community detection, ligand-based virtual screening, etc. As data are easier and easier to collect, finding neighbors has become a potential bottleneck in analysis pipelines. Performing pairwise comparisons given the massive datasets of today is no longer feasible. The high computational complexity of the task has led researchers to develop approximate methods, which find many but not all of the nearest neighbors. Yet, for some types of data, efficient exact solutions have been found by carefully partitioning or filtering the search space in a way that avoids most unnecessary comparisons. In recent years, there have been several fundamental advances in our ability to efficiently identify appropriate neighbors, especially in non-traditional data, such as graphs or document collections. In this tutorial, we provide an in-depth overview of recent methods for finding (nearest) neighbors, focusing on the intuition behind choices made in the design of those algorithms and on the utility of the methods in real-world applications. Our tutorial aims to provide a unifying view of "neighbor computing" problems, spanning from numerical data to graph data, from categorical data to sequential data, and related application scenarios. For each type of data, we will review the current state-of-the-art approaches used to identify neighbors and discuss how neighbor search methods are used to solve important problems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
教程:你是我的邻居吗?:为邻居计算问题带来秩序。
寻找最近邻是近年来备受关注的一个重要课题,在市场购物篮分析、抄袭与异常检测、社区检测、基于配体的虚拟筛选等诸多领域都有应用。随着数据越来越容易收集,寻找邻居已经成为分析管道中的一个潜在瓶颈。考虑到今天的海量数据集,两两比较已不再可行。这项任务的高计算复杂性促使研究人员开发出近似方法,这种方法可以找到许多但不是所有的最近邻。然而,对于某些类型的数据,通过以避免大多数不必要的比较的方式仔细划分或过滤搜索空间,已经找到了有效的精确解决方案。近年来,我们在有效地识别适当的邻居的能力方面取得了一些根本性的进步,特别是在非传统数据中,如图形或文档集合。在本教程中,我们将深入概述查找(最近)邻居的最新方法,重点关注这些算法设计中所做选择背后的直觉,以及这些方法在实际应用程序中的实用性。我们的教程旨在提供“邻居计算”问题的统一视图,从数值数据到图形数据,从分类数据到顺序数据,以及相关的应用场景。对于每种类型的数据,我们将回顾当前用于识别邻居的最先进方法,并讨论如何使用邻居搜索方法来解决重要问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Tackle Balancing Constraint for Incremental Semi-Supervised Support Vector Learning HATS Temporal Probabilistic Profiles for Sepsis Prediction in the ICU Large-scale User Visits Understanding and Forecasting with Deep Spatial-Temporal Tensor Factorization Framework Adaptive Influence Maximization
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1