Any-k:标记图中的Anytime Top-k树模式检索。

Xiaofeng Yang, Patrick K Nicholson, Deepak Ajwani, Mirek Riedewald, Wolfgang Gatterbauer, Alessandra Sala
{"title":"Any-k:标记图中的Anytime Top-k树模式检索。","authors":"Xiaofeng Yang,&nbsp;Patrick K Nicholson,&nbsp;Deepak Ajwani,&nbsp;Mirek Riedewald,&nbsp;Wolfgang Gatterbauer,&nbsp;Alessandra Sala","doi":"10.1145/3178876.3186115","DOIUrl":null,"url":null,"abstract":"<p><p>Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called \"heterogeneous information networks\" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-<i>k</i> matches according to a ranking function over edge and node weights. For users, it is difficult to select value <i>k</i>. We therefore propose the novel notion of an <i>any-k ranking algorithm</i>: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.</p>","PeriodicalId":74532,"journal":{"name":"Proceedings of the ... International World-Wide Web Conference. International WWW Conference","volume":"2018 ","pages":"489-498"},"PeriodicalIF":0.0000,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1145/3178876.3186115","citationCount":"11","resultStr":"{\"title\":\"Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs.\",\"authors\":\"Xiaofeng Yang,&nbsp;Patrick K Nicholson,&nbsp;Deepak Ajwani,&nbsp;Mirek Riedewald,&nbsp;Wolfgang Gatterbauer,&nbsp;Alessandra Sala\",\"doi\":\"10.1145/3178876.3186115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called \\\"heterogeneous information networks\\\" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-<i>k</i> matches according to a ranking function over edge and node weights. For users, it is difficult to select value <i>k</i>. We therefore propose the novel notion of an <i>any-k ranking algorithm</i>: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.</p>\",\"PeriodicalId\":74532,\"journal\":{\"name\":\"Proceedings of the ... International World-Wide Web Conference. International WWW Conference\",\"volume\":\"2018 \",\"pages\":\"489-498\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1145/3178876.3186115\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... International World-Wide Web Conference. International WWW Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3178876.3186115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International World-Wide Web Conference. International WWW Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3178876.3186115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

推荐系统、社交网络分析、语义搜索和分布式根本原因分析等领域的许多问题都可以建模为标记图上的模式搜索(也称为“异构信息网络”或HIN)。给定一个大图和一个具有节点和边标签约束的查询模式,一个基本的挑战是根据边和节点权重的排序函数来找到前k个匹配。对于用户来说,很难选择值k。因此,我们提出了any-k排名算法的新概念:在给定的时间预算下,返回尽可能多的排名靠前的结果。然后,如果有额外的时间,也可以快速生成排名较低的下一个结果。它可以随时停止,但可能必须继续,直到返回所有结果。本文主要研究任意标记图上的非循环模式。我们感兴趣的是能够有效利用(1)异构网络的特性,特别是标签上的选择性约束,以及(2)用户通常只探索排名靠前的结果的一小部分的实用算法。我们的解决方案KARPET小心地集成了利用查询的非循环性质的主动修剪和增量引导搜索。它使我们能够证明强大的非平凡的时间和空间保证,这通常被认为对于这类图搜索问题非常困难。通过实验研究,我们发现KARPET在具有数百万节点和边缘的大型网络上实现了树模式的毫秒级运行时间。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

摘要图片

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs.

Many problems in areas as diverse as recommendation systems, social network analysis, semantic search, and distributed root cause analysis can be modeled as pattern search on labeled graphs (also called "heterogeneous information networks" or HINs). Given a large graph and a query pattern with node and edge label constraints, a fundamental challenge is to find the top-k matches according to a ranking function over edge and node weights. For users, it is difficult to select value k. We therefore propose the novel notion of an any-k ranking algorithm: for a given time budget, return as many of the top-ranked results as possible. Then, given additional time, produce the next lower-ranked results quickly as well. It can be stopped anytime, but may have to continue until all results are returned. This paper focuses on acyclic patterns over arbitrary labeled graphs. We are interested in practical algorithms that effectively exploit (1) properties of heterogeneous networks, in particular selective constraints on labels, and (2) that the users often explore only a fraction of the top-ranked results. Our solution, KARPET, carefully integrates aggressive pruning that leverages the acyclic nature of the query, and incremental guided search. It enables us to prove strong non-trivial time and space guarantees, which is generally considered very hard for this type of graph search problem. Through experimental studies we show that KARPET achieves running times in the order of milliseconds for tree patterns on large networks with millions of nodes and edges.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
DPAR: Decoupled Graph Neural Networks with Node-Level Differential Privacy. Exploring Representations for Singular and Multi-Concept Relations for Biomedical Named Entity Normalization. Context-Enriched Learning Models for Aligning Biomedical Vocabularies at Scale in the UMLS Metathesaurus. Contrastive Lexical Diffusion Coefficient: Quantifying the Stickiness of the Ordinary. Communication Efficient Federated Generalized Tensor Factorization for Collaborative Health Data Analytics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1