Information-complete and redundancy-free keyword search over large data graphs

Proceedings of the 21st ACM international conference on Information and knowledge management Pub Date : 2012-10-29 DOI:10.1145/2396761.2398712

Byron J. Gao, Zhumin Chen, Qi Kang

引用次数: 1

Abstract

Keyword search over graphs has a wide array of applications in querying structured, semi-structured and unstructured data. Existing models typically use minimal trees or bounded subgraphs as query answers. While such models emphasize relevancy, they would suffer from incompleteness of information and redundancy among answers, making it difficult for users to effectively explore query answers. To overcome these drawbacks, we propose a novel cluster-based model, where query answers are relevancy-connected clusters. A cluster is a subgraph induced from a maximal set of relevancy-connected nodes. Such clusters are coherent and relevant, yet complete and redundancy free. They can be of arbitrary shape in contrast to the sphere-shaped bounded subgraphs in existing models. We also propose an efficient search algorithm and a corresponding graph index for large, disk-resident data graphs.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

对大型数据图进行信息完整和无冗余的关键字搜索

图上的关键字搜索在查询结构化、半结构化和非结构化数据方面有着广泛的应用。现有模型通常使用最小树或有界子图作为查询答案。虽然这种模型强调相关性，但会存在答案之间信息不完整和冗余的问题，用户难以有效地探索查询答案。为了克服这些缺点，我们提出了一种新的基于聚类的模型，其中查询答案是关联连接的聚类。聚类是由关联连接节点的最大集合产生的子图。这样的集群是连贯和相关的，但完整和无冗余。它们可以是任意形状，而不是现有模型中的球形有界子图。我们还提出了一种高效的搜索算法和相应的图索引，用于大型磁盘驻留数据图。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 21st ACM international conference on Information and knowledge management

自引率

0.00%

发文量

期刊最新文献

Predicting web search success with fine-grained interaction data User activity profiling with multi-layer analysis Search result presentation based on faceted clustering Domain dependent query reformulation for web search CrowdTiles: presenting crowd-based information for event-driven information needs