面向图索引的迭代图特征挖掘

2012 IEEE 28th International Conference on Data Engineering Pub Date : 2012-04-01 DOI:10.1109/ICDE.2012.11

Dayu Yuan, P. Mitra, Huiwen Yu, C. Lee Giles

{"title":"面向图索引的迭代图特征挖掘","authors":"Dayu Yuan, P. Mitra, Huiwen Yu, C. Lee Giles","doi":"10.1109/ICDE.2012.11","DOIUrl":null,"url":null,"abstract":"Sub graph search is a popular query scenario on graph databases. Given a query graph q, the sub graph search algorithm returns all database graphs having q as a sub graph. To efficiently implement a subgraph search, subgraph features are mined in order to index the graph database. Many subgraph feature mining approaches have been proposed. They are all \"mine-at-once\" algorithms in which the whole feature set is mined in one run before building a stable graph index. However, due to the change of environments (such as an update of the graph database and the increase of available memory), the index needs to be updated to accommodate such changes. Most of the \"mine-at-once\" algorithms involve frequent subgraph or subtree mining over the whole graph database. Also, constructing and deploying a new index involves an expensive disk operation such that it is inefficient to re-mine the features and rebuild the index from scratch. We observe that, under most cases, it is sufficient to update a small part of the graph index. Here we propose an \"iterative subgraph mining\" algorithm which iteratively finds one feature to insert into (or remove from) the index. Since the majority of indexing features and the index structure are not changed, the algorithm can be frequently invoked. We define an objective function that guides the feature mining. Next, we propose a basic branch and bound algorithm to mine the features. Finally, we design an advanced search algorithm, which quickly finds a near-optimum subgraph feature and reduces the search space. Experiments show that our feature mining algorithm is 5 times faster than the popular graph indexing algorithm gIndex, and that features mined by our iterative algorithm have a better filtering rate for the subgraph search problem.","PeriodicalId":321608,"journal":{"name":"2012 IEEE 28th International Conference on Data Engineering","volume":"245 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"Iterative Graph Feature Mining for Graph Indexing\",\"authors\":\"Dayu Yuan, P. Mitra, Huiwen Yu, C. Lee Giles\",\"doi\":\"10.1109/ICDE.2012.11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sub graph search is a popular query scenario on graph databases. Given a query graph q, the sub graph search algorithm returns all database graphs having q as a sub graph. To efficiently implement a subgraph search, subgraph features are mined in order to index the graph database. Many subgraph feature mining approaches have been proposed. They are all \\\"mine-at-once\\\" algorithms in which the whole feature set is mined in one run before building a stable graph index. However, due to the change of environments (such as an update of the graph database and the increase of available memory), the index needs to be updated to accommodate such changes. Most of the \\\"mine-at-once\\\" algorithms involve frequent subgraph or subtree mining over the whole graph database. Also, constructing and deploying a new index involves an expensive disk operation such that it is inefficient to re-mine the features and rebuild the index from scratch. We observe that, under most cases, it is sufficient to update a small part of the graph index. Here we propose an \\\"iterative subgraph mining\\\" algorithm which iteratively finds one feature to insert into (or remove from) the index. Since the majority of indexing features and the index structure are not changed, the algorithm can be frequently invoked. We define an objective function that guides the feature mining. Next, we propose a basic branch and bound algorithm to mine the features. Finally, we design an advanced search algorithm, which quickly finds a near-optimum subgraph feature and reduces the search space. Experiments show that our feature mining algorithm is 5 times faster than the popular graph indexing algorithm gIndex, and that features mined by our iterative algorithm have a better filtering rate for the subgraph search problem.\",\"PeriodicalId\":321608,\"journal\":{\"name\":\"2012 IEEE 28th International Conference on Data Engineering\",\"volume\":\"245 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE 28th International Conference on Data Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE.2012.11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 28th International Conference on Data Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2012.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

子图搜索是图数据库中常用的查询场景。给定一个查询图q，子图搜索算法返回所有以q为子图的数据库图。为了有效地实现子图搜索，挖掘子图特征以便对图数据库进行索引。许多子图特征挖掘方法已经被提出。它们都是“一次挖掘”算法，在构建稳定的图索引之前，在一次运行中挖掘整个特征集。但是，由于环境的变化(例如图数据库的更新和可用内存的增加)，需要更新索引以适应这些变化。大多数“一次挖掘”算法涉及在整个图数据库上频繁的子图或子树挖掘。此外，构造和部署新索引涉及到昂贵的磁盘操作，因此重新挖掘特性并从头构建索引的效率很低。我们观察到，在大多数情况下，更新一小部分图索引就足够了。在这里，我们提出了一个“迭代子图挖掘”算法，迭代地找到一个特征插入(或从)索引中删除。由于大多数索引特性和索引结构没有改变，因此可以频繁调用该算法。我们定义了一个目标函数来指导特征挖掘。接下来，我们提出了一种基本的分支定界算法来挖掘特征。最后，我们设计了一种先进的搜索算法，该算法可以快速找到接近最优的子图特征，并减少了搜索空间。实验表明，我们的特征挖掘算法比流行的图索引算法gIndex快5倍，并且我们的迭代算法挖掘的特征对子图搜索问题有更好的过滤率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Iterative Graph Feature Mining for Graph Indexing

Sub graph search is a popular query scenario on graph databases. Given a query graph q, the sub graph search algorithm returns all database graphs having q as a sub graph. To efficiently implement a subgraph search, subgraph features are mined in order to index the graph database. Many subgraph feature mining approaches have been proposed. They are all "mine-at-once" algorithms in which the whole feature set is mined in one run before building a stable graph index. However, due to the change of environments (such as an update of the graph database and the increase of available memory), the index needs to be updated to accommodate such changes. Most of the "mine-at-once" algorithms involve frequent subgraph or subtree mining over the whole graph database. Also, constructing and deploying a new index involves an expensive disk operation such that it is inefficient to re-mine the features and rebuild the index from scratch. We observe that, under most cases, it is sufficient to update a small part of the graph index. Here we propose an "iterative subgraph mining" algorithm which iteratively finds one feature to insert into (or remove from) the index. Since the majority of indexing features and the index structure are not changed, the algorithm can be frequently invoked. We define an objective function that guides the feature mining. Next, we propose a basic branch and bound algorithm to mine the features. Finally, we design an advanced search algorithm, which quickly finds a near-optimum subgraph feature and reduces the search space. Experiments show that our feature mining algorithm is 5 times faster than the popular graph indexing algorithm gIndex, and that features mined by our iterative algorithm have a better filtering rate for the subgraph search problem.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助