Making graphs compact by lossless contraction.

IF 2.8 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE Vldb Journal Pub Date : 2023-01-01 Epub Date: 2022-02-19 DOI:10.1007/s00778-022-00731-7

Wenfei Fan, Yuanhao Li, Muyang Liu, Can Lu

{"title":"Making graphs compact by lossless contraction.","authors":"Wenfei Fan, Yuanhao Li, Muyang Liu, Can Lu","doi":"10.1007/s00778-022-00731-7","DOIUrl":null,"url":null,"abstract":"<p><p>This paper proposes a scheme to reduce big graphs to small graphs. It contracts obsolete parts and regular structures into supernodes. The supernodes carry a synopsis <math><msub><mi>S</mi> <mi>Q</mi></msub> </math> for each query class <math><mi>Q</mi></math> in use, to abstract key features of the contracted parts for answering queries of <math><mi>Q</mi></math> . Moreover, for various types of graphs, we identify regular structures to contract. The contraction scheme provides a compact graph representation and prioritizes up-to-date data. Better still, it is generic and lossless. We show that the same contracted graph is able to support multiple query classes at the same time, no matter whether their queries are label based or not, local or non-local. Moreover, existing algorithms for these queries can be readily adapted to compute exact answers by using the synopses when possible and decontracting the supernodes only when necessary. As a proof of concept, we show how to adapt existing algorithms for subgraph isomorphism, triangle counting, shortest distance, connected component and clique decision to contracted graphs. We also provide a bounded incremental contraction algorithm in response to updates, such that its cost is determined by the size of areas affected by the updates alone, not by the entire graphs. We experimentally verify that on average, the contraction scheme reduces graphs by 71.9% and improves the evaluation of these queries by 1.69, 1.44, 1.47, 2.24 and 1.37 times, respectively.</p>","PeriodicalId":49373,"journal":{"name":"Vldb Journal","volume":"32 1","pages":"49-73"},"PeriodicalIF":2.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9845199/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vldb Journal","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00778-022-00731-7","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/2/19 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes a scheme to reduce big graphs to small graphs. It contracts obsolete parts and regular structures into supernodes. The supernodes carry a synopsis $S_{Q}$ for each query class $Q$ in use, to abstract key features of the contracted parts for answering queries of $Q$ . Moreover, for various types of graphs, we identify regular structures to contract. The contraction scheme provides a compact graph representation and prioritizes up-to-date data. Better still, it is generic and lossless. We show that the same contracted graph is able to support multiple query classes at the same time, no matter whether their queries are label based or not, local or non-local. Moreover, existing algorithms for these queries can be readily adapted to compute exact answers by using the synopses when possible and decontracting the supernodes only when necessary. As a proof of concept, we show how to adapt existing algorithms for subgraph isomorphism, triangle counting, shortest distance, connected component and clique decision to contracted graphs. We also provide a bounded incremental contraction algorithm in response to updates, such that its cost is determined by the size of areas affected by the updates alone, not by the entire graphs. We experimentally verify that on average, the contraction scheme reduces graphs by 71.9% and improves the evaluation of these queries by 1.69, 1.44, 1.47, 2.24 and 1.37 times, respectively.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过无损收缩使图形紧凑

本文提出了一种将大图缩减为小图的方案。它将过时的部分和规则结构收缩为超级节点。超级节点为每个正在使用的查询类 Q 携带一个概要 S Q，以抽象出用于回答 Q 查询的收缩部分的关键特征。此外，对于各种类型的图，我们会确定需要收缩的规则结构。收缩方案提供了一种紧凑的图表示法，并优先处理最新数据。更妙的是，它具有通用性和无损性。我们证明，同一个收缩图能够同时支持多个查询类别，无论它们的查询是否基于标签，是本地查询还是非本地查询。此外，针对这些查询的现有算法可以很容易地进行调整，在可能的情况下使用概要，只有在必要时才去收缩上节点，从而计算出精确的答案。作为概念证明，我们展示了如何将现有的子图同构、三角形计数、最短距离、连通组件和簇判定算法调整到收缩图上。我们还针对更新提供了一种有界增量收缩算法，其成本仅由受更新影响的区域大小决定，而不是由整个图决定。我们通过实验验证了收缩方案平均减少了 71.9% 的图，并将这些查询的评估结果分别提高了 1.69、1.44、1.47、2.24 和 1.37 倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Vldb Journal 工程技术-计算机：信息系统

CiteScore

12.30

自引率

4.80%

发文量

审稿时长

>12 weeks

期刊介绍： The journal is dedicated to the publication of scholarly contributions in areas of data management such as database system technology and information systems, including their architectures and applications. Further, the journal’s scope is restricted to areas of data management that are covered by the combined expertise of the journal’s editorial board. Submissions with a substantial theory component are welcome, but the VLDB Journal expects such submissions also to embody a systems component. In relation to data mining, the journal will handle submissions where systems issues play a significant role. Factors that we use to determine whether a data mining paper is within scope include: The submission targets systems issues in relation to data mining, e.g., by covering integration with a database engine or with other data management functionality. The submission’s contributions build on (rather than simply cite) work already published in database outlets, e.g., VLDBJ, ACM TODS, PVLDB, ACM SIGMOD, IEEE ICDE, EDBT. The journal''s editorial board has the necessary expertise on the submission''s topic. Traditional, stand-alone data mining papers that lack the above or similar characteristics are out of scope for this journal. Criteria similar to the above are applied to submission from other areas, e.g., information retrieval and geographical information systems.

期刊最新文献

SWOOP: top-k similarity joins over set streams. Optimizing RPQs over a compact graph representation Cardinality estimation using normalizing flow MinJoin++: a fast algorithm for string similarity joins under edit distance Tabular data synthesis with generative adversarial networks: design space and optimizations