Summarizing Labeled Multi-Graphs

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference) Pub Date : 2022-06-15 DOI:10.48550/arXiv.2206.07674

Dimitris Berberidis, P. Liang, L. Akoglu

{"title":"Summarizing Labeled Multi-Graphs","authors":"Dimitris Berberidis, P. Liang, L. Akoglu","doi":"10.48550/arXiv.2206.07674","DOIUrl":null,"url":null,"abstract":"Real-world graphs can be difficult to interpret and visualize beyond a certain size. To address this issue, graph summarization aims to simplify and shrink a graph, while maintaining its high-level structure and characteristics. Most summarization methods are designed for homogeneous, undirected, simple graphs; however, many real-world graphs are ornate; with characteristics including node labels, directed edges, edge multiplicities, and self-loops. In this paper we propose LM-Gsum, a versatile yet rigorous graph summarization model that (to the best of our knowledge, for the first time) can handle graphs with all the aforementioned characteristics (and any combination thereof). Moreover, our proposed model captures basic sub-structures that are prevalent in real-world graphs, such as cliques, stars, etc. LM-Gsum compactly quantifies the information content of a complex graph using a novel encoding scheme, where it seeks to minimize the total number of bits required to encode (i) the summary graph, as well as (ii) the corrections required for reconstructing the input graph losslessly. To accelerate the summary construction, it creates super-nodes efficiently by merging nodes in groups. Experiments demonstrate that LM-Gsum facilitates the visualization of real-world complex graphs, revealing interpretable structures and high- level relationships. Furthermore, LM-Gsum achieves better trade-off between compression rate and running time, relative to existing methods (only) on comparable settings.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"36 1","pages":"53-68"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2206.07674","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Real-world graphs can be difficult to interpret and visualize beyond a certain size. To address this issue, graph summarization aims to simplify and shrink a graph, while maintaining its high-level structure and characteristics. Most summarization methods are designed for homogeneous, undirected, simple graphs; however, many real-world graphs are ornate; with characteristics including node labels, directed edges, edge multiplicities, and self-loops. In this paper we propose LM-Gsum, a versatile yet rigorous graph summarization model that (to the best of our knowledge, for the first time) can handle graphs with all the aforementioned characteristics (and any combination thereof). Moreover, our proposed model captures basic sub-structures that are prevalent in real-world graphs, such as cliques, stars, etc. LM-Gsum compactly quantifies the information content of a complex graph using a novel encoding scheme, where it seeks to minimize the total number of bits required to encode (i) the summary graph, as well as (ii) the corrections required for reconstructing the input graph losslessly. To accelerate the summary construction, it creates super-nodes efficiently by merging nodes in groups. Experiments demonstrate that LM-Gsum facilitates the visualization of real-world complex graphs, revealing interpretable structures and high- level relationships. Furthermore, LM-Gsum achieves better trade-off between compression rate and running time, relative to existing methods (only) on comparable settings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

标注多图总结

现实世界的图形可能很难解释和可视化超过一定的大小。为了解决这个问题，图形摘要旨在简化和缩小图形，同时保持其高级结构和特征。大多数总结方法是为齐次的、无向的、简单的图设计的;然而，许多现实世界的图表都是华丽的;具有节点标签、有向边、边多重性和自环等特征。在本文中，我们提出了LM-Gsum，这是一个通用但严格的图摘要模型，(据我们所知，这是第一次)可以处理具有上述所有特征(以及它们的任何组合)的图。此外，我们提出的模型捕获了在现实世界图中普遍存在的基本子结构，如派系、星形等。LM-Gsum使用一种新颖的编码方案紧凑地量化了复杂图的信息内容，其中它寻求最小化编码(i)汇总图所需的总比特数，以及(ii)无损重建输入图所需的校正。该算法通过分组合并节点，高效地创建超级节点，加快了摘要的构建速度。实验表明，LM-Gsum有助于现实世界复杂图形的可视化，揭示可解释的结构和高层关系。此外，LM-Gsum在压缩率和运行时间之间实现了更好的权衡，相对于现有的方法(仅)在可比较的设置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

自引率

0.00%

发文量