Core Maintenance on Dynamic Graphs: A Distributed Approach Built on H-Index

IF 7.5 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Big Data Pub Date : 2024-01-11 DOI:10.1109/TBDATA.2024.3352973

Qiang-Sheng Hua;Hongen Wang;Hai Jin;Xuanhua Shi

{"title":"Core Maintenance on Dynamic Graphs: A Distributed Approach Built on H-Index","authors":"Qiang-Sheng Hua;Hongen Wang;Hai Jin;Xuanhua Shi","doi":"10.1109/TBDATA.2024.3352973","DOIUrl":null,"url":null,"abstract":"Core number is an essential tool for analyzing graph structure. Graphs in the real world are typically large and dynamic, requiring the development of distributed algorithms to refrain from expensive I/O operations and the maintenance algorithms to address dynamism. Core maintenance updates the core number of each vertex upon the insertion/deletion of vertices/edges. Although the state-of-the-art distributed maintenance algorithm (Weng et al.~2022) can handle multiple edge insertions/deletions simultaneously, it still has two aspects to improve. (I) Parallel processing is not allowed when inserting/removing edges with the same core number, reducing the degree of parallelism and raising the number of rounds. (II) During the implementation phase, only one thread is assigned to the vertices with the same core number, leading to the inability to fully utilize the distributed computing power. Furthermore, the h-index (Lü, et al. 2016) based distributed core decomposition algorithm (Montresor et al. 2013) can fully utilize the distributed computing power where all vertices can be processed in parallel. However, it requires all vertices to recompute their core numbers upon graph changes. In this article, we propose a distributed core maintenance algorithm based on h-index, which circumvents the issues of algorithm (Weng et al.~2022). In addition, our algorithm avoids core numbers recalculation where the numbers do not change. In comparison to the state-of-the-art distributed maintenance algorithm (Weng et al.~2022), the time speedup ratio is at least 100 in the scenarios of both insertion and deletion. Compared to the distributed core decomposition algorithm (Montresor et al. 2013), the average time speedup ratios are 2 and 8 for the cases of insertion and deletion, respectively.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"10 5","pages":"595-608"},"PeriodicalIF":7.5000,"publicationDate":"2024-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10388383","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10388383/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Core number is an essential tool for analyzing graph structure. Graphs in the real world are typically large and dynamic, requiring the development of distributed algorithms to refrain from expensive I/O operations and the maintenance algorithms to address dynamism. Core maintenance updates the core number of each vertex upon the insertion/deletion of vertices/edges. Although the state-of-the-art distributed maintenance algorithm (Weng et al.~2022) can handle multiple edge insertions/deletions simultaneously, it still has two aspects to improve. (I) Parallel processing is not allowed when inserting/removing edges with the same core number, reducing the degree of parallelism and raising the number of rounds. (II) During the implementation phase, only one thread is assigned to the vertices with the same core number, leading to the inability to fully utilize the distributed computing power. Furthermore, the h-index (Lü, et al. 2016) based distributed core decomposition algorithm (Montresor et al. 2013) can fully utilize the distributed computing power where all vertices can be processed in parallel. However, it requires all vertices to recompute their core numbers upon graph changes. In this article, we propose a distributed core maintenance algorithm based on h-index, which circumvents the issues of algorithm (Weng et al.~2022). In addition, our algorithm avoids core numbers recalculation where the numbers do not change. In comparison to the state-of-the-art distributed maintenance algorithm (Weng et al.~2022), the time speedup ratio is at least 100 in the scenarios of both insertion and deletion. Compared to the distributed core decomposition algorithm (Montresor et al. 2013), the average time speedup ratios are 2 and 8 for the cases of insertion and deletion, respectively.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

动态图上的核心维护：基于 H-Index 的分布式方法

核心数是分析图结构的重要工具。现实世界中的图形通常是庞大而动态的，这就要求开发分布式算法以避免昂贵的 I/O 操作，并开发维护算法以解决动态性问题。核心维护在插入/删除顶点/边时更新每个顶点的核心数。虽然最先进的分布式维护算法（Weng 等~2022）可以同时处理多条边的插入/删除，但仍有两方面需要改进。(I) 在插入/删除具有相同核心数的边时，不允许并行处理，从而降低了并行程度，增加了回合数。(二）在执行阶段，只为具有相同核心数的顶点分配一个线程，导致无法充分利用分布式计算能力。此外，基于 h 指数（Lü 等人，2016 年）的分布式核心分解算法（Montresor 等人，2013 年）可以充分利用分布式计算能力，并行处理所有顶点。然而，它要求所有顶点在图发生变化时重新计算其核心数。在本文中，我们提出了一种基于 h-index 的分布式核心维护算法，它规避了算法（Weng 等~2022）的问题。此外，我们的算法还避免了在核心数不变的情况下重新计算核心数。与最先进的分布式维护算法（Weng 等 ~2022）相比，在插入和删除两种情况下，时间加速比至少为 100。与分布式内核分解算法（Montresor 等人，2013 年）相比，插入和删除情况下的平均时间加速比分别为 2 和 8。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Big Data Multiple-

CiteScore

11.80

自引率

2.80%

发文量

114

期刊介绍： The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.

期刊最新文献

2024 Reviewers List* Robust Privacy-Preserving Federated Item Ranking in Online Marketplaces: Exploiting Platform Reputation for Effective Aggregation Guest Editorial TBD Special Issue on Graph Machine Learning for Recommender Systems Data-Centric Graph Learning: A Survey Reliable Data Augmented Contrastive Learning for Sequential Recommendation