Towards Faster Graph Partitioning via Pre-training and Inductive Inference

arXiv - CS - Social and Information Networks Pub Date : 2024-09-01 DOI:arxiv-2409.00670

Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai

{"title":"Towards Faster Graph Partitioning via Pre-training and Inductive Inference","authors":"Meng Qin, Chaorui Zhang, Yu Gao, Yibin Ding, Weipeng Jiang, Weixi Zhang, Wei Han, Bo Bai","doi":"arxiv-2409.00670","DOIUrl":null,"url":null,"abstract":"Graph partitioning (GP) is a classic problem that divides the node set of a\ngraph into densely-connected blocks. Following the IEEE HPEC Graph Challenge\nand recent advances in pre-training techniques (e.g., large-language models),\nwe propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel\npre-training & refinement paradigm. We first conduct the offline pre-training\nof a deep graph learning (DGL) model on small synthetic graphs with various\ntopology properties. By using the inductive inference of DGL, one can directly\ngeneralize the pre-trained model (with frozen model parameters) to large graphs\nand derive feasible GP results. We also use the derived partition as a good\ninitialization of an efficient GP method (e.g., InfoMap) to further refine the\nquality of partitioning. In this setting, the online generalization and\nrefinement of PR-GPT can not only benefit from the transfer ability regarding\nquality but also ensure high inference efficiency without re-training. Based on\na mechanism of reducing the scale of a graph to be processed by the refinement\nmethod, PR-GPT also has the potential to support streaming GP. Experiments on\nthe Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on\nlarge-scale graphs without significant quality degradation, compared with\nrunning a refinement method from scratch. We will make our code public at\nhttps://github.com/KuroginQin/PRGPT.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.00670","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Graph partitioning (GP) is a classic problem that divides the node set of a graph into densely-connected blocks. Following the IEEE HPEC Graph Challenge and recent advances in pre-training techniques (e.g., large-language models), we propose PR-GPT (Pre-trained & Refined Graph ParTitioning) based on a novel pre-training & refinement paradigm. We first conduct the offline pre-training of a deep graph learning (DGL) model on small synthetic graphs with various topology properties. By using the inductive inference of DGL, one can directly generalize the pre-trained model (with frozen model parameters) to large graphs and derive feasible GP results. We also use the derived partition as a good initialization of an efficient GP method (e.g., InfoMap) to further refine the quality of partitioning. In this setting, the online generalization and refinement of PR-GPT can not only benefit from the transfer ability regarding quality but also ensure high inference efficiency without re-training. Based on a mechanism of reducing the scale of a graph to be processed by the refinement method, PR-GPT also has the potential to support streaming GP. Experiments on the Graph Challenge benchmark demonstrate that PR-GPT can ensure faster GP on large-scale graphs without significant quality degradation, compared with running a refinement method from scratch. We will make our code public at https://github.com/KuroginQin/PRGPT.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过预训练和归纳推理实现更快的图谱划分

图分割（GP）是一个经典问题，它将图的节点集分割成密集连接的块。继 IEEE HPEC Graph Challenge 和预训练技术（如大型语言模型）的最新进展之后，我们提出了基于新颖的预训练和精炼范式的 PR-GPT（Pre-trained & Refined Graph ParTitioning）。我们首先在具有不同拓扑特性的小型合成图上对深度图学习（DGL）模型进行离线预训练。通过使用 DGL 的归纳推理，我们可以直接将预训练模型（模型参数冻结）推广到大型图，并得出可行的 GP 结果。我们还将得出的分区作为高效 GP 方法（如 InfoMap）的良好初始化，以进一步完善分区的质量。在这种情况下，PR-GPT 的在线泛化和细化不仅能从质量转移能力中获益，还能在无需重新训练的情况下确保较高的推理效率。PR-GPT 的机制是缩小待处理图的规模，在此基础上，PR-GPT 还具有支持流式 GP 的潜力。在 Graph Challenge 基准上的实验表明，与从头开始运行细化方法相比，PR-GPT 可以确保在大规模图上更快地实现 GP，而不会出现明显的质量下降。我们将在 https://github.com/KuroginQin/PRGPT 公开我们的代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

arXiv - CS - Social and Information Networks

自引率

0.00%

发文量