通过图部分池化训练大规模图神经网络

IF 7.5 3区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS IEEE Transactions on Big Data Pub Date : 2024-03-20 DOI:10.1109/TBDATA.2024.3403380
Qi Zhang;Yanfeng Sun;Shaofan Wang;Junbin Gao;Yongli Hu;Baocai Yin
{"title":"通过图部分池化训练大规模图神经网络","authors":"Qi Zhang;Yanfeng Sun;Shaofan Wang;Junbin Gao;Yongli Hu;Baocai Yin","doi":"10.1109/TBDATA.2024.3403380","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) are powerful tools for graph representation learning, but they face challenges when applied to large-scale graphs due to substantial computational costs and memory requirements. To address scalability limitations, various methods have been proposed, including sampling-based and decoupling-based methods. However, these methods have their limitations: sampling-based methods inevitably discard some link information during the sampling process, while decoupling-based methods require alterations to the model's structure, reducing their adaptability to various GNNs. This paper proposes a novel graph pooling method, Graph Partial Pooling (GPPool), for scaling GNNs to large-scale graphs. GPPool is a versatile and straightforward technique that enhances training efficiency while simultaneously reducing memory requirements. GPPool constructs small-scale pooled graphs by pooling partial nodes into supernodes. Each pooled graph consists of supernodes and unpooled nodes, preserving valuable local and global information. Training GNNs on these graphs reduces memory demands and enhances their performance. Additionally, this paper provides a theoretical analysis of training GNNs using GPPool-constructed graphs from a graph diffusion perspective. It shows that a GNN can be transformed from a large-scale graph into pooled graphs with minimal approximation error. A series of experiments on datasets of varying scales demonstrates the effectiveness of GPPool.","PeriodicalId":13106,"journal":{"name":"IEEE Transactions on Big Data","volume":"11 1","pages":"221-233"},"PeriodicalIF":7.5000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Training Large-Scale Graph Neural Networks via Graph Partial Pooling\",\"authors\":\"Qi Zhang;Yanfeng Sun;Shaofan Wang;Junbin Gao;Yongli Hu;Baocai Yin\",\"doi\":\"10.1109/TBDATA.2024.3403380\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph Neural Networks (GNNs) are powerful tools for graph representation learning, but they face challenges when applied to large-scale graphs due to substantial computational costs and memory requirements. To address scalability limitations, various methods have been proposed, including sampling-based and decoupling-based methods. However, these methods have their limitations: sampling-based methods inevitably discard some link information during the sampling process, while decoupling-based methods require alterations to the model's structure, reducing their adaptability to various GNNs. This paper proposes a novel graph pooling method, Graph Partial Pooling (GPPool), for scaling GNNs to large-scale graphs. GPPool is a versatile and straightforward technique that enhances training efficiency while simultaneously reducing memory requirements. GPPool constructs small-scale pooled graphs by pooling partial nodes into supernodes. Each pooled graph consists of supernodes and unpooled nodes, preserving valuable local and global information. Training GNNs on these graphs reduces memory demands and enhances their performance. Additionally, this paper provides a theoretical analysis of training GNNs using GPPool-constructed graphs from a graph diffusion perspective. It shows that a GNN can be transformed from a large-scale graph into pooled graphs with minimal approximation error. A series of experiments on datasets of varying scales demonstrates the effectiveness of GPPool.\",\"PeriodicalId\":13106,\"journal\":{\"name\":\"IEEE Transactions on Big Data\",\"volume\":\"11 1\",\"pages\":\"221-233\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-03-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Big Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10535224/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Big Data","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10535224/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

图神经网络(gnn)是图表示学习的强大工具,但由于大量的计算成本和内存需求,它们在应用于大规模图时面临挑战。为了解决可伸缩性的限制,已经提出了各种方法,包括基于采样和基于解耦的方法。然而,这些方法都有其局限性:基于采样的方法在采样过程中不可避免地丢弃了一些链路信息,而基于解耦的方法需要改变模型的结构,降低了其对各种gnn的适应性。本文提出了一种新的图池化方法——图部分池化(GPPool),用于将gnn扩展到大规模图。GPPool是一种通用且简单的技术,可以提高训练效率,同时降低内存需求。GPPool通过将部分节点池化为超级节点来构建小规模池化图。每个池化图由超级节点和非池化节点组成,保留有价值的局部和全局信息。在这些图上训练gnn可以减少内存需求并提高其性能。此外,本文还从图扩散的角度对gppool构造图训练gnn进行了理论分析。结果表明,GNN可以以最小的近似误差将大规模图转换为池图。在不同规模的数据集上进行的一系列实验证明了GPPool的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Training Large-Scale Graph Neural Networks via Graph Partial Pooling
Graph Neural Networks (GNNs) are powerful tools for graph representation learning, but they face challenges when applied to large-scale graphs due to substantial computational costs and memory requirements. To address scalability limitations, various methods have been proposed, including sampling-based and decoupling-based methods. However, these methods have their limitations: sampling-based methods inevitably discard some link information during the sampling process, while decoupling-based methods require alterations to the model's structure, reducing their adaptability to various GNNs. This paper proposes a novel graph pooling method, Graph Partial Pooling (GPPool), for scaling GNNs to large-scale graphs. GPPool is a versatile and straightforward technique that enhances training efficiency while simultaneously reducing memory requirements. GPPool constructs small-scale pooled graphs by pooling partial nodes into supernodes. Each pooled graph consists of supernodes and unpooled nodes, preserving valuable local and global information. Training GNNs on these graphs reduces memory demands and enhances their performance. Additionally, this paper provides a theoretical analysis of training GNNs using GPPool-constructed graphs from a graph diffusion perspective. It shows that a GNN can be transformed from a large-scale graph into pooled graphs with minimal approximation error. A series of experiments on datasets of varying scales demonstrates the effectiveness of GPPool.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
11.80
自引率
2.80%
发文量
114
期刊介绍: The IEEE Transactions on Big Data publishes peer-reviewed articles focusing on big data. These articles present innovative research ideas and application results across disciplines, including novel theories, algorithms, and applications. Research areas cover a wide range, such as big data analytics, visualization, curation, management, semantics, infrastructure, standards, performance analysis, intelligence extraction, scientific discovery, security, privacy, and legal issues specific to big data. The journal also prioritizes applications of big data in fields generating massive datasets.
期刊最新文献
2024 Reviewers List* Robust Privacy-Preserving Federated Item Ranking in Online Marketplaces: Exploiting Platform Reputation for Effective Aggregation Guest Editorial TBD Special Issue on Graph Machine Learning for Recommender Systems Data-Centric Graph Learning: A Survey Reliable Data Augmented Contrastive Learning for Sequential Recommendation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1