基于子图传播的用于推荐的精确且可扩展的图卷积网络

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-10-11 DOI:10.1109/TKDE.2024.3467333

Xueqi Li;Guoqing Xiao;Yuedan Chen;Kenli Li;Gao Cong

{"title":"基于子图传播的用于推荐的精确且可扩展的图卷积网络","authors":"Xueqi Li;Guoqing Xiao;Yuedan Chen;Kenli Li;Gao Cong","doi":"10.1109/TKDE.2024.3467333","DOIUrl":null,"url":null,"abstract":"In recommendation systems, Graph Convolutional Networks (GCNs) often suffer from significant computational and memory cost when propagating features across the entire user-item graph. While various sampling strategies have been introduced to reduce the cost, the challenge of neighbor explosion persists, primarily due to the iterative nature of neighbor aggregation. This work focuses on exploring subgraph propagation for scalable recommendation by addressing two primary challenges: \n<italic>efficient and effective subgraph construction\n and \n<italic>subgraph sparsity\n. To address these challenges, we propose a novel \n<underline>GCN\n model for recommendation based on \n<underline>Sub\ngraph propagation, called SubGCN. One key component of SubGCN is BiPPR, a technique that fuses both source- and target-based Personalized PageRank (PPR) approximations, to overcome the challenge of \n<italic>efficient and effective subgraph construction\n. Furthermore, we propose a source-target contrastive learning scheme to mitigate the impact of \n<italic>subgraph sparsity\n for SubGCN. We conduct extensive experiments on two large and two medium-sized datasets to evaluate the scalability, efficiency, and effectiveness of SubGCN. On medium-sized datasets, compared to full-graph GCNs, SubGCN achieves competitive accuracy while using only 23.79% training time on Gowalla and 16.3% on Yelp2018. On large datasets, where full-graph GCNs ran out of the GPU memory, our proposed SubGCN outperforms widely used sampling strategies in terms of training efficiency and recommendation accuracy.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7556-7568"},"PeriodicalIF":8.9000,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accurate and Scalable Graph Convolutional Networks for Recommendation Based on Subgraph Propagation\",\"authors\":\"Xueqi Li;Guoqing Xiao;Yuedan Chen;Kenli Li;Gao Cong\",\"doi\":\"10.1109/TKDE.2024.3467333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recommendation systems, Graph Convolutional Networks (GCNs) often suffer from significant computational and memory cost when propagating features across the entire user-item graph. While various sampling strategies have been introduced to reduce the cost, the challenge of neighbor explosion persists, primarily due to the iterative nature of neighbor aggregation. This work focuses on exploring subgraph propagation for scalable recommendation by addressing two primary challenges: \\n<italic>efficient and effective subgraph construction\\n and \\n<italic>subgraph sparsity\\n. To address these challenges, we propose a novel \\n<underline>GCN\\n model for recommendation based on \\n<underline>Sub\\ngraph propagation, called SubGCN. One key component of SubGCN is BiPPR, a technique that fuses both source- and target-based Personalized PageRank (PPR) approximations, to overcome the challenge of \\n<italic>efficient and effective subgraph construction\\n. Furthermore, we propose a source-target contrastive learning scheme to mitigate the impact of \\n<italic>subgraph sparsity\\n for SubGCN. We conduct extensive experiments on two large and two medium-sized datasets to evaluate the scalability, efficiency, and effectiveness of SubGCN. On medium-sized datasets, compared to full-graph GCNs, SubGCN achieves competitive accuracy while using only 23.79% training time on Gowalla and 16.3% on Yelp2018. On large datasets, where full-graph GCNs ran out of the GPU memory, our proposed SubGCN outperforms widely used sampling strategies in terms of training efficiency and recommendation accuracy.\",\"PeriodicalId\":13496,\"journal\":{\"name\":\"IEEE Transactions on Knowledge and Data Engineering\",\"volume\":\"36 12\",\"pages\":\"7556-7568\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2024-10-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Knowledge and Data Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10714406/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10714406/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在推荐系统中，图卷积网络（Graph Convolutional Networks，GCNs）在整个用户-项目图中传播特征时，往往会产生巨大的计算和内存成本。虽然已经引入了各种采样策略来降低成本，但邻居爆炸的挑战依然存在，这主要是由于邻居聚合的迭代性质造成的。本研究主要通过解决两个主要挑战来探索用于可扩展推荐的子图传播：高效和有效的子图构建以及子图稀疏性。为了应对这些挑战，我们提出了一种基于子图传播的新型 GCN 推荐模型，称为 SubGCN。SubGCN 的一个关键组成部分是 BiPPR，这是一种融合了基于源和目标的个性化页面排名（PPR）近似值的技术，可以克服高效和有效构建子图的挑战。此外，我们还提出了一种源目标对比学习方案，以减轻子图稀疏性对 SubGCN 的影响。我们在两个大型和两个中型数据集上进行了广泛的实验，以评估 SubGCN 的可扩展性、效率和有效性。在中型数据集上，与全图 GCN 相比，SubGCN 在 Gowalla 上只用了 23.79% 的训练时间，在 Yelp2018 上只用了 16.3% 的训练时间，就达到了具有竞争力的准确率。在大型数据集上，全图 GCN 会耗尽 GPU 内存，而我们提出的 SubGCN 在训练效率和推荐准确性方面都优于广泛使用的采样策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Accurate and Scalable Graph Convolutional Networks for Recommendation Based on Subgraph Propagation

In recommendation systems, Graph Convolutional Networks (GCNs) often suffer from significant computational and memory cost when propagating features across the entire user-item graph. While various sampling strategies have been introduced to reduce the cost, the challenge of neighbor explosion persists, primarily due to the iterative nature of neighbor aggregation. This work focuses on exploring subgraph propagation for scalable recommendation by addressing two primary challenges: efficient and effective subgraph construction and subgraph sparsity . To address these challenges, we propose a novel GCN model for recommendation based on Sub graph propagation, called SubGCN. One key component of SubGCN is BiPPR, a technique that fuses both source- and target-based Personalized PageRank (PPR) approximations, to overcome the challenge of efficient and effective subgraph construction . Furthermore, we propose a source-target contrastive learning scheme to mitigate the impact of subgraph sparsity for SubGCN. We conduct extensive experiments on two large and two medium-sized datasets to evaluate the scalability, efficiency, and effectiveness of SubGCN. On medium-sized datasets, compared to full-graph GCNs, SubGCN achieves competitive accuracy while using only 23.79% training time on Gowalla and 16.3% on Yelp2018. On large datasets, where full-graph GCNs ran out of the GPU memory, our proposed SubGCN outperforms widely used sampling strategies in terms of training efficiency and recommendation accuracy.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.