面向可扩展图学习的节点扩散

Proceedings of the ACM Web Conference 2023 Pub Date : 2023-04-30 DOI:10.1145/3543507.3583408

Keke Huang, Jing Tang, Juncheng Liu, Renchi Yang, X. Xiao

{"title":"面向可扩展图学习的节点扩散","authors":"Keke Huang, Jing Tang, Juncheng Liu, Renchi Yang, X. Xiao","doi":"10.1145/3543507.3583408","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) have shown superior performance for semi-supervised learning of numerous web applications, such as classification on web services and pages, analysis of online social networks, and recommendation in e-commerce. The state of the art derives representations for all nodes in graphs following the same diffusion (message passing) model without discriminating their uniqueness. However, (i) labeled nodes involved in model training usually account for a small portion of graphs in the semi-supervised setting, and (ii) different nodes locate at different graph local contexts and it inevitably degrades the representation qualities if treating them undistinguishedly in diffusion. To address the above issues, we develop NDM, a universal node-wise diffusion model, to capture the unique characteristics of each node in diffusion, by which NDM is able to yield high-quality node representations. In what follows, we customize NDM for semi-supervised learning and design the NIGCN model. In particular, NIGCN advances the efficiency significantly since it (i) produces representations for labeled nodes only and (ii) adopts well-designed neighbor sampling techniques tailored for node representation generation. Extensive experimental results on various types of web datasets, including citation, social and co-purchasing graphs, not only verify the state-of-the-art effectiveness of NIGCN but also strongly support the remarkable scalability of NIGCN. In particular, NIGCN completes representation generation and training within 10 seconds on the dataset with hundreds of millions of nodes and billions of edges, up to orders of magnitude speedups over the baselines, while achieving the highest F1-scores on classification.","PeriodicalId":296351,"journal":{"name":"Proceedings of the ACM Web Conference 2023","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Node-wise Diffusion for Scalable Graph Learning\",\"authors\":\"Keke Huang, Jing Tang, Juncheng Liu, Renchi Yang, X. Xiao\",\"doi\":\"10.1145/3543507.3583408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph Neural Networks (GNNs) have shown superior performance for semi-supervised learning of numerous web applications, such as classification on web services and pages, analysis of online social networks, and recommendation in e-commerce. The state of the art derives representations for all nodes in graphs following the same diffusion (message passing) model without discriminating their uniqueness. However, (i) labeled nodes involved in model training usually account for a small portion of graphs in the semi-supervised setting, and (ii) different nodes locate at different graph local contexts and it inevitably degrades the representation qualities if treating them undistinguishedly in diffusion. To address the above issues, we develop NDM, a universal node-wise diffusion model, to capture the unique characteristics of each node in diffusion, by which NDM is able to yield high-quality node representations. In what follows, we customize NDM for semi-supervised learning and design the NIGCN model. In particular, NIGCN advances the efficiency significantly since it (i) produces representations for labeled nodes only and (ii) adopts well-designed neighbor sampling techniques tailored for node representation generation. Extensive experimental results on various types of web datasets, including citation, social and co-purchasing graphs, not only verify the state-of-the-art effectiveness of NIGCN but also strongly support the remarkable scalability of NIGCN. In particular, NIGCN completes representation generation and training within 10 seconds on the dataset with hundreds of millions of nodes and billions of edges, up to orders of magnitude speedups over the baselines, while achieving the highest F1-scores on classification.\",\"PeriodicalId\":296351,\"journal\":{\"name\":\"Proceedings of the ACM Web Conference 2023\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM Web Conference 2023\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3543507.3583408\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM Web Conference 2023","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3543507.3583408","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

图神经网络(gnn)在许多web应用程序的半监督学习方面表现优异，例如web服务和页面的分类，在线社交网络的分析以及电子商务中的推荐。目前的技术状态是根据相同的扩散(消息传递)模型派生图中所有节点的表示，而不区分它们的唯一性。然而，(i)在半监督设置中，模型训练中涉及的标记节点通常只占图的一小部分，(ii)不同的节点位于不同的图局部上下文，如果在扩散中不区分它们，不可避免地会降低表示质量。为了解决上述问题，我们开发了NDM，一个通用的节点智能扩散模型，以捕获扩散中每个节点的独特特征，通过该模型，NDM能够产生高质量的节点表示。接下来，我们为半监督学习定制NDM，并设计NIGCN模型。特别是，NIGCN显著提高了效率，因为它(i)仅为标记节点生成表示，(ii)采用为节点表示生成量身定制的精心设计的邻居采样技术。在各种类型的网络数据集上的大量实验结果，包括引文、社交和共同购买图，不仅验证了NIGCN的最先进的有效性，而且有力地支持了NIGCN显著的可扩展性。特别是，NIGCN在拥有数亿个节点和数十亿条边的数据集上，在10秒内完成了表示生成和训练，速度比基线提高了几个数量级，同时在分类上获得了最高的f1分数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Node-wise Diffusion for Scalable Graph Learning

Graph Neural Networks (GNNs) have shown superior performance for semi-supervised learning of numerous web applications, such as classification on web services and pages, analysis of online social networks, and recommendation in e-commerce. The state of the art derives representations for all nodes in graphs following the same diffusion (message passing) model without discriminating their uniqueness. However, (i) labeled nodes involved in model training usually account for a small portion of graphs in the semi-supervised setting, and (ii) different nodes locate at different graph local contexts and it inevitably degrades the representation qualities if treating them undistinguishedly in diffusion. To address the above issues, we develop NDM, a universal node-wise diffusion model, to capture the unique characteristics of each node in diffusion, by which NDM is able to yield high-quality node representations. In what follows, we customize NDM for semi-supervised learning and design the NIGCN model. In particular, NIGCN advances the efficiency significantly since it (i) produces representations for labeled nodes only and (ii) adopts well-designed neighbor sampling techniques tailored for node representation generation. Extensive experimental results on various types of web datasets, including citation, social and co-purchasing graphs, not only verify the state-of-the-art effectiveness of NIGCN but also strongly support the remarkable scalability of NIGCN. In particular, NIGCN completes representation generation and training within 10 seconds on the dataset with hundreds of millions of nodes and billions of edges, up to orders of magnitude speedups over the baselines, while achieving the highest F1-scores on classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助