利用隐式和显式邻居进行关联感知图数据扩充

IF 4 3区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Knowledge Discovery from Data Pub Date : 2024-01-25 DOI:10.1145/3638057

Chuan-Wei Kuo, Bo-Yu Chen, Wen-Chih Peng, Chih-Chieh Hung, Hsin-Ning Su

{"title":"利用隐式和显式邻居进行关联感知图数据扩充","authors":"Chuan-Wei Kuo, Bo-Yu Chen, Wen-Chih Peng, Chih-Chieh Hung, Hsin-Ning Su","doi":"10.1145/3638057","DOIUrl":null,"url":null,"abstract":"In recent years, there has been a significant surge in commercial demand for citation graph-based tasks, such as patent analysis, social network analysis, and recommendation systems. Graph Neural Networks (GNNs) are widely used for these tasks due to their remarkable performance in capturing topological graph information. However, GNNs’ output results are highly dependent on the composition of local neighbors within the topological structure. To address this issue, we identify two types of neighbors in a citation graph: explicit neighbors based on the topological structure, and implicit neighbors based on node features. Our primary motivation is to clearly define and visualize these neighbors, emphasizing their importance in enhancing graph neural network performance. We propose a Correlation-aware Network (CNet) to re-organize the citation graph and learn more valuable informative representations by leveraging these implicit and explicit neighbors. Our approach aims to improve graph data augmentation and classification performance, with the majority of our focus on stating the importance of using these neighbors, while also introducing a new graph data augmentation method. We compare CNet with state-of-the-art (SOTA) GNNs and other graph data augmentation approaches acting on GNNs. Extensive experiments demonstrate that CNet effectively extracts more valuable informative representations from the citation graph, significantly outperforming baselines. The code is available on public GitHub\n1.","PeriodicalId":49249,"journal":{"name":"ACM Transactions on Knowledge Discovery from Data","volume":"27 1","pages":""},"PeriodicalIF":4.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Correlation-Aware Graph Data Augmentation with Implicit and Explicit Neighbors\",\"authors\":\"Chuan-Wei Kuo, Bo-Yu Chen, Wen-Chih Peng, Chih-Chieh Hung, Hsin-Ning Su\",\"doi\":\"10.1145/3638057\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, there has been a significant surge in commercial demand for citation graph-based tasks, such as patent analysis, social network analysis, and recommendation systems. Graph Neural Networks (GNNs) are widely used for these tasks due to their remarkable performance in capturing topological graph information. However, GNNs’ output results are highly dependent on the composition of local neighbors within the topological structure. To address this issue, we identify two types of neighbors in a citation graph: explicit neighbors based on the topological structure, and implicit neighbors based on node features. Our primary motivation is to clearly define and visualize these neighbors, emphasizing their importance in enhancing graph neural network performance. We propose a Correlation-aware Network (CNet) to re-organize the citation graph and learn more valuable informative representations by leveraging these implicit and explicit neighbors. Our approach aims to improve graph data augmentation and classification performance, with the majority of our focus on stating the importance of using these neighbors, while also introducing a new graph data augmentation method. We compare CNet with state-of-the-art (SOTA) GNNs and other graph data augmentation approaches acting on GNNs. Extensive experiments demonstrate that CNet effectively extracts more valuable informative representations from the citation graph, significantly outperforming baselines. The code is available on public GitHub\\n1.\",\"PeriodicalId\":49249,\"journal\":{\"name\":\"ACM Transactions on Knowledge Discovery from Data\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Knowledge Discovery from Data\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3638057\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Knowledge Discovery from Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3638057","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，对基于引文图的任务（如专利分析、社交网络分析和推荐系统）的商业需求大幅增加。图神经网络（GNN）在捕捉拓扑图信息方面表现出色，因此被广泛用于这些任务。然而，GNN 的输出结果在很大程度上取决于拓扑结构中本地邻居的组成。为了解决这个问题，我们确定了引文图中的两类邻居：基于拓扑结构的显式邻居和基于节点特征的隐式邻居。我们的主要动机是明确定义和可视化这些邻居，强调它们在提高图神经网络性能方面的重要性。我们提出了一种相关性感知网络（CNet）来重新组织引文图，并通过利用这些隐式和显式邻居来学习更有价值的信息表征。我们的方法旨在提高图数据扩增和分类性能，重点在于说明使用这些邻居的重要性，同时还引入了一种新的图数据扩增方法。我们将 CNet 与最先进的（SOTA）GNN 及其他作用于 GNN 的图数据增强方法进行了比较。大量实验证明，CNet 能有效地从引文图中提取出更有价值的信息表征，性能明显优于基线方法。代码可在公开的 GitHub 上获取1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Correlation-Aware Graph Data Augmentation with Implicit and Explicit Neighbors

In recent years, there has been a significant surge in commercial demand for citation graph-based tasks, such as patent analysis, social network analysis, and recommendation systems. Graph Neural Networks (GNNs) are widely used for these tasks due to their remarkable performance in capturing topological graph information. However, GNNs’ output results are highly dependent on the composition of local neighbors within the topological structure. To address this issue, we identify two types of neighbors in a citation graph: explicit neighbors based on the topological structure, and implicit neighbors based on node features. Our primary motivation is to clearly define and visualize these neighbors, emphasizing their importance in enhancing graph neural network performance. We propose a Correlation-aware Network (CNet) to re-organize the citation graph and learn more valuable informative representations by leveraging these implicit and explicit neighbors. Our approach aims to improve graph data augmentation and classification performance, with the majority of our focus on stating the importance of using these neighbors, while also introducing a new graph data augmentation method. We compare CNet with state-of-the-art (SOTA) GNNs and other graph data augmentation approaches acting on GNNs. Extensive experiments demonstrate that CNet effectively extracts more valuable informative representations from the citation graph, significantly outperforming baselines. The code is available on public GitHub ¹.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Knowledge Discovery from Data COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

6.70

自引率

5.60%

发文量

172

审稿时长

3 months

期刊介绍： TKDD welcomes papers on a full range of research in the knowledge discovery and analysis of diverse forms of data. Such subjects include, but are not limited to: scalable and effective algorithms for data mining and big data analysis, mining brain networks, mining data streams, mining multi-media data, mining high-dimensional data, mining text, Web, and semi-structured data, mining spatial and temporal data, data mining for community generation, social network analysis, and graph structured data, security and privacy issues in data mining, visual, interactive and online data mining, pre-processing and post-processing for data mining, robust and scalable statistical methods, data mining languages, foundations of data mining, KDD framework and process, and novel applications and infrastructures exploiting data mining technology including massively parallel processing and cloud computing platforms. TKDD encourages papers that explore the above subjects in the context of large distributed networks of computers, parallel or multiprocessing computers, or new data devices. TKDD also encourages papers that describe emerging data mining applications that cannot be satisfied by the current data mining technology.