IEEE Transactions on Knowledge and Data Engineering最新文献_第3页

Efficient and Effective Augmentation Framework With Latent Mixup and Label-Guided Contrastive Learning for Graph Classification 利用潜在混合和标签引导对比学习实现图分类的高效增强框架

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-30 DOI: 10.1109/TKDE.2024.3471659

Aoting Zeng;Liping Wang;Wenjie Zhang;Xuemin Lin

Graph Neural Networks (GNNs) with data augmentation obtain promising results among existing solutions for graph classification. Mixup-based augmentation methods for graph classification have already achieved state-of-the-art performance. However, existing mixup-based augmentation methods either operate in the input space and thus face the challenge of balancing efficiency and accuracy, or directly conduct mixup in the latent space without similarity guarantee, thus leading to lacking semantic validity and limited performance. To address these limitations, this paper proposes

$mathcal {G}$

-MixCon, a novel framework leveraging the strengths of Mixup-based augmentation and supervised Contrastive learning (SCL). To the best of our knowledge, this is the first attempt to develop an SCL-based approach for learning graph representations. Specifically, the mixup-based strategy within the latent space named

$GDA_{gl}$

and

$GDA_{nl}$

are proposed, which efficiently conduct linear interpolation between views of the node or graph level. Furthermore, we design a dual-objective loss function named SupMixCon that can consider both the consistency among graphs and the distances between the original and augmented graph. SupMixCon can guide the training process for SCL in

$mathcal {G}$

-MixCon while achieving a similarity guarantee. Comprehensive experiments are conducted on various real-world datasets, the results show that

$mathcal {G}$

-MixCon demonstrably enhances performance, achieving an average accuracy increment of 6.24%, and significantly increases the robustness of GNNs against noisy labels.

在现有的图分类解决方案中，具有数据增强功能的图神经网络（GNN）取得了可喜的成果。基于混合的图分类增强方法已经取得了最先进的性能。然而，现有的基于混合的增强方法要么是在输入空间中进行操作，因此面临着平衡效率和准确性的挑战；要么是直接在潜空间中进行混合，而没有相似性保证，因此导致缺乏语义有效性和性能有限。为了解决这些局限性，本文提出了$mathcal {G}$-MixCon, 一个利用基于混合的增强和有监督对比学习（SCL）优势的新型框架。据我们所知，这是首次尝试开发基于 SCL 的图表示学习方法。具体来说，我们提出了在名为$GDA_{gl}$和$GDA_{nl}$的潜空间内基于混合的策略，它能有效地在节点或图层视图之间进行线性插值。此外，我们还设计了一种名为 SupMixCon 的双目标损失函数，它既能考虑图之间的一致性，也能考虑原始图和增强图之间的距离。SupMixCon 可以指导 $mathcal {G}$-MixCon 中 SCL 的训练过程，同时实现相似性保证。我们在各种实际数据集上进行了综合实验，结果表明$mathcal {G}$-MixCon 明显提高了性能，平均准确率提高了6.24%，并显著增强了GNN对噪声标签的鲁棒性。

{"title":"Efficient and Effective Augmentation Framework With Latent Mixup and Label-Guided Contrastive Learning for Graph Classification","authors":"Aoting Zeng;Liping Wang;Wenjie Zhang;Xuemin Lin","doi":"10.1109/TKDE.2024.3471659","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3471659","url":null,"abstract":"Graph Neural Networks (GNNs) with data augmentation obtain promising results among existing solutions for graph classification. Mixup-based augmentation methods for graph classification have already achieved state-of-the-art performance. However, existing mixup-based augmentation methods either operate in the input space and thus face the challenge of balancing efficiency and accuracy, or directly conduct mixup in the latent space without similarity guarantee, thus leading to lacking semantic validity and limited performance. To address these limitations, this paper proposes \u0000<inline-formula><tex-math>$mathcal {G}$</tex-math></inline-formula>\u0000-MixCon, a novel framework leveraging the strengths of \u0000Mix\u0000up-based augmentation and supervised \u0000Con\u0000trastive learning (SCL). To the best of our knowledge, this is the first attempt to develop an SCL-based approach for learning graph representations. Specifically, the mixup-based strategy within the latent space named \u0000<inline-formula><tex-math>$GDA_{gl}$</tex-math></inline-formula>\u0000 and \u0000<inline-formula><tex-math>$GDA_{nl}$</tex-math></inline-formula>\u0000 are proposed, which efficiently conduct linear interpolation between views of the node or graph level. Furthermore, we design a dual-objective loss function named \u0000SupMixCon\u0000 that can consider both the consistency among graphs and the distances between the original and augmented graph. \u0000SupMixCon\u0000 can guide the training process for SCL in \u0000<inline-formula><tex-math>$mathcal {G}$</tex-math></inline-formula>\u0000-MixCon while achieving a similarity guarantee. Comprehensive experiments are conducted on various real-world datasets, the results show that \u0000<inline-formula><tex-math>$mathcal {G}$</tex-math></inline-formula>\u0000-MixCon demonstrably enhances performance, achieving an average accuracy increment of 6.24%, and significantly increases the robustness of GNNs against noisy labels.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8066-8078"},"PeriodicalIF":8.9,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Large Language Models on Graphs: A Comprehensive Survey 图上的大型语言模型：全面调查

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-27 DOI: 10.1109/TKDE.2024.3469578

Bowen Jin;Gang Liu;Chi Han;Meng Jiang;Heng Ji;Jiawei Han

Large language models (LLMs), such as GPT4 and LLaMA, are creating significant advancements in natural language processing, due to their strong text encoding/decoding ability and newly found emergent capability (e.g., reasoning). While LLMs are mainly designed to process pure texts, there are many real-world scenarios where text data is associated with rich structure information in the form of graphs (e.g., academic networks, and e-commerce networks) or scenarios where graph data is paired with rich textual information (e.g., molecules with descriptions). Besides, although LLMs have shown their pure text-based reasoning ability, it is underexplored whether such ability can be generalized to graphs (i.e., graph-based reasoning). In this paper, we provide a systematic review of scenarios and techniques related to large language models on graphs. We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs. We then discuss detailed techniques for utilizing LLMs on graphs, including LLM as Predictor, LLM as Encoder, and LLM as Aligner, and compare the advantages and disadvantages of different schools of models. Furthermore, we discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets. Finally, we conclude with potential future research directions in this fast-growing field.

大型语言模型（LLM），如 GPT4 和 LLaMA，由于其强大的文本编码/解码能力和新发现的新兴能力（如推理），正在为自然语言处理带来重大进展。虽然 LLM 主要是为处理纯文本而设计的，但在现实世界中，有很多场景是文本数据与丰富的图形式结构信息相关联（如学术网络和电子商务网络），或者图数据与丰富的文本信息配对（如分子与描述）。此外，虽然 LLMs 已经展示了其纯文本推理能力，但这种能力是否可以推广到图形（即基于图形的推理），目前还没有得到充分的探讨。在本文中，我们将系统回顾与图上大型语言模型相关的应用场景和技术。我们首先将在图上采用大型语言模型的潜在应用场景归纳为三类，即纯图、文本归属图和文本配对图。然后，我们讨论了在图上使用 LLM 的详细技术，包括作为预测器的 LLM、作为编码器的 LLM 和作为对齐器的 LLM，并比较了不同流派模型的优缺点。此外，我们还讨论了这些方法在现实世界中的应用，并总结了开放源代码和基准数据集。最后，我们总结了这一快速发展领域的潜在未来研究方向。

{"title":"Large Language Models on Graphs: A Comprehensive Survey","authors":"Bowen Jin;Gang Liu;Chi Han;Meng Jiang;Heng Ji;Jiawei Han","doi":"10.1109/TKDE.2024.3469578","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3469578","url":null,"abstract":"Large language models (LLMs), such as GPT4 and LLaMA, are creating significant advancements in natural language processing, due to their strong text encoding/decoding ability and newly found emergent capability (e.g., reasoning). While LLMs are mainly designed to process pure texts, there are many real-world scenarios where text data is associated with rich structure information in the form of graphs (e.g., academic networks, and e-commerce networks) or scenarios where graph data is paired with rich textual information (e.g., molecules with descriptions). Besides, although LLMs have shown their pure text-based reasoning ability, it is underexplored whether such ability can be generalized to graphs (i.e., graph-based reasoning). In this paper, we provide a systematic review of scenarios and techniques related to large language models on graphs. We first summarize potential scenarios of adopting LLMs on graphs into three categories, namely pure graphs, text-attributed graphs, and text-paired graphs. We then discuss detailed techniques for utilizing LLMs on graphs, including LLM as Predictor, LLM as Encoder, and LLM as Aligner, and compare the advantages and disadvantages of different schools of models. Furthermore, we discuss the real-world applications of such methods and summarize open-source codes and benchmark datasets. Finally, we conclude with potential future research directions in this fast-growing field.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8622-8642"},"PeriodicalIF":8.9,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142645400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Effective Generalized Low-Rank Tensor Contextual Bandits 有效的广义低张量语境匪帮

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-27 DOI: 10.1109/TKDE.2024.3469782

Qianxin Yi;Yiyang Yang;Shaojie Tang;Jiapeng Liu;Yao Wang

In this paper, we aim to build a novel bandits algorithm that is capable of fully harnessing the power of multi-dimensional data and the inherent non-linearity of reward functions to provide high-usable and accountable decision-making services. To this end, we introduce a generalized low-rank tensor contextual bandits model in which an action is formed from three feature vectors, and thus is represented by a tensor. In this formulation, the reward is determined through a generalized linear function applied to the inner product of the action’s feature tensor and a fixed but unknown parameter tensor with low-rank structure. To effectively achieve the trade-off between exploration and exploitation, we introduce an algorithm called “Generalized Low-Rank Tensor Exploration Subspace then Refine” (G-LowTESTR). This algorithm first collects data to explore the intrinsic low-rank tensor subspace information embedded in the scenario, and then converts the original problem into a lower-dimensional generalized linear contextual bandits problem. Rigorous theoretical analysis shows that the regret bound of G-LowTESTR is superior to those in vectorization and matricization cases. We conduct a series of synthetic and real data experiments to further highlight the effectiveness of G-LowTESTR, leveraging its ability to capitalize on the low-rank tensor structure for enhanced learning.

在本文中，我们旨在建立一种新型匪帮算法，该算法能够充分利用多维数据的力量和奖励函数固有的非线性特性，从而提供高可用性和负责任的决策服务。为此，我们引入了一种广义低阶张量情境匪帮模型，其中一个行动由三个特征向量组成，因此用张量表示。在这一模型中，奖励是通过应用于行动特征张量与具有低阶结构的固定但未知参数张量的内积的广义线性函数来确定的。为了有效地实现探索与开发之间的权衡，我们引入了一种名为 "广义低阶张量探索子空间然后精炼"（G-LowTESTR）的算法。该算法首先收集数据，探索场景中蕴含的内在低阶张量子空间信息，然后将原问题转换为低维广义线性情境匪帮问题。严谨的理论分析表明，G-LowTESTR 的后悔约束优于矢量化和矩阵化情况下的后悔约束。我们进行了一系列合成数据和真实数据实验，进一步凸显了 G-LowTESTR 的有效性，它充分利用了低阶张量结构来增强学习能力。

{"title":"Effective Generalized Low-Rank Tensor Contextual Bandits","authors":"Qianxin Yi;Yiyang Yang;Shaojie Tang;Jiapeng Liu;Yao Wang","doi":"10.1109/TKDE.2024.3469782","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3469782","url":null,"abstract":"In this paper, we aim to build a novel bandits algorithm that is capable of fully harnessing the power of multi-dimensional data and the inherent non-linearity of reward functions to provide high-usable and accountable decision-making services. To this end, we introduce a generalized low-rank tensor contextual bandits model in which an action is formed from three feature vectors, and thus is represented by a tensor. In this formulation, the reward is determined through a generalized linear function applied to the inner product of the action’s feature tensor and a fixed but unknown parameter tensor with low-rank structure. To effectively achieve the trade-off between exploration and exploitation, we introduce an algorithm called “Generalized Low-Rank Tensor Exploration Subspace then Refine” (G-LowTESTR). This algorithm first collects data to explore the intrinsic low-rank tensor subspace information embedded in the scenario, and then converts the original problem into a lower-dimensional generalized linear contextual bandits problem. Rigorous theoretical analysis shows that the regret bound of G-LowTESTR is superior to those in vectorization and matricization cases. We conduct a series of synthetic and real data experiments to further highlight the effectiveness of G-LowTESTR, leveraging its ability to capitalize on the low-rank tensor structure for enhanced learning.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8051-8065"},"PeriodicalIF":8.9,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-Level Graph Knowledge Contrastive Learning 多层次图式知识对比学习

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-26 DOI: 10.1109/TKDE.2024.3466530

Haoran Yang;Yuhao Wang;Xiangyu Zhao;Hongxu Chen;Hongzhi Yin;Qing Li;Guandong Xu

Graph Contrastive Learning (GCL) stands as a potent framework for unsupervised graph representation learning that has gained traction across numerous graph learning applications. The effectiveness of GCL relies on generating high-quality contrasting samples, enhancing the model’s ability to discern graph semantics. However, the prevailing GCL methods face two key challenges: 1) introducing noise during graph augmentations and 2) requiring additional storage for generated samples, which degrade the model performance. In this paper, we propose novel approaches, GKCL (i.e., Graph Knowledge Contrastive Learning) and DGKCL (i.e., Distilled Graph Knowledge Contrastive Learning), that leverage multi-level graph knowledge to create noise-free contrasting pairs. This framework not only addresses the noise-related challenges but also circumvents excessive storage demands. Furthermore, our method incorporates a knowledge distillation component to optimize the trained embedding tables, reducing the model’s scale while ensuring superior performance, particularly for the scenarios with smaller embedding sizes. Comprehensive experimental evaluations on three public benchmark datasets underscore the merits of our proposed method and elucidate its properties, which primarily reflect the performance of the proposed method equipped with different embedding sizes and how the distillation weight affects the overall performance.

图形对比学习（GCL）是一种有效的无监督图形表示学习框架，已在众多图形学习应用中得到广泛应用。GCL 的有效性依赖于生成高质量的对比样本，从而增强模型辨别图语义的能力。然而，目前流行的 GCL 方法面临两大挑战：1）在图增强过程中引入噪声；2）生成的样本需要额外存储，从而降低了模型性能。在本文中，我们提出了新颖的 GKCL（即图知识对比学习）和 DGKCL（即蒸馏图知识对比学习）方法，利用多层次图知识创建无噪声对比对。这一框架不仅解决了与噪声相关的难题，还避免了过多的存储需求。此外，我们的方法还结合了知识提炼组件，以优化训练有素的嵌入表，从而缩小模型的规模，同时确保卓越的性能，尤其是在嵌入规模较小的情况下。在三个公共基准数据集上进行的综合实验评估强调了我们所提方法的优点，并阐明了其特性，主要反映了所提方法在不同嵌入大小下的性能，以及蒸馏权重对整体性能的影响。

{"title":"Multi-Level Graph Knowledge Contrastive Learning","authors":"Haoran Yang;Yuhao Wang;Xiangyu Zhao;Hongxu Chen;Hongzhi Yin;Qing Li;Guandong Xu","doi":"10.1109/TKDE.2024.3466530","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3466530","url":null,"abstract":"Graph Contrastive Learning (GCL) stands as a potent framework for unsupervised graph representation learning that has gained traction across numerous graph learning applications. The effectiveness of GCL relies on generating high-quality contrasting samples, enhancing the model’s ability to discern graph semantics. However, the prevailing GCL methods face two key challenges: 1) introducing noise during graph augmentations and 2) requiring additional storage for generated samples, which degrade the model performance. In this paper, we propose novel approaches, GKCL (i.e., Graph Knowledge Contrastive Learning) and DGKCL (i.e., Distilled Graph Knowledge Contrastive Learning), that leverage multi-level graph knowledge to create noise-free contrasting pairs. This framework not only addresses the noise-related challenges but also circumvents excessive storage demands. Furthermore, our method incorporates a knowledge distillation component to optimize the trained embedding tables, reducing the model’s scale while ensuring superior performance, particularly for the scenarios with smaller embedding sizes. Comprehensive experimental evaluations on three public benchmark datasets underscore the merits of our proposed method and elucidate its properties, which primarily reflect the performance of the proposed method equipped with different embedding sizes and how the distillation weight affects the overall performance.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8829-8841"},"PeriodicalIF":8.9,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FedEAN: Entity-Aware Adversarial Negative Sampling for Federated Knowledge Graph Reasoning FedEAN：面向联合知识图谱推理的实体感知对抗性负抽样

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-26 DOI: 10.1109/TKDE.2024.3464516

Lingyuan Meng;Ke Liang;Hao Yu;Yue Liu;Sihang Zhou;Meng Liu;Xinwang Liu

Federated knowledge graph reasoning (FedKGR) aims to perform reasoning over different clients while protecting data privacy, drawing increasing attention to its high practical value. Previous works primarily focus on data heterogeneity, ignoring challenges from limited data scale and primitive negative sample strategies, i.e., random entity replacement, which yield low-quality negatives and zero loss issues. Meanwhile, generative adversarial networks (GANs) are widely used in different fields to generate high-quality negative samples, but no work has been developed for FedKGR. To this end, we propose a plug-and-play Entity-aware Adversarial Negative sampling strategy for FedKGR, termed FedEAN. Specifically, we are the first to adopt GANs to generate high-quality negative samples in different clients. It takes the target triplet in each batch as input and outputs high-quality negative samples, which guaranteed by the joint training of the generator and discriminator. Moreover, we design an entity-aware adaptive negative sampling mechanism based on the similarity of entity representations before and after server aggregation, which can persevere the entity global consistency across clients during training. Extensive experiments demonstrate that FedEAN excels with various FedKGR backbones, demonstrating its ability to construct high-quality negative samples and address the zero-loss issue.

联合知识图谱推理（FedKGR）旨在保护数据隐私的同时对不同客户端执行推理，其高度实用价值日益受到关注。以往的研究主要关注数据异构性，忽略了有限数据规模和原始否定样本策略（即随机实体替换）带来的挑战，这些策略会产生低质量否定和零损失问题。同时，生成式对抗网络（GAN）被广泛应用于不同领域，以生成高质量的负样本，但目前还没有针对 FedKGR 的研究。为此，我们为 FedKGR 提出了一种即插即用的实体感知对抗负采样策略，称为 FedEAN。具体来说，我们首次采用 GAN 在不同客户端生成高质量的负样本。它将每个批次中的目标三元组作为输入，并通过生成器和判别器的联合训练保证输出高质量的负样本。此外，我们还根据服务器聚合前后实体表征的相似性设计了一种实体感知的自适应负采样机制，可以在训练过程中保持不同客户端实体的全局一致性。广泛的实验证明，FedEAN 在使用各种 FedKGR 主干网时表现出色，证明了它有能力构建高质量的负采样并解决零损失问题。

{"title":"FedEAN: Entity-Aware Adversarial Negative Sampling for Federated Knowledge Graph Reasoning","authors":"Lingyuan Meng;Ke Liang;Hao Yu;Yue Liu;Sihang Zhou;Meng Liu;Xinwang Liu","doi":"10.1109/TKDE.2024.3464516","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3464516","url":null,"abstract":"Federated knowledge graph reasoning (FedKGR) aims to perform reasoning over different clients while protecting data privacy, drawing increasing attention to its high practical value. Previous works primarily focus on data heterogeneity, ignoring challenges from limited data scale and primitive negative sample strategies, i.e., random entity replacement, which yield low-quality negatives and zero loss issues. Meanwhile, generative adversarial networks (GANs) are widely used in different fields to generate high-quality negative samples, but no work has been developed for FedKGR. To this end, we propose a plug-and-play \u0000<underline>E\u0000ntity-aware \u0000<underline>A\u0000dversarial \u0000<underline>N\u0000egative sampling strategy for FedKGR, termed FedEAN. Specifically, we are the first to adopt GANs to generate high-quality negative samples in different clients. It takes the target triplet in each batch as input and outputs high-quality negative samples, which guaranteed by the joint training of the generator and discriminator. Moreover, we design an entity-aware adaptive negative sampling mechanism based on the similarity of entity representations before and after server aggregation, which can persevere the entity global consistency across clients during training. Extensive experiments demonstrate that FedEAN excels with various FedKGR backbones, demonstrating its ability to construct high-quality negative samples and address the zero-loss issue.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8206-8219"},"PeriodicalIF":8.9,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Fine-Grained Semantics Enhanced Contrastive Learning for Graphs 细粒度语义增强的图形对比学习

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-25 DOI: 10.1109/TKDE.2024.3466990

Youming Liu;Lin Shu;Chuan Chen;Zibin Zheng

Graph contrastive learning defines a contrastive task to pull similar instances close and push dissimilar instances away. It learns discriminative node embeddings without supervised labels, which has aroused increasing attention in the past few years. Nevertheless, existing methods of graph contrastive learning ignore the differences between diverse semantics existed in graphs, which learn coarse-grained node embeddings and lead to sub-optimal performances on downstream tasks. To bridge this gap, we propose a novel Fine-grained Semantics enhanced Graph Contrastive Learning (FSGCL) in this paper. Concretely, FSGCL first introduces a motif-based graph construction, which employs graph motifs to extract diverse semantics existed in graphs from the perspective of input data. Then, the semantic-level contrastive task is explored to further enhance the utilization of fine-grained semantics from the perspective of model training. Experiments on five real-world datasets demonstrate the superiority of our proposed FSGCL over state-of-the-art methods. To make the results reproducible, we will make our codes public on GitHub after this paper is accepted.

图形对比学习定义了一种对比任务，即把相似的实例拉近，把不相似的实例推远。它在没有监督标签的情况下学习具有区分性的节点嵌入，这在过去几年中引起了越来越多的关注。然而，现有的图对比学习方法忽略了图中存在的不同语义之间的差异，它们学习的是粗粒度的节点嵌入，导致在下游任务中表现不佳。为了弥补这一缺陷，我们在本文中提出了一种新颖的细粒度语义增强图对比学习（FSGCL）。具体来说，FSGCL 首先引入了基于图案的图构造，利用图图案从输入数据的角度提取图中存在的各种语义。然后，探索语义级对比任务，从模型训练的角度进一步加强对细粒度语义的利用。在五个真实世界数据集上进行的实验证明，我们提出的 FSGCL 优于最先进的方法。为了使结果具有可重复性，我们将在本文被接受后在 GitHub 上公开我们的代码。

{"title":"Fine-Grained Semantics Enhanced Contrastive Learning for Graphs","authors":"Youming Liu;Lin Shu;Chuan Chen;Zibin Zheng","doi":"10.1109/TKDE.2024.3466990","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3466990","url":null,"abstract":"Graph contrastive learning defines a contrastive task to pull similar instances close and push dissimilar instances away. It learns discriminative node embeddings without supervised labels, which has aroused increasing attention in the past few years. Nevertheless, existing methods of graph contrastive learning ignore the differences between diverse semantics existed in graphs, which learn coarse-grained node embeddings and lead to sub-optimal performances on downstream tasks. To bridge this gap, we propose a novel \u0000<bold>F\u0000ine-grained \u0000<bold>S\u0000emantics enhanced \u0000<bold>G\u0000raph \u0000<bold>C\u0000ontrastive \u0000<bold>L\u0000earning (FSGCL) in this paper. Concretely, FSGCL first introduces a motif-based graph construction, which employs graph motifs to extract diverse semantics existed in graphs from the perspective of input data. Then, the semantic-level contrastive task is explored to further enhance the utilization of fine-grained semantics from the perspective of model training. Experiments on five real-world datasets demonstrate the superiority of our proposed FSGCL over state-of-the-art methods. To make the results reproducible, we will make our codes public on GitHub after this paper is accepted.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8238-8250"},"PeriodicalIF":8.9,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey on Privacy in Graph Neural Networks: Attacks, Preservation, and Applications 图神经网络隐私调查：攻击、保护与应用

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-24 DOI: 10.1109/TKDE.2024.3454328

Yi Zhang;Yuying Zhao;Zhaoqing Li;Xueqi Cheng;Yu Wang;Olivera Kotevska;Philip S. Yu;Tyler Derr

Graph Neural Networks (GNNs) have gained significant attention owing to their ability to handle graph-structured data and the improvement in practical applications. However, many of these models prioritize high utility performance, such as accuracy, with a lack of privacy consideration, which is a major concern in modern society where privacy attacks are rampant. To address this issue, researchers have started to develop privacy-preserving GNNs. Despite this progress, there is a lack of a comprehensive overview of the attacks and the techniques for preserving privacy in the graph domain. In this survey, we aim to address this gap by summarizing the attacks on graph data according to the targeted information, categorizing the privacy preservation techniques in GNNs, and reviewing the datasets and applications that could be used for analyzing/solving privacy issues in GNNs. We also outline potential directions for future research in order to build better privacy-preserving GNNs.

图神经网络（GNN）因其处理图结构数据的能力和在实际应用中的改进而备受关注。然而，在隐私攻击猖獗的现代社会，隐私问题是人们关注的焦点。为了解决这个问题，研究人员已经开始开发保护隐私的 GNN。尽管取得了这一进展，但对图领域的攻击和隐私保护技术还缺乏全面的概述。在本调查中，我们根据目标信息总结了对图数据的攻击，对 GNN 中的隐私保护技术进行了分类，并回顾了可用于分析/解决 GNN 中隐私问题的数据集和应用，旨在弥补这一不足。我们还概述了未来研究的潜在方向，以便建立更好的隐私保护 GNN。

{"title":"A Survey on Privacy in Graph Neural Networks: Attacks, Preservation, and Applications","authors":"Yi Zhang;Yuying Zhao;Zhaoqing Li;Xueqi Cheng;Yu Wang;Olivera Kotevska;Philip S. Yu;Tyler Derr","doi":"10.1109/TKDE.2024.3454328","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3454328","url":null,"abstract":"Graph Neural Networks (GNNs) have gained significant attention owing to their ability to handle graph-structured data and the improvement in practical applications. However, many of these models prioritize high utility performance, such as accuracy, with a lack of privacy consideration, which is a major concern in modern society where privacy attacks are rampant. To address this issue, researchers have started to develop privacy-preserving GNNs. Despite this progress, there is a lack of a comprehensive overview of the attacks and the techniques for preserving privacy in the graph domain. In this survey, we aim to address this gap by summarizing the attacks on graph data according to the targeted information, categorizing the privacy preservation techniques in GNNs, and reviewing the datasets and applications that could be used for analyzing/solving privacy issues in GNNs. We also outline potential directions for future research in order to build better privacy-preserving GNNs.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7497-7515"},"PeriodicalIF":8.9,"publicationDate":"2024-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10693287","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142645552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Diffusion-Based Graph Generative Methods 基于扩散的图形生成方法

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-23 DOI: 10.1109/TKDE.2024.3466301

Hongyang Chen;Can Xu;Lingyu Zheng;Qiang Zhang;Xuemin Lin

Being the most cutting-edge generative methods, diffusion methods have shown great advances in wide generation tasks. Among them, graph generation attracts significant research attention for its broad application in real life. In our survey, we systematically and comprehensively review on diffusion-based graph generative methods. We first make a review on three mainstream paradigms of diffusion methods, which are denoising diffusion probabilistic models, score-based genrative models, and stochastic differential equations. Then we further categorize and introduce the latest applications of diffusion models on graphs. In the end, we point out some limitations of current studies and future directions of future explorations.

作为最前沿的生成方法，扩散方法在广泛的生成任务中取得了巨大进步。其中，图生成因其在现实生活中的广泛应用而备受研究关注。在我们的调查中，我们系统而全面地回顾了基于扩散的图生成方法。我们首先回顾了扩散方法的三种主流范式，即去噪扩散概率模型、基于分数的生成模型和随机微分方程。然后，我们进一步分类并介绍了扩散模型在图上的最新应用。最后，我们指出了当前研究的一些局限性和未来探索的方向。

引用次数: 0

Rethinking Robust Multivariate Time Series Anomaly Detection: A Hierarchical Spatio-Temporal Variational Perspective 反思稳健的多变量时间序列异常检测：分层时空变量视角

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-23 DOI: 10.1109/TKDE.2024.3466291

Xiao Zhang;Shuqing Xu;Huashan Chen;Zekai Chen;Fuzhen Zhuang;Hui Xiong;Dongxiao Yu

The robust multivariate time series anomaly detection can facilitate intelligent decisions and timely maintenance in various kinds of monitor systems. However, the robustness is highly restricted by the stochasticity in multivariate time series, which is summarized as temporal stochasticity and spatial stochasticity specifically. In this paper, we explicitly model the temporal stochasticity variables and the latent graph relationship variables into a unified graphical framework, which can achieve better robustness to dynamicity from both the spatial and temporal perspective. First, within the spatial encoder, every connection exists or not is modeled as a binary stochastic variable, and the graph structure can be learnt automatically. Then, the temporal encoder would embed the highly structured time series into latent stochastic variables to capture both complex temporal dependencies and neighbors information. Moreover, we design a history-future combined anomaly score mechanism with both reconstruction decoder and forecasting decoder to improve the anomaly detection performance. By weighting the historical anomaly factor, the future anomaly factor, and the prediction error of current timestamp, the anomaly detection at current timestamp could be more sensitive to anomaly detection. Finally, extensive experiments on three publicly available anomaly detection datasets demonstrate our proposed method can achieve the best performance in terms of recall and F1 compared with state-of-the-arts baselines.

稳健的多变量时间序列异常检测有助于各类监控系统的智能决策和及时维护。然而，多变量时间序列中的随机性极大地限制了其鲁棒性，具体可概括为时间随机性和空间随机性。本文将时间随机性变量和潜在图关系变量明确建模到一个统一的图框架中，从而从空间和时间两个角度实现更好的动态鲁棒性。首先，在空间编码器中，每个连接存在与否都被建模为二元随机变量，图结构可以自动学习。然后，时间编码器将高度结构化的时间序列嵌入到潜在随机变量中，以捕捉复杂的时间依赖性和邻近信息。此外，我们还设计了一种历史与未来相结合的异常评分机制，同时使用重构解码器和预测解码器来提高异常检测性能。通过对历史异常因子、未来异常因子和当前时间戳的预测误差进行加权，可以提高当前时间戳的异常检测灵敏度。最后，在三个公开的异常检测数据集上进行的大量实验表明，与同行相比，我们提出的方法在召回率和 F1 方面都能达到最佳性能。

{"title":"Rethinking Robust Multivariate Time Series Anomaly Detection: A Hierarchical Spatio-Temporal Variational Perspective","authors":"Xiao Zhang;Shuqing Xu;Huashan Chen;Zekai Chen;Fuzhen Zhuang;Hui Xiong;Dongxiao Yu","doi":"10.1109/TKDE.2024.3466291","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3466291","url":null,"abstract":"The robust multivariate time series anomaly detection can facilitate intelligent decisions and timely maintenance in various kinds of monitor systems. However, the robustness is highly restricted by the stochasticity in multivariate time series, which is summarized as \u0000<italic>temporal stochasticity\u0000 and \u0000<italic>spatial stochasticity\u0000 specifically. In this paper, we explicitly model the temporal stochasticity variables and the latent graph relationship variables into a unified graphical framework, which can achieve better robustness to dynamicity from both the spatial and temporal perspective. First, within the spatial encoder, every connection exists or not is modeled as a binary stochastic variable, and the graph structure can be learnt automatically. Then, the temporal encoder would embed the highly structured time series into latent stochastic variables to capture both complex temporal dependencies and neighbors information. Moreover, we design a history-future combined anomaly score mechanism with both reconstruction decoder and forecasting decoder to improve the anomaly detection performance. By weighting the historical anomaly factor, the future anomaly factor, and the prediction error of current timestamp, the anomaly detection at current timestamp could be more sensitive to anomaly detection. Finally, extensive experiments on three publicly available anomaly detection datasets demonstrate our proposed method can achieve the best performance in terms of recall and F1 compared with state-of-the-arts baselines.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"9136-9149"},"PeriodicalIF":8.9,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142600145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ReiPool: Reinforced Pooling Graph Neural Networks for Graph-Level Representation Learning ReiPool：用于图层表征学习的强化池化图神经网络

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering

Pub Date : 2024-09-23 DOI: 10.1109/TKDE.2024.3466508

Xuexiong Luo;Sheng Zhang;Jia Wu;Hongyang Chen;Hao Peng;Chuan Zhou;Zhao Li;Shan Xue;Jian Yang

Graph pooling technique as the essential component of graph neural networks has gotten increasing attention recently and it aims to learn graph-level representations for the whole graph. Besides, graph pooling is important in graph classification and graph generation tasks. However, current graph pooling methods mainly coarsen a sequence of small-sized graphs to capture hierarchical structures, potentially resulting in the deterioration of the global structure of the original graph and influencing the quality of graph representations. Furthermore, these methods artificially select the number of graph pooling layers for different graph datasets rather than considering each graph individually. In reality, the structure and size differences among graphs necessitate a specific number of graph pooling layers for each graph. In this work, we propose reinforced pooling graph neural networks via adaptive hybrid graph coarsening networks. Specifically, we design a hybrid graph coarsening strategy to coarsen redundant structures of the original graph while retaining the global structure. In addition, we introduce multi-agent reinforcement learning to adaptively perform the graph coarsening process to extract the most representative coarsened graph for each graph, enhancing the quality of graph-level representations. Finally, we design graph-level contrast to improve the preservation of global information in graph-level representations. Extensive experiments with rich baselines on six benchmark datasets show the effectiveness of ReiPool¹.

图池技术作为图神经网络的重要组成部分，近来受到越来越多的关注，它旨在学习整个图的图级表示。此外，图池技术在图分类和图生成任务中也非常重要。然而，目前的图池化方法主要是粗化一连串小尺寸的图来捕捉层次结构，可能会导致原始图的全局结构恶化，影响图表示的质量。此外，这些方法人为地选择了不同图形数据集的图形池层数，而不是单独考虑每个图形。实际上，由于图之间的结构和大小差异，每个图都需要特定数量的图池化层。在这项工作中，我们提出通过自适应混合图粗化网络来强化池化图神经网络。具体来说，我们设计了一种混合图粗化策略，在保留全局结构的同时，粗化原始图的冗余结构。此外，我们还引入了多代理强化学习，以自适应地执行图粗化过程，为每个图提取最具代表性的粗化图，从而提高图级表示的质量。最后，我们设计了图级对比，以改善图级表征中全局信息的保存。在六个基准数据集上与丰富的基线进行的广泛实验表明了 ReiPool1 的有效性。

{"title":"ReiPool: Reinforced Pooling Graph Neural Networks for Graph-Level Representation Learning","authors":"Xuexiong Luo;Sheng Zhang;Jia Wu;Hongyang Chen;Hao Peng;Chuan Zhou;Zhao Li;Shan Xue;Jian Yang","doi":"10.1109/TKDE.2024.3466508","DOIUrl":"https://doi.org/10.1109/TKDE.2024.3466508","url":null,"abstract":"Graph pooling technique as the essential component of graph neural networks has gotten increasing attention recently and it aims to learn graph-level representations for the whole graph. Besides, graph pooling is important in graph classification and graph generation tasks. However, current graph pooling methods mainly coarsen a sequence of small-sized graphs to capture hierarchical structures, potentially resulting in the deterioration of the global structure of the original graph and influencing the quality of graph representations. Furthermore, these methods artificially select the number of graph pooling layers for different graph datasets rather than considering each graph individually. In reality, the structure and size differences among graphs necessitate a specific number of graph pooling layers for each graph. In this work, we propose reinforced pooling graph neural networks via adaptive hybrid graph coarsening networks. Specifically, we design a hybrid graph coarsening strategy to coarsen redundant structures of the original graph while retaining the global structure. In addition, we introduce multi-agent reinforcement learning to adaptively perform the graph coarsening process to extract the most representative coarsened graph for each graph, enhancing the quality of graph-level representations. Finally, we design graph-level contrast to improve the preservation of global information in graph-level representations. Extensive experiments with rich baselines on six benchmark datasets show the effectiveness of ReiPool\u00001\u0000.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"9109-9122"},"PeriodicalIF":8.9,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0