VEM $$^2$$ L: an easy but effective framework for fusing text and structure knowledge on sparse knowledge graph completion

IF 4.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Data Mining and Knowledge Discovery Pub Date : 2024-02-06 DOI:10.1007/s10618-023-01001-y

Tao He, Ming Liu, Yixin Cao, Meng Qu, Zihao Zheng, Bing Qin

{"title":"VEM $$^2$$ L: an easy but effective framework for fusing text and structure knowledge on sparse knowledge graph completion","authors":"Tao He, Ming Liu, Yixin Cao, Meng Qu, Zihao Zheng, Bing Qin","doi":"10.1007/s10618-023-01001-y","DOIUrl":null,"url":null,"abstract":"The task of Knowledge Graph Completion (KGC) is to infer missing links for Knowledge Graphs (KGs) by analyzing graph structures. However, with increasing sparsity in KGs, this task becomes increasingly challenging. In this paper, we propose VEM\$^2\$L, a joint learning framework that incorporates structure and relevant text information to supplement insufficient features for sparse KGs. We begin by training two pre-existing KGC models: one based on structure and the other based on text. Our ultimate goal is to fuse knowledge acquired by these models. To achieve this, we divide knowledge within the models into two non-overlapping parts: expressive power and generalization ability. We then propose two different joint learning methods that co-distill these two kinds of knowledge respectively. For expressive power, we allow each model to learn from and exchange knowledge mutually on training examples. For the generalization ability, we propose a novel co-distillation strategy using the Variational EM algorithm on unobserved queries. Our proposed joint learning framework is supported by both detailed theoretical evidence and qualitative experiments, demonstrating its effectiveness.","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"18 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Mining and Knowledge Discovery","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10618-023-01001-y","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

The task of Knowledge Graph Completion (KGC) is to infer missing links for Knowledge Graphs (KGs) by analyzing graph structures. However, with increasing sparsity in KGs, this task becomes increasingly challenging. In this paper, we propose VEM$^2$L, a joint learning framework that incorporates structure and relevant text information to supplement insufficient features for sparse KGs. We begin by training two pre-existing KGC models: one based on structure and the other based on text. Our ultimate goal is to fuse knowledge acquired by these models. To achieve this, we divide knowledge within the models into two non-overlapping parts: expressive power and generalization ability. We then propose two different joint learning methods that co-distill these two kinds of knowledge respectively. For expressive power, we allow each model to learn from and exchange knowledge mutually on training examples. For the generalization ability, we propose a novel co-distillation strategy using the Variational EM algorithm on unobserved queries. Our proposed joint learning framework is supported by both detailed theoretical evidence and qualitative experiments, demonstrating its effectiveness.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

VEM $$^2$ L：在稀疏知识图谱完成上融合文本和结构知识的简单而有效的框架

知识图谱补全（KGC）的任务是通过分析图谱结构来推断知识图谱（KG）中缺失的链接。然而，随着知识图谱的稀疏性不断增加，这项任务变得越来越具有挑战性。在本文中，我们提出了 VEM$^2$L 这一联合学习框架，它结合了结构和相关文本信息，以补充稀疏知识图谱中不足的特征。我们首先训练两个已有的 KGC 模型：一个基于结构，另一个基于文本。我们的最终目标是融合这些模型获得的知识。为此，我们将模型中的知识分为两个不重叠的部分：表达能力和泛化能力。然后，我们提出了两种不同的联合学习方法，分别对这两种知识进行联合提炼。在表现力方面，我们允许每个模型在训练实例中相互学习和交流知识。在泛化能力方面，我们提出了一种新颖的共同提炼策略，利用变异 EM 算法对未观察到的查询进行提炼。我们提出的联合学习框架得到了详细理论证据和定性实验的支持，证明了其有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Data Mining and Knowledge Discovery 工程技术-计算机：人工智能

CiteScore

10.40

自引率

4.20%

发文量

审稿时长

10 months

期刊介绍： Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.