Tao He, Ming Liu, Yixin Cao, Meng Qu, Zihao Zheng, Bing Qin
{"title":"VEM $$^2$ L:在稀疏知识图谱完成上融合文本和结构知识的简单而有效的框架","authors":"Tao He, Ming Liu, Yixin Cao, Meng Qu, Zihao Zheng, Bing Qin","doi":"10.1007/s10618-023-01001-y","DOIUrl":null,"url":null,"abstract":"<p>The task of Knowledge Graph Completion (KGC) is to infer missing links for Knowledge Graphs (KGs) by analyzing graph structures. However, with increasing sparsity in KGs, this task becomes increasingly challenging. In this paper, we propose VEM<span>\\(^2\\)</span>L, a joint learning framework that incorporates structure and relevant text information to supplement insufficient features for sparse KGs. We begin by training two pre-existing KGC models: one based on structure and the other based on text. Our ultimate goal is to fuse knowledge acquired by these models. To achieve this, we divide knowledge within the models into two non-overlapping parts: <b>expressive power</b> and <b>generalization ability</b>. We then propose two different joint learning methods that co-distill these two kinds of knowledge respectively. For expressive power, we allow each model to learn from and exchange knowledge mutually on training examples. For the generalization ability, we propose a novel co-distillation strategy using the Variational EM algorithm on unobserved queries. Our proposed joint learning framework is supported by both detailed theoretical evidence and qualitative experiments, demonstrating its effectiveness.</p>","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"18 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"VEM $$^2$$ L: an easy but effective framework for fusing text and structure knowledge on sparse knowledge graph completion\",\"authors\":\"Tao He, Ming Liu, Yixin Cao, Meng Qu, Zihao Zheng, Bing Qin\",\"doi\":\"10.1007/s10618-023-01001-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The task of Knowledge Graph Completion (KGC) is to infer missing links for Knowledge Graphs (KGs) by analyzing graph structures. However, with increasing sparsity in KGs, this task becomes increasingly challenging. In this paper, we propose VEM<span>\\\\(^2\\\\)</span>L, a joint learning framework that incorporates structure and relevant text information to supplement insufficient features for sparse KGs. We begin by training two pre-existing KGC models: one based on structure and the other based on text. Our ultimate goal is to fuse knowledge acquired by these models. To achieve this, we divide knowledge within the models into two non-overlapping parts: <b>expressive power</b> and <b>generalization ability</b>. We then propose two different joint learning methods that co-distill these two kinds of knowledge respectively. For expressive power, we allow each model to learn from and exchange knowledge mutually on training examples. For the generalization ability, we propose a novel co-distillation strategy using the Variational EM algorithm on unobserved queries. Our proposed joint learning framework is supported by both detailed theoretical evidence and qualitative experiments, demonstrating its effectiveness.</p>\",\"PeriodicalId\":55183,\"journal\":{\"name\":\"Data Mining and Knowledge Discovery\",\"volume\":\"18 1\",\"pages\":\"\"},\"PeriodicalIF\":2.8000,\"publicationDate\":\"2024-02-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Mining and Knowledge Discovery\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10618-023-01001-y\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Mining and Knowledge Discovery","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10618-023-01001-y","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
知识图谱补全(KGC)的任务是通过分析图谱结构来推断知识图谱(KG)中缺失的链接。然而,随着知识图谱的稀疏性不断增加,这项任务变得越来越具有挑战性。在本文中,我们提出了 VEM\(^2\)L 这一联合学习框架,它结合了结构和相关文本信息,以补充稀疏知识图谱中不足的特征。我们首先训练两个已有的 KGC 模型:一个基于结构,另一个基于文本。我们的最终目标是融合这些模型获得的知识。为此,我们将模型中的知识分为两个不重叠的部分:表达能力和泛化能力。然后,我们提出了两种不同的联合学习方法,分别对这两种知识进行联合提炼。在表现力方面,我们允许每个模型在训练实例中相互学习和交流知识。在泛化能力方面,我们提出了一种新颖的共同提炼策略,利用变异 EM 算法对未观察到的查询进行提炼。我们提出的联合学习框架得到了详细理论证据和定性实验的支持,证明了其有效性。
VEM $$^2$$ L: an easy but effective framework for fusing text and structure knowledge on sparse knowledge graph completion
The task of Knowledge Graph Completion (KGC) is to infer missing links for Knowledge Graphs (KGs) by analyzing graph structures. However, with increasing sparsity in KGs, this task becomes increasingly challenging. In this paper, we propose VEM\(^2\)L, a joint learning framework that incorporates structure and relevant text information to supplement insufficient features for sparse KGs. We begin by training two pre-existing KGC models: one based on structure and the other based on text. Our ultimate goal is to fuse knowledge acquired by these models. To achieve this, we divide knowledge within the models into two non-overlapping parts: expressive power and generalization ability. We then propose two different joint learning methods that co-distill these two kinds of knowledge respectively. For expressive power, we allow each model to learn from and exchange knowledge mutually on training examples. For the generalization ability, we propose a novel co-distillation strategy using the Variational EM algorithm on unobserved queries. Our proposed joint learning framework is supported by both detailed theoretical evidence and qualitative experiments, demonstrating its effectiveness.
期刊介绍:
Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.