Repurposing Knowledge Graph Embeddings for Triple Representation via Weak Supervision

2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA) Pub Date : 2022-08-22 DOI:10.1109/IDSTA55301.2022.9923036

Alexander Kalinowski, Yuan An

{"title":"Repurposing Knowledge Graph Embeddings for Triple Representation via Weak Supervision","authors":"Alexander Kalinowski, Yuan An","doi":"10.1109/IDSTA55301.2022.9923036","DOIUrl":null,"url":null,"abstract":"The majority of knowledge graph embedding techniques treat entities and predicates as separate embedding matrices, using aggregation functions to build a representation of the input triple. However, these aggregations are lossy, i.e. they do not capture the semantics of the original triples, such as information contained in the predicates. To combat these shortcomings, current methods learn triple embeddings from scratch without utilizing entity and predicate embeddings from pre-trained models. In this paper, we design a novel fine-tuning approach for learning triple embeddings by creating weak supervision signals from pre-trained knowledge graph embeddings. We develop a method for automatically sampling triples from a knowledge graph and estimating their pairwise similarities from pre-trained embedding models. These pairwise similarity scores are then fed to a Siamese-like neural architecture to fine-tune triple representations. We evaluate the proposed method on two widely studied knowledge graphs and show consistent improvement over other state-of-the-art triple embedding methods on triple classification and triple clustering tasks.","PeriodicalId":268343,"journal":{"name":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDSTA55301.2022.9923036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The majority of knowledge graph embedding techniques treat entities and predicates as separate embedding matrices, using aggregation functions to build a representation of the input triple. However, these aggregations are lossy, i.e. they do not capture the semantics of the original triples, such as information contained in the predicates. To combat these shortcomings, current methods learn triple embeddings from scratch without utilizing entity and predicate embeddings from pre-trained models. In this paper, we design a novel fine-tuning approach for learning triple embeddings by creating weak supervision signals from pre-trained knowledge graph embeddings. We develop a method for automatically sampling triples from a knowledge graph and estimating their pairwise similarities from pre-trained embedding models. These pairwise similarity scores are then fed to a Siamese-like neural architecture to fine-tune triple representations. We evaluate the proposed method on two widely studied knowledge graphs and show consistent improvement over other state-of-the-art triple embedding methods on triple classification and triple clustering tasks.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于弱监督的三重表示知识图嵌入的再利用

大多数知识图嵌入技术将实体和谓词作为单独的嵌入矩阵，使用聚合函数来构建输入三元组的表示。然而，这些聚合是有损的，也就是说，它们不能捕获原始三元组的语义，比如谓词中包含的信息。为了克服这些缺点，目前的方法从头开始学习三重嵌入，而不使用预训练模型中的实体和谓词嵌入。在本文中，我们设计了一种新的微调方法，通过从预训练的知识图嵌入中创建弱监督信号来学习三重嵌入。我们开发了一种从知识图中自动采样三元组并从预训练的嵌入模型中估计其成对相似度的方法。然后将这些两两相似性分数输入到类似暹罗的神经结构中，以微调三重表示。我们在两个广泛研究的知识图上评估了所提出的方法，并在三重分类和三重聚类任务上显示出与其他最先进的三重嵌入方法一致的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA)

自引率

0.00%

发文量