Towards the Web of Embeddings: Integrating multiple knowledge graph embedding spaces with FedCoder

Pub Date : 2023-01-01 DOI:10.1016/j.websem.2022.100741

Matthias Baumgartner , Daniele Dell’Aglio , Heiko Paulheim , Abraham Bernstein

{"title":"Towards the Web of Embeddings: Integrating multiple knowledge graph embedding spaces with FedCoder","authors":"Matthias Baumgartner , Daniele Dell’Aglio , Heiko Paulheim , Abraham Bernstein","doi":"10.1016/j.websem.2022.100741","DOIUrl":null,"url":null,"abstract":"<div><p>The Semantic Web is distributed yet interoperable: Distributed since resources are created and published by a variety of producers, tailored to their specific needs and knowledge; Interoperable as entities are linked across resources, allowing to use resources from different providers in concord. Complementary to the explicit usage of Semantic Web resources, embedding methods made them applicable to machine learning tasks. Subsequently, embedding models for numerous tasks and structures have been developed, and embedding spaces for various resources have been published. The ecosystem of embedding spaces is distributed but not interoperable: Entity embeddings are not readily comparable across different spaces. To parallel the Web of Data with a Web of Embeddings, we must thus integrate available embedding spaces into a uniform space.</p><p>Current integration approaches are limited to two spaces and presume that both of them were embedded with the same method — both assumptions are unlikely to hold in the context of a Web of Embeddings. In this paper, we present FedCoder— an approach that integrates multiple embedding spaces via a latent space. We assert that linked entities have a similar representation in the latent space so that entities become comparable across embedding spaces. FedCoder employs an autoencoder to learn this latent space from linked as well as non-linked entities.</p><p>Our experiments show that FedCoder substantially outperforms state-of-the-art approaches when faced with different embedding models, that it scales better than previous methods in the number of embedding spaces, and that it improves with more graphs being integrated whilst performing comparably with current approaches that assumed joint learning of the embeddings and were, usually, limited to two sources. Our results demonstrate that FedCoder is well adapted to integrate the distributed, diverse, and large ecosystem of embeddings spaces into an interoperable Web of Embeddings.</p></div>","PeriodicalId":75319,"journal":{"name":"","volume":"75 ","pages":"Article 100741"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1570826822000270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

The Semantic Web is distributed yet interoperable: Distributed since resources are created and published by a variety of producers, tailored to their specific needs and knowledge; Interoperable as entities are linked across resources, allowing to use resources from different providers in concord. Complementary to the explicit usage of Semantic Web resources, embedding methods made them applicable to machine learning tasks. Subsequently, embedding models for numerous tasks and structures have been developed, and embedding spaces for various resources have been published. The ecosystem of embedding spaces is distributed but not interoperable: Entity embeddings are not readily comparable across different spaces. To parallel the Web of Data with a Web of Embeddings, we must thus integrate available embedding spaces into a uniform space.

Current integration approaches are limited to two spaces and presume that both of them were embedded with the same method — both assumptions are unlikely to hold in the context of a Web of Embeddings. In this paper, we present FedCoder— an approach that integrates multiple embedding spaces via a latent space. We assert that linked entities have a similar representation in the latent space so that entities become comparable across embedding spaces. FedCoder employs an autoencoder to learn this latent space from linked as well as non-linked entities.

Our experiments show that FedCoder substantially outperforms state-of-the-art approaches when faced with different embedding models, that it scales better than previous methods in the number of embedding spaces, and that it improves with more graphs being integrated whilst performing comparably with current approaches that assumed joint learning of the embeddings and were, usually, limited to two sources. Our results demonstrate that FedCoder is well adapted to integrate the distributed, diverse, and large ecosystem of embeddings spaces into an interoperable Web of Embeddings.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

走向嵌入Web：将多个知识图嵌入空间与FedCoder集成

语义网是分布式的，但可互操作：分布式的，因为资源是由各种生产者根据他们的具体需求和知识创建和发布的；可互操作，因为实体跨资源链接，允许协同使用来自不同提供商的资源。嵌入方法补充了语义网资源的明确使用，使其适用于机器学习任务。随后，开发了许多任务和结构的嵌入模型，并发布了各种资源的嵌入空间。嵌入空间的生态系统是分布式的，但不可互操作：实体嵌入在不同的空间中不容易进行比较。为了将数据网与嵌入网并行，我们必须将可用的嵌入空间集成到一个统一的空间中。当前的集成方法仅限于两个空间，并假设它们都是用同一方法嵌入的——这两种假设在嵌入Web的上下文中都不太可能成立。在本文中，我们提出了FedCoder——一种通过潜在空间集成多个嵌入空间的方法。我们断言，链接实体在潜在空间中具有相似的表示，因此实体在嵌入空间中变得可比。FedCoder使用了一个自动编码器来从链接和非链接实体中学习这个潜在空间。我们的实验表明，当面对不同的嵌入模型时，FedCoder显著优于最先进的方法，它在嵌入空间的数量上比以前的方法扩展得更好，并且它随着集成更多的图而改进，同时与当前假设嵌入的联合学习的方法相比，仅限于两个来源。我们的结果表明，FedCoder非常适合将分布式、多样化和大型的嵌入空间生态系统集成到可互操作的嵌入Web中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助