{"title":"Knowledge graph embedding closed under composition","authors":"Zhuoxun Zheng, Baifan Zhou, Hui Yang, Zhipeng Tan, Zequn Sun, Chunnong Li, Arild Waaler, Evgeny Kharlamov, Ahmet Soylu","doi":"10.1007/s10618-024-01050-x","DOIUrl":null,"url":null,"abstract":"<p>Knowledge Graph Embedding (KGE) has attracted increasing attention. Relation patterns, such as symmetry and inversion, have received considerable focus. Among them, composition patterns are particularly important, as they involve nearly all relations in KGs. However, prior KGE approaches often consider relations to be compositional only if they are well-represented in the training data. Consequently, it can lead to performance degradation, especially for under-represented composition patterns. To this end, we propose HolmE, a general form of KGE with its relation embedding space closed under composition, namely that the composition of any two given relation embeddings remains within the embedding space. This property ensures that every relation embedding can compose, or be composed by other relation embeddings. It enhances HolmE’s capability to model under-represented (also called long-tail) composition patterns with limited learning instances. To our best knowledge, our work is pioneering in discussing KGE with this property of being closed under composition. We provide detailed theoretical proof and extensive experiments to demonstrate the notable advantages of HolmE in modelling composition patterns, particularly for long-tail patterns. Our results also highlight HolmE’s effectiveness in extrapolating to unseen relations through composition and its state-of-the-art performance on benchmark datasets.</p>","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"35 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Mining and Knowledge Discovery","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10618-024-01050-x","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Knowledge Graph Embedding (KGE) has attracted increasing attention. Relation patterns, such as symmetry and inversion, have received considerable focus. Among them, composition patterns are particularly important, as they involve nearly all relations in KGs. However, prior KGE approaches often consider relations to be compositional only if they are well-represented in the training data. Consequently, it can lead to performance degradation, especially for under-represented composition patterns. To this end, we propose HolmE, a general form of KGE with its relation embedding space closed under composition, namely that the composition of any two given relation embeddings remains within the embedding space. This property ensures that every relation embedding can compose, or be composed by other relation embeddings. It enhances HolmE’s capability to model under-represented (also called long-tail) composition patterns with limited learning instances. To our best knowledge, our work is pioneering in discussing KGE with this property of being closed under composition. We provide detailed theoretical proof and extensive experiments to demonstrate the notable advantages of HolmE in modelling composition patterns, particularly for long-tail patterns. Our results also highlight HolmE’s effectiveness in extrapolating to unseen relations through composition and its state-of-the-art performance on benchmark datasets.
期刊介绍:
Advances in data gathering, storage, and distribution have created a need for computational tools and techniques to aid in data analysis. Data Mining and Knowledge Discovery in Databases (KDD) is a rapidly growing area of research and application that builds on techniques and theories from many fields, including statistics, databases, pattern recognition and learning, data visualization, uncertainty modelling, data warehousing and OLAP, optimization, and high performance computing.