Understanding Negative Sampling in Knowledge Graph Embedding

International journal of artificial intelligence & applications Pub Date : 2021-01-31 DOI:10.5121/IJAIA.2021.12105

Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue

引用次数: 4

Abstract

Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

知识图嵌入中负抽样的理解

知识图嵌入(Knowledge graph embedding, KGE)是将知识图中的实体和关系投影到低维向量空间中，近年来取得了稳步发展。传统的KGE方法，特别是基于平移距离的模型，是通过区分阳性样本和阴性样本来训练的。为了节省空间，大多数kg只存储阳性样本。因此，负采样在编码KG的三元组中起着至关重要的作用。生成负样本的质量直接影响学习到的知识表示在无数下游任务中的表现，如推荐、链接预测和节点分类。我们将目前KGE的负抽样方法分为三类，分别是基于静态分布的、基于动态分布的和基于自定义聚类的。基于这种分类，我们讨论了最普遍的现有方法及其特点。希望本文能对KGE负抽样的新思路提供一些指导。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International journal of artificial intelligence & applications

自引率

0.00%

发文量