Multi-Context Attention for Entity Matching

Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI:10.1145/3366423.3380017

Dongxiang Zhang, Yuyang Nie, Sai Wu, Yanyan Shen, K. Tan

引用次数: 24

Abstract

Entity matching (EM) is a classic research problem that identifies data instances referring to the same real-world entity. Recent technical trend in this area is to take advantage of deep learning (DL) to automatically extract discriminative features. DeepER and DeepMatcher have emerged as two pioneering DL models for EM. However, these two state-of-the-art solutions simply incorporate vanilla RNNs and straightforward attention mechanisms. In this paper, we fully exploit the semantic context of embedding vectors for the pair of entity text descriptions. In particular, we propose an integrated multi-context attention framework that takes into account self-attention, pair-attention and global-attention from three types of context. The idea is further extended to incorporate attribute attention in order to support structured datasets. We conduct extensive experiments with 7 benchmark datasets that are publicly accessible. The experimental results clearly establish our superiority over DeepER and DeepMatcher in all the datasets.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

实体匹配的多上下文关注

实体匹配(EM)是识别引用相同现实世界实体的数据实例的经典研究问题。近年来该领域的技术趋势是利用深度学习(DL)来自动提取判别特征。deep和DeepMatcher已经成为EM的两个开创性深度学习模型。然而，这两个最先进的解决方案只是简单地结合了普通的rnn和简单的注意力机制。在本文中，我们充分利用了嵌入向量对实体文本描述的语义上下文。我们特别提出了一个综合的多语境注意框架，该框架考虑了三种类型语境中的自我注意、配对注意和全局注意。为了支持结构化数据集，这个想法被进一步扩展到包含属性关注。我们对7个可公开访问的基准数据集进行了广泛的实验。实验结果清楚地证明了我们在所有数据集上优于deep和DeepMatcher。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of The Web Conference 2020

自引率

0.00%

发文量