UNIMIB @ DIACR-Ita: Aligning Distributional Embeddings with a Compass for Semantic Change Detection in the Italian Language (short paper)

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020 Pub Date : 1900-01-01 DOI:10.4000/BOOKS.AACCADEMIA.7688

F. Belotti, Federico Bianchi, M. Palmonari

引用次数: 2

Abstract

In this paper, we present our results related to the EVALITA 2020 challenge, DIACR-Ita, for semantic change detection for the Italian language. Our approach is based on measuring the semantic distance across time-specific word vectors generated with Compass-aligned Distributional Embeddings (CADE). We first generate temporal embeddings with CADE, a strategy to align word embeddings that are specific for each time period; the quality of this alignment is the main asset of our proposal. We then measure the semantic shift of each word, combining two different semantic shift measures. Eventually, we classify a word meaning as changed or not changed by defining a threshold over the semantic distance across time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

UNIMIB @ DIACR-Ita:用指南针对齐分布嵌入来检测意大利语的语义变化(短论文)

在本文中，我们展示了与EVALITA 2020挑战(DIACR-Ita)相关的结果，该挑战用于意大利语的语义变化检测。我们的方法是基于测量由指南针对齐分布嵌入(CADE)生成的特定时间词向量之间的语义距离。我们首先使用CADE生成时间嵌入，CADE是一种对齐特定于每个时间段的词嵌入的策略;这种一致性的质量是我们建议的主要资产。然后，我们结合两种不同的语义转移测量来测量每个单词的语义转移。最后，我们通过定义一个跨越时间的语义距离的阈值，将一个单词的意义划分为改变或未改变。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

EVALITA Evaluation of NLP and Speech Tools for Italian - December 17th, 2020

自引率

0.00%

发文量