Decoding Word Embeddings with Brain-Based Semantic Features

IF 3.7 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computational Linguistics Pub Date : 2021-07-05 DOI:10.1162/coli_a_00412

Emmanuele Chersoni, Enrico Santus, Chu-Ren Huang, Alessandro Lenci

{"title":"Decoding Word Embeddings with Brain-Based Semantic Features","authors":"Emmanuele Chersoni, Enrico Santus, Chu-Ren Huang, Alessandro Lenci","doi":"10.1162/coli_a_00412","DOIUrl":null,"url":null,"abstract":"Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their introduction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings, including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT). In our analysis, we first evaluate the quality of the mapping in a retrieval task, then we shed light on the semantic features that are better encoded in each embedding type. A large number of probing tasks is finally set to assess how the original and the mapped embeddings perform in discriminating semantic categories. For each probing task, we identify the most relevant semantic features and we show that there is a correlation between the embedding performance and how they encode those features. This study sets itself as a step forward in understanding which aspects of meaning are captured by vector spaces, by proposing a new and simple method to carve human-interpretable semantic representations from distributional vectors.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-36"},"PeriodicalIF":3.7000,"publicationDate":"2021-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_a_00412","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 24

Abstract

Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their introduction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings, including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT). In our analysis, we first evaluate the quality of the mapping in a retrieval task, then we shed light on the semantic features that are better encoded in each embedding type. A large number of probing tasks is finally set to assess how the original and the mapped embeddings perform in discriminating semantic categories. For each probing task, we identify the most relevant semantic features and we show that there is a correlation between the embedding performance and how they encode those features. This study sets itself as a step forward in understanding which aspects of meaning are captured by vector spaces, by proposing a new and simple method to carve human-interpretable semantic representations from distributional vectors.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于大脑语义特征的单词嵌入解码

单词嵌入是通过计数或预测技术构建的矢量语义表示，旨在从单词的共同出现中捕捉含义的阴影。自从它们被引入以来，这些表征一直被批评为缺乏可解释的维度。单词嵌入的这种特性限制了我们对它们实际编码的语义特征的理解。此外，由于单词嵌入性能的原因对人类来说往往是不透明的，因此它有助于使用它们的任务的“黑匣子”性质。在这篇文章中，我们通过将单词嵌入映射到可解释向量上来探索单词嵌入中编码的语义特性，该向量由显式和神经生物学动机的语义特征组成（Binder等人，2016）。我们的探索考虑了不同类型的嵌入，包括因子分解计数向量和预测模型（Skip Gram、GloVe等），以及最新的上下文表示（即ELMo和BERT）。在我们的分析中，我们首先评估检索任务中映射的质量，然后阐明在每个嵌入类型中更好地编码的语义特征。最后设置了大量的探测任务来评估原始嵌入和映射嵌入在区分语义类别方面的表现。对于每个探测任务，我们确定了最相关的语义特征，并表明嵌入性能和它们如何编码这些特征之间存在相关性。这项研究提出了一种新的简单方法，从分布向量中雕刻出人类可解释的语义表示，从而在理解向量空间捕捉到意义的哪些方面方面迈出了一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computational Linguistics 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.