Challenges in including extra-linguistic context in pre-trained language models

First Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI:10.18653/v1/2022.insights-1.18

Ionut-Teodor Sorodoc, Laura Aina, Gemma Boleda

引用次数: 0

Abstract

To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pre-trained language models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在预训练的语言模型中包含语言外语境的挑战

为了成功地解释语言，计算模型需要同时考虑语言上下文(话语的内容)和语言外上下文(例如，对话中的参与者)。我们专注于一个参考任务，该任务要求模型将电视节目中的实体提及与相应的角色联系起来，并设计一个试图解释这两种上下文的体系结构。特别是，我们的架构结合了先前提出的用于字符表示的专门模块(“实体库”)和从预训练的语言模型迁移学习。我们发现，尽管该模型确实改善了语言语境化，但它未能成功地整合对话参与者的语言外信息。我们的工作表明，将语言外的信息纳入预训练的语言模型是非常具有挑战性的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

First Workshop on Insights from Negative Results in NLP

自引率

0.00%

发文量

期刊最新文献

What GPT Knows About Who is Who Pathologies of Pre-trained Language Models in Few-shot Fine-tuning Can Question Rewriting Help Conversational Question Answering? Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains Do Data-based Curricula Work?