在预训练的语言模型中包含语言外语境的挑战

First Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI:10.18653/v1/2022.insights-1.18

Ionut-Teodor Sorodoc, Laura Aina, Gemma Boleda

{"title":"在预训练的语言模型中包含语言外语境的挑战","authors":"Ionut-Teodor Sorodoc, Laura Aina, Gemma Boleda","doi":"10.18653/v1/2022.insights-1.18","DOIUrl":null,"url":null,"abstract":"To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pre-trained language models.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Challenges in including extra-linguistic context in pre-trained language models\",\"authors\":\"Ionut-Teodor Sorodoc, Laura Aina, Gemma Boleda\",\"doi\":\"10.18653/v1/2022.insights-1.18\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pre-trained language models.\",\"PeriodicalId\":441528,\"journal\":{\"name\":\"First Workshop on Insights from Negative Results in NLP\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First Workshop on Insights from Negative Results in NLP\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2022.insights-1.18\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First Workshop on Insights from Negative Results in NLP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2022.insights-1.18","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

为了成功地解释语言，计算模型需要同时考虑语言上下文(话语的内容)和语言外上下文(例如，对话中的参与者)。我们专注于一个参考任务，该任务要求模型将电视节目中的实体提及与相应的角色联系起来，并设计一个试图解释这两种上下文的体系结构。特别是，我们的架构结合了先前提出的用于字符表示的专门模块(“实体库”)和从预训练的语言模型迁移学习。我们发现，尽管该模型确实改善了语言语境化，但它未能成功地整合对话参与者的语言外信息。我们的工作表明，将语言外的信息纳入预训练的语言模型是非常具有挑战性的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Challenges in including extra-linguistic context in pre-trained language models

To successfully account for language, computational models need to take into account both the linguistic context (the content of the utterances) and the extra-linguistic context (for instance, the participants in a dialogue). We focus on a referential task that asks models to link entity mentions in a TV show to the corresponding characters, and design an architecture that attempts to account for both kinds of context. In particular, our architecture combines a previously proposed specialized module (an “entity library”) for character representation with transfer learning from a pre-trained language model. We find that, although the model does improve linguistic contextualization, it fails to successfully integrate extra-linguistic information about the participants in the dialogue. Our work shows that it is very challenging to incorporate extra-linguistic information into pre-trained language models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

First Workshop on Insights from Negative Results in NLP

自引率

0.00%

发文量

期刊最新文献

What GPT Knows About Who is Who Pathologies of Pre-trained Language Models in Few-shot Fine-tuning Can Question Rewriting Help Conversational Question Answering? Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains Do Data-based Curricula Work?