Family lexicon: Using language models to encode memories of personally familiar and famous people and places in the brain.

IF 2.6 3区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES PLoS ONE Pub Date : 2024-11-22 eCollection Date: 2024-01-01 DOI:10.1371/journal.pone.0291099

Andrea Bruera, Massimo Poesio

{"title":"Family lexicon: Using language models to encode memories of personally familiar and famous people and places in the brain.","authors":"Andrea Bruera, Massimo Poesio","doi":"10.1371/journal.pone.0291099","DOIUrl":null,"url":null,"abstract":"<p><p>Knowledge about personally familiar people and places is extremely rich and varied, involving pieces of semantic information connected in unpredictable ways through past autobiographical memories. In this work, we investigate whether we can capture brain processing of personally familiar people and places using subject-specific memories, after transforming them into vectorial semantic representations using language models. First, we asked participants to provide us with the names of the closest people and places in their lives. Then we collected open-ended answers to a questionnaire, aimed at capturing various facets of declarative knowledge. We collected EEG data from the same participants while they were reading the names and subsequently mentally visualizing their referents. As a control set of stimuli, we also recorded evoked responses to a matched set of famous people and places. We then created original semantic representations for the individual entities using language models. For personally familiar entities, we used the text of the answers to the questionnaire. For famous entities, we employed their Wikipedia page, which reflects shared declarative knowledge about them. Through whole-scalp time-resolved and searchlight encoding analyses, we found that we could capture how the brain processes one's closest people and places using person-specific answers to questionnaires, as well as famous entities. Overall encoding performance was significant in a large time window (200-800ms). Using spatio-temporal EEG searchlight, we found that we could predict brain responses significantly better than chance earlier (200-500ms) in bilateral temporo-parietal electrodes and later (500-700ms) in frontal and posterior central electrodes. We also found that XLM, a contextualized (or large) language model, provided superior encoding scores when compared with a simpler static language model as word2vec. Overall, these results indicate that language models can capture subject-specific semantic representations as they are processed in the human brain, by exploiting small-scale distributional lexical data.</p>","PeriodicalId":20189,"journal":{"name":"PLoS ONE","volume":"19 11","pages":"e0291099"},"PeriodicalIF":2.6000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11584084/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS ONE","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1371/journal.pone.0291099","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge about personally familiar people and places is extremely rich and varied, involving pieces of semantic information connected in unpredictable ways through past autobiographical memories. In this work, we investigate whether we can capture brain processing of personally familiar people and places using subject-specific memories, after transforming them into vectorial semantic representations using language models. First, we asked participants to provide us with the names of the closest people and places in their lives. Then we collected open-ended answers to a questionnaire, aimed at capturing various facets of declarative knowledge. We collected EEG data from the same participants while they were reading the names and subsequently mentally visualizing their referents. As a control set of stimuli, we also recorded evoked responses to a matched set of famous people and places. We then created original semantic representations for the individual entities using language models. For personally familiar entities, we used the text of the answers to the questionnaire. For famous entities, we employed their Wikipedia page, which reflects shared declarative knowledge about them. Through whole-scalp time-resolved and searchlight encoding analyses, we found that we could capture how the brain processes one's closest people and places using person-specific answers to questionnaires, as well as famous entities. Overall encoding performance was significant in a large time window (200-800ms). Using spatio-temporal EEG searchlight, we found that we could predict brain responses significantly better than chance earlier (200-500ms) in bilateral temporo-parietal electrodes and later (500-700ms) in frontal and posterior central electrodes. We also found that XLM, a contextualized (or large) language model, provided superior encoding scores when compared with a simpler static language model as word2vec. Overall, these results indicate that language models can capture subject-specific semantic representations as they are processed in the human brain, by exploiting small-scale distributional lexical data.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

家族词典：利用语言模型在大脑中编码个人熟悉的名人和地方的记忆。

关于个人熟悉的人和地点的知识极其丰富多样，其中包括通过过去的自传体记忆以不可预测的方式连接起来的语义信息片段。在这项研究中，我们将研究对象的特定记忆利用语言模型转化为矢量语义表征后，能否捕捉到大脑对个人熟悉的人和地点的处理过程。首先，我们要求参与者提供他们生活中最亲近的人和地方的名字。然后，我们收集了一份问卷的开放式答案，旨在捕捉陈述性知识的各个方面。我们还收集了同一受试者的脑电图数据，当时受试者正在读取人名，随后在头脑中将其参照物视觉化。作为一组对照刺激，我们还记录了一组匹配的名人和名胜的诱发反应。然后，我们使用语言模型为各个实体创建了原始语义表征。对于个人熟悉的实体，我们使用了问卷答案的文本。对于知名实体，我们使用了他们的维基百科页面，该页面反映了关于他们的共享声明性知识。通过全尺度时间分辨和探照灯编码分析，我们发现我们可以利用特定个人的问卷答案以及著名实体来捕捉大脑是如何处理一个人最亲近的人和地方的。在一个较大的时间窗口（200-800 毫秒）内，总体编码效果显著。通过使用时空脑电图探照灯，我们发现在双侧颞顶叶电极的早期（200-500 毫秒）和额叶及后部中央电极的晚期（500-700 毫秒），我们预测大脑反应的能力明显优于偶然性。我们还发现，与更简单的静态语言模型 word2vec 相比，语境化（或大型）语言模型 XLM 的编码得分更高。总之，这些结果表明，语言模型可以通过利用小范围的词汇分布数据，捕捉人脑中处理的特定主题语义表征。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

PLoS ONE 生物-生物学

CiteScore

6.20

自引率

5.40%

发文量

14242

审稿时长

3.7 months

期刊介绍： PLOS ONE is an international, peer-reviewed, open-access, online publication. PLOS ONE welcomes reports on primary research from any scientific discipline. It provides: * Open-access—freely accessible online, authors retain copyright * Fast publication times * Peer review by expert, practicing researchers * Post-publication tools to indicate quality and impact * Community-based dialogue on articles * Worldwide media coverage