On the Mathematical Relationship Between Contextual Probability and N400 Amplitude.

Q1 Social Sciences Open Mind Pub Date : 2024-06-28 eCollection Date: 2024-01-01 DOI:10.1162/opmi_a_00150
James A Michaelov, Benjamin K Bergen
{"title":"On the Mathematical Relationship Between Contextual Probability and N400 Amplitude.","authors":"James A Michaelov, Benjamin K Bergen","doi":"10.1162/opmi_a_00150","DOIUrl":null,"url":null,"abstract":"<p><p>Accounts of human language comprehension propose different mathematical relationships between the contextual probability of a word and how difficult it is to process, including linear, logarithmic, and super-logarithmic ones. However, the empirical evidence favoring any of these over the others is mixed, appearing to vary depending on the index of processing difficulty used and the approach taken to calculate contextual probability. To help disentangle these results, we focus on the mathematical relationship between corpus-derived contextual probability and the N400, a neural index of processing difficulty. Specifically, we use 37 contemporary transformer language models to calculate the contextual probability of stimuli from 6 experimental studies of the N400, and test whether N400 amplitude is best predicted by a linear, logarithmic, super-logarithmic, or sub-logarithmic transformation of the probabilities calculated using these language models, as well as combinations of these transformed metrics. We replicate the finding that on some datasets, a combination of linearly and logarithmically-transformed probability can predict N400 amplitude better than either metric alone. In addition, we find that overall, the best single predictor of N400 amplitude is sub-logarithmically-transformed probability, which for almost all language models and datasets explains all the variance in N400 amplitude otherwise explained by the linear and logarithmic transformations. This is a novel finding that is not predicted by any current theoretical accounts, and thus one that we argue is likely to play an important role in increasing our understanding of how the statistical regularities of language impact language comprehension.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"859-897"},"PeriodicalIF":0.0000,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285424/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Mind","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/opmi_a_00150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Accounts of human language comprehension propose different mathematical relationships between the contextual probability of a word and how difficult it is to process, including linear, logarithmic, and super-logarithmic ones. However, the empirical evidence favoring any of these over the others is mixed, appearing to vary depending on the index of processing difficulty used and the approach taken to calculate contextual probability. To help disentangle these results, we focus on the mathematical relationship between corpus-derived contextual probability and the N400, a neural index of processing difficulty. Specifically, we use 37 contemporary transformer language models to calculate the contextual probability of stimuli from 6 experimental studies of the N400, and test whether N400 amplitude is best predicted by a linear, logarithmic, super-logarithmic, or sub-logarithmic transformation of the probabilities calculated using these language models, as well as combinations of these transformed metrics. We replicate the finding that on some datasets, a combination of linearly and logarithmically-transformed probability can predict N400 amplitude better than either metric alone. In addition, we find that overall, the best single predictor of N400 amplitude is sub-logarithmically-transformed probability, which for almost all language models and datasets explains all the variance in N400 amplitude otherwise explained by the linear and logarithmic transformations. This is a novel finding that is not predicted by any current theoretical accounts, and thus one that we argue is likely to play an important role in increasing our understanding of how the statistical regularities of language impact language comprehension.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
语境概率与 N400 振幅之间的数学关系。
关于人类语言理解能力的研究提出了单词的语境概率与单词处理难度之间的不同数学关系,包括线性关系、对数关系和超对数关系。然而,支持其中任何一种关系的实证证据并不充分,似乎因所使用的处理难度指数和计算上下文概率的方法而异。为了帮助厘清这些结果,我们重点研究了从语料库中得出的语境概率与 N400(一种表示处理难度的神经指数)之间的数学关系。具体来说,我们使用 37 个当代转换器语言模型来计算 6 项 N400 实验研究中刺激的上下文概率,并测试使用这些语言模型计算出的概率的线性、对数、超对数或次对数变换以及这些转换指标的组合是否能最好地预测 N400 振幅。我们重复了这一发现,即在某些数据集上,线性变换和对数变换概率的组合预测 N400 振幅的效果优于单独使用其中一种指标。此外,我们还发现,总体而言,N400 波幅的最佳预测指标是亚对数变换概率,几乎所有语言模型和数据集的 N400 波幅差异都可以用它来解释,而线性变换和对数变换则无法解释。这是一个目前任何理论都无法预测的新发现,因此我们认为,这一发现可能会在加深我们对语言的统计规律性如何影响语言理解的理解方面发挥重要作用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Open Mind
Open Mind Social Sciences-Linguistics and Language
CiteScore
3.20
自引率
0.00%
发文量
15
审稿时长
53 weeks
期刊最新文献
Approximating Human-Level 3D Visual Inferences With Deep Neural Networks. Prosodic Cues Support Inferences About the Question's Pedagogical Intent. The Double Standard of Ownership. Combination and Differentiation Theories of Categorization: A Comparison Using Participants' Categorization Descriptions. Investigating Sensitivity to Shared Information and Personal Experience in Children's Use of Majority Information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1