Word Frequency and Predictability Dissociate in Naturalistic Reading.

Q1 Social Sciences Open Mind Pub Date : 2024-03-05 eCollection Date: 2024-01-01 DOI:10.1162/opmi_a_00119
Cory Shain
{"title":"Word Frequency and Predictability Dissociate in Naturalistic Reading.","authors":"Cory Shain","doi":"10.1162/opmi_a_00119","DOIUrl":null,"url":null,"abstract":"<p><p>Many studies of human language processing have shown that readers slow down at less frequent or less predictable words, but there is debate about whether frequency and predictability effects reflect separable cognitive phenomena: are cognitive operations that retrieve words from the mental lexicon based on sensory cues distinct from those that predict upcoming words based on context? Previous evidence for a frequency-predictability dissociation is mostly based on small samples (both for estimating predictability and frequency and for testing their effects on human behavior), artificial materials (e.g., isolated constructed sentences), and implausible modeling assumptions (discrete-time dynamics, linearity, additivity, constant variance, and invariance over time), which raises the question: do frequency and predictability dissociate in ordinary language comprehension, such as story reading? This study leverages recent progress in open data and computational modeling to address this question at scale. A large collection of naturalistic reading data (six datasets, >2.2 M datapoints) is analyzed using nonlinear continuous-time regression, and frequency and predictability are estimated using statistical language models trained on more data than is currently typical in psycholinguistics. Despite the use of naturalistic data, strong predictability estimates, and flexible regression models, results converge with earlier experimental studies in supporting dissociable and additive frequency and predictability effects.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"177-201"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10932590/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Mind","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/opmi_a_00119","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Many studies of human language processing have shown that readers slow down at less frequent or less predictable words, but there is debate about whether frequency and predictability effects reflect separable cognitive phenomena: are cognitive operations that retrieve words from the mental lexicon based on sensory cues distinct from those that predict upcoming words based on context? Previous evidence for a frequency-predictability dissociation is mostly based on small samples (both for estimating predictability and frequency and for testing their effects on human behavior), artificial materials (e.g., isolated constructed sentences), and implausible modeling assumptions (discrete-time dynamics, linearity, additivity, constant variance, and invariance over time), which raises the question: do frequency and predictability dissociate in ordinary language comprehension, such as story reading? This study leverages recent progress in open data and computational modeling to address this question at scale. A large collection of naturalistic reading data (six datasets, >2.2 M datapoints) is analyzed using nonlinear continuous-time regression, and frequency and predictability are estimated using statistical language models trained on more data than is currently typical in psycholinguistics. Despite the use of naturalistic data, strong predictability estimates, and flexible regression models, results converge with earlier experimental studies in supporting dissociable and additive frequency and predictability effects.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
自然阅读中的词频与可预测性脱节。
许多关于人类语言处理的研究都表明,读者在阅读频率较低或可预测性较低的单词时会放慢阅读速度,但关于频率和可预测性效应是否反映了可分离的认知现象还存在争议:根据感觉线索从心理词典中检索单词的认知操作与根据上下文预测即将出现的单词的认知操作是否不同?以往关于频率-可预测性分离的证据大多基于小样本(既用于估计可预测性和频率,也用于测试它们对人类行为的影响)、人工材料(如孤立的结构化句子)和难以置信的建模假设(离散时间动态、线性、可加性、恒方差和随时间变化的不变性),这就提出了一个问题:在普通语言理解(如故事阅读)中,频率和可预测性是否分离?本研究利用开放数据和计算建模方面的最新进展来大规模解决这一问题。我们使用非线性连续时间回归分析了大量的自然阅读数据(六个数据集,超过 220 万个数据点),并使用统计语言模型对频率和可预测性进行了估计,这些模型是在比目前心理语言学中典型的更多数据基础上训练而成的。尽管使用了自然数据、较强的可预测性估计和灵活的回归模型,但结果与早期的实验研究一致,支持频率和可预测性效应的可分性和可加性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Open Mind
Open Mind Social Sciences-Linguistics and Language
CiteScore
3.20
自引率
0.00%
发文量
15
审稿时长
53 weeks
期刊最新文献
Approximating Human-Level 3D Visual Inferences With Deep Neural Networks. Prosodic Cues Support Inferences About the Question's Pedagogical Intent. The Double Standard of Ownership. Combination and Differentiation Theories of Categorization: A Comparison Using Participants' Categorization Descriptions. Investigating Sensitivity to Shared Information and Personal Experience in Children's Use of Majority Information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1