Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model

IF 4.6 Q2 MATERIALS SCIENCE, BIOMATERIALS ACS Applied Bio Materials Pub Date : 2024-05-14 DOI:10.1111/cogs.13448
Eva Portelance, Michael C. Frank, Dan Jurafsky
{"title":"Learning the Meanings of Function Words From Grounded Language Using a Visual Question Answering Model","authors":"Eva Portelance,&nbsp;Michael C. Frank,&nbsp;Dan Jurafsky","doi":"10.1111/cogs.13448","DOIUrl":null,"url":null,"abstract":"<p>Interpreting a seemingly simple function word like “or,” “behind,” or “more” can require logical, numerical, and relational reasoning. How are such words learned by children? Prior acquisition theories have often relied on positing a foundation of innate knowledge. Yet recent neural-network-based visual question answering models apparently can learn to use function words as part of answering questions about complex visual scenes. In this paper, we study what these models learn about function words, in the hope of better understanding how the meanings of these words can be learned by both models and children. We show that recurrent models trained on visually grounded language learn gradient semantics for function words requiring spatial and numerical reasoning. Furthermore, we find that these models can learn the meanings of logical connectives <i>and</i> and <i>or</i> without any prior knowledge of logical reasoning as well as early evidence that they are sensitive to alternative expressions when interpreting language. Finally, we show that word learning difficulty is dependent on the frequency of models' input. Our findings offer proof-of-concept evidence that it is possible to learn the nuanced interpretations of function words in a visually grounded context by using non-symbolic general statistical learning algorithms, without any prior knowledge of linguistic meaning.</p>","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1111/cogs.13448","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"102","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/cogs.13448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
引用次数: 0

Abstract

Interpreting a seemingly simple function word like “or,” “behind,” or “more” can require logical, numerical, and relational reasoning. How are such words learned by children? Prior acquisition theories have often relied on positing a foundation of innate knowledge. Yet recent neural-network-based visual question answering models apparently can learn to use function words as part of answering questions about complex visual scenes. In this paper, we study what these models learn about function words, in the hope of better understanding how the meanings of these words can be learned by both models and children. We show that recurrent models trained on visually grounded language learn gradient semantics for function words requiring spatial and numerical reasoning. Furthermore, we find that these models can learn the meanings of logical connectives and and or without any prior knowledge of logical reasoning as well as early evidence that they are sensitive to alternative expressions when interpreting language. Finally, we show that word learning difficulty is dependent on the frequency of models' input. Our findings offer proof-of-concept evidence that it is possible to learn the nuanced interpretations of function words in a visually grounded context by using non-symbolic general statistical learning algorithms, without any prior knowledge of linguistic meaning.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用视觉问题解答模型从基础语言中学习功能词的含义
解释 "或"、"在......后面 "或 "更多 "等看似简单的功能词需要逻辑、数字和关系推理。儿童是如何学会这些词的呢?先前的习得理论往往依赖于先天知识基础的假设。然而,最近基于神经网络的视觉问题解答模型显然可以学会使用功能词,作为回答复杂视觉场景问题的一部分。在本文中,我们研究了这些模型对功能词的学习情况,希望能更好地理解模型和儿童是如何学习这些词的含义的。我们的研究表明,在视觉语言基础上训练的递归模型可以学习需要空间和数字推理的功能词的梯度语义。此外,我们还发现,这些模型可以在没有任何逻辑推理知识的情况下学习逻辑连接词和和或的含义,而且有早期证据表明,这些模型在解释语言时对替代表达方式很敏感。最后,我们还发现单词学习的难度取决于模型输入的频率。我们的研究结果提供了概念证明,即在没有任何语言意义知识的前提下,通过使用非符号的一般统计学习算法,在视觉基础的语境中学习功能词的细微解释是可能的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
ACS Applied Bio Materials
ACS Applied Bio Materials Chemistry-Chemistry (all)
CiteScore
9.40
自引率
2.10%
发文量
464
期刊最新文献
A Systematic Review of Sleep Disturbance in Idiopathic Intracranial Hypertension. Advancing Patient Education in Idiopathic Intracranial Hypertension: The Promise of Large Language Models. Anti-Myelin-Associated Glycoprotein Neuropathy: Recent Developments. Approach to Managing the Initial Presentation of Multiple Sclerosis: A Worldwide Practice Survey. Association Between LACE+ Index Risk Category and 90-Day Mortality After Stroke.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1