Benchmarking the speed-accuracy tradeoff in object recognition by humans and neural networks.

IF 2 4区 心理学 Q2 OPHTHALMOLOGY Journal of Vision Pub Date : 2025-01-02 DOI:10.1167/jov.25.1.4
Ajay Subramanian, Sara Price, Omkar Kumbhar, Elena Sizikova, Najib J Majaj, Denis G Pelli
{"title":"Benchmarking the speed-accuracy tradeoff in object recognition by humans and neural networks.","authors":"Ajay Subramanian, Sara Price, Omkar Kumbhar, Elena Sizikova, Najib J Majaj, Denis G Pelli","doi":"10.1167/jov.25.1.4","DOIUrl":null,"url":null,"abstract":"<p><p>Active object recognition, fundamental to tasks like reading and driving, relies on the ability to make time-sensitive decisions. People exhibit a flexible tradeoff between speed and accuracy, a crucial human skill. However, current computational models struggle to incorporate time. To address this gap, we present the first dataset (with 148 observers) exploring the speed-accuracy tradeoff (SAT) in ImageNet object recognition. Participants performed a 16-way ImageNet categorization task where their responses counted only if they occurred near the time of a fixed-delay beep. Each block of trials allowed one reaction time. As expected, human accuracy increases with reaction time. We compare human performance with that of dynamic neural networks that adapt their computation to the available inference time. Time is a scarce resource for human object recognition, and finding an appropriate analog in neural networks is challenging. Networks can repeat operations by using layers, recurrent cycles, or early exits. We use the repetition count as a network's analog for time. In our analysis, the number of layers, recurrent cycles, and early exits correlates strongly with floating-point operations, making them suitable time analogs. Comparing networks and humans on SAT-fit error, category-wise correlation, and SAT-curve steepness, we find cascaded dynamic neural networks most promising in modeling human speed and accuracy. Surprisingly, convolutional recurrent networks, typically favored in human object recognition modeling, perform the worst on our benchmark.</p>","PeriodicalId":49955,"journal":{"name":"Journal of Vision","volume":"25 1","pages":"4"},"PeriodicalIF":2.0000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11706240/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Vision","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1167/jov.25.1.4","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Active object recognition, fundamental to tasks like reading and driving, relies on the ability to make time-sensitive decisions. People exhibit a flexible tradeoff between speed and accuracy, a crucial human skill. However, current computational models struggle to incorporate time. To address this gap, we present the first dataset (with 148 observers) exploring the speed-accuracy tradeoff (SAT) in ImageNet object recognition. Participants performed a 16-way ImageNet categorization task where their responses counted only if they occurred near the time of a fixed-delay beep. Each block of trials allowed one reaction time. As expected, human accuracy increases with reaction time. We compare human performance with that of dynamic neural networks that adapt their computation to the available inference time. Time is a scarce resource for human object recognition, and finding an appropriate analog in neural networks is challenging. Networks can repeat operations by using layers, recurrent cycles, or early exits. We use the repetition count as a network's analog for time. In our analysis, the number of layers, recurrent cycles, and early exits correlates strongly with floating-point operations, making them suitable time analogs. Comparing networks and humans on SAT-fit error, category-wise correlation, and SAT-curve steepness, we find cascaded dynamic neural networks most promising in modeling human speed and accuracy. Surprisingly, convolutional recurrent networks, typically favored in human object recognition modeling, perform the worst on our benchmark.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
对人类和神经网络在物体识别中的速度-精度权衡进行基准测试。
主动物体识别是阅读和驾驶等任务的基础,它依赖于做出时间敏感决策的能力。人们在速度和准确性之间表现出一种灵活的权衡,这是一项至关重要的人类技能。然而,目前的计算模型很难考虑时间。为了解决这一差距,我们提出了第一个数据集(148个观察者),探索ImageNet对象识别中的速度-精度权衡(SAT)。参与者执行了一个16路ImageNet分类任务,只有当他们的反应发生在固定延迟的蜂鸣声时间附近时,他们的反应才算数。每组试验允许一次反应时间。正如预期的那样,人类的准确性随着反应时间的增加而增加。我们将人类的性能与动态神经网络的性能进行比较,动态神经网络使其计算适应可用的推理时间。时间是人类物体识别的稀缺资源,在神经网络中寻找合适的模拟是一项挑战。网络可以通过使用层、循环或早期退出来重复操作。我们使用重复计数作为网络的时间模拟。在我们的分析中,层数、循环周期和早期退出与浮点操作密切相关,使它们成为合适的时间类比。比较网络和人类在sat拟合误差、类别相关和sat曲线陡峭度方面的差异,我们发现级联动态神经网络在模拟人类速度和准确性方面最有前途。令人惊讶的是,通常在人类物体识别建模中受到青睐的卷积循环网络在我们的基准测试中表现最差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Vision
Journal of Vision 医学-眼科学
CiteScore
2.90
自引率
5.60%
发文量
218
审稿时长
3-6 weeks
期刊介绍: Exploring all aspects of biological visual function, including spatial vision, perception, low vision, color vision and more, spanning the fields of neuroscience, psychology and psychophysics.
期刊最新文献
Impaired visual perceptual accuracy in the upper visual field induces asymmetric performance in position estimation for falling and rising objects. Anticipatory smooth pursuit eye movements scale with the probability of visual motion: The role of target speed and acceleration. Benchmarking the speed-accuracy tradeoff in object recognition by humans and neural networks. Effect of sign language learning on temporal resolution of visual attention. Improving the reliability and accuracy of population receptive field measures using a logarithmically warped stimulus.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1