一个机器人启发的扫描路径模型揭示了不确定性和语义对象线索对动态场景中凝视引导的重要性。

IF 2.3 4区 心理学 Q2 OPHTHALMOLOGY Journal of Vision Pub Date : 2025-02-03 DOI:10.1167/jov.25.2.6
Vito Mengers, Nicolas Roth, Oliver Brock, Klaus Obermayer, Martin Rolfs
{"title":"一个机器人启发的扫描路径模型揭示了不确定性和语义对象线索对动态场景中凝视引导的重要性。","authors":"Vito Mengers, Nicolas Roth, Oliver Brock, Klaus Obermayer, Martin Rolfs","doi":"10.1167/jov.25.2.6","DOIUrl":null,"url":null,"abstract":"<p><p>The objects we perceive guide our eye movements when observing real-world dynamic scenes. Yet, gaze shifts and selective attention are critical for perceiving details and refining object boundaries. Object segmentation and gaze behavior are, however, typically treated as two independent processes. Here, we present a computational model that simulates these processes in an interconnected manner and allows for hypothesis-driven investigations of distinct attentional mechanisms. Drawing on an information processing pattern from robotics, we use a Bayesian filter to recursively segment the scene, which also provides an uncertainty estimate for the object boundaries that we use to guide active scene exploration. We demonstrate that this model closely resembles observers' free viewing behavior on a dataset of dynamic real-world scenes, measured by scanpath statistics, including foveation duration and saccade amplitude distributions used for parameter fitting and higher-level statistics not used for fitting. These include how object detections, inspections, and returns are balanced and a delay of returning saccades without an explicit implementation of such temporal inhibition of return. Extensive simulations and ablation studies show that uncertainty promotes balanced exploration and that semantic object cues are crucial to forming the perceptual units used in object-based attention. Moreover, we show how our model's modular design allows for extensions, such as incorporating saccadic momentum or presaccadic attention, to further align its output with human scanpaths.</p>","PeriodicalId":49955,"journal":{"name":"Journal of Vision","volume":"25 2","pages":"6"},"PeriodicalIF":2.3000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11812614/pdf/","citationCount":"0","resultStr":"{\"title\":\"A robotics-inspired scanpath model reveals the importance of uncertainty and semantic object cues for gaze guidance in dynamic scenes.\",\"authors\":\"Vito Mengers, Nicolas Roth, Oliver Brock, Klaus Obermayer, Martin Rolfs\",\"doi\":\"10.1167/jov.25.2.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The objects we perceive guide our eye movements when observing real-world dynamic scenes. Yet, gaze shifts and selective attention are critical for perceiving details and refining object boundaries. Object segmentation and gaze behavior are, however, typically treated as two independent processes. Here, we present a computational model that simulates these processes in an interconnected manner and allows for hypothesis-driven investigations of distinct attentional mechanisms. Drawing on an information processing pattern from robotics, we use a Bayesian filter to recursively segment the scene, which also provides an uncertainty estimate for the object boundaries that we use to guide active scene exploration. We demonstrate that this model closely resembles observers' free viewing behavior on a dataset of dynamic real-world scenes, measured by scanpath statistics, including foveation duration and saccade amplitude distributions used for parameter fitting and higher-level statistics not used for fitting. These include how object detections, inspections, and returns are balanced and a delay of returning saccades without an explicit implementation of such temporal inhibition of return. Extensive simulations and ablation studies show that uncertainty promotes balanced exploration and that semantic object cues are crucial to forming the perceptual units used in object-based attention. Moreover, we show how our model's modular design allows for extensions, such as incorporating saccadic momentum or presaccadic attention, to further align its output with human scanpaths.</p>\",\"PeriodicalId\":49955,\"journal\":{\"name\":\"Journal of Vision\",\"volume\":\"25 2\",\"pages\":\"6\"},\"PeriodicalIF\":2.3000,\"publicationDate\":\"2025-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11812614/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Vision\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1167/jov.25.2.6\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPHTHALMOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Vision","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1167/jov.25.2.6","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPHTHALMOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

当我们观察真实世界的动态场景时,我们感知到的物体引导着我们的眼球运动。然而,目光转移和选择性注意对于感知细节和精炼物体边界至关重要。然而,物体分割和注视行为通常被视为两个独立的过程。在这里,我们提出了一个计算模型,以相互关联的方式模拟这些过程,并允许对不同的注意机制进行假设驱动的调查。利用机器人技术的信息处理模式,我们使用贝叶斯滤波器递归分割场景,这也为我们用来指导主动场景探索的物体边界提供了不确定性估计。我们证明,该模型非常类似于观察者在动态真实场景数据集上的自由观看行为,通过扫描路径统计来测量,包括用于参数拟合的注视持续时间和扫视幅度分布,以及未用于拟合的更高级别统计。其中包括目标检测、检查和返回如何平衡,以及在没有明确实现这种返回的时间抑制的情况下返回的扫视延迟。大量的模拟和消融研究表明,不确定性促进了平衡探索,语义对象线索对于形成基于对象的注意中使用的感知单元至关重要。此外,我们展示了我们的模型的模块化设计如何允许扩展,例如合并跳跃性动量或前跳跃性注意,以进一步使其输出与人类扫描路径保持一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A robotics-inspired scanpath model reveals the importance of uncertainty and semantic object cues for gaze guidance in dynamic scenes.

The objects we perceive guide our eye movements when observing real-world dynamic scenes. Yet, gaze shifts and selective attention are critical for perceiving details and refining object boundaries. Object segmentation and gaze behavior are, however, typically treated as two independent processes. Here, we present a computational model that simulates these processes in an interconnected manner and allows for hypothesis-driven investigations of distinct attentional mechanisms. Drawing on an information processing pattern from robotics, we use a Bayesian filter to recursively segment the scene, which also provides an uncertainty estimate for the object boundaries that we use to guide active scene exploration. We demonstrate that this model closely resembles observers' free viewing behavior on a dataset of dynamic real-world scenes, measured by scanpath statistics, including foveation duration and saccade amplitude distributions used for parameter fitting and higher-level statistics not used for fitting. These include how object detections, inspections, and returns are balanced and a delay of returning saccades without an explicit implementation of such temporal inhibition of return. Extensive simulations and ablation studies show that uncertainty promotes balanced exploration and that semantic object cues are crucial to forming the perceptual units used in object-based attention. Moreover, we show how our model's modular design allows for extensions, such as incorporating saccadic momentum or presaccadic attention, to further align its output with human scanpaths.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Vision
Journal of Vision 医学-眼科学
CiteScore
2.90
自引率
5.60%
发文量
218
审稿时长
3-6 weeks
期刊介绍: Exploring all aspects of biological visual function, including spatial vision, perception, low vision, color vision and more, spanning the fields of neuroscience, psychology and psychophysics.
期刊最新文献
The visual perception of outdoor angular spatial relationships. The effect of flashing lights on speed perception for lateral motion and motion in depth. "Magnetic sand": Illusions of interactivity. Interaction of optical parameters in the perception of transparency/translucency and their neural representation in the visual cortex. Factors underlying flicker-induced time dilation: Temporal frequency, semantic content, and subjective saliency of visual stimuli.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1