Vito Mengers, Nicolas Roth, Oliver Brock, Klaus Obermayer, Martin Rolfs
{"title":"受机器人启发的扫描路径模型揭示了不确定性和语义物体线索对动态场景中目光引导的重要性","authors":"Vito Mengers, Nicolas Roth, Oliver Brock, Klaus Obermayer, Martin Rolfs","doi":"arxiv-2408.01322","DOIUrl":null,"url":null,"abstract":"How we perceive objects around us depends on what we actively attend to, yet\nour eye movements depend on the perceived objects. Still, object segmentation\nand gaze behavior are typically treated as two independent processes. Drawing\non an information processing pattern from robotics, we present a mechanistic\nmodel that simulates these processes for dynamic real-world scenes. Our\nimage-computable model uses the current scene segmentation for object-based\nsaccadic decision-making while using the foveated object to refine its scene\nsegmentation recursively. To model this refinement, we use a Bayesian filter,\nwhich also provides an uncertainty estimate for the segmentation that we use to\nguide active scene exploration. We demonstrate that this model closely\nresembles observers' free viewing behavior, measured by scanpath statistics,\nincluding foveation duration and saccade amplitude distributions used for\nparameter fitting and higher-level statistics not used for fitting. These\ninclude how object detections, inspections, and returns are balanced and a\ndelay of returning saccades without an explicit implementation of such temporal\ninhibition of return. Extensive simulations and ablation studies show that\nuncertainty promotes balanced exploration and that semantic object cues are\ncrucial to form the perceptual units used in object-based attention. Moreover,\nwe show how our model's modular design allows for extensions, such as\nincorporating saccadic momentum or pre-saccadic attention, to further align its\noutput with human scanpaths.","PeriodicalId":501517,"journal":{"name":"arXiv - QuanBio - Neurons and Cognition","volume":"49 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Robotics-Inspired Scanpath Model Reveals the Importance of Uncertainty and Semantic Object Cues for Gaze Guidance in Dynamic Scenes\",\"authors\":\"Vito Mengers, Nicolas Roth, Oliver Brock, Klaus Obermayer, Martin Rolfs\",\"doi\":\"arxiv-2408.01322\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How we perceive objects around us depends on what we actively attend to, yet\\nour eye movements depend on the perceived objects. Still, object segmentation\\nand gaze behavior are typically treated as two independent processes. Drawing\\non an information processing pattern from robotics, we present a mechanistic\\nmodel that simulates these processes for dynamic real-world scenes. Our\\nimage-computable model uses the current scene segmentation for object-based\\nsaccadic decision-making while using the foveated object to refine its scene\\nsegmentation recursively. To model this refinement, we use a Bayesian filter,\\nwhich also provides an uncertainty estimate for the segmentation that we use to\\nguide active scene exploration. We demonstrate that this model closely\\nresembles observers' free viewing behavior, measured by scanpath statistics,\\nincluding foveation duration and saccade amplitude distributions used for\\nparameter fitting and higher-level statistics not used for fitting. These\\ninclude how object detections, inspections, and returns are balanced and a\\ndelay of returning saccades without an explicit implementation of such temporal\\ninhibition of return. Extensive simulations and ablation studies show that\\nuncertainty promotes balanced exploration and that semantic object cues are\\ncrucial to form the perceptual units used in object-based attention. Moreover,\\nwe show how our model's modular design allows for extensions, such as\\nincorporating saccadic momentum or pre-saccadic attention, to further align its\\noutput with human scanpaths.\",\"PeriodicalId\":501517,\"journal\":{\"name\":\"arXiv - QuanBio - Neurons and Cognition\",\"volume\":\"49 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Neurons and Cognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.01322\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Neurons and Cognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.01322","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Robotics-Inspired Scanpath Model Reveals the Importance of Uncertainty and Semantic Object Cues for Gaze Guidance in Dynamic Scenes
How we perceive objects around us depends on what we actively attend to, yet
our eye movements depend on the perceived objects. Still, object segmentation
and gaze behavior are typically treated as two independent processes. Drawing
on an information processing pattern from robotics, we present a mechanistic
model that simulates these processes for dynamic real-world scenes. Our
image-computable model uses the current scene segmentation for object-based
saccadic decision-making while using the foveated object to refine its scene
segmentation recursively. To model this refinement, we use a Bayesian filter,
which also provides an uncertainty estimate for the segmentation that we use to
guide active scene exploration. We demonstrate that this model closely
resembles observers' free viewing behavior, measured by scanpath statistics,
including foveation duration and saccade amplitude distributions used for
parameter fitting and higher-level statistics not used for fitting. These
include how object detections, inspections, and returns are balanced and a
delay of returning saccades without an explicit implementation of such temporal
inhibition of return. Extensive simulations and ablation studies show that
uncertainty promotes balanced exploration and that semantic object cues are
crucial to form the perceptual units used in object-based attention. Moreover,
we show how our model's modular design allows for extensions, such as
incorporating saccadic momentum or pre-saccadic attention, to further align its
output with human scanpaths.