三维姿势感知的心理几何复杂性

IF 1.5 4区 心理学 Q4 NEUROSCIENCES Vision Research Pub Date : 2024-06-08 DOI:10.1016/j.visres.2024.108438
Crystal Guo, Akihito Maruya, Qasim Zaidi
{"title":"三维姿势感知的心理几何复杂性","authors":"Crystal Guo,&nbsp;Akihito Maruya,&nbsp;Qasim Zaidi","doi":"10.1016/j.visres.2024.108438","DOIUrl":null,"url":null,"abstract":"<div><p>Biological visual systems rely on pose estimation of 3D objects to navigate and interact with their environment, but the neural mechanisms and computations for inferring 3D poses from 2D retinal images are only partially understood, especially where stereo information is missing. We previously presented evidence that humans infer the poses of 3D objects lying centered on the ground by using the geometrical back-transform from retinal images to viewer-centered world coordinates. This model explained the almost veridical estimation of poses in real scenes and the illusory rotation of poses in obliquely viewed pictures, which includes the “pointing out of the picture” phenomenon. Here we test this model for more varied configurations and find that it needs to be augmented. Five observers estimated poses of sloped, elevated, or off-center 3D sticks in each of 16 different poses displayed on a monitor in frontal and oblique views. Pose estimates in scenes and pictures showed remarkable accuracy and agreement between observers, but with a systematic fronto-parallel bias for oblique poses similar to the ground condition. The retinal projection of the pose of an object sloped wrt the ground depends on the slope. We show that observers’ estimates can be explained by the back-transform derived for close to the correct slope. The back-transform explanation also applies to obliquely viewed pictures and to off-center objects and elevated objects, making it more likely that observers use internalized perspective geometry to make 3D pose inferences while actively incorporating inferences about other aspects of object placement.</p></div>","PeriodicalId":23670,"journal":{"name":"Vision Research","volume":null,"pages":null},"PeriodicalIF":1.5000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Complexity of mental geometry for 3D pose perception\",\"authors\":\"Crystal Guo,&nbsp;Akihito Maruya,&nbsp;Qasim Zaidi\",\"doi\":\"10.1016/j.visres.2024.108438\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Biological visual systems rely on pose estimation of 3D objects to navigate and interact with their environment, but the neural mechanisms and computations for inferring 3D poses from 2D retinal images are only partially understood, especially where stereo information is missing. We previously presented evidence that humans infer the poses of 3D objects lying centered on the ground by using the geometrical back-transform from retinal images to viewer-centered world coordinates. This model explained the almost veridical estimation of poses in real scenes and the illusory rotation of poses in obliquely viewed pictures, which includes the “pointing out of the picture” phenomenon. Here we test this model for more varied configurations and find that it needs to be augmented. Five observers estimated poses of sloped, elevated, or off-center 3D sticks in each of 16 different poses displayed on a monitor in frontal and oblique views. Pose estimates in scenes and pictures showed remarkable accuracy and agreement between observers, but with a systematic fronto-parallel bias for oblique poses similar to the ground condition. The retinal projection of the pose of an object sloped wrt the ground depends on the slope. We show that observers’ estimates can be explained by the back-transform derived for close to the correct slope. The back-transform explanation also applies to obliquely viewed pictures and to off-center objects and elevated objects, making it more likely that observers use internalized perspective geometry to make 3D pose inferences while actively incorporating inferences about other aspects of object placement.</p></div>\",\"PeriodicalId\":23670,\"journal\":{\"name\":\"Vision Research\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.5000,\"publicationDate\":\"2024-06-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision Research\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0042698924000828\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision Research","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0042698924000828","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

生物视觉系统依赖于三维物体的姿势估计来导航和与环境互动,但人们对从二维视网膜图像推断三维姿势的神经机制和计算仅有部分了解,尤其是在缺少立体信息的情况下。我们之前提出的证据表明,人类通过从视网膜图像到以观看者为中心的世界坐标的几何反变换,推断出以地面为中心的三维物体的姿势。这一模型解释了真实场景中几乎真实的姿势估计,以及斜视图片中虚幻的姿势旋转,包括 "指向图片外 "现象。在这里,我们对这一模型进行了测试,发现它需要在更多的配置上进行改进。五名观察者在显示器上以正视图和斜视图显示的 16 种不同姿势中的每一种姿势下,对倾斜、升高或偏离中心的三维木棒的姿势进行了估计。在场景和图片中的姿势估计显示出显著的准确性和观察者之间的一致性,但在与地面条件类似的斜视姿势中存在系统性的正面-平行偏差。视网膜对倾斜于地面的物体姿势的投影取决于斜度。我们的研究表明,观察者的估计值可以用接近正确坡度的反变换来解释。后向变换的解释也适用于斜视图片、偏离中心的物体和升高的物体,这使得观察者更有可能使用内化的透视几何来进行三维姿势推断,同时积极地结合物体位置的其他方面进行推断。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Complexity of mental geometry for 3D pose perception

Biological visual systems rely on pose estimation of 3D objects to navigate and interact with their environment, but the neural mechanisms and computations for inferring 3D poses from 2D retinal images are only partially understood, especially where stereo information is missing. We previously presented evidence that humans infer the poses of 3D objects lying centered on the ground by using the geometrical back-transform from retinal images to viewer-centered world coordinates. This model explained the almost veridical estimation of poses in real scenes and the illusory rotation of poses in obliquely viewed pictures, which includes the “pointing out of the picture” phenomenon. Here we test this model for more varied configurations and find that it needs to be augmented. Five observers estimated poses of sloped, elevated, or off-center 3D sticks in each of 16 different poses displayed on a monitor in frontal and oblique views. Pose estimates in scenes and pictures showed remarkable accuracy and agreement between observers, but with a systematic fronto-parallel bias for oblique poses similar to the ground condition. The retinal projection of the pose of an object sloped wrt the ground depends on the slope. We show that observers’ estimates can be explained by the back-transform derived for close to the correct slope. The back-transform explanation also applies to obliquely viewed pictures and to off-center objects and elevated objects, making it more likely that observers use internalized perspective geometry to make 3D pose inferences while actively incorporating inferences about other aspects of object placement.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Vision Research
Vision Research 医学-神经科学
CiteScore
3.70
自引率
16.70%
发文量
111
审稿时长
66 days
期刊介绍: Vision Research is a journal devoted to the functional aspects of human, vertebrate and invertebrate vision and publishes experimental and observational studies, reviews, and theoretical and computational analyses. Vision Research also publishes clinical studies relevant to normal visual function and basic research relevant to visual dysfunction or its clinical investigation. Functional aspects of vision is interpreted broadly, ranging from molecular and cellular function to perception and behavior. Detailed descriptions are encouraged but enough introductory background should be included for non-specialists. Theoretical and computational papers should give a sense of order to the facts or point to new verifiable observations. Papers dealing with questions in the history of vision science should stress the development of ideas in the field.
期刊最新文献
ATXN2 loss of function results in glaucoma-related features supporting a role for Ataxin-2 in primary open-angle glaucoma (POAG) pathogenesis Depth constancy and the absolute vergence anomaly Accentuation, Boolean maps and perception of (dis)similarity in a neural model of visual segmentation Ovariectomy drives increase of an ECM transcription signature in the posterior eye and retina The concept of group and the theory of perception
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1