人类听者在自然听觉场景感知过程中全局属性的初步证据

Q1 Social Sciences Open Mind Pub Date : 2024-03-26 eCollection Date: 2024-01-01 DOI:10.1162/opmi_a_00131
Margaret A McMullin, Rohit Kumar, Nathan C Higgins, Brian Gygi, Mounya Elhilali, Joel S Snyder
{"title":"人类听者在自然听觉场景感知过程中全局属性的初步证据","authors":"Margaret A McMullin, Rohit Kumar, Nathan C Higgins, Brian Gygi, Mounya Elhilali, Joel S Snyder","doi":"10.1162/opmi_a_00131","DOIUrl":null,"url":null,"abstract":"<p><p>Theories of auditory and visual scene analysis suggest the perception of scenes relies on the identification and segregation of objects within it, resembling a detail-oriented processing style. However, a more global process may occur while analyzing scenes, which has been evidenced in the visual domain. It is our understanding that a similar line of research has not been explored in the auditory domain; therefore, we evaluated the contributions of high-level global and low-level acoustic information to auditory scene perception. An additional aim was to increase the field's ecological validity by using and making available a new collection of high-quality auditory scenes. Participants rated scenes on 8 global properties (e.g., open vs. enclosed) and an acoustic analysis evaluated which low-level features predicted the ratings. We submitted the acoustic measures and average ratings of the global properties to separate exploratory factor analyses (EFAs). The EFA of the acoustic measures revealed a seven-factor structure explaining 57% of the variance in the data, while the EFA of the global property measures revealed a two-factor structure explaining 64% of the variance in the data. Regression analyses revealed each global property was predicted by at least one acoustic variable (R<sup>2</sup> = 0.33-0.87). These findings were extended using deep neural network models where we examined correlations between human ratings of global properties and deep embeddings of two computational models: an object-based model and a scene-based model. The results support that participants' ratings are more strongly explained by a global analysis of the scene setting, though the relationship between scene perception and auditory perception is multifaceted, with differing correlation patterns evident between the two models. Taken together, our results provide evidence for the ability to perceive auditory scenes from a global perspective. Some of the acoustic measures predicted ratings of global scene perception, suggesting representations of auditory objects may be transformed through many stages of processing in the ventral auditory stream, similar to what has been proposed in the ventral visual stream. These findings and the open availability of our scene collection will make future studies on perception, attention, and memory for natural auditory scenes possible.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"333-365"},"PeriodicalIF":0.0000,"publicationDate":"2024-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10990578/pdf/","citationCount":"0","resultStr":"{\"title\":\"Preliminary Evidence for Global Properties in Human Listeners During Natural Auditory Scene Perception.\",\"authors\":\"Margaret A McMullin, Rohit Kumar, Nathan C Higgins, Brian Gygi, Mounya Elhilali, Joel S Snyder\",\"doi\":\"10.1162/opmi_a_00131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Theories of auditory and visual scene analysis suggest the perception of scenes relies on the identification and segregation of objects within it, resembling a detail-oriented processing style. However, a more global process may occur while analyzing scenes, which has been evidenced in the visual domain. It is our understanding that a similar line of research has not been explored in the auditory domain; therefore, we evaluated the contributions of high-level global and low-level acoustic information to auditory scene perception. An additional aim was to increase the field's ecological validity by using and making available a new collection of high-quality auditory scenes. Participants rated scenes on 8 global properties (e.g., open vs. enclosed) and an acoustic analysis evaluated which low-level features predicted the ratings. We submitted the acoustic measures and average ratings of the global properties to separate exploratory factor analyses (EFAs). The EFA of the acoustic measures revealed a seven-factor structure explaining 57% of the variance in the data, while the EFA of the global property measures revealed a two-factor structure explaining 64% of the variance in the data. Regression analyses revealed each global property was predicted by at least one acoustic variable (R<sup>2</sup> = 0.33-0.87). These findings were extended using deep neural network models where we examined correlations between human ratings of global properties and deep embeddings of two computational models: an object-based model and a scene-based model. The results support that participants' ratings are more strongly explained by a global analysis of the scene setting, though the relationship between scene perception and auditory perception is multifaceted, with differing correlation patterns evident between the two models. Taken together, our results provide evidence for the ability to perceive auditory scenes from a global perspective. Some of the acoustic measures predicted ratings of global scene perception, suggesting representations of auditory objects may be transformed through many stages of processing in the ventral auditory stream, similar to what has been proposed in the ventral visual stream. These findings and the open availability of our scene collection will make future studies on perception, attention, and memory for natural auditory scenes possible.</p>\",\"PeriodicalId\":32558,\"journal\":{\"name\":\"Open Mind\",\"volume\":\"8 \",\"pages\":\"333-365\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10990578/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Open Mind\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1162/opmi_a_00131\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Mind","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1162/opmi_a_00131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}
引用次数: 0

摘要

听觉和视觉场景分析理论认为,对场景的感知依赖于识别和分离场景中的物体,这类似于一种以细节为导向的处理方式。然而,在分析场景时可能会出现更全面的过程,这在视觉领域已得到证实。据我们了解,听觉领域还没有类似的研究;因此,我们评估了高层次的全局信息和低层次的声音信息对听觉场景感知的贡献。另一个目的是通过使用和提供新的高质量听觉场景集来提高该领域的生态有效性。参与者对场景的 8 个全局属性(如开放与封闭)进行评分,声学分析评估了哪些低层次特征可以预测评分。我们将声学测量结果和总体属性的平均评分分别进行了探索性因子分析(EFAs)。声学测量的 EFA 发现了一个七因子结构,可以解释数据中 57% 的方差,而总体属性测量的 EFA 发现了一个双因子结构,可以解释数据中 64% 的方差。回归分析表明,每个全局属性至少可以由一个声学变量预测(R2 = 0.33-0.87)。我们使用深度神经网络模型对这些发现进行了扩展,检验了人类对全局属性的评分与两个计算模型(基于物体的模型和基于场景的模型)的深度嵌入之间的相关性。尽管场景感知与听觉感知之间的关系是多方面的,两种模型之间的相关模式也不尽相同,但我们的结果表明,对场景设置的全局分析能更有力地解释参与者的评分。总之,我们的研究结果为从全局角度感知听觉场景的能力提供了证据。一些声学测量结果预测了对全局场景感知的评分,这表明听觉对象的表征可能会在腹侧听觉流的多个处理阶段中发生转变,这与在腹侧视觉流中提出的观点类似。这些发现以及我们场景收集的开放性将使未来对自然听觉场景的感知、注意力和记忆的研究成为可能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Preliminary Evidence for Global Properties in Human Listeners During Natural Auditory Scene Perception.

Theories of auditory and visual scene analysis suggest the perception of scenes relies on the identification and segregation of objects within it, resembling a detail-oriented processing style. However, a more global process may occur while analyzing scenes, which has been evidenced in the visual domain. It is our understanding that a similar line of research has not been explored in the auditory domain; therefore, we evaluated the contributions of high-level global and low-level acoustic information to auditory scene perception. An additional aim was to increase the field's ecological validity by using and making available a new collection of high-quality auditory scenes. Participants rated scenes on 8 global properties (e.g., open vs. enclosed) and an acoustic analysis evaluated which low-level features predicted the ratings. We submitted the acoustic measures and average ratings of the global properties to separate exploratory factor analyses (EFAs). The EFA of the acoustic measures revealed a seven-factor structure explaining 57% of the variance in the data, while the EFA of the global property measures revealed a two-factor structure explaining 64% of the variance in the data. Regression analyses revealed each global property was predicted by at least one acoustic variable (R2 = 0.33-0.87). These findings were extended using deep neural network models where we examined correlations between human ratings of global properties and deep embeddings of two computational models: an object-based model and a scene-based model. The results support that participants' ratings are more strongly explained by a global analysis of the scene setting, though the relationship between scene perception and auditory perception is multifaceted, with differing correlation patterns evident between the two models. Taken together, our results provide evidence for the ability to perceive auditory scenes from a global perspective. Some of the acoustic measures predicted ratings of global scene perception, suggesting representations of auditory objects may be transformed through many stages of processing in the ventral auditory stream, similar to what has been proposed in the ventral visual stream. These findings and the open availability of our scene collection will make future studies on perception, attention, and memory for natural auditory scenes possible.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Open Mind
Open Mind Social Sciences-Linguistics and Language
CiteScore
3.20
自引率
0.00%
发文量
15
审稿时长
53 weeks
期刊最新文献
Approximating Human-Level 3D Visual Inferences With Deep Neural Networks. Prosodic Cues Support Inferences About the Question's Pedagogical Intent. The Double Standard of Ownership. Combination and Differentiation Theories of Categorization: A Comparison Using Participants' Categorization Descriptions. Investigating Sensitivity to Shared Information and Personal Experience in Children's Use of Majority Information.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1