Maddie Cusimano , Luke B. Hewitt , Josh H. McDermott
{"title":"Listening with generative models","authors":"Maddie Cusimano , Luke B. Hewitt , Josh H. McDermott","doi":"10.1016/j.cognition.2024.105874","DOIUrl":null,"url":null,"abstract":"<div><p>Perception has long been envisioned to use an internal model of the world to explain the causes of sensory signals. However, such accounts have historically not been testable, typically requiring intractable search through the space of possible explanations. Using auditory scenes as a case study, we leveraged contemporary computational tools to infer explanations of sounds in a candidate internal generative model of the auditory world (ecologically inspired audio synthesizers). Model inferences accounted for many classic illusions. Unlike traditional accounts of auditory illusions, the model is applicable to any sound, and exhibited human-like perceptual organization for real-world sound mixtures. The combination of stimulus-computability and interpretable model structure enabled ‘rich falsification’, revealing additional assumptions about sound generation needed to account for perception. The results show how generative models can account for the perception of both classic illusions and everyday sensory signals, and illustrate the opportunities and challenges involved in incorporating them into theories of perception.</p></div>","PeriodicalId":48455,"journal":{"name":"Cognition","volume":"253 ","pages":"Article 105874"},"PeriodicalIF":2.8000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0010027724001604/pdfft?md5=12a6854cd3586854a262c85e80572130&pid=1-s2.0-S0010027724001604-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognition","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010027724001604","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Perception has long been envisioned to use an internal model of the world to explain the causes of sensory signals. However, such accounts have historically not been testable, typically requiring intractable search through the space of possible explanations. Using auditory scenes as a case study, we leveraged contemporary computational tools to infer explanations of sounds in a candidate internal generative model of the auditory world (ecologically inspired audio synthesizers). Model inferences accounted for many classic illusions. Unlike traditional accounts of auditory illusions, the model is applicable to any sound, and exhibited human-like perceptual organization for real-world sound mixtures. The combination of stimulus-computability and interpretable model structure enabled ‘rich falsification’, revealing additional assumptions about sound generation needed to account for perception. The results show how generative models can account for the perception of both classic illusions and everyday sensory signals, and illustrate the opportunities and challenges involved in incorporating them into theories of perception.
期刊介绍:
Cognition is an international journal that publishes theoretical and experimental papers on the study of the mind. It covers a wide variety of subjects concerning all the different aspects of cognition, ranging from biological and experimental studies to formal analysis. Contributions from the fields of psychology, neuroscience, linguistics, computer science, mathematics, ethology and philosophy are welcome in this journal provided that they have some bearing on the functioning of the mind. In addition, the journal serves as a forum for discussion of social and political aspects of cognitive science.