Pub Date : 2024-07-19eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00152
Rebecca Zhu, Mariel K Goddu, Lily Zihui Zhu, Alison Gopnik
Previous work suggests that preschoolers often misunderstand metaphors. However, some recent studies demonstrate that preschoolers can represent abstract relations, suggesting that the cognitive foundations of metaphor comprehension may develop earlier than previously believed. The present experiments used novel paradigms to explore whether preschoolers (N = 200; 4-5 years; 100 males, 100 females; predominantly White) can understand metaphors based on abstract, functional similarities. In Experiment 1, preschoolers and adults (N = 64; 18-41 years; 25 males, 39 females; predominantly White) rated functional metaphors (e.g., "Roofs are hats"; "Tires are shoes") as "smarter" than nonsense statements (e.g., "Boats are skirts"; "Pennies are sunglasses") in a metalinguistic judgment task (d = .42 in preschoolers; d = 3.06 in adults). In Experiment 2, preschoolers preferred functional explanations (e.g., "Both keep you dry") over perceptual explanations (e.g., "Both have pointy tops") when interpreting functional metaphors (e.g., "Roofs are hats") (d = .99). In Experiment 3, preschoolers preferred functional metaphors (e.g., "Roofs are hats") over nonsense statements (e.g., "Roofs are scissors") when prompted to select the "better" utterance (d = 1.25). Moreover, over a quarter of preschoolers in Experiment 1 and half of preschoolers in Experiment 3 explicitly articulated functional similarities when justifying their responses, and the performance of these subsets of children drove the success of the entire sample in both experiments. These findings demonstrate that preschoolers can understand metaphors based on abstract, functional similarities.
{"title":"Preschoolers' Comprehension of Functional Metaphors.","authors":"Rebecca Zhu, Mariel K Goddu, Lily Zihui Zhu, Alison Gopnik","doi":"10.1162/opmi_a_00152","DOIUrl":"10.1162/opmi_a_00152","url":null,"abstract":"<p><p>Previous work suggests that preschoolers often misunderstand metaphors. However, some recent studies demonstrate that preschoolers can represent abstract relations, suggesting that the cognitive foundations of metaphor comprehension may develop earlier than previously believed. The present experiments used novel paradigms to explore whether preschoolers (<i>N</i> = 200; 4-5 years; 100 males, 100 females; predominantly White) can understand metaphors based on abstract, functional similarities. In Experiment 1, preschoolers and adults (<i>N</i> = 64; 18-41 years; 25 males, 39 females; predominantly White) rated functional metaphors (e.g., \"Roofs are hats\"; \"Tires are shoes\") as \"smarter\" than nonsense statements (e.g., \"Boats are skirts\"; \"Pennies are sunglasses\") in a metalinguistic judgment task (<i>d</i> = .42 in preschoolers; <i>d</i> = 3.06 in adults). In Experiment 2, preschoolers preferred functional explanations (e.g., \"Both keep you dry\") over perceptual explanations (e.g., \"Both have pointy tops\") when interpreting functional metaphors (e.g., \"Roofs are hats\") (<i>d</i> = .99). In Experiment 3, preschoolers preferred functional metaphors (e.g., \"Roofs are hats\") over nonsense statements (e.g., \"Roofs are scissors\") when prompted to select the \"better\" utterance (<i>d</i> = 1.25). Moreover, over a quarter of preschoolers in Experiment 1 and half of preschoolers in Experiment 3 explicitly articulated functional similarities when justifying their responses, and the performance of these subsets of children drove the success of the entire sample in both experiments. These findings demonstrate that preschoolers can understand metaphors based on abstract, functional similarities.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"924-949"},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285420/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141793704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00149
Igor Grossmann, Johanna Peetz, Anna Dorfman, Amanda Rotella, Roger Buehler
We explored how individuals' mental representations of complex and uncertain situations impact their ability to reason wisely. To this end, we introduce situated methods to capture abstract and concrete mental representations and the switching between them when reflecting on social challenges. Using these methods, we evaluated the alignment of abstractness and concreteness with four integral facets of wisdom: intellectual humility, open-mindedness, perspective-taking, and compromise-seeking. Data from North American and UK participants (N = 1,151) revealed that both abstract and concrete construals significantly contribute to wise reasoning, even when controlling for a host of relevant covariates and potential response bias. Natural language processing of unstructured texts among high (top 25%) and low (bottom 25%) wisdom participants corroborated these results: semantic networks of the high wisdom group reveal greater use of both abstract and concrete themes compared to the low wisdom group. Finally, employing a repeated strategy-choice method as an additional measure, our findings demonstrated that individuals who showed a greater balance and switching between these construal types exhibited higher wisdom. Our findings advance understanding of individual differences in mental representations and how construals shape reasoning across contexts in everyday life.
{"title":"The Wise Mind Balances the Abstract and the Concrete.","authors":"Igor Grossmann, Johanna Peetz, Anna Dorfman, Amanda Rotella, Roger Buehler","doi":"10.1162/opmi_a_00149","DOIUrl":"10.1162/opmi_a_00149","url":null,"abstract":"<p><p>We explored how individuals' mental representations of complex and uncertain situations impact their ability to reason wisely. To this end, we introduce situated methods to capture abstract and concrete mental representations and the switching between them when reflecting on social challenges. Using these methods, we evaluated the alignment of abstractness and concreteness with four integral facets of wisdom: intellectual humility, open-mindedness, perspective-taking, and compromise-seeking. Data from North American and UK participants (<i>N</i> = 1,151) revealed that both abstract and concrete construals significantly contribute to wise reasoning, even when controlling for a host of relevant covariates and potential response bias. Natural language processing of unstructured texts among high (top 25%) and low (bottom 25%) wisdom participants corroborated these results: semantic networks of the high wisdom group reveal greater use of both abstract and concrete themes compared to the low wisdom group. Finally, employing a repeated strategy-choice method as an additional measure, our findings demonstrated that individuals who showed a greater balance and switching between these construal types exhibited higher wisdom. Our findings advance understanding of individual differences in mental representations and how construals shape reasoning across contexts in everyday life.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"826-858"},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11226238/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141555550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00150
James A Michaelov, Benjamin K Bergen
Accounts of human language comprehension propose different mathematical relationships between the contextual probability of a word and how difficult it is to process, including linear, logarithmic, and super-logarithmic ones. However, the empirical evidence favoring any of these over the others is mixed, appearing to vary depending on the index of processing difficulty used and the approach taken to calculate contextual probability. To help disentangle these results, we focus on the mathematical relationship between corpus-derived contextual probability and the N400, a neural index of processing difficulty. Specifically, we use 37 contemporary transformer language models to calculate the contextual probability of stimuli from 6 experimental studies of the N400, and test whether N400 amplitude is best predicted by a linear, logarithmic, super-logarithmic, or sub-logarithmic transformation of the probabilities calculated using these language models, as well as combinations of these transformed metrics. We replicate the finding that on some datasets, a combination of linearly and logarithmically-transformed probability can predict N400 amplitude better than either metric alone. In addition, we find that overall, the best single predictor of N400 amplitude is sub-logarithmically-transformed probability, which for almost all language models and datasets explains all the variance in N400 amplitude otherwise explained by the linear and logarithmic transformations. This is a novel finding that is not predicted by any current theoretical accounts, and thus one that we argue is likely to play an important role in increasing our understanding of how the statistical regularities of language impact language comprehension.
{"title":"On the Mathematical Relationship Between Contextual Probability and N400 Amplitude.","authors":"James A Michaelov, Benjamin K Bergen","doi":"10.1162/opmi_a_00150","DOIUrl":"10.1162/opmi_a_00150","url":null,"abstract":"<p><p>Accounts of human language comprehension propose different mathematical relationships between the contextual probability of a word and how difficult it is to process, including linear, logarithmic, and super-logarithmic ones. However, the empirical evidence favoring any of these over the others is mixed, appearing to vary depending on the index of processing difficulty used and the approach taken to calculate contextual probability. To help disentangle these results, we focus on the mathematical relationship between corpus-derived contextual probability and the N400, a neural index of processing difficulty. Specifically, we use 37 contemporary transformer language models to calculate the contextual probability of stimuli from 6 experimental studies of the N400, and test whether N400 amplitude is best predicted by a linear, logarithmic, super-logarithmic, or sub-logarithmic transformation of the probabilities calculated using these language models, as well as combinations of these transformed metrics. We replicate the finding that on some datasets, a combination of linearly and logarithmically-transformed probability can predict N400 amplitude better than either metric alone. In addition, we find that overall, the best single predictor of N400 amplitude is sub-logarithmically-transformed probability, which for almost all language models and datasets explains all the variance in N400 amplitude otherwise explained by the linear and logarithmic transformations. This is a novel finding that is not predicted by any current theoretical accounts, and thus one that we argue is likely to play an important role in increasing our understanding of how the statistical regularities of language impact language comprehension.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"859-897"},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11285424/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141793703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-28eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00148
Nicolò Cesana-Arlotti, Justin Halberda
Children grow up surrounded by opportunities to learn (the language of their community, the movements of their body, other people's preferences and mental lives, games, social norms, etc.). Here, we find that toddlers (N = 36; age range 2.3-3.2 years) rely on a logical reasoning strategy, Disjunctive Inference (i.e., A OR B, A is ruled out, THEREFORE, B), across a variety of situations, all before they have any formal education or extensive experience with words for expressing logical meanings. In learning new words, learning new facts about a person, and finding the winner of a race, toddlers systematically consider and reject competitors before deciding who must be the winner. This suggests that toddlers may have a general-purpose logical reasoning tool that they can use in any situation.
{"title":"A Continuity in Logical Development: Domain-General Disjunctive Inference by Toddlers.","authors":"Nicolò Cesana-Arlotti, Justin Halberda","doi":"10.1162/opmi_a_00148","DOIUrl":"10.1162/opmi_a_00148","url":null,"abstract":"<p><p>Children grow up surrounded by opportunities to learn (the language of their community, the movements of their body, other people's preferences and mental lives, games, social norms, etc.). Here, we find that toddlers (N = 36; age range 2.3-3.2 years) rely on a logical reasoning strategy, Disjunctive Inference (i.e., A OR B, A is ruled out, THEREFORE, B), across a variety of situations, all before they have any formal education or extensive experience with words for expressing logical meanings. In learning new words, learning new facts about a person, and finding the winner of a race, toddlers systematically consider and reject competitors before deciding who must be the winner. This suggests that toddlers may have a general-purpose logical reasoning tool that they can use in any situation.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"809-825"},"PeriodicalIF":0.0,"publicationDate":"2024-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11226237/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141556449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00147
Aaron Chuey, Veronica Boyce, Anjie Cao, Michael C Frank
An increasing number of psychological experiments with children are being conducted using online platforms, in part due to the COVID-19 pandemic. Individual replications have compared the findings of particular experiments online and in-person, but the general effect of data collection method on data collected from children is still unknown. Therefore, the goal of the current meta-analysis is to estimate the average difference in effect size for developmental studies conducted online compared to the same studies conducted in-person. Our pre-registered analysis includes 211 effect sizes calculated from 30 papers with 3282 children, ranging in age from four months to six years. The estimated effect size for studies conducted online was slightly smaller than for their counterparts conducted in-person, a difference of d = -.05, but this difference was not significant, 95% CI = [-.17, .07]. We examined several potential moderators of the effect of online testing, including the role of dependent measure (looking vs verbal), online study method (moderated vs unmoderated), and age, but none of these were significant. The literature to date thus suggests-on average-small differences in results between in-person and online experimentation.
部分由于 COVID-19 的流行,越来越多的儿童心理实验通过网络平台进行。个别复制研究对特定实验的在线和现场结果进行了比较,但数据收集方法对儿童数据收集的总体影响仍是未知数。因此,本次荟萃分析的目的是估算在线进行的发育研究与面对面进行的相同研究在效应大小上的平均差异。我们的预注册分析包括从 30 篇论文中计算出的 211 个效应大小,涉及 3282 名儿童,年龄从 4 个月到 6 岁不等。在线研究的估计效应大小略小于面对面研究,差异为 d = -.05,但这一差异并不显著,95% CI = [-.17, .07]。我们研究了在线测试影响的几个潜在调节因素,包括依赖测量的作用(观察与口头)、在线研究方法(有调节与无调节)和年龄,但这些因素都不显著。因此,迄今为止的文献表明,面对面实验和在线实验的结果平均差异很小。
{"title":"Conducting Developmental Research Online vs. In-Person: A Meta-Analysis.","authors":"Aaron Chuey, Veronica Boyce, Anjie Cao, Michael C Frank","doi":"10.1162/opmi_a_00147","DOIUrl":"10.1162/opmi_a_00147","url":null,"abstract":"<p><p>An increasing number of psychological experiments with children are being conducted using online platforms, in part due to the COVID-19 pandemic. Individual replications have compared the findings of particular experiments online and in-person, but the general effect of data collection method on data collected from children is still unknown. Therefore, the goal of the current meta-analysis is to estimate the average difference in effect size for developmental studies conducted online compared to the same studies conducted in-person. Our pre-registered analysis includes 211 effect sizes calculated from 30 papers with 3282 children, ranging in age from four months to six years. The estimated effect size for studies conducted online was slightly smaller than for their counterparts conducted in-person, a difference of <i>d</i> = -.05, but this difference was not significant, 95% CI = [-.17, .07]. We examined several potential moderators of the effect of online testing, including the role of dependent measure (looking vs verbal), online study method (moderated vs unmoderated), and age, but none of these were significant. The literature to date thus suggests-on average-small differences in results between in-person and online experimentation.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"795-808"},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11219065/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141493713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-12eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00146
Alon Hafri, Michael F Bonner, Barbara Landau, Chaz Firestone
When a piece of fruit is in a bowl, and the bowl is on a table, we appreciate not only the individual objects and their features, but also the relations containment and support, which abstract away from the particular objects involved. Independent representation of roles (e.g., containers vs. supporters) and "fillers" of those roles (e.g., bowls vs. cups, tables vs. chairs) is a core principle of language and higher-level reasoning. But does such role-filler independence also arise in automatic visual processing? Here, we show that it does, by exploring a surprising error that such independence can produce. In four experiments, participants saw a stream of images containing different objects arranged in force-dynamic relations-e.g., a phone contained in a basket, a marker resting on a garbage can, or a knife sitting in a cup. Participants had to respond to a single target image (e.g., a phone in a basket) within a stream of distractors presented under time constraints. Surprisingly, even though participants completed this task quickly and accurately, they false-alarmed more often to images matching the target's relational category than to those that did not-even when those images involved completely different objects. In other words, participants searching for a phone in a basket were more likely to mistakenly respond to a knife in a cup than to a marker on a garbage can. Follow-up experiments ruled out strategic responses and also controlled for various confounding image features. We suggest that visual processing represents relations abstractly, in ways that separate roles from fillers.
{"title":"A Phone in a Basket Looks Like a Knife in a Cup: Role-Filler Independence in Visual Processing.","authors":"Alon Hafri, Michael F Bonner, Barbara Landau, Chaz Firestone","doi":"10.1162/opmi_a_00146","DOIUrl":"10.1162/opmi_a_00146","url":null,"abstract":"<p><p>When a piece of fruit is in a bowl, and the bowl is on a table, we appreciate not only the individual objects and their features, but also the relations <i>containment</i> and <i>support</i>, which abstract away from the particular objects involved. Independent representation of roles (e.g., containers vs. supporters) and \"fillers\" of those roles (e.g., bowls vs. cups, tables vs. chairs) is a core principle of language and higher-level reasoning. But does such role-filler independence also arise in automatic visual processing? Here, we show that it does, by exploring a surprising error that such independence can produce. In four experiments, participants saw a stream of images containing different objects arranged in force-dynamic relations-e.g., a phone contained in a basket, a marker resting on a garbage can, or a knife sitting in a cup. Participants had to respond to a single target image (e.g., a phone in a basket) within a stream of distractors presented under time constraints. Surprisingly, even though participants completed this task quickly and accurately, they false-alarmed more often to images matching the target's relational category than to those that did not-even when those images involved completely different objects. In other words, participants searching for a phone in a basket were more likely to mistakenly respond to a knife in a cup than to a marker on a garbage can. Follow-up experiments ruled out strategic responses and also controlled for various confounding image features. We suggest that visual processing represents relations abstractly, in ways that separate roles from fillers.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"766-794"},"PeriodicalIF":0.0,"publicationDate":"2024-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11219067/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141493712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-04eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00145
Pietro Amerio, Matthias Michel, Stephan Goerttler, Megan A K Peters, Axel Cleeremans
The comparison between conscious and unconscious perception is a cornerstone of consciousness science. However, most studies reporting above-chance discrimination of unseen stimuli do not control for criterion biases when assessing awareness. We tested whether observers can discriminate subjectively invisible offsets of Vernier stimuli when visibility is probed using a bias-free task. To reduce visibility, stimuli were either backward masked or presented for very brief durations (1-3 milliseconds) using a modern-day Tachistoscope. We found some behavioral indicators of perception without awareness, and yet, no conclusive evidence thereof. To seek more decisive proof, we simulated a series of Bayesian observer models, including some that produce visibility judgements alongside type-1 judgements. Our data are best accounted for by observers with slightly suboptimal conscious access to sensory evidence. Overall, the stimuli and visibility manipulations employed here induced mild instances of blindsight-like behavior, making them attractive candidates for future investigation of this phenomenon.
{"title":"Unconscious Perception of Vernier Offsets.","authors":"Pietro Amerio, Matthias Michel, Stephan Goerttler, Megan A K Peters, Axel Cleeremans","doi":"10.1162/opmi_a_00145","DOIUrl":"10.1162/opmi_a_00145","url":null,"abstract":"<p><p>The comparison between conscious and unconscious perception is a cornerstone of consciousness science. However, most studies reporting above-chance discrimination of unseen stimuli do not control for criterion biases when assessing awareness. We tested whether observers can discriminate subjectively invisible offsets of Vernier stimuli when visibility is probed using a bias-free task. To reduce visibility, stimuli were either backward masked or presented for very brief durations (1-3 milliseconds) using a modern-day Tachistoscope. We found some behavioral indicators of perception without awareness, and yet, no conclusive evidence thereof. To seek more decisive proof, we simulated a series of Bayesian observer models, including some that produce visibility judgements alongside type-1 judgements. Our data are best accounted for by observers with slightly suboptimal conscious access to sensory evidence. Overall, the stimuli and visibility manipulations employed here induced mild instances of blindsight-like behavior, making them attractive candidates for future investigation of this phenomenon.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"739-765"},"PeriodicalIF":0.0,"publicationDate":"2024-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11185422/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141421204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-20eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00144
Sean Trott
Recent advances in Large Language Models (LLMs) have raised the question of replacing human subjects with LLM-generated data. While some believe that LLMs capture the "wisdom of the crowd"-due to their vast training data-empirical evidence for this hypothesis remains scarce. We present a novel methodological framework to test this: the "number needed to beat" (NNB), which measures how many humans are needed for a sample's quality to rival the quality achieved by GPT-4, a state-of-the-art LLM. In a series of pre-registered experiments, we collect novel human data and demonstrate the utility of this method for four psycholinguistic datasets for English. We find that NNB > 1 for each dataset, but also that NNB varies across tasks (and in some cases is quite small, e.g., 2). We also introduce two "centaur" methods for combining LLM and human data, which outperform both stand-alone LLMs and human samples. Finally, we analyze the trade-offs in data cost and quality for each approach. While clear limitations remain, we suggest that this framework could guide decision-making about whether and how to integrate LLM-generated data into the research pipeline.
{"title":"Large Language Models and the Wisdom of Small Crowds.","authors":"Sean Trott","doi":"10.1162/opmi_a_00144","DOIUrl":"10.1162/opmi_a_00144","url":null,"abstract":"<p><p>Recent advances in Large Language Models (LLMs) have raised the question of replacing human subjects with LLM-generated data. While some believe that LLMs capture the \"wisdom of the crowd\"-due to their vast training data-empirical evidence for this hypothesis remains scarce. We present a novel methodological framework to test this: the \"number needed to beat\" (NNB), which measures how many humans are needed for a sample's quality to rival the quality achieved by GPT-4, a state-of-the-art LLM. In a series of pre-registered experiments, we collect novel human data and demonstrate the utility of this method for four psycholinguistic datasets for English. We find that NNB > 1 for each dataset, but also that NNB varies across tasks (and in some cases is quite small, e.g., 2). We also introduce two \"centaur\" methods for combining LLM and human data, which outperform both stand-alone LLMs and human samples. Finally, we analyze the trade-offs in data cost and quality for each approach. While clear limitations remain, we suggest that this framework could guide decision-making about whether and how to integrate LLM-generated data into the research pipeline.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"723-738"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11142632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141201704","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00142
Nathan K Mathews, Umer Bin Faiz, Nicholaus P Brosowsky
Mind wandering is a common experience in which your attention drifts away from the task at hand and toward task-unrelated thoughts. To measure mind wandering we typically use experience sampling and retrospective self-reports, which require participants to make metacognitive judgments about their immediately preceding attentional states. In the current study, we aimed to better understand how people come to make such judgments by introducing a novel distinction between explicit memories of off task thought and subjective feelings of inattention. Across two preregistered experiments, we found that participants often indicated they were "off task" and yet had no memory of the content of their thoughts-though, they were less common than remembered experiences. Critically, remembered experiences of mind wandering and subjective feelings of inattention differed in their behavioral correlates. In Experiment 1, we found that only the frequency of remembered mind wandering varied with task demands. In contrast, only subjective feelings of inattention were associated with poor performance (Experiments 1 and 2) and individual differences in executive functioning (Experiment 2). These results suggest that the phenomenology of mind wandering may differ depending on how the experiences are brought about (e.g., executive functioning errors versus excess attentional resources), and provide preliminary evidence of the importance of measuring subjective feelings of inattention when assessing mind wandering.
{"title":"How Do You Know If You Were Mind Wandering? Dissociating Explicit Memories of Off Task Thought From Subjective Feelings of Inattention.","authors":"Nathan K Mathews, Umer Bin Faiz, Nicholaus P Brosowsky","doi":"10.1162/opmi_a_00142","DOIUrl":"10.1162/opmi_a_00142","url":null,"abstract":"<p><p>Mind wandering is a common experience in which your attention drifts away from the task at hand and toward task-unrelated thoughts. To measure mind wandering we typically use experience sampling and retrospective self-reports, which require participants to make metacognitive judgments about their immediately preceding attentional states. In the current study, we aimed to better understand how people come to make such judgments by introducing a novel distinction between explicit memories of off task thought and subjective feelings of inattention. Across two preregistered experiments, we found that participants often indicated they were \"off task\" and yet had no memory of the content of their thoughts-though, they were less common than remembered experiences. Critically, remembered experiences of mind wandering and subjective feelings of inattention differed in their behavioral correlates. In Experiment 1, we found that only the frequency of remembered mind wandering varied with task demands. In contrast, only subjective feelings of inattention were associated with poor performance (Experiments 1 and 2) and individual differences in executive functioning (Experiment 2). These results suggest that the phenomenology of mind wandering may differ depending on how the experiences are brought about (e.g., executive functioning errors versus excess attentional resources), and provide preliminary evidence of the importance of measuring subjective feelings of inattention when assessing mind wandering.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"666-687"},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11142633/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-10eCollection Date: 2024-01-01DOI: 10.1162/opmi_a_00141
Julie Y L Chow, Micah B Goldwater, Ben Colagiuri, Evan J Livesey
People tend to overestimate the efficacy of an ineffective treatment when they experience the treatment and its supposed outcome co-occurring frequently. This is referred to as the outcome density effect. Here, we attempted to improve the accuracy of participants' assessments of an ineffective treatment by instructing them about the scientific practice of comparing treatment effects against a relevant base-rate, i.e., when no treatment is delivered. The effect of these instructions was assessed in both a trial-by-trial contingency learning task, where cue administration was either decided by the participant (Experiments 1 & 2) or pre-determined by the experimenter (Experiment 3), as well as in summary format where all information was presented on a single screen (Experiment 4). Overall, we found two means by which base-rate instructions influence efficacy ratings for the ineffective treatment: 1) When information was presented sequentially, the benefit of base-rate instructions on illusory belief was mediated by reduced sampling of cue-present trials, and 2) When information was presented in summary format, we found a direct effect of base-rate instruction on reducing causal illusion. Together, these findings suggest that simple instructions on the scientific method were able to decrease participants' (over-)weighting of cue-outcome coincidences when making causal judgements, as well as decrease their tendency to over-sample cue-present events. However, the effect of base-rate instructions on correcting illusory beliefs was incomplete, and participants still showed illusory causal judgements when the probability of the outcome occurring was high. Thus, simple textual information about assessing causal relationships is partially effective in influencing people's judgements of treatment efficacy, suggesting an important role of scientific instruction in debiasing cognitive errors.
{"title":"Instruction on the Scientific Method Provides (Some) Protection Against Illusions of Causality.","authors":"Julie Y L Chow, Micah B Goldwater, Ben Colagiuri, Evan J Livesey","doi":"10.1162/opmi_a_00141","DOIUrl":"10.1162/opmi_a_00141","url":null,"abstract":"<p><p>People tend to overestimate the efficacy of an ineffective treatment when they experience the treatment and its supposed outcome co-occurring frequently. This is referred to as the <i>outcome density</i> effect. Here, we attempted to improve the accuracy of participants' assessments of an ineffective treatment by instructing them about the scientific practice of comparing treatment effects against a relevant base-rate, i.e., when no treatment is delivered. The effect of these instructions was assessed in both a trial-by-trial contingency learning task, where cue administration was either decided by the participant (Experiments 1 & 2) or pre-determined by the experimenter (Experiment 3), as well as in summary format where all information was presented on a single screen (Experiment 4). Overall, we found two means by which base-rate instructions influence efficacy ratings for the ineffective treatment: 1) When information was presented sequentially, the benefit of base-rate instructions on illusory belief was mediated by reduced sampling of cue-present trials, and 2) When information was presented in summary format, we found a <i>direct</i> effect of base-rate instruction on reducing causal illusion. Together, these findings suggest that simple instructions on the scientific method were able to decrease participants' (over-)weighting of cue-outcome coincidences when making causal judgements, as well as decrease their tendency to over-sample cue-present events. However, the effect of base-rate instructions on correcting illusory beliefs was incomplete, and participants still showed illusory causal judgements when the probability of the outcome occurring was high. Thus, simple textual information about assessing causal relationships is partially effective in influencing people's judgements of treatment efficacy, suggesting an important role of scientific instruction in debiasing cognitive errors.</p>","PeriodicalId":32558,"journal":{"name":"Open Mind","volume":"8 ","pages":"639-665"},"PeriodicalIF":0.0,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11142631/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141200925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}