Pub Date : 2024-10-01Epub Date: 2024-07-08DOI: 10.3758/s13428-024-02410-7
Sabina J Sloman, Daniel R Cavagnaro, Stephen B Broomell
Adaptive design optimization (ADO) is a state-of-the-art technique for experimental design (Cavagnaro et al., 2010). ADO dynamically identifies stimuli that, in expectation, yield the most information about a hypothetical construct of interest (e.g., parameters of a cognitive model). To calculate this expectation, ADO leverages the modeler's existing knowledge, specified in the form of a prior distribution. Informative priors align with the distribution of the focal construct in the participant population. This alignment is assumed by ADO's internal assessment of expected information gain. If the prior is instead misinformative, i.e., does not align with the participant population, ADO's estimates of expected information gain could be inaccurate. In many cases, the true distribution that characterizes the participant population is unknown, and experimenters rely on heuristics in their choice of prior and without an understanding of how this choice affects ADO's behavior. Our work introduces a mathematical framework that facilitates investigation of the consequences of the choice of prior distribution on the efficiency of experiments designed using ADO. Through theoretical and empirical results, we show that, in the context of prior misinformation, measures of expected information gain are distinct from the correctness of the corresponding inference. Through a series of simulation experiments, we show that, in the case of parameter estimation, ADO nevertheless outperforms other design methods. Conversely, in the case of model selection, misinformative priors can lead inference to favor the wrong model, and rather than mitigating this pitfall, ADO exacerbates it.
{"title":"Knowing what to know: Implications of the choice of prior distribution on the behavior of adaptive design optimization.","authors":"Sabina J Sloman, Daniel R Cavagnaro, Stephen B Broomell","doi":"10.3758/s13428-024-02410-7","DOIUrl":"10.3758/s13428-024-02410-7","url":null,"abstract":"<p><p>Adaptive design optimization (ADO) is a state-of-the-art technique for experimental design (Cavagnaro et al., 2010). ADO dynamically identifies stimuli that, in expectation, yield the most information about a hypothetical construct of interest (e.g., parameters of a cognitive model). To calculate this expectation, ADO leverages the modeler's existing knowledge, specified in the form of a prior distribution. Informative priors align with the distribution of the focal construct in the participant population. This alignment is assumed by ADO's internal assessment of expected information gain. If the prior is instead misinformative, i.e., does not align with the participant population, ADO's estimates of expected information gain could be inaccurate. In many cases, the true distribution that characterizes the participant population is unknown, and experimenters rely on heuristics in their choice of prior and without an understanding of how this choice affects ADO's behavior. Our work introduces a mathematical framework that facilitates investigation of the consequences of the choice of prior distribution on the efficiency of experiments designed using ADO. Through theoretical and empirical results, we show that, in the context of prior misinformation, measures of expected information gain are distinct from the correctness of the corresponding inference. Through a series of simulation experiments, we show that, in the case of parameter estimation, ADO nevertheless outperforms other design methods. Conversely, in the case of model selection, misinformative priors can lead inference to favor the wrong model, and rather than mitigating this pitfall, ADO exacerbates it.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141557941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-07-16DOI: 10.3758/s13428-024-02463-8
Talia A Wise, Yoed N Kenett
Creative block is a familiar foe to any who attempt to create and is especially related to "writers block". While significant effort has been focused on developing methods to break such blocks, it remains an active challenge. Here, we focus on the role of semantic memory structure in driving creative block, by having people get "stuck" in a certain part of their semantic memory network. We directly examine whether we can "pull out" a participant from where they got "stuck" in their semantic memory, breaking their creative impasse. Our Associative Creativity Sparker (ACS) is a cognitive network science-based online tool that aims to spark creative ideas and break creative impasse: Once a participant runs out of ideas in a creative idea generation task, word recommendations are suggested to prime new ideas. These word recommendations are either towards or away from previous ideas, as well as close or far from the target object, based on a conceptual space extracted from the participants responses using online text analysis. In Study 1, 121 participants use the ACS to generate creative alternative uses for five different objects and completed creativity and Gf tasks. In Study 2, we repeat the design of Study 1, but further examine the impact of writing experience on the ACS, by examining 120 novice and 120 experienced writers. Across both studies, our results indicate that the location of word recommendations affects the fluency and originality of one's ideas, and that novice and experienced writers differently benefit from these word recommendations.
{"title":"Sparking creativity: Encouraging creative idea generation through automatically generated word recommendations.","authors":"Talia A Wise, Yoed N Kenett","doi":"10.3758/s13428-024-02463-8","DOIUrl":"10.3758/s13428-024-02463-8","url":null,"abstract":"<p><p>Creative block is a familiar foe to any who attempt to create and is especially related to \"writers block\". While significant effort has been focused on developing methods to break such blocks, it remains an active challenge. Here, we focus on the role of semantic memory structure in driving creative block, by having people get \"stuck\" in a certain part of their semantic memory network. We directly examine whether we can \"pull out\" a participant from where they got \"stuck\" in their semantic memory, breaking their creative impasse. Our Associative Creativity Sparker (ACS) is a cognitive network science-based online tool that aims to spark creative ideas and break creative impasse: Once a participant runs out of ideas in a creative idea generation task, word recommendations are suggested to prime new ideas. These word recommendations are either towards or away from previous ideas, as well as close or far from the target object, based on a conceptual space extracted from the participants responses using online text analysis. In Study 1, 121 participants use the ACS to generate creative alternative uses for five different objects and completed creativity and Gf tasks. In Study 2, we repeat the design of Study 1, but further examine the impact of writing experience on the ACS, by examining 120 novice and 120 experienced writers. Across both studies, our results indicate that the location of word recommendations affects the fluency and originality of one's ideas, and that novice and experienced writers differently benefit from these word recommendations.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362362/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141625848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-07-25DOI: 10.3758/s13428-024-02462-9
Sarah Humberg, Simon Grund, Steffen Nestler
Multilevel structural equation modeling (MSEM) is a statistical framework of major relevance for research concerned with people's intrapersonal dynamics. An application domain that is rapidly gaining relevance is the study of individual differences in the within-person association (WPA) of variables that fluctuate over time. For instance, an individual's social reactivity - their emotional response to social situations - can be represented as the association between repeated measurements of the individual's social interaction quantity and momentary well-being. MSEM allows researchers to investigate the associations between WPAs and person-level outcome variables (e.g., life satisfaction) by specifying the WPAs as random slopes in the structural equation on level 1 and using the latent representations of the slopes to predict outcomes on level 2. Here, we are concerned with the case in which a researcher is interested in nonlinear effects of WPAs on person-level outcomes - a U-shaped effect of a WPA, a moderation effect of two WPAs, or an effect of congruence between two WPAs - such that the corresponding MSEM includes latent interactions between random slopes. We evaluate the nonlinear MSEM approach for the three classes of nonlinear effects (U-shaped, moderation, congruence) and compare it with three simpler approaches: a simple two-step approach, a single-indicator approach, and a plausible values approach. We use a simulation study to compare the approaches on accuracy of parameter estimates and inference. We derive recommendations for practice and provide code templates and an illustrative example to help researchers implement the approaches.
{"title":"Estimating nonlinear effects of random slopes: A comparison of multilevel structural equation modeling with a two-step, a single-indicator, and a plausible values approach.","authors":"Sarah Humberg, Simon Grund, Steffen Nestler","doi":"10.3758/s13428-024-02462-9","DOIUrl":"10.3758/s13428-024-02462-9","url":null,"abstract":"<p><p>Multilevel structural equation modeling (MSEM) is a statistical framework of major relevance for research concerned with people's intrapersonal dynamics. An application domain that is rapidly gaining relevance is the study of individual differences in the within-person association (WPA) of variables that fluctuate over time. For instance, an individual's social reactivity - their emotional response to social situations - can be represented as the association between repeated measurements of the individual's social interaction quantity and momentary well-being. MSEM allows researchers to investigate the associations between WPAs and person-level outcome variables (e.g., life satisfaction) by specifying the WPAs as random slopes in the structural equation on level 1 and using the latent representations of the slopes to predict outcomes on level 2. Here, we are concerned with the case in which a researcher is interested in nonlinear effects of WPAs on person-level outcomes - a U-shaped effect of a WPA, a moderation effect of two WPAs, or an effect of congruence between two WPAs - such that the corresponding MSEM includes latent interactions between random slopes. We evaluate the nonlinear MSEM approach for the three classes of nonlinear effects (U-shaped, moderation, congruence) and compare it with three simpler approaches: a simple two-step approach, a single-indicator approach, and a plausible values approach. We use a simulation study to compare the approaches on accuracy of parameter estimates and inference. We derive recommendations for practice and provide code templates and an illustrative example to help researchers implement the approaches.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362328/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141765068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-08-05DOI: 10.3758/s13428-024-02469-2
Nicholas Kathios, Kelsie L Lopez, Laurel Joy Gabard-Durnam, Psyche Loui
Early home musical environments can significantly impact sensory, cognitive, and socioemotional development. While longitudinal studies may be resource-intensive, retrospective reports are a relatively quick and inexpensive way to examine associations between early home musical environments and adult outcomes. We present the Music@Home-Retrospective scale, derived partly from the Music@Home-Preschool scale (Politimou et al., 2018), to retrospectively assess the childhood home musical environment. In two studies (total n = 578), we conducted an exploratory factor analysis (Study 1) and confirmatory factor analysis (Study 2) on items, including many adapted from the Music@Home-Preschool scale. This revealed a 20-item solution with five subscales. Items retained for three subscales (Caregiver Beliefs, Caregiver Initiation of Singing, Child Engagement with Music) load identically to three in the Music@Home--Preschool Scale. We also identified two additional dimensions of the childhood home musical environment. The Attitude Toward Childhood Home Musical Environment subscale captures participants' current adult attitudes toward their childhood home musical environment, and the Social Listening Contexts subscale indexes the degree to which participants listened to music at home with others (i.e., friends, siblings, and caregivers). Music@Home-Retrospective scores were related to adult self-reports of musicality, performance on a melodic perception task, and self-reports of well-being, demonstrating utility in measuring the early home music environment as captured through this scale. The Music@Home-Retrospective scale is freely available to enable future investigations exploring how the early home musical environment relates to adult cognition, affect, and behavior.
{"title":"Music@Home-Retrospective: A new measure to retrospectively assess childhood home musical environments.","authors":"Nicholas Kathios, Kelsie L Lopez, Laurel Joy Gabard-Durnam, Psyche Loui","doi":"10.3758/s13428-024-02469-2","DOIUrl":"10.3758/s13428-024-02469-2","url":null,"abstract":"<p><p>Early home musical environments can significantly impact sensory, cognitive, and socioemotional development. While longitudinal studies may be resource-intensive, retrospective reports are a relatively quick and inexpensive way to examine associations between early home musical environments and adult outcomes. We present the Music@Home-Retrospective scale, derived partly from the Music@Home-Preschool scale (Politimou et al., 2018), to retrospectively assess the childhood home musical environment. In two studies (total n = 578), we conducted an exploratory factor analysis (Study 1) and confirmatory factor analysis (Study 2) on items, including many adapted from the Music@Home-Preschool scale. This revealed a 20-item solution with five subscales. Items retained for three subscales (Caregiver Beliefs, Caregiver Initiation of Singing, Child Engagement with Music) load identically to three in the Music@Home--Preschool Scale. We also identified two additional dimensions of the childhood home musical environment. The Attitude Toward Childhood Home Musical Environment subscale captures participants' current adult attitudes toward their childhood home musical environment, and the Social Listening Contexts subscale indexes the degree to which participants listened to music at home with others (i.e., friends, siblings, and caregivers). Music@Home-Retrospective scores were related to adult self-reports of musicality, performance on a melodic perception task, and self-reports of well-being, demonstrating utility in measuring the early home music environment as captured through this scale. The Music@Home-Retrospective scale is freely available to enable future investigations exploring how the early home musical environment relates to adult cognition, affect, and behavior.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362467/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141892776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-03-04DOI: 10.3758/s13428-024-02372-w
Conor J R Smithson, Jason K Chow, Ting-Yun Chang, Isabel Gauthier
Measurement of domain-general object recognition ability (o) requires minimization of domain-specific variance. One approach is to model o as a latent variable explaining performance on a battery of tests which differ in task demands and stimuli; however, time and sample requirements may be prohibitive. Alternatively, an aggregate measure of o can be obtained by averaging z-scores across tests. Using data from Sunday et al., Journal of Experimental Psychology: General, 151, 676-694, (2022), we demonstrated that aggregate scores from just two such object recognition tests provide a good approximation (r = .79) of factor scores calculated from a model using a much larger set of tests. Some test combinations produced correlations of up to r = .87 with factor scores. We then revised these tests to reduce testing time, and developed an odd one out task, using a unique object category on nearly every trial, to increase task and stimuli diversity. To validate our measures, 163 participants completed the object recognition tests on two occasions, one month apart. Providing the first evidence that o is stable over time, our short aggregate o measure demonstrated good test-retest reliability (r = .77). The stability of o could not be completely accounted for by intelligence, perceptual speed, and early visual ability. Structural equation modeling suggested that our tests load significantly onto the same latent variable, and revealed that as a latent variable, o is highly stable (r = .93). Aggregation is an efficient method for estimating o, allowing investigation of individual differences in object recognition ability to be more accessible in future studies.
测量一般领域的物体识别能力(o)需要尽量减少特定领域的方差。一种方法是将 o 作为一个潜在变量来建模,以解释在一系列任务要求和刺激不同的测试中的表现;但是,时间和样本要求可能过高。另一种方法是通过平均各测试的 z 分数来获得 o 的综合测量值。利用 Sunday 等人的数据,《实验心理学杂志》(Journal of Experimental Psychology:General, 151, 676-694, (2022))中的数据,我们证明,仅从两个这样的物体识别测试中得到的总分,就能很好地近似(r = .79)使用更多测试集的模型计算出的因子得分。一些测试组合与因子得分的相关性高达 r = .87。随后,我们对这些测试进行了修改,以减少测试时间,并开发了一个 "奇异任务",几乎每次测试都使用一个独特的对象类别,以增加任务和刺激的多样性。为了验证我们的测试方法,163 名参与者在相隔一个月的时间里两次完成了物体识别测试。我们的短期综合 o 测量结果显示了良好的测试-再测可靠性(r = .77),首次证明了 o 随时间的推移是稳定的。智力、感知速度和早期视觉能力并不能完全解释 o 的稳定性。结构方程模型表明,我们的测试在同一潜变量上有显著的负载,并揭示了 o 作为一个潜变量具有高度的稳定性(r = .93)。聚合法是估计 o 的一种有效方法,使今后的研究更容易调查物体识别能力的个体差异。
{"title":"Measuring object recognition ability: Reliability, validity, and the aggregate z-score approach.","authors":"Conor J R Smithson, Jason K Chow, Ting-Yun Chang, Isabel Gauthier","doi":"10.3758/s13428-024-02372-w","DOIUrl":"10.3758/s13428-024-02372-w","url":null,"abstract":"<p><p>Measurement of domain-general object recognition ability (o) requires minimization of domain-specific variance. One approach is to model o as a latent variable explaining performance on a battery of tests which differ in task demands and stimuli; however, time and sample requirements may be prohibitive. Alternatively, an aggregate measure of o can be obtained by averaging z-scores across tests. Using data from Sunday et al., Journal of Experimental Psychology: General, 151, 676-694, (2022), we demonstrated that aggregate scores from just two such object recognition tests provide a good approximation (r = .79) of factor scores calculated from a model using a much larger set of tests. Some test combinations produced correlations of up to r = .87 with factor scores. We then revised these tests to reduce testing time, and developed an odd one out task, using a unique object category on nearly every trial, to increase task and stimuli diversity. To validate our measures, 163 participants completed the object recognition tests on two occasions, one month apart. Providing the first evidence that o is stable over time, our short aggregate o measure demonstrated good test-retest reliability (r = .77). The stability of o could not be completely accounted for by intelligence, perceptual speed, and early visual ability. Structural equation modeling suggested that our tests load significantly onto the same latent variable, and revealed that as a latent variable, o is highly stable (r = .93). Aggregation is an efficient method for estimating o, allowing investigation of individual differences in object recognition ability to be more accessible in future studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140027308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-03-20DOI: 10.3758/s13428-024-02393-5
Darrell A Worthy, Joanna N Lahey, Samuel L Priestley, Marco A Palma
Eye-tracking is emerging as a tool for researchers to better understand cognition and behavior. However, it is possible that experiment participants adjust their behavior when they know their eyes are being tracked. This potential change would be considered a type of Hawthorne effect, in which participants alter their behavior in response to being watched and could potentially compromise the outcomes and conclusions of experimental studies that use eye tracking. We examined whether eye-tracking produced Hawthorne effects in six commonly used psychological scales and five behavioral tasks. The dependent measures were selected because they are widely used and cited and because they involved measures of sensitive topics, including gambling behavior, racial bias, undesirable personality characteristics, or because they require working memory or executive attention resources, which might be affected by Hawthorne effects. The only task where Hawthorne effects manifested was the mixed gambles task, in which participants accepted or rejected gambles involving a 50/50 chance of gaining or losing different monetary amounts. Participants in the eye-tracking condition accepted fewer gambles that were low in expected value, and they also took longer to respond for these low-value gambles. These results suggest that eye-tracking is not likely to produce Hawthorne effects in most common psychology laboratory tasks, except for those involving risky decisions where the probability of the outcomes from each choice are known.
{"title":"An examination of the effects of eye-tracking on behavior in psychology experiments.","authors":"Darrell A Worthy, Joanna N Lahey, Samuel L Priestley, Marco A Palma","doi":"10.3758/s13428-024-02393-5","DOIUrl":"10.3758/s13428-024-02393-5","url":null,"abstract":"<p><p>Eye-tracking is emerging as a tool for researchers to better understand cognition and behavior. However, it is possible that experiment participants adjust their behavior when they know their eyes are being tracked. This potential change would be considered a type of Hawthorne effect, in which participants alter their behavior in response to being watched and could potentially compromise the outcomes and conclusions of experimental studies that use eye tracking. We examined whether eye-tracking produced Hawthorne effects in six commonly used psychological scales and five behavioral tasks. The dependent measures were selected because they are widely used and cited and because they involved measures of sensitive topics, including gambling behavior, racial bias, undesirable personality characteristics, or because they require working memory or executive attention resources, which might be affected by Hawthorne effects. The only task where Hawthorne effects manifested was the mixed gambles task, in which participants accepted or rejected gambles involving a 50/50 chance of gaining or losing different monetary amounts. Participants in the eye-tracking condition accepted fewer gambles that were low in expected value, and they also took longer to respond for these low-value gambles. These results suggest that eye-tracking is not likely to produce Hawthorne effects in most common psychology laboratory tasks, except for those involving risky decisions where the probability of the outcomes from each choice are known.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-03-20DOI: 10.3758/s13428-024-02371-x
Steffen Nestler, Marie Salditt
Psychologists are increasingly interested in whether treatment effects vary in randomized controlled trials. A number of tests have been proposed in the causal inference literature to test for such heterogeneity, which differ in the sample statistic they use (either using the variance terms of the experimental and control group, their empirical distribution functions, or specific quantiles), and in whether they make distributional assumptions or are based on a Fisher randomization procedure. In this manuscript, we present the results of a simulation study in which we examine the performance of the different tests while varying the amount of treatment effect heterogeneity, the type of underlying distribution, the sample size, and whether an additional covariate is considered. Altogether, our results suggest that researchers should use a randomization test to optimally control for type 1 errors. Furthermore, all tests studied are associated with low power in case of small and moderate samples even when the heterogeneity of the treatment effect is substantial. This suggests that current tests for treatment effect heterogeneity require much larger samples than those collected in current research.
{"title":"Comparing type 1 and type 2 error rates of different tests for heterogeneous treatment effects.","authors":"Steffen Nestler, Marie Salditt","doi":"10.3758/s13428-024-02371-x","DOIUrl":"10.3758/s13428-024-02371-x","url":null,"abstract":"<p><p>Psychologists are increasingly interested in whether treatment effects vary in randomized controlled trials. A number of tests have been proposed in the causal inference literature to test for such heterogeneity, which differ in the sample statistic they use (either using the variance terms of the experimental and control group, their empirical distribution functions, or specific quantiles), and in whether they make distributional assumptions or are based on a Fisher randomization procedure. In this manuscript, we present the results of a simulation study in which we examine the performance of the different tests while varying the amount of treatment effect heterogeneity, the type of underlying distribution, the sample size, and whether an additional covariate is considered. Altogether, our results suggest that researchers should use a randomization test to optimally control for type 1 errors. Furthermore, all tests studied are associated with low power in case of small and moderate samples even when the heterogeneity of the treatment effect is substantial. This suggests that current tests for treatment effect heterogeneity require much larger samples than those collected in current research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362231/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-09DOI: 10.3758/s13428-024-02409-0
Shaela T Jalava, Jeffrey D Wammes
A principal goal of attention research is to develop tasks with clear behavioral signatures of attentional fluctuations. Measures that index attentional states often fall under two broad umbrellas: decision tasks, in which participants make responses based on the changing requirements of each trial, and rhythm tasks, in which participants respond rhythmically to a uniform stimulus (e.g., a metronome tone). In the former, response speeding typically precedes errors (indicative of attention failures). In the latter, increased response variability precedes subjective reports of off-task states. We developed and validated the rhythmic visual response task (RVRT); a rhythm task incorporating trial-unique scene stimuli. The RVRT incorporates two important advances from both task categories: (1) it is free from the influence that differential decision-making has on fluctuations in attentional states, and (2) trial-unique stimuli enable later cognitive judgments to be mapped to specific moments in the task. These features allow a relatively unobtrusive measure of mind wandering that facilitates the downstream assessment of its consequences. Participants completed 900 trials of the RVRT, interrupted periodically by thought probes that assessed their attentional state. We found that both response time variance and speed predicted depth of mind wandering. Encouraged by these findings, we used the same analysis approach on archival data to demonstrate that the combination of variance and speed best predicted attentional states in several rhythm and decision task datasets. We discuss the implications of these findings and suggest future research that uses the RVRT to investigate the impact of spontaneous mind wandering on memory, decision-making, and perception.
{"title":"Slow and steady: Validating the rhythmic visual response task as a marker for attentional states.","authors":"Shaela T Jalava, Jeffrey D Wammes","doi":"10.3758/s13428-024-02409-0","DOIUrl":"10.3758/s13428-024-02409-0","url":null,"abstract":"<p><p>A principal goal of attention research is to develop tasks with clear behavioral signatures of attentional fluctuations. Measures that index attentional states often fall under two broad umbrellas: decision tasks, in which participants make responses based on the changing requirements of each trial, and rhythm tasks, in which participants respond rhythmically to a uniform stimulus (e.g., a metronome tone). In the former, response speeding typically precedes errors (indicative of attention failures). In the latter, increased response variability precedes subjective reports of off-task states. We developed and validated the rhythmic visual response task (RVRT); a rhythm task incorporating trial-unique scene stimuli. The RVRT incorporates two important advances from both task categories: (1) it is free from the influence that differential decision-making has on fluctuations in attentional states, and (2) trial-unique stimuli enable later cognitive judgments to be mapped to specific moments in the task. These features allow a relatively unobtrusive measure of mind wandering that facilitates the downstream assessment of its consequences. Participants completed 900 trials of the RVRT, interrupted periodically by thought probes that assessed their attentional state. We found that both response time variance and speed predicted depth of mind wandering. Encouraged by these findings, we used the same analysis approach on archival data to demonstrate that the combination of variance and speed best predicted attentional states in several rhythm and decision task datasets. We discuss the implications of these findings and suggest future research that uses the RVRT to investigate the impact of spontaneous mind wandering on memory, decision-making, and perception.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140897237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-15DOI: 10.3758/s13428-024-02437-w
Yuen-Lai Chan, Chi-Shing Tse
Investigation of affective and semantic dimensions of words is essential for studying word processing. In this study, we expanded Tse et al.'s (Behav Res Methods 49:1503-1519, 2017; Behav Res Methods 55:4382-4402, 2023) Chinese Lexicon Project by norming five word dimensions (valence, arousal, familiarity, concreteness, and imageability) for over 25,000 two-character Chinese words presented in traditional script. Through regression models that controlled for other variables, we examined the relationships among these dimensions. We included ambiguity, quantified by the standard deviation of the ratings of a given lexical variable across different raters, as separate variables (e.g., valence ambiguity) to explore their connections with other variables. The intensity-ambiguity relationships (i.e., between normed variables and their ambiguities, like valence with valence ambiguity) were also examined. In these analyses with a large pool of words and controlling for other lexical variables, we replicated the asymmetric U-shaped valence-arousal relationship, which was moderated by valence and arousal ambiguities. We also observed a curvilinear relationship between valence and familiarity and between valence and concreteness. Replicating Brainerd et al.'s (J Exp Psychol Gen 150:1476-1499, 2021; J Mem Lang 121:104286, 2021) quadratic intensity-ambiguity relationships, we found that the ambiguity of valence, arousal, concreteness, and imageability decreases as the value of these variables is extremely low or extremely high, although this was not generalized to familiarity. While concreteness and imageability were strongly correlated, they displayed different relationships with arousal, valence, familiarity, and valence ambiguity, suggesting their distinct conceptual nature. These findings further our understanding of the affective and semantic dimensions of two-character Chinese words. The normed values of all these variables can be accessed via https://osf.io/hwkv7 .
调查词语的情感和语义维度对于研究词语加工至关重要。在本研究中,我们扩展了 Tse 等人(Behav Res Methods 49:1503-1519, 2017; Behav Res Methods 55:4382-4402, 2023)的 "中文词典项目",对超过 25,000 个繁体字双字词的五个词语维度(情感、唤醒、熟悉、具体和形象)进行了规范。通过控制其他变量的回归模型,我们研究了这些维度之间的关系。我们将模糊性(通过不同评分者对特定词汇变量评分的标准偏差来量化)作为单独的变量(如情感模糊性),以探讨它们与其他变量之间的联系。强度-模糊性关系(即规范变量与其模糊性之间的关系,如价位与价位模糊性之间的关系)也在研究之列。在这些分析中,我们使用了大量词库,并控制了其他词汇变量,结果证实了不对称的 U 型 "情价-唤醒 "关系,这种关系受到情价和唤醒模糊性的调节。我们还观察到情绪与熟悉程度之间以及情绪与具体程度之间存在曲线关系。与 Brainerd 等人(J Exp Psychol Gen 150:1476-1499,2021;J Mem Lang 121:104286,2021)的二次方强度-模糊性关系相同,我们发现,当这些变量的值极低或极高时,效价、唤醒、具体性和形象性的模糊性会降低,尽管这并没有推广到熟悉程度上。虽然具体性和形象性密切相关,但它们与唤醒度、情绪、熟悉度和情绪模糊度之间的关系却不同,这表明它们具有不同的概念性质。这些发现进一步加深了我们对两个汉字的情感和语义维度的理解。所有这些变量的标准值可通过 https://osf.io/hwkv7 访问。
{"title":"Decoding the essence of two-character Chinese words: Unveiling valence, arousal, concreteness, familiarity, and imageability through word norming.","authors":"Yuen-Lai Chan, Chi-Shing Tse","doi":"10.3758/s13428-024-02437-w","DOIUrl":"10.3758/s13428-024-02437-w","url":null,"abstract":"<p><p>Investigation of affective and semantic dimensions of words is essential for studying word processing. In this study, we expanded Tse et al.'s (Behav Res Methods 49:1503-1519, 2017; Behav Res Methods 55:4382-4402, 2023) Chinese Lexicon Project by norming five word dimensions (valence, arousal, familiarity, concreteness, and imageability) for over 25,000 two-character Chinese words presented in traditional script. Through regression models that controlled for other variables, we examined the relationships among these dimensions. We included ambiguity, quantified by the standard deviation of the ratings of a given lexical variable across different raters, as separate variables (e.g., valence ambiguity) to explore their connections with other variables. The intensity-ambiguity relationships (i.e., between normed variables and their ambiguities, like valence with valence ambiguity) were also examined. In these analyses with a large pool of words and controlling for other lexical variables, we replicated the asymmetric U-shaped valence-arousal relationship, which was moderated by valence and arousal ambiguities. We also observed a curvilinear relationship between valence and familiarity and between valence and concreteness. Replicating Brainerd et al.'s (J Exp Psychol Gen 150:1476-1499, 2021; J Mem Lang 121:104286, 2021) quadratic intensity-ambiguity relationships, we found that the ambiguity of valence, arousal, concreteness, and imageability decreases as the value of these variables is extremely low or extremely high, although this was not generalized to familiarity. While concreteness and imageability were strongly correlated, they displayed different relationships with arousal, valence, familiarity, and valence ambiguity, suggesting their distinct conceptual nature. These findings further our understanding of the affective and semantic dimensions of two-character Chinese words. The normed values of all these variables can be accessed via https://osf.io/hwkv7 .</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140943498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-15DOI: 10.3758/s13428-024-02436-x
Serena Dolfi, Alberto Testolin, Simone Cutini, Marco Zorzi
While several methods have been proposed to assess the influence of continuous visual cues in parallel numerosity estimation, the impact of temporal magnitudes on sequential numerosity judgments has been largely ignored. To overcome this issue, we extend a recently proposed framework that makes it possible to separate the contribution of numerical and non-numerical information in numerosity comparison by introducing a novel stimulus space designed for sequential tasks. Our method systematically varies the temporal magnitudes embedded into event sequences through the orthogonal manipulation of numerosity and two latent factors, which we designate as "duration" and "temporal spacing". This allows us to measure the contribution of finer-grained temporal features on numerosity judgments in several sensory modalities. We validate the proposed method on two different experiments in both visual and auditory modalities: results show that adult participants discriminated sequences primarily by relying on numerosity, with similar acuity in the visual and auditory modality. However, participants were similarly influenced by non-numerical cues, such as the total duration of the stimuli, suggesting that temporal cues can significantly bias numerical processing. Our findings highlight the need to carefully consider the continuous properties of numerical stimuli in a sequential mode of presentation as well, with particular relevance in multimodal and cross-modal investigations. We provide the complete code for creating sequential stimuli and analyzing participants' responses.
{"title":"Measuring temporal bias in sequential numerosity comparison.","authors":"Serena Dolfi, Alberto Testolin, Simone Cutini, Marco Zorzi","doi":"10.3758/s13428-024-02436-x","DOIUrl":"10.3758/s13428-024-02436-x","url":null,"abstract":"<p><p>While several methods have been proposed to assess the influence of continuous visual cues in parallel numerosity estimation, the impact of temporal magnitudes on sequential numerosity judgments has been largely ignored. To overcome this issue, we extend a recently proposed framework that makes it possible to separate the contribution of numerical and non-numerical information in numerosity comparison by introducing a novel stimulus space designed for sequential tasks. Our method systematically varies the temporal magnitudes embedded into event sequences through the orthogonal manipulation of numerosity and two latent factors, which we designate as \"duration\" and \"temporal spacing\". This allows us to measure the contribution of finer-grained temporal features on numerosity judgments in several sensory modalities. We validate the proposed method on two different experiments in both visual and auditory modalities: results show that adult participants discriminated sequences primarily by relying on numerosity, with similar acuity in the visual and auditory modality. However, participants were similarly influenced by non-numerical cues, such as the total duration of the stimuli, suggesting that temporal cues can significantly bias numerical processing. Our findings highlight the need to carefully consider the continuous properties of numerical stimuli in a sequential mode of presentation as well, with particular relevance in multimodal and cross-modal investigations. We provide the complete code for creating sequential stimuli and analyzing participants' responses.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":null,"pages":null},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362239/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140943590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}