Pub Date : 2024-10-01Epub Date: 2024-03-20DOI: 10.3758/s13428-024-02371-x
Steffen Nestler, Marie Salditt
Psychologists are increasingly interested in whether treatment effects vary in randomized controlled trials. A number of tests have been proposed in the causal inference literature to test for such heterogeneity, which differ in the sample statistic they use (either using the variance terms of the experimental and control group, their empirical distribution functions, or specific quantiles), and in whether they make distributional assumptions or are based on a Fisher randomization procedure. In this manuscript, we present the results of a simulation study in which we examine the performance of the different tests while varying the amount of treatment effect heterogeneity, the type of underlying distribution, the sample size, and whether an additional covariate is considered. Altogether, our results suggest that researchers should use a randomization test to optimally control for type 1 errors. Furthermore, all tests studied are associated with low power in case of small and moderate samples even when the heterogeneity of the treatment effect is substantial. This suggests that current tests for treatment effect heterogeneity require much larger samples than those collected in current research.
{"title":"Comparing type 1 and type 2 error rates of different tests for heterogeneous treatment effects.","authors":"Steffen Nestler, Marie Salditt","doi":"10.3758/s13428-024-02371-x","DOIUrl":"10.3758/s13428-024-02371-x","url":null,"abstract":"<p><p>Psychologists are increasingly interested in whether treatment effects vary in randomized controlled trials. A number of tests have been proposed in the causal inference literature to test for such heterogeneity, which differ in the sample statistic they use (either using the variance terms of the experimental and control group, their empirical distribution functions, or specific quantiles), and in whether they make distributional assumptions or are based on a Fisher randomization procedure. In this manuscript, we present the results of a simulation study in which we examine the performance of the different tests while varying the amount of treatment effect heterogeneity, the type of underlying distribution, the sample size, and whether an additional covariate is considered. Altogether, our results suggest that researchers should use a randomization test to optimally control for type 1 errors. Furthermore, all tests studied are associated with low power in case of small and moderate samples even when the heterogeneity of the treatment effect is substantial. This suggests that current tests for treatment effect heterogeneity require much larger samples than those collected in current research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"6582-6597"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362231/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140179228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-06-24DOI: 10.3758/s13428-024-02445-w
Sarah T O'Brien, Nerisa Dozo, Jordan D X Hinton, Ella K Moeck, Rio Susanto, Glenn T Jayaputera, Richard O Sinnott, Duy Vu, Mario Alvarez-Jimenez, John Gleeson, Peter Koval
Traditionally, behavioral, social, and health science researchers have relied on global/retrospective survey methods administered cross-sectionally (i.e., on a single occasion) or longitudinally (i.e., on several occasions separated by weeks, months, or years). More recently, social and health scientists have added daily life survey methods (also known as intensive longitudinal methods or ambulatory assessment) to their toolkit. These methods (e.g., daily diaries, experience sampling, ecological momentary assessment) involve dense repeated assessments in everyday settings. To facilitate research using daily life survey methods, we present SEMA3 ( http://www.SEMA3.com ), a platform for designing and administering intensive longitudinal daily life surveys via Android and iOS smartphones. SEMA3 fills an important gap by providing researchers with a free, intuitive, and flexible platform with basic and advanced functionality. In this article, we describe SEMA3's development history and system architecture, provide an overview of how to design a study using SEMA3 and outline its key features, and discuss the platform's limitations and propose directions for future development of SEMA3.
{"title":"SEMA<sup>3</sup>: A free smartphone platform for daily life surveys.","authors":"Sarah T O'Brien, Nerisa Dozo, Jordan D X Hinton, Ella K Moeck, Rio Susanto, Glenn T Jayaputera, Richard O Sinnott, Duy Vu, Mario Alvarez-Jimenez, John Gleeson, Peter Koval","doi":"10.3758/s13428-024-02445-w","DOIUrl":"10.3758/s13428-024-02445-w","url":null,"abstract":"<p><p>Traditionally, behavioral, social, and health science researchers have relied on global/retrospective survey methods administered cross-sectionally (i.e., on a single occasion) or longitudinally (i.e., on several occasions separated by weeks, months, or years). More recently, social and health scientists have added daily life survey methods (also known as intensive longitudinal methods or ambulatory assessment) to their toolkit. These methods (e.g., daily diaries, experience sampling, ecological momentary assessment) involve dense repeated assessments in everyday settings. To facilitate research using daily life survey methods, we present SEMA<sup>3</sup> ( http://www.SEMA3.com ), a platform for designing and administering intensive longitudinal daily life surveys via Android and iOS smartphones. SEMA<sup>3</sup> fills an important gap by providing researchers with a free, intuitive, and flexible platform with basic and advanced functionality. In this article, we describe SEMA<sup>3</sup>'s development history and system architecture, provide an overview of how to design a study using SEMA<sup>3</sup> and outline its key features, and discuss the platform's limitations and propose directions for future development of SEMA<sup>3</sup>.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7691-7706"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362263/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141445376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-15DOI: 10.3758/s13428-024-02437-w
Yuen-Lai Chan, Chi-Shing Tse
Investigation of affective and semantic dimensions of words is essential for studying word processing. In this study, we expanded Tse et al.'s (Behav Res Methods 49:1503-1519, 2017; Behav Res Methods 55:4382-4402, 2023) Chinese Lexicon Project by norming five word dimensions (valence, arousal, familiarity, concreteness, and imageability) for over 25,000 two-character Chinese words presented in traditional script. Through regression models that controlled for other variables, we examined the relationships among these dimensions. We included ambiguity, quantified by the standard deviation of the ratings of a given lexical variable across different raters, as separate variables (e.g., valence ambiguity) to explore their connections with other variables. The intensity-ambiguity relationships (i.e., between normed variables and their ambiguities, like valence with valence ambiguity) were also examined. In these analyses with a large pool of words and controlling for other lexical variables, we replicated the asymmetric U-shaped valence-arousal relationship, which was moderated by valence and arousal ambiguities. We also observed a curvilinear relationship between valence and familiarity and between valence and concreteness. Replicating Brainerd et al.'s (J Exp Psychol Gen 150:1476-1499, 2021; J Mem Lang 121:104286, 2021) quadratic intensity-ambiguity relationships, we found that the ambiguity of valence, arousal, concreteness, and imageability decreases as the value of these variables is extremely low or extremely high, although this was not generalized to familiarity. While concreteness and imageability were strongly correlated, they displayed different relationships with arousal, valence, familiarity, and valence ambiguity, suggesting their distinct conceptual nature. These findings further our understanding of the affective and semantic dimensions of two-character Chinese words. The normed values of all these variables can be accessed via https://osf.io/hwkv7 .
调查词语的情感和语义维度对于研究词语加工至关重要。在本研究中,我们扩展了 Tse 等人(Behav Res Methods 49:1503-1519, 2017; Behav Res Methods 55:4382-4402, 2023)的 "中文词典项目",对超过 25,000 个繁体字双字词的五个词语维度(情感、唤醒、熟悉、具体和形象)进行了规范。通过控制其他变量的回归模型,我们研究了这些维度之间的关系。我们将模糊性(通过不同评分者对特定词汇变量评分的标准偏差来量化)作为单独的变量(如情感模糊性),以探讨它们与其他变量之间的联系。强度-模糊性关系(即规范变量与其模糊性之间的关系,如价位与价位模糊性之间的关系)也在研究之列。在这些分析中,我们使用了大量词库,并控制了其他词汇变量,结果证实了不对称的 U 型 "情价-唤醒 "关系,这种关系受到情价和唤醒模糊性的调节。我们还观察到情绪与熟悉程度之间以及情绪与具体程度之间存在曲线关系。与 Brainerd 等人(J Exp Psychol Gen 150:1476-1499,2021;J Mem Lang 121:104286,2021)的二次方强度-模糊性关系相同,我们发现,当这些变量的值极低或极高时,效价、唤醒、具体性和形象性的模糊性会降低,尽管这并没有推广到熟悉程度上。虽然具体性和形象性密切相关,但它们与唤醒度、情绪、熟悉度和情绪模糊度之间的关系却不同,这表明它们具有不同的概念性质。这些发现进一步加深了我们对两个汉字的情感和语义维度的理解。所有这些变量的标准值可通过 https://osf.io/hwkv7 访问。
{"title":"Decoding the essence of two-character Chinese words: Unveiling valence, arousal, concreteness, familiarity, and imageability through word norming.","authors":"Yuen-Lai Chan, Chi-Shing Tse","doi":"10.3758/s13428-024-02437-w","DOIUrl":"10.3758/s13428-024-02437-w","url":null,"abstract":"<p><p>Investigation of affective and semantic dimensions of words is essential for studying word processing. In this study, we expanded Tse et al.'s (Behav Res Methods 49:1503-1519, 2017; Behav Res Methods 55:4382-4402, 2023) Chinese Lexicon Project by norming five word dimensions (valence, arousal, familiarity, concreteness, and imageability) for over 25,000 two-character Chinese words presented in traditional script. Through regression models that controlled for other variables, we examined the relationships among these dimensions. We included ambiguity, quantified by the standard deviation of the ratings of a given lexical variable across different raters, as separate variables (e.g., valence ambiguity) to explore their connections with other variables. The intensity-ambiguity relationships (i.e., between normed variables and their ambiguities, like valence with valence ambiguity) were also examined. In these analyses with a large pool of words and controlling for other lexical variables, we replicated the asymmetric U-shaped valence-arousal relationship, which was moderated by valence and arousal ambiguities. We also observed a curvilinear relationship between valence and familiarity and between valence and concreteness. Replicating Brainerd et al.'s (J Exp Psychol Gen 150:1476-1499, 2021; J Mem Lang 121:104286, 2021) quadratic intensity-ambiguity relationships, we found that the ambiguity of valence, arousal, concreteness, and imageability decreases as the value of these variables is extremely low or extremely high, although this was not generalized to familiarity. While concreteness and imageability were strongly correlated, they displayed different relationships with arousal, valence, familiarity, and valence ambiguity, suggesting their distinct conceptual nature. These findings further our understanding of the affective and semantic dimensions of two-character Chinese words. The normed values of all these variables can be accessed via https://osf.io/hwkv7 .</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7574-7601"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140943498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-15DOI: 10.3758/s13428-024-02436-x
Serena Dolfi, Alberto Testolin, Simone Cutini, Marco Zorzi
While several methods have been proposed to assess the influence of continuous visual cues in parallel numerosity estimation, the impact of temporal magnitudes on sequential numerosity judgments has been largely ignored. To overcome this issue, we extend a recently proposed framework that makes it possible to separate the contribution of numerical and non-numerical information in numerosity comparison by introducing a novel stimulus space designed for sequential tasks. Our method systematically varies the temporal magnitudes embedded into event sequences through the orthogonal manipulation of numerosity and two latent factors, which we designate as "duration" and "temporal spacing". This allows us to measure the contribution of finer-grained temporal features on numerosity judgments in several sensory modalities. We validate the proposed method on two different experiments in both visual and auditory modalities: results show that adult participants discriminated sequences primarily by relying on numerosity, with similar acuity in the visual and auditory modality. However, participants were similarly influenced by non-numerical cues, such as the total duration of the stimuli, suggesting that temporal cues can significantly bias numerical processing. Our findings highlight the need to carefully consider the continuous properties of numerical stimuli in a sequential mode of presentation as well, with particular relevance in multimodal and cross-modal investigations. We provide the complete code for creating sequential stimuli and analyzing participants' responses.
{"title":"Measuring temporal bias in sequential numerosity comparison.","authors":"Serena Dolfi, Alberto Testolin, Simone Cutini, Marco Zorzi","doi":"10.3758/s13428-024-02436-x","DOIUrl":"10.3758/s13428-024-02436-x","url":null,"abstract":"<p><p>While several methods have been proposed to assess the influence of continuous visual cues in parallel numerosity estimation, the impact of temporal magnitudes on sequential numerosity judgments has been largely ignored. To overcome this issue, we extend a recently proposed framework that makes it possible to separate the contribution of numerical and non-numerical information in numerosity comparison by introducing a novel stimulus space designed for sequential tasks. Our method systematically varies the temporal magnitudes embedded into event sequences through the orthogonal manipulation of numerosity and two latent factors, which we designate as \"duration\" and \"temporal spacing\". This allows us to measure the contribution of finer-grained temporal features on numerosity judgments in several sensory modalities. We validate the proposed method on two different experiments in both visual and auditory modalities: results show that adult participants discriminated sequences primarily by relying on numerosity, with similar acuity in the visual and auditory modality. However, participants were similarly influenced by non-numerical cues, such as the total duration of the stimuli, suggesting that temporal cues can significantly bias numerical processing. Our findings highlight the need to carefully consider the continuous properties of numerical stimuli in a sequential mode of presentation as well, with particular relevance in multimodal and cross-modal investigations. We provide the complete code for creating sequential stimuli and analyzing participants' responses.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7561-7573"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362239/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140943590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-09DOI: 10.3758/s13428-024-02409-0
Shaela T Jalava, Jeffrey D Wammes
A principal goal of attention research is to develop tasks with clear behavioral signatures of attentional fluctuations. Measures that index attentional states often fall under two broad umbrellas: decision tasks, in which participants make responses based on the changing requirements of each trial, and rhythm tasks, in which participants respond rhythmically to a uniform stimulus (e.g., a metronome tone). In the former, response speeding typically precedes errors (indicative of attention failures). In the latter, increased response variability precedes subjective reports of off-task states. We developed and validated the rhythmic visual response task (RVRT); a rhythm task incorporating trial-unique scene stimuli. The RVRT incorporates two important advances from both task categories: (1) it is free from the influence that differential decision-making has on fluctuations in attentional states, and (2) trial-unique stimuli enable later cognitive judgments to be mapped to specific moments in the task. These features allow a relatively unobtrusive measure of mind wandering that facilitates the downstream assessment of its consequences. Participants completed 900 trials of the RVRT, interrupted periodically by thought probes that assessed their attentional state. We found that both response time variance and speed predicted depth of mind wandering. Encouraged by these findings, we used the same analysis approach on archival data to demonstrate that the combination of variance and speed best predicted attentional states in several rhythm and decision task datasets. We discuss the implications of these findings and suggest future research that uses the RVRT to investigate the impact of spontaneous mind wandering on memory, decision-making, and perception.
{"title":"Slow and steady: Validating the rhythmic visual response task as a marker for attentional states.","authors":"Shaela T Jalava, Jeffrey D Wammes","doi":"10.3758/s13428-024-02409-0","DOIUrl":"10.3758/s13428-024-02409-0","url":null,"abstract":"<p><p>A principal goal of attention research is to develop tasks with clear behavioral signatures of attentional fluctuations. Measures that index attentional states often fall under two broad umbrellas: decision tasks, in which participants make responses based on the changing requirements of each trial, and rhythm tasks, in which participants respond rhythmically to a uniform stimulus (e.g., a metronome tone). In the former, response speeding typically precedes errors (indicative of attention failures). In the latter, increased response variability precedes subjective reports of off-task states. We developed and validated the rhythmic visual response task (RVRT); a rhythm task incorporating trial-unique scene stimuli. The RVRT incorporates two important advances from both task categories: (1) it is free from the influence that differential decision-making has on fluctuations in attentional states, and (2) trial-unique stimuli enable later cognitive judgments to be mapped to specific moments in the task. These features allow a relatively unobtrusive measure of mind wandering that facilitates the downstream assessment of its consequences. Participants completed 900 trials of the RVRT, interrupted periodically by thought probes that assessed their attentional state. We found that both response time variance and speed predicted depth of mind wandering. Encouraged by these findings, we used the same analysis approach on archival data to demonstrate that the combination of variance and speed best predicted attentional states in several rhythm and decision task datasets. We discuss the implications of these findings and suggest future research that uses the RVRT to investigate the impact of spontaneous mind wandering on memory, decision-making, and perception.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7079-7101"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140897237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-07-12DOI: 10.3758/s13428-024-02454-9
Younes Strittmatter, Markus W H Spitzer, Nadja Ging-Jehli, Sebastian Musslick
Online experiments are increasingly gaining traction in the behavioral sciences. Despite this, behavioral researchers have largely continued to use keyboards as the primary input devices for such online studies, overlooking the ubiquity of touchscreens in everyday use. This paper presents an open-source touchscreen extension for jsPsych, a JavaScript framework designed for conducting online experiments. We additionally evaluated the touchscreen extension assessing whether typical behavioral findings from two distinct perceptual decision-making tasks - the random-dot kinematogram and the Stroop task - can similarly be observed when administered via touchscreen devices compared to keyboard devices. Our findings indicate similar performance metrics for each paradigm between the touchscreen and keyboard versions of the experiments. Specifically, we observe similar psychometric curves in the random-dot kinematogram across the touchscreen and keyboard versions. Similarly, in the Stroop task, we detect significant task, congruency, and sequential congruency effects in both experiment versions. We conclude that our open-source touchscreen extension serves as a promising tool for data collection in online behavioral experiments on forced-choice tasks.
{"title":"A jsPsych touchscreen extension for behavioral research on touch-enabled interfaces.","authors":"Younes Strittmatter, Markus W H Spitzer, Nadja Ging-Jehli, Sebastian Musslick","doi":"10.3758/s13428-024-02454-9","DOIUrl":"10.3758/s13428-024-02454-9","url":null,"abstract":"<p><p>Online experiments are increasingly gaining traction in the behavioral sciences. Despite this, behavioral researchers have largely continued to use keyboards as the primary input devices for such online studies, overlooking the ubiquity of touchscreens in everyday use. This paper presents an open-source touchscreen extension for jsPsych, a JavaScript framework designed for conducting online experiments. We additionally evaluated the touchscreen extension assessing whether typical behavioral findings from two distinct perceptual decision-making tasks - the random-dot kinematogram and the Stroop task - can similarly be observed when administered via touchscreen devices compared to keyboard devices. Our findings indicate similar performance metrics for each paradigm between the touchscreen and keyboard versions of the experiments. Specifically, we observe similar psychometric curves in the random-dot kinematogram across the touchscreen and keyboard versions. Similarly, in the Stroop task, we detect significant task, congruency, and sequential congruency effects in both experiment versions. We conclude that our open-source touchscreen extension serves as a promising tool for data collection in online behavioral experiments on forced-choice tasks.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7814-7830"},"PeriodicalIF":5.4,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11549123/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141589532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-07-12DOI: 10.3758/s13428-024-02451-y
Diana J N Armbruster-Genç, Rebecca A Rammensee, Stefanie M Jungmann, Philine Drake, Michèle Wessa, Ulrike Basten
Interpretation biases in the processing of ambiguous affective information are assumed to play an important role in the onset and maintenance of emotional disorders. Reports of low reliability for experimental measures of cognitive biases have called into question previous findings on the association of these measures with markers of mental health and demonstrated the need to systematically evaluate measurement reliability for measures of cognitive biases. We evaluated reliability and correlations with self-report measures of mental health for interpretation bias scores derived from the Ambiguous Cue Task (ACT), an experimental paradigm for the assessment of approach-avoidance behavior towards ambiguous affective stimuli. For a non-clinical sample, the measurement of an interpretation bias with the ACT showed high internal consistency (rSB = .91 - .96, N = 354) and acceptable 2-week test-retest correlations (rPearson = .61 - .65, n = 109). Correlations between the ACT interpretation bias scores and mental health-related self-report measures of personality and well-being were generally small (r ≤ |.11|) and statistically not significant when correcting for multiple comparisons. These findings suggest that in non-clinical populations, individual differences in the interpretation of ambiguous affective information as assessed with the ACT do not show a clear association with self-report markers of mental health. However, in allowing for a highly reliable measurement of interpretation bias, the ACT provides a valuable tool for studies considering potentially small effect sizes in non-clinical populations by studying bigger samples as well as for work on clinical populations, for which potentially greater effects can be expected.
{"title":"The Ambiguous Cue Task: Measurement reliability of an experimental paradigm for the assessment of interpretation bias and associations with mental health.","authors":"Diana J N Armbruster-Genç, Rebecca A Rammensee, Stefanie M Jungmann, Philine Drake, Michèle Wessa, Ulrike Basten","doi":"10.3758/s13428-024-02451-y","DOIUrl":"10.3758/s13428-024-02451-y","url":null,"abstract":"<p><p>Interpretation biases in the processing of ambiguous affective information are assumed to play an important role in the onset and maintenance of emotional disorders. Reports of low reliability for experimental measures of cognitive biases have called into question previous findings on the association of these measures with markers of mental health and demonstrated the need to systematically evaluate measurement reliability for measures of cognitive biases. We evaluated reliability and correlations with self-report measures of mental health for interpretation bias scores derived from the Ambiguous Cue Task (ACT), an experimental paradigm for the assessment of approach-avoidance behavior towards ambiguous affective stimuli. For a non-clinical sample, the measurement of an interpretation bias with the ACT showed high internal consistency (r<sub>SB</sub> = .91 - .96, N = 354) and acceptable 2-week test-retest correlations (r<sub>Pearson</sub> = .61 - .65, n = 109). Correlations between the ACT interpretation bias scores and mental health-related self-report measures of personality and well-being were generally small (r ≤ |.11|) and statistically not significant when correcting for multiple comparisons. These findings suggest that in non-clinical populations, individual differences in the interpretation of ambiguous affective information as assessed with the ACT do not show a clear association with self-report markers of mental health. However, in allowing for a highly reliable measurement of interpretation bias, the ACT provides a valuable tool for studies considering potentially small effect sizes in non-clinical populations by studying bigger samples as well as for work on clinical populations, for which potentially greater effects can be expected.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7774-7789"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362423/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141589533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2023-08-25DOI: 10.3758/s13428-023-02213-2
Yury Shevchenko, Ulf-Dietrich Reips
This manuscript presents a novel geofencing method in behavioral research. Geofencing, built upon geolocation technology, constitutes virtual fences around specific locations. Every time a participant crosses the virtual border around the geofenced area, an event can be triggered on a smartphone, e.g., the participant may be asked to complete a survey. The geofencing method can alleviate the problems of constant location tracking, such as recording sensitive geolocation information and battery drain. In scenarios where locations for geofencing are determined by participants (e.g., home, workplace), no location data need to be transferred to the researcher, so this method can ensure privacy and anonymity. Given the widespread use of smartphones and mobile Internet, geofencing has become a feasible tool in studying human behavior and cognition outside of the laboratory. The method can help advance theoretical and applied psychological science at a new frontier of context-aware research. At the same time, there is a lack of guidance on how and when geofencing can be applied in research. This manuscript aims to fill the gap and ease the adoption of the geofencing method. We describe the current challenges and implementations in geofencing and present three empirical studies in which we evaluated the geofencing method using the Samply application, a tool for mobile experience sampling research. The studies show that sensitivity and precision of geofencing were affected by the type of event, location radius, environment, operating system, and user behavior. Potential implications and recommendations for behavioral research are discussed.
{"title":"Geofencing in location-based behavioral research: Methodology, challenges, and implementation.","authors":"Yury Shevchenko, Ulf-Dietrich Reips","doi":"10.3758/s13428-023-02213-2","DOIUrl":"10.3758/s13428-023-02213-2","url":null,"abstract":"<p><p>This manuscript presents a novel geofencing method in behavioral research. Geofencing, built upon geolocation technology, constitutes virtual fences around specific locations. Every time a participant crosses the virtual border around the geofenced area, an event can be triggered on a smartphone, e.g., the participant may be asked to complete a survey. The geofencing method can alleviate the problems of constant location tracking, such as recording sensitive geolocation information and battery drain. In scenarios where locations for geofencing are determined by participants (e.g., home, workplace), no location data need to be transferred to the researcher, so this method can ensure privacy and anonymity. Given the widespread use of smartphones and mobile Internet, geofencing has become a feasible tool in studying human behavior and cognition outside of the laboratory. The method can help advance theoretical and applied psychological science at a new frontier of context-aware research. At the same time, there is a lack of guidance on how and when geofencing can be applied in research. This manuscript aims to fill the gap and ease the adoption of the geofencing method. We describe the current challenges and implementations in geofencing and present three empirical studies in which we evaluated the geofencing method using the Samply application, a tool for mobile experience sampling research. The studies show that sensitivity and precision of geofencing were affected by the type of event, location radius, environment, operating system, and user behavior. Potential implications and recommendations for behavioral research are discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"6411-6439"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11362315/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10428016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-03-14DOI: 10.3758/s13428-024-02382-8
Biao Chen, Junjie Bu, Xu Jiang, Ping Wang, Yan Xie, Zhuoyun Wang, Zhen Liang, Shengzhao Zhang
Response latency is a critical parameter in studying human behavior, representing the time interval between the onset of stimulus and the response. However, different time between devices can introduce errors. Serial port synchronization signal can mitigate this, but limited information is available regarding their accuracy. Optical signals offer another option, but the difference in the positioning of optical signals and visual stimuli can introduce errors, and there have been limited reports of error reduction. This study aims to investigate methods for reducing the time errors. We used the Psychtoolbox to generate visual stimuli and serial port synchronization signals to explore their accuracy. Subsequently, we proposed a calibration formula to minimize the error between optical signals and visual stimuli. The findings are as follows: Firstly, the serial port synchronization signal presenting precedes visual stimulation, with a smaller lead time observed at higher refresh rates. Secondly, the lead time increases as the stimulus position deviates to the right and downwards. In Linux and IOPort(), serial port synchronization signals exhibited greater accuracy. Considering the poor accuracy and the multiple influencing factors associated with serial port synchronization signals, it is recommended to use optical signals to complete time synchronization. The results indicate that under the darkening process, the time error is - 0.23 ~ 0.08 ms (mean). This calibration formula can help measure the response latency accurately. This study provides valuable insights for optimizing experimental design and improving the accuracy of response latency. Although it only involves visual stimuli, the methods and results of this study can still serve as a reference.
{"title":"The discrepancy in timing between synchronous signals and visual stimulation should not be underestimated.","authors":"Biao Chen, Junjie Bu, Xu Jiang, Ping Wang, Yan Xie, Zhuoyun Wang, Zhen Liang, Shengzhao Zhang","doi":"10.3758/s13428-024-02382-8","DOIUrl":"10.3758/s13428-024-02382-8","url":null,"abstract":"<p><p>Response latency is a critical parameter in studying human behavior, representing the time interval between the onset of stimulus and the response. However, different time between devices can introduce errors. Serial port synchronization signal can mitigate this, but limited information is available regarding their accuracy. Optical signals offer another option, but the difference in the positioning of optical signals and visual stimuli can introduce errors, and there have been limited reports of error reduction. This study aims to investigate methods for reducing the time errors. We used the Psychtoolbox to generate visual stimuli and serial port synchronization signals to explore their accuracy. Subsequently, we proposed a calibration formula to minimize the error between optical signals and visual stimuli. The findings are as follows: Firstly, the serial port synchronization signal presenting precedes visual stimulation, with a smaller lead time observed at higher refresh rates. Secondly, the lead time increases as the stimulus position deviates to the right and downwards. In Linux and IOPort(), serial port synchronization signals exhibited greater accuracy. Considering the poor accuracy and the multiple influencing factors associated with serial port synchronization signals, it is recommended to use optical signals to complete time synchronization. The results indicate that under the darkening process, the time error is - 0.23 ~ 0.08 ms (mean). This calibration formula can help measure the response latency accurately. This study provides valuable insights for optimizing experimental design and improving the accuracy of response latency. Although it only involves visual stimuli, the methods and results of this study can still serve as a reference.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"6673-6686"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140130648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01Epub Date: 2024-05-08DOI: 10.3758/s13428-024-02413-4
Jordan Revol, Ginette Lafit, Eva Ceulemans
Researchers increasingly study short-term dynamic processes that evolve within single individuals using N = 1 studies. The processes of interest are typically captured by fitting a VAR(1) model to the resulting data. A crucial question is how to perform sample-size planning and thus decide on the number of measurement occasions that are needed. The most popular approach is to perform a power analysis, which focuses on detecting the effects of interest. We argue that performing sample-size planning based on out-of-sample predictive accuracy yields additional important information regarding potential overfitting of the model. Predictive accuracy quantifies how well the estimated VAR(1) model will allow predicting unseen data from the same individual. We propose a new simulation-based sample-size planning method called predictive accuracy analysis (PAA), and an associated Shiny app. This approach makes use of a novel predictive accuracy metric that accounts for the multivariate nature of the prediction problem. We showcase how the values of the different VAR(1) model parameters impact power and predictive accuracy-based sample-size recommendations using simulated data sets and real data applications. The range of recommended sample sizes is smaller for predictive accuracy analysis than for power analysis.
{"title":"A new sample-size planning approach for person-specific VAR(1) studies: Predictive accuracy analysis.","authors":"Jordan Revol, Ginette Lafit, Eva Ceulemans","doi":"10.3758/s13428-024-02413-4","DOIUrl":"10.3758/s13428-024-02413-4","url":null,"abstract":"<p><p>Researchers increasingly study short-term dynamic processes that evolve within single individuals using N = 1 studies. The processes of interest are typically captured by fitting a VAR(1) model to the resulting data. A crucial question is how to perform sample-size planning and thus decide on the number of measurement occasions that are needed. The most popular approach is to perform a power analysis, which focuses on detecting the effects of interest. We argue that performing sample-size planning based on out-of-sample predictive accuracy yields additional important information regarding potential overfitting of the model. Predictive accuracy quantifies how well the estimated VAR(1) model will allow predicting unseen data from the same individual. We propose a new simulation-based sample-size planning method called predictive accuracy analysis (PAA), and an associated Shiny app. This approach makes use of a novel predictive accuracy metric that accounts for the multivariate nature of the prediction problem. We showcase how the values of the different VAR(1) model parameters impact power and predictive accuracy-based sample-size recommendations using simulated data sets and real data applications. The range of recommended sample sizes is smaller for predictive accuracy analysis than for power analysis.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":" ","pages":"7152-7167"},"PeriodicalIF":4.6,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140875721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}