Pub Date : 2026-01-16DOI: 10.3758/s13428-025-02926-6
Thomas Schmidt, Maximilian P Wolkersdorfer, Xin Ying Lee, Omar Jubran
One of the most popular approaches to unconscious cognition is the technique of "post hoc selection": Priming effects and visibility ratings are measured in multitasks on the same trial, and only trials with the lowest visibility ratings are selected for analysis of (presumably unconscious) priming effects. In the past, the technique has been criticized for creating statistical artifacts and capitalizing on chance. Here, we argue that post hoc selection constitutes a sampling fallacy, confusing sensitivity and response bias, wrongly ascribing unconscious processing to stimulus conditions that may be far from indiscriminable. In response to a high-profile "best practice" paper by Stockart et al. (2025) that condones the technique, we use standard signal detection theory to show that post hoc selection only isolates trials with neutral response bias, irrespective of actual sensitivity, and thus fails to isolate trials where the critical stimulus is "unconscious". Our own data demonstrate that zero-visibility ratings are consistent with uncomfortably high levels of sensitivity. As an alternative to post hoc selection, we advocate the study of functional dissociations, where direct (D) and indirect (I) measures are conceptualized as spanning a two-dimensional D-I space wherein simple, sensitivity, and double dissociations appear as distinct curve patterns. While Stockart et al.'s recommendations cover only a single line of that space where D is close to zero, functional dissociations can utilize the entire space. This circumvents requirements like null visibility and exhaustive reliability, allows for dissociations among different measures of awareness, and supports the planful measurement of functional relationships between direct and indirect measures.
{"title":"Unconscious cognition without post hoc selection artifacts: From selective analysis to functional dissociations.","authors":"Thomas Schmidt, Maximilian P Wolkersdorfer, Xin Ying Lee, Omar Jubran","doi":"10.3758/s13428-025-02926-6","DOIUrl":"10.3758/s13428-025-02926-6","url":null,"abstract":"<p><p>One of the most popular approaches to unconscious cognition is the technique of \"post hoc selection\": Priming effects and visibility ratings are measured in multitasks on the same trial, and only trials with the lowest visibility ratings are selected for analysis of (presumably unconscious) priming effects. In the past, the technique has been criticized for creating statistical artifacts and capitalizing on chance. Here, we argue that post hoc selection constitutes a sampling fallacy, confusing sensitivity and response bias, wrongly ascribing unconscious processing to stimulus conditions that may be far from indiscriminable. In response to a high-profile \"best practice\" paper by Stockart et al. (2025) that condones the technique, we use standard signal detection theory to show that post hoc selection only isolates trials with neutral response bias, irrespective of actual sensitivity, and thus fails to isolate trials where the critical stimulus is \"unconscious\". Our own data demonstrate that zero-visibility ratings are consistent with uncomfortably high levels of sensitivity. As an alternative to post hoc selection, we advocate the study of functional dissociations, where direct (D) and indirect (I) measures are conceptualized as spanning a two-dimensional D-I space wherein simple, sensitivity, and double dissociations appear as distinct curve patterns. While Stockart et al.'s recommendations cover only a single line of that space where D is close to zero, functional dissociations can utilize the entire space. This circumvents requirements like null visibility and exhaustive reliability, allows for dissociations among different measures of awareness, and supports the planful measurement of functional relationships between direct and indirect measures.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 2","pages":"39"},"PeriodicalIF":3.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811327/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.3758/s13428-025-02922-w
Scott Crossley, Joon Suh Choi, Kenny Tang, Laurie Cutting
This study documents and assesses the Tool for Automatic Analysis of Decoding Ambiguity (TAADA). TAADA calculates measures related to decoding, including metrics for grapheme and phoneme counts, neighborhood effects, rhymes, and conditional probabilities for sound-spelling relationships. These measures are assessed in two reading studies. The first study examined links between decoding variables and judgments of reading ease in a corpus of ~5000 reading excerpts, finding that variables related to word frequency, phonographic neighbors for words, word syllable length, and the reverse prior probability for consonants explained 34% of the variance in the reading scores. The second examined links between decoding variables and student reading miscues, finding that word frequency, phoneme counts, rhyme counts, and probability counts explained 3% of students' reading miscues.
{"title":"The Tool for Automatic Analysis of Decoding Ambiguity (TAADA).","authors":"Scott Crossley, Joon Suh Choi, Kenny Tang, Laurie Cutting","doi":"10.3758/s13428-025-02922-w","DOIUrl":"10.3758/s13428-025-02922-w","url":null,"abstract":"<p><p>This study documents and assesses the Tool for Automatic Analysis of Decoding Ambiguity (TAADA). TAADA calculates measures related to decoding, including metrics for grapheme and phoneme counts, neighborhood effects, rhymes, and conditional probabilities for sound-spelling relationships. These measures are assessed in two reading studies. The first study examined links between decoding variables and judgments of reading ease in a corpus of ~5000 reading excerpts, finding that variables related to word frequency, phonographic neighbors for words, word syllable length, and the reverse prior probability for consonants explained 34% of the variance in the reading scores. The second examined links between decoding variables and student reading miscues, finding that word frequency, phoneme counts, rhyme counts, and probability counts explained 3% of students' reading miscues.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 2","pages":"40"},"PeriodicalIF":3.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811272/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.3758/s13428-025-02911-z
Wendie Yang, Shu Fai Cheung
Standardized coefficients - including factor loadings, correlations, and indirect effects - are fundamental to interpreting structural equation modeling (SEM) results in psychology. However, they often exhibit skewed sampling distributions in finite samples, which are not captured by conventional symmetric confidence intervals (CIs). Methods such as bootstrap CI that do not impose symmetry are more appropriate for these coefficients. Despite its popularity, the widely used R package lavaan (version 0.6-19 or earlier) provides limited bootstrap support for standardized coefficients. Specifically, its function standardizedSolution() uses the delta method for CIs and lacks bootstrap p values. It provides a flexible and powerful function, bootstrapLavaan(), for bootstrapping, and it can be used to form bootstrap CIs for the standardized coefficients. However, this function requires a certain level of R coding skills. Moreover, no built-in functions are available to inspect bootstrap distributions, which are recommended for assessing the stability of the bootstrap estimates. To address these limitations, we developed the semboottools R package, which provides a simple workflow in SEM to form bootstrap confidence intervals for unstandardized and standardized estimates of model and user-defined parameters. It allows researchers to generate percentile or bias-corrected bootstrap CIs, standard errors, asymmetric p values, compare the bootstrap CIs with other CI methods (e.g., delta method), and visualize the distributions of bootstrap estimates - with minimal coding effort. We believe the tool can facilitate researchers in easily forming bootstrap CIs, comparing different CI methods to assess the need for bootstrapping, and examining the distribution of bootstrap estimates to assess their stability.
{"title":"Forming bootstrap confidence intervals and examining bootstrap distributions of standardized coefficients in structural equation modelling: A simplified workflow using the R package semboottools.","authors":"Wendie Yang, Shu Fai Cheung","doi":"10.3758/s13428-025-02911-z","DOIUrl":"10.3758/s13428-025-02911-z","url":null,"abstract":"<p><p>Standardized coefficients - including factor loadings, correlations, and indirect effects - are fundamental to interpreting structural equation modeling (SEM) results in psychology. However, they often exhibit skewed sampling distributions in finite samples, which are not captured by conventional symmetric confidence intervals (CIs). Methods such as bootstrap CI that do not impose symmetry are more appropriate for these coefficients. Despite its popularity, the widely used R package lavaan (version 0.6-19 or earlier) provides limited bootstrap support for standardized coefficients. Specifically, its function standardizedSolution() uses the delta method for CIs and lacks bootstrap p values. It provides a flexible and powerful function, bootstrapLavaan(), for bootstrapping, and it can be used to form bootstrap CIs for the standardized coefficients. However, this function requires a certain level of R coding skills. Moreover, no built-in functions are available to inspect bootstrap distributions, which are recommended for assessing the stability of the bootstrap estimates. To address these limitations, we developed the semboottools R package, which provides a simple workflow in SEM to form bootstrap confidence intervals for unstandardized and standardized estimates of model and user-defined parameters. It allows researchers to generate percentile or bias-corrected bootstrap CIs, standard errors, asymmetric p values, compare the bootstrap CIs with other CI methods (e.g., delta method), and visualize the distributions of bootstrap estimates - with minimal coding effort. We believe the tool can facilitate researchers in easily forming bootstrap CIs, comparing different CI methods to assess the need for bootstrapping, and examining the distribution of bootstrap estimates to assess their stability.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 2","pages":"38"},"PeriodicalIF":3.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.3758/s13428-025-02927-5
Jonathan D'hondt, Barbara Briers
Understanding food preferences plays a crucial role in addressing both health concerns, such as obesity, and environmental concerns, such as climate change. Recognizing the impact of lay beliefs on food preferences is essential in addressing these challenges. One prevalent belief is the "unhealthy = tasty intuition" (UTI), the belief that taste and health in food do not go together. While self-report scales and behavioral tasks are commonly used to measure such beliefs, they have distinct methodological purposes: scales are better suited for assessing stable, trait-like constructs, whereas tasks capture more dynamic processes and are well suited for experimental manipulation. This paper introduces a mouse-tracking classification task that provides a process-based behavioral index of UTI, providing a novel approach for assessing implicit beliefs about the relationship between taste and health in food. Three studies validate the task, demonstrating correlations between explicit UTI scores and task performance. Additionally, the task predicts actual food consumption and, importantly, exhibits sensitivity to contextual manipulations. Because this task can be adapted to measure other beliefs, it is a valuable tool for researchers working on individual lay beliefs and decision-making processes. To that end, a template of the task is provided to help other researchers build on this work.
{"title":"A mouse-tracking classification task to measure the unhealthy = tasty intuition.","authors":"Jonathan D'hondt, Barbara Briers","doi":"10.3758/s13428-025-02927-5","DOIUrl":"10.3758/s13428-025-02927-5","url":null,"abstract":"<p><p>Understanding food preferences plays a crucial role in addressing both health concerns, such as obesity, and environmental concerns, such as climate change. Recognizing the impact of lay beliefs on food preferences is essential in addressing these challenges. One prevalent belief is the \"unhealthy = tasty intuition\" (UTI), the belief that taste and health in food do not go together. While self-report scales and behavioral tasks are commonly used to measure such beliefs, they have distinct methodological purposes: scales are better suited for assessing stable, trait-like constructs, whereas tasks capture more dynamic processes and are well suited for experimental manipulation. This paper introduces a mouse-tracking classification task that provides a process-based behavioral index of UTI, providing a novel approach for assessing implicit beliefs about the relationship between taste and health in food. Three studies validate the task, demonstrating correlations between explicit UTI scores and task performance. Additionally, the task predicts actual food consumption and, importantly, exhibits sensitivity to contextual manipulations. Because this task can be adapted to measure other beliefs, it is a valuable tool for researchers working on individual lay beliefs and decision-making processes. To that end, a template of the task is provided to help other researchers build on this work.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 2","pages":"37"},"PeriodicalIF":3.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811306/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-16DOI: 10.3758/s13428-025-02914-w
Lisa Loy, James P Trujillo, Floris Roelofsen
Gesture recognition technology is a popular area of research, offering applications in many fields, including behaviour research, human-computer interaction (HCI), medical research, and surveillance culture, among others. However, the large quantity of data needed to train a recognition algorithm is not always available, and differences between the training set and one's own research data in factors such as recording conditions and participant characteristics may hinder transferability. To address these issues, we propose training and testing recognition algorithms on virtual agents, a tool that has not yet been used for this purpose in multimodal communication research. We provide an example use case with step-by-step instructions, using mocap data to animate a virtual agent and create customised lighting conditions, backgrounds, and camera angles, creating a virtual agent-only dataset to train and test a gesture recognition algorithm. This approach also allows us to assess the impact of particular features, such as background and lighting. Our best-performing model in optimal background and lighting conditions achieved accuracy of 85.9%. When introducing background clutter and reduced lighting, the accuracy dropped to 71.6%. When testing the virtual agent-trained model on images of humans, the accuracy of target handshape classification ranged from 72% to 95%. The results suggest that training an algorithm on artificial data (1) is a resourceful, convenient, and effective way to customise algorithms, (2) potentially addresses issues of data sparsity, and (3) can be used to assess the impact of many contextual and environmental factors that would not be feasible to systematically assess using human data.
{"title":"Virtual agents as a scalable tool for diverse, robust gesture recognition.","authors":"Lisa Loy, James P Trujillo, Floris Roelofsen","doi":"10.3758/s13428-025-02914-w","DOIUrl":"10.3758/s13428-025-02914-w","url":null,"abstract":"<p><p>Gesture recognition technology is a popular area of research, offering applications in many fields, including behaviour research, human-computer interaction (HCI), medical research, and surveillance culture, among others. However, the large quantity of data needed to train a recognition algorithm is not always available, and differences between the training set and one's own research data in factors such as recording conditions and participant characteristics may hinder transferability. To address these issues, we propose training and testing recognition algorithms on virtual agents, a tool that has not yet been used for this purpose in multimodal communication research. We provide an example use case with step-by-step instructions, using mocap data to animate a virtual agent and create customised lighting conditions, backgrounds, and camera angles, creating a virtual agent-only dataset to train and test a gesture recognition algorithm. This approach also allows us to assess the impact of particular features, such as background and lighting. Our best-performing model in optimal background and lighting conditions achieved accuracy of 85.9%. When introducing background clutter and reduced lighting, the accuracy dropped to 71.6%. When testing the virtual agent-trained model on images of humans, the accuracy of target handshape classification ranged from 72% to 95%. The results suggest that training an algorithm on artificial data (1) is a resourceful, convenient, and effective way to customise algorithms, (2) potentially addresses issues of data sparsity, and (3) can be used to assess the impact of many contextual and environmental factors that would not be feasible to systematically assess using human data.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 2","pages":"41"},"PeriodicalIF":3.9,"publicationDate":"2026-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12811268/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145987841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.3758/s13428-025-02869-y
Niek Stevenson, Michelle C Donzallaz, Reilly J Innes, Birte U Forstmann, Dora Matzke, Andrew Heathcote
EMC2 is an R package that provides a comprehensive five-phase workflow for Bayesian hierarchical analysis of cognitive models of choice. In the design phase, EMC2 bridges the gap between standard regression analyses and cognitive modeling through linear-model specifications for cognitive-model parameters. In the Bayesian specification and sampling phases, the package provides flexible priors, hierarchical structures, and efficient sampling algorithms, enabling fast, user-friendly estimation of computationally intensive cognitive models. In the final two phases, EMC2 provides a suite of functions for model criticism and inference. Using two leading evidence-accumulation models for illustration, we provide a tutorial on the EMC2-based workflow that eases and guides the process of specifying, evaluating, refining, comparing, and interpreting Bayesian hierarchical cognitive models.
{"title":"Bayesian hierarchical cognitive modeling with the EMC2 package.","authors":"Niek Stevenson, Michelle C Donzallaz, Reilly J Innes, Birte U Forstmann, Dora Matzke, Andrew Heathcote","doi":"10.3758/s13428-025-02869-y","DOIUrl":"10.3758/s13428-025-02869-y","url":null,"abstract":"<p><p>EMC2 is an R package that provides a comprehensive five-phase workflow for Bayesian hierarchical analysis of cognitive models of choice. In the design phase, EMC2 bridges the gap between standard regression analyses and cognitive modeling through linear-model specifications for cognitive-model parameters. In the Bayesian specification and sampling phases, the package provides flexible priors, hierarchical structures, and efficient sampling algorithms, enabling fast, user-friendly estimation of computationally intensive cognitive models. In the final two phases, EMC2 provides a suite of functions for model criticism and inference. Using two leading evidence-accumulation models for illustration, we provide a tutorial on the EMC2-based workflow that eases and guides the process of specifying, evaluating, refining, comparing, and interpreting Bayesian hierarchical cognitive models.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"35"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145958351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.3758/s13428-025-02823-y
Carolina Guidolin, Johannes Zauner, Steffen Lutz Hartmeyer, Manuel Spitschan
In field studies using wearable light loggers, participants often need to remove the devices, resulting in non-wear intervals of varying and unknown duration. Accurate detection of these intervals is an essential step during data pre-processing. Here, we deployed a multi-modal approach to collect non-wear time during a longitudinal light exposure collection campaign and systematically compare non-wear detection strategies. Healthy participants (n = 26; mean age 28 ± 5 years, 14F) wore a near-corneal plane light logger for 1 week and reported non-wear events in three ways: pressing an "event marker" button on the light logger, placing it in a black bag, and using an app-based Wear log. Wear log entries, checked twice daily, served as ground truth for non-wear detection, showing that non-wear time constituted 5.4 ± 3.8% (mean ± SD) of total participation time. Button presses at the start and end of non-wear intervals were identified in >85.4% of cases when considering time windows beyond 1 min for detection. To detect non-wear intervals based on black bag use and lack of motion, we employed an algorithm that detects clusters of low illuminance and clusters of low activity. Performance was higher for illuminance (F1 = 0.78) than for activity (F1 = 0.52). Light exposure metrics derived from the full dataset, a dataset filtered for non-wear based on self-reports, and a dataset filtered for non-wear using the low illuminance clusters detection algorithm showed minimal differences. Our results highlight that while non-wear detection may be less critical in high-compliance cohorts, systematically collecting and detecting non-wear intervals is feasible and important for ensuring robust data pre-processing.
{"title":"Collecting, detecting, and handling non-wear intervals in longitudinal light exposure data.","authors":"Carolina Guidolin, Johannes Zauner, Steffen Lutz Hartmeyer, Manuel Spitschan","doi":"10.3758/s13428-025-02823-y","DOIUrl":"10.3758/s13428-025-02823-y","url":null,"abstract":"<p><p>In field studies using wearable light loggers, participants often need to remove the devices, resulting in non-wear intervals of varying and unknown duration. Accurate detection of these intervals is an essential step during data pre-processing. Here, we deployed a multi-modal approach to collect non-wear time during a longitudinal light exposure collection campaign and systematically compare non-wear detection strategies. Healthy participants (n = 26; mean age 28 ± 5 years, 14F) wore a near-corneal plane light logger for 1 week and reported non-wear events in three ways: pressing an \"event marker\" button on the light logger, placing it in a black bag, and using an app-based Wear log. Wear log entries, checked twice daily, served as ground truth for non-wear detection, showing that non-wear time constituted 5.4 ± 3.8% (mean ± SD) of total participation time. Button presses at the start and end of non-wear intervals were identified in >85.4% of cases when considering time windows beyond 1 min for detection. To detect non-wear intervals based on black bag use and lack of motion, we employed an algorithm that detects clusters of low illuminance and clusters of low activity. Performance was higher for illuminance (F1 = 0.78) than for activity (F1 = 0.52). Light exposure metrics derived from the full dataset, a dataset filtered for non-wear based on self-reports, and a dataset filtered for non-wear using the low illuminance clusters detection algorithm showed minimal differences. Our results highlight that while non-wear detection may be less critical in high-compliance cohorts, systematically collecting and detecting non-wear intervals is feasible and important for ensuring robust data pre-processing.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"36"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12795912/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145958356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-12DOI: 10.3758/s13428-025-02910-0
Tuǧçe Nur Pekçetin, Gaye Aşkın, Şeyda Evsen, Tuvana Dilan Karaduman, Badel Barinal, Jana Tunç, Cengiz Acarturk, Burcu A Urgen
We present the HR-ACT (Human-Robot Action) Database, a comprehensive collection of 80 standardized videos featuring matched communicative and noncommunicative actions performed by both a humanoid robot (Pepper) and a human actor. We describe the creation of 40 action exemplars per agent, with actions executed in a similar manner, timing, and number of repetitions. The database includes detailed normative data collected from 438 participants, providing metrics on action identification, confidence ratings, communicativeness ratings, meaning clusters, and H values (an entropy-based measure reflecting response homogeneity). We provide researchers with controlled yet naturalistic stimuli in multiple formats: videos, image frames, and raw animation files (.qanim). These materials support diverse research applications in human-robot interaction, cognitive psychology, and neuroscience. The database enables systematic investigation of action perception across human and robotic agents, while the inclusion of raw animation files allows researchers using Pepper robots to implement these actions for real-time experiments. The full set of stimuli, along with comprehensive normative data and documentation, is publicly available at https://osf.io/8vsxq/ .
{"title":"HR-ACT (Human-Robot Action) Database: Communicative and noncommunicative action videos featuring a human and a humanoid robot.","authors":"Tuǧçe Nur Pekçetin, Gaye Aşkın, Şeyda Evsen, Tuvana Dilan Karaduman, Badel Barinal, Jana Tunç, Cengiz Acarturk, Burcu A Urgen","doi":"10.3758/s13428-025-02910-0","DOIUrl":"10.3758/s13428-025-02910-0","url":null,"abstract":"<p><p>We present the HR-ACT (Human-Robot Action) Database, a comprehensive collection of 80 standardized videos featuring matched communicative and noncommunicative actions performed by both a humanoid robot (Pepper) and a human actor. We describe the creation of 40 action exemplars per agent, with actions executed in a similar manner, timing, and number of repetitions. The database includes detailed normative data collected from 438 participants, providing metrics on action identification, confidence ratings, communicativeness ratings, meaning clusters, and H values (an entropy-based measure reflecting response homogeneity). We provide researchers with controlled yet naturalistic stimuli in multiple formats: videos, image frames, and raw animation files (.qanim). These materials support diverse research applications in human-robot interaction, cognitive psychology, and neuroscience. The database enables systematic investigation of action perception across human and robotic agents, while the inclusion of raw animation files allows researchers using Pepper robots to implement these actions for real-time experiments. The full set of stimuli, along with comprehensive normative data and documentation, is publicly available at https://osf.io/8vsxq/ .</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"34"},"PeriodicalIF":3.9,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12795972/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145958338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-09DOI: 10.3758/s13428-025-02918-6
Giorgio Piazza, Natalia Kartushina, Christoforos Souganidis, James E Flege, Clara D Martin
Psycholinguistic research has become increasingly reliant on online experimentation, making it an attractive approach for studying speech production. However, concerns remain about data quality and participant engagement in online settings. In this preregistered study, we used two tasks-picture naming and reading aloud-to test whether the lexical frequency effect (low-frequency words having shorter speech onset times than high-frequency words) could be reliably detected in the online environment (run at home), both with and without experimenter supervision. Participants completed the same two tasks at home and in the lab. Half of the participants performed both tasks with supervision and the other half unsupervised. In the naming task, all conditions yielded consistent frequency effects (~27-41 ms), comparable to previous online and lab findings. In the reading aloud task, lexical frequency effect emerged in all conditions except for the home-supervised, where the effect was in the expected direction but nonsignificant (~12 ms). Notably, participants were overall faster at home than in the lab (~10 ms), and unsupervised settings yielded the largest effect sizes. This suggests that experimenter presence may inadvertently dampen subtle effects, possibly due to increased self-monitoring or reduced comfort. Such findings indicate the reliability of online platforms for speech production research in psycholinguistics and highlight the nuanced influence of supervision on speech outcomes.
{"title":"Speech onset time at home or in the lab: The role of testing environment and experimenter presence.","authors":"Giorgio Piazza, Natalia Kartushina, Christoforos Souganidis, James E Flege, Clara D Martin","doi":"10.3758/s13428-025-02918-6","DOIUrl":"10.3758/s13428-025-02918-6","url":null,"abstract":"<p><p>Psycholinguistic research has become increasingly reliant on online experimentation, making it an attractive approach for studying speech production. However, concerns remain about data quality and participant engagement in online settings. In this preregistered study, we used two tasks-picture naming and reading aloud-to test whether the lexical frequency effect (low-frequency words having shorter speech onset times than high-frequency words) could be reliably detected in the online environment (run at home), both with and without experimenter supervision. Participants completed the same two tasks at home and in the lab. Half of the participants performed both tasks with supervision and the other half unsupervised. In the naming task, all conditions yielded consistent frequency effects (~27-41 ms), comparable to previous online and lab findings. In the reading aloud task, lexical frequency effect emerged in all conditions except for the home-supervised, where the effect was in the expected direction but nonsignificant (~12 ms). Notably, participants were overall faster at home than in the lab (~10 ms), and unsupervised settings yielded the largest effect sizes. This suggests that experimenter presence may inadvertently dampen subtle effects, possibly due to increased self-monitoring or reduced comfort. Such findings indicate the reliability of online platforms for speech production research in psycholinguistics and highlight the nuanced influence of supervision on speech outcomes.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"33"},"PeriodicalIF":3.9,"publicationDate":"2026-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145942407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.3758/s13428-025-02905-x
Nienke E R van Bueren, Anne H van Hoogmoed, Sanne H G van der Ven, Lisa M Jonkman
Electroencephalography (EEG) provides valuable insights into brain development, but collecting high-quality data can be tedious, limiting its usability with children. This study evaluates the feasibility and reliability of EEG data acquisition in children with a wireless consumer-grade EEG headset (EMOTIV EPOC X), by comparing it to a research-grade system (BioSemi ActiveTwo), with a focus on aperiodic brain activity. The portability of the EMOTIV headset allows for EEG data collection in ecologically valid, real-world settings such as schools, enabling novel insights into brain activity during learning. We recorded EEG from 93 children (aged 9-10 years) using the EMOTIV headset, beginning with a 4-min resting-state measurement, followed by assessments of mathematical ability, visuospatial working memory, and verbal working memory, in a classroom environment. Aperiodic activity, thought to reflect fundamental aspects of neural excitability and cognitive processing, was extracted and its reliability compared across the two EEG systems. We further tested whether aperiodic activity recorded with EMOTIV predicts mathematical ability, replicating earlier research using research-grade EEG equipment. Our findings reveal that, similar to earlier findings, lower aperiodic activity was associated with higher math performance, supporting its role as a neural marker of cognitive ability. These results demonstrate the feasibility and reliability of using a consumer-grade mobile EEG headset to investigate individual differences in cognitive development in naturalistic contexts. This work opens up new opportunities for large-scale, school-based neurocognitive assessments and paves the way for personalized educational approaches based on neural profiles.
{"title":"Comparing aperiodic activity in consumer-grade and research-grade EEG: Reliability and association with mathematical ability.","authors":"Nienke E R van Bueren, Anne H van Hoogmoed, Sanne H G van der Ven, Lisa M Jonkman","doi":"10.3758/s13428-025-02905-x","DOIUrl":"10.3758/s13428-025-02905-x","url":null,"abstract":"<p><p>Electroencephalography (EEG) provides valuable insights into brain development, but collecting high-quality data can be tedious, limiting its usability with children. This study evaluates the feasibility and reliability of EEG data acquisition in children with a wireless consumer-grade EEG headset (EMOTIV EPOC X), by comparing it to a research-grade system (BioSemi ActiveTwo), with a focus on aperiodic brain activity. The portability of the EMOTIV headset allows for EEG data collection in ecologically valid, real-world settings such as schools, enabling novel insights into brain activity during learning. We recorded EEG from 93 children (aged 9-10 years) using the EMOTIV headset, beginning with a 4-min resting-state measurement, followed by assessments of mathematical ability, visuospatial working memory, and verbal working memory, in a classroom environment. Aperiodic activity, thought to reflect fundamental aspects of neural excitability and cognitive processing, was extracted and its reliability compared across the two EEG systems. We further tested whether aperiodic activity recorded with EMOTIV predicts mathematical ability, replicating earlier research using research-grade EEG equipment. Our findings reveal that, similar to earlier findings, lower aperiodic activity was associated with higher math performance, supporting its role as a neural marker of cognitive ability. These results demonstrate the feasibility and reliability of using a consumer-grade mobile EEG headset to investigate individual differences in cognitive development in naturalistic contexts. This work opens up new opportunities for large-scale, school-based neurocognitive assessments and paves the way for personalized educational approaches based on neural profiles.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"58 1","pages":"32"},"PeriodicalIF":3.9,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12775036/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145910324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}