Pub Date : 2025-10-10DOI: 10.3758/s13428-025-02827-8
Holger Mitterer
The study of language processing requires data from a wide range of languages but also data that are free from demand characteristics and meta-linguistic strategies. While eye-tracking has been successfully used to address the later issue, pragmatically, eye-tracking is often difficult to achieve with less well-studied languages. Therefore, the current paper presents a web-based mouse-tracking task that generates data that seem to reflect early perceptual processes similar to eye-tracking but which can be performed remotely. The task uses a set-up similar to early video games to entice participants to use language input as early as possible. The data presented here replicate an earlier eye-tracking study focusing on how reduced words are recognized. Fillers from the same study are also used, which show that the paradigm also reflects predictive semantic processing. It is concluded that the paradigm can be used to investigate lexical access, prosodic processing, and predictive semantic processing.
{"title":"A web-based mouse-tracking task for early perceptual language processing.","authors":"Holger Mitterer","doi":"10.3758/s13428-025-02827-8","DOIUrl":"10.3758/s13428-025-02827-8","url":null,"abstract":"<p><p>The study of language processing requires data from a wide range of languages but also data that are free from demand characteristics and meta-linguistic strategies. While eye-tracking has been successfully used to address the later issue, pragmatically, eye-tracking is often difficult to achieve with less well-studied languages. Therefore, the current paper presents a web-based mouse-tracking task that generates data that seem to reflect early perceptual processes similar to eye-tracking but which can be performed remotely. The task uses a set-up similar to early video games to entice participants to use language input as early as possible. The data presented here replicate an earlier eye-tracking study focusing on how reduced words are recognized. Fillers from the same study are also used, which show that the paradigm also reflects predictive semantic processing. It is concluded that the paradigm can be used to investigate lexical access, prosodic processing, and predictive semantic processing.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"308"},"PeriodicalIF":3.9,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12513896/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145273570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-10DOI: 10.3758/s13428-025-02847-4
Shaohua Tang, Chunbo Jiang, Zheng Li
Purpose: Mind wandering (MW), a common cognitive phenomenon marked by a shift of attention away from the task at hand, poses significant challenges in online educational settings. This study aims to advance MW detection by developing a classification scheme that leverages multimodal data, including electroencephalograph (EEG) signals and facial video recorded using a commercial off-the-shelf webcam. Additionally, this study provides an in-depth analysis of feature contributions and explores the correlation between self-reported introspective confidence, mental state stability, and classification performance, offering deeper insights into MW detection.
Methods: Data were collected from 26 college students during a video-based learning task, interspersed with modified experience sampling probes. To enhance the sample size and address autocorrelation in EEG signals, a probe-based sample extraction method was applied. MW classification was performed using a random forest algorithm, with features derived from both EEG signals and facial video recordings. Model performance was evaluated using within-participant tenfold cross-validation and leave-one-participant-out (LOPO) cross-validation.
Results: The combination of EEG and video features yielded better performance (AUC = 0.68 for within-participant; AUC = 0.56 for LOPO) compared to using EEG or video alone. Individual differences significantly influenced performance, with a 10% increase in AUC observed when training data included samples from the evaluated individual in augmented LOPO cross-validation. Introspective confidence levels positively correlated with classification performance, while mental state temporal stability was associated with improved cross-participant performance. Additionally, the size of the training set positively correlated with cross-participant performance when combining EEG and video features.
Conclusion: These findings underscore the potential of multimodal approaches for MW detection and highlight the importance of individual differences and data diversity in classifier training. The study provides actionable insights into improving MW detection systems for real-world applications in educational settings.
{"title":"Detecting mind wandering via EEG and facial video features.","authors":"Shaohua Tang, Chunbo Jiang, Zheng Li","doi":"10.3758/s13428-025-02847-4","DOIUrl":"10.3758/s13428-025-02847-4","url":null,"abstract":"<p><strong>Purpose: </strong>Mind wandering (MW), a common cognitive phenomenon marked by a shift of attention away from the task at hand, poses significant challenges in online educational settings. This study aims to advance MW detection by developing a classification scheme that leverages multimodal data, including electroencephalograph (EEG) signals and facial video recorded using a commercial off-the-shelf webcam. Additionally, this study provides an in-depth analysis of feature contributions and explores the correlation between self-reported introspective confidence, mental state stability, and classification performance, offering deeper insights into MW detection.</p><p><strong>Methods: </strong>Data were collected from 26 college students during a video-based learning task, interspersed with modified experience sampling probes. To enhance the sample size and address autocorrelation in EEG signals, a probe-based sample extraction method was applied. MW classification was performed using a random forest algorithm, with features derived from both EEG signals and facial video recordings. Model performance was evaluated using within-participant tenfold cross-validation and leave-one-participant-out (LOPO) cross-validation.</p><p><strong>Results: </strong>The combination of EEG and video features yielded better performance (AUC = 0.68 for within-participant; AUC = 0.56 for LOPO) compared to using EEG or video alone. Individual differences significantly influenced performance, with a 10% increase in AUC observed when training data included samples from the evaluated individual in augmented LOPO cross-validation. Introspective confidence levels positively correlated with classification performance, while mental state temporal stability was associated with improved cross-participant performance. Additionally, the size of the training set positively correlated with cross-participant performance when combining EEG and video features.</p><p><strong>Conclusion: </strong>These findings underscore the potential of multimodal approaches for MW detection and highlight the importance of individual differences and data diversity in classifier training. The study provides actionable insights into improving MW detection systems for real-world applications in educational settings.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"310"},"PeriodicalIF":3.9,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145273512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-10DOI: 10.3758/s13428-025-02821-0
Eunsook Kim, Radhika Sundar, Emma Evudottir, Minjung Kim, Robert Dedrick, Junyeong Yang, Nathaniel von der Embse
Polynomial regression with response surface analysis (PRRSA) has been widely adopted in congruence research when the relation of congruence to an outcome is examined. However, PRRSA assumes that the congruence effects are homogeneous across all individuals. Polynomial regression mixture analysis (PRMix) allows for heterogeneity in the effect of congruence on an outcome across individuals and identifies latent classes of differential congruence effects. In this study, through Monte Carlo simulation, we examined bias in response surface parameters when differential congruence effects were not modeled correctly in PRRSA. We found that the size of the bias depended on the proportion of ignored classes. When evaluating PRMix and nonnormal PRMix, we found that PRMix generally performed well in detecting two latent classes of differential congruence effects when the assumption of residual normality within class was satisfied, but led to severe over-extraction when this assumption was violated. Nonnormal PRMix provided an adequate solution for the generated skew t residual distribution within class. We provide an empirical data example with annotated software syntax to demonstrate the normal and nonnormal PRMix procedures, including model specification and the construction of confidence intervals for the response surface parameters. Practical implications are discussed for applied researchers.
{"title":"Normal and nonnormal polynomial regression mixture modeling for differential congruence effects: A simulation and tutorial.","authors":"Eunsook Kim, Radhika Sundar, Emma Evudottir, Minjung Kim, Robert Dedrick, Junyeong Yang, Nathaniel von der Embse","doi":"10.3758/s13428-025-02821-0","DOIUrl":"10.3758/s13428-025-02821-0","url":null,"abstract":"<p><p>Polynomial regression with response surface analysis (PRRSA) has been widely adopted in congruence research when the relation of congruence to an outcome is examined. However, PRRSA assumes that the congruence effects are homogeneous across all individuals. Polynomial regression mixture analysis (PRMix) allows for heterogeneity in the effect of congruence on an outcome across individuals and identifies latent classes of differential congruence effects. In this study, through Monte Carlo simulation, we examined bias in response surface parameters when differential congruence effects were not modeled correctly in PRRSA. We found that the size of the bias depended on the proportion of ignored classes. When evaluating PRMix and nonnormal PRMix, we found that PRMix generally performed well in detecting two latent classes of differential congruence effects when the assumption of residual normality within class was satisfied, but led to severe over-extraction when this assumption was violated. Nonnormal PRMix provided an adequate solution for the generated skew t residual distribution within class. We provide an empirical data example with annotated software syntax to demonstrate the normal and nonnormal PRMix procedures, including model specification and the construction of confidence intervals for the response surface parameters. Practical implications are discussed for applied researchers.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"309"},"PeriodicalIF":3.9,"publicationDate":"2025-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145273502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-08DOI: 10.3758/s13428-025-02798-w
Matthew S Welhaf, Matt E Meier, Michael J Kane
Previous work has argued that the ability to sustain attention consistency can be best modeled as the individual-difference covariation in objective performance-based measures (e.g., reaction-time [RT] variability; accuracy) and self-report measures of task-unrelated thought (TUT). Latent variable studies demonstrate that a general, higher-order attention consistency factor correlates more strongly with nomological network constructs than do either lower-order, measurement-specific factors. The present study aimed to replicate and extend this measurement approach by building a construct-valid battery of sustained attention consistency tasks and testing associations with the conative factors of task interest and success motivation. We analyzed data from 402 subjects who completed a battery of seven attention-consistency functions and found that the hierarchical model provided an adequate fit to the data. Further, attention-consistency associations with motivation and interest, while evident with the lower-order factors, were again stronger with the general higher-order factor (and each conative factor predicted unique variance in general attention consistency in structural regression models). We also refined our task battery by removing poor-performing indicators and demonstrated similar patterns of correlations among the attention and conative factors. We suggest that studies examining attention consistency should use a combination of performance and self-report indicators to capture its individual-differences variation in the most construct valid way. We finally provide recommendations on which tasks and measures might be most useful when measuring sustained attention consistency in future research.
{"title":"Building a construct-valid battery of performance and self-report indicators of sustained attention consistency.","authors":"Matthew S Welhaf, Matt E Meier, Michael J Kane","doi":"10.3758/s13428-025-02798-w","DOIUrl":"10.3758/s13428-025-02798-w","url":null,"abstract":"<p><p>Previous work has argued that the ability to sustain attention consistency can be best modeled as the individual-difference covariation in objective performance-based measures (e.g., reaction-time [RT] variability; accuracy) and self-report measures of task-unrelated thought (TUT). Latent variable studies demonstrate that a general, higher-order attention consistency factor correlates more strongly with nomological network constructs than do either lower-order, measurement-specific factors. The present study aimed to replicate and extend this measurement approach by building a construct-valid battery of sustained attention consistency tasks and testing associations with the conative factors of task interest and success motivation. We analyzed data from 402 subjects who completed a battery of seven attention-consistency functions and found that the hierarchical model provided an adequate fit to the data. Further, attention-consistency associations with motivation and interest, while evident with the lower-order factors, were again stronger with the general higher-order factor (and each conative factor predicted unique variance in general attention consistency in structural regression models). We also refined our task battery by removing poor-performing indicators and demonstrated similar patterns of correlations among the attention and conative factors. We suggest that studies examining attention consistency should use a combination of performance and self-report indicators to capture its individual-differences variation in the most construct valid way. We finally provide recommendations on which tasks and measures might be most useful when measuring sustained attention consistency in future research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"306"},"PeriodicalIF":3.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12507979/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145249402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-08DOI: 10.3758/s13428-025-02695-2
Dirk U Wulff, Pascal J Kieslich, Felix Henninger, Jonas M B Haslbeck, Michael Schulte-Mecklenbeck
Movement tracking is a novel process-tracing method that promises unique access to the temporal dynamics of psychological processes. The method involves high-resolution tracking of a hand or handheld device (e.g., a computer mouse) while it is used to make a choice. In contrast to other process-tracing methods, which mostly focus on information acquisition, movement tracking focuses on the processes of information integration and preference formation. In this article, we present a tutorial on movement tracking of psychological processes with the mousetrap R package. We address all steps of the research process, from design to interpretation, with a particular focus on data processing and analysis and featuring both established and novel approaches. Using a representative working example, we demonstrate how the various steps of movement-tracking analysis can be implemented with mousetrap and provide thorough explanations of their theoretical background and interpretation. Finally, we present a list of recommendations to assist researchers in addressing their own research questions using movement tracking of psychological processes.
{"title":"Movement tracking of psychological processes: A tutorial using mousetrap.","authors":"Dirk U Wulff, Pascal J Kieslich, Felix Henninger, Jonas M B Haslbeck, Michael Schulte-Mecklenbeck","doi":"10.3758/s13428-025-02695-2","DOIUrl":"10.3758/s13428-025-02695-2","url":null,"abstract":"<p><p>Movement tracking is a novel process-tracing method that promises unique access to the temporal dynamics of psychological processes. The method involves high-resolution tracking of a hand or handheld device (e.g., a computer mouse) while it is used to make a choice. In contrast to other process-tracing methods, which mostly focus on information acquisition, movement tracking focuses on the processes of information integration and preference formation. In this article, we present a tutorial on movement tracking of psychological processes with the mousetrap R package. We address all steps of the research process, from design to interpretation, with a particular focus on data processing and analysis and featuring both established and novel approaches. Using a representative working example, we demonstrate how the various steps of movement-tracking analysis can be implemented with mousetrap and provide thorough explanations of their theoretical background and interpretation. Finally, we present a list of recommendations to assist researchers in addressing their own research questions using movement tracking of psychological processes.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"307"},"PeriodicalIF":3.9,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12508020/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145249406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-07DOI: 10.3758/s13428-025-02844-7
Sanford R Student, Wyatt S Read
Interval scales are frequently assumed in educational and psychological research involving latent variables, but are rarely verified. This paper outlines methods for investigating the interval scale assumption when fitting the Rasch model to item response data. We study a Bayesian method for evaluating an item response dataset's adherence to the cancellation axioms of additive conjoint measurement under the Rasch model, and compare the extent to which the axiom of double cancellation holds in the data at sample sizes of 250 and 1000 with varying test lengths, difficulty spreads, and levels of adherence to the Rasch model in the data-generating process. Because the statistic produced by the procedure is not directly interpretable as an indicator of whether an interval scale can be established, we develop and evaluate procedures for bootstrapping a null distribution of violation rates against which to compare results. At a sample size of 250, the method under investigation is not well powered to detect the violations of interval scaling that we simulate, but the procedure works quite consistently at N = 1000. That is, at moderate but achievable sample sizes, empirical tests for interval scaling are indeed possible.
{"title":"Applying Bayesian checks of cancellation axioms for interval scaling in limited samples.","authors":"Sanford R Student, Wyatt S Read","doi":"10.3758/s13428-025-02844-7","DOIUrl":"10.3758/s13428-025-02844-7","url":null,"abstract":"<p><p>Interval scales are frequently assumed in educational and psychological research involving latent variables, but are rarely verified. This paper outlines methods for investigating the interval scale assumption when fitting the Rasch model to item response data. We study a Bayesian method for evaluating an item response dataset's adherence to the cancellation axioms of additive conjoint measurement under the Rasch model, and compare the extent to which the axiom of double cancellation holds in the data at sample sizes of 250 and 1000 with varying test lengths, difficulty spreads, and levels of adherence to the Rasch model in the data-generating process. Because the statistic produced by the procedure is not directly interpretable as an indicator of whether an interval scale can be established, we develop and evaluate procedures for bootstrapping a null distribution of violation rates against which to compare results. At a sample size of 250, the method under investigation is not well powered to detect the violations of interval scaling that we simulate, but the procedure works quite consistently at N = 1000. That is, at moderate but achievable sample sizes, empirical tests for interval scaling are indeed possible.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"305"},"PeriodicalIF":3.9,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12504388/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145243496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.3758/s13428-025-02838-5
Diego Ramos, Sebastián Moreno, Enrique Canessa, Sergio E Chaigneau
When using the Property Listing Task (PLT) to collect semantic content for a set of concepts (Concept Property Norms, CPNs), coding raw properties into standardized labels poses significant challenges. In this work, we address these challenges by enhancing the Assisted Coding for Property Listing Task (AC-PLT) framework, which facilitates the coding process. The current work conducts an ablation study to optimize AC-PLT by evaluating combinations of text cleaning, embedding models (e.g., Word2Vec, E5, LaBSE), and classification methods (e.g., kNN, SVM, XGBoost). Results show that normalization with the E5 embedding model and kNN classification achieves the highest accuracy, with top-1 test accuracies of 0.523 for CPN27 and 0.608 for CPN120 datasets, outperforming the original AC-PLT baseline. Comparisons with ChatGPT (fine-tuned and one-shot) reveal AC-PLT's superior stability and cost-effectiveness, despite ChatGPT's competitive performance in some cases. The improved AC-PLT framework offers a scalable, efficient solution to manual coding challenges, reducing variability and time constraints. Future work will explore its role as a recommender system for human coders, further enhancing its practical utility in cognitive psychology and psycholinguistics research.
{"title":"Towards scalable and reliable coding of semantic property norms: ChatGPT vs. an improved AC-PLT.","authors":"Diego Ramos, Sebastián Moreno, Enrique Canessa, Sergio E Chaigneau","doi":"10.3758/s13428-025-02838-5","DOIUrl":"10.3758/s13428-025-02838-5","url":null,"abstract":"<p><p>When using the Property Listing Task (PLT) to collect semantic content for a set of concepts (Concept Property Norms, CPNs), coding raw properties into standardized labels poses significant challenges. In this work, we address these challenges by enhancing the Assisted Coding for Property Listing Task (AC-PLT) framework, which facilitates the coding process. The current work conducts an ablation study to optimize AC-PLT by evaluating combinations of text cleaning, embedding models (e.g., Word2Vec, E5, LaBSE), and classification methods (e.g., kNN, SVM, XGBoost). Results show that normalization with the E5 embedding model and kNN classification achieves the highest accuracy, with top-1 test accuracies of 0.523 for CPN27 and 0.608 for CPN120 datasets, outperforming the original AC-PLT baseline. Comparisons with ChatGPT (fine-tuned and one-shot) reveal AC-PLT's superior stability and cost-effectiveness, despite ChatGPT's competitive performance in some cases. The improved AC-PLT framework offers a scalable, efficient solution to manual coding challenges, reducing variability and time constraints. Future work will explore its role as a recommender system for human coders, further enhancing its practical utility in cognitive psychology and psycholinguistics research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"302"},"PeriodicalIF":3.9,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145237850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.3758/s13428-025-02843-8
Clarence Green, Anthony Pak-Hin Kong, Marc Brysbaert, Kathleen Keogh
This paper revisits the age-of-acquisition (AoA) norms of Kuperman et al. (2012). Three studies were conducted. Study 1 reports a crowdsourcing 'megastudy' obtaining 790,024 estimates from participants with the age they could first read and write 11,074 early acquired words from Kuperman et al. (2012). The study aimed to differentiate between oral language receptive AoA and print-based AoA. The results correlate well with the original estimates, offering, as hypothesized, higher AoAs for reading/writing. These are released as supplements to the original norms. Study 2 explored the potential of large language models (LLMs), specifically GPT-4o, to replicate these crowdsourced AoA estimates. The findings indicated a strong correlation between AI-generated estimates and human judgments, showing the utility of AI in estimating AoA and developing norms for psycholinguistic and educational research in lieu of crowdsourcing. Study 3 leveraged AI to extend estimates to all well-known words in Kuperman et al. (2012) and the English Crowdsourcing Project (ECP). Study 3 also investigated a trained model fine-tuned on 2000 ratings from Kuperman et al. (2012). Fine-tuning increased alignment with human ratings, though comparisons with untrained models suggested that fine-tuning is not essential in English for obtaining useful AoA estimates. Both trained and untrained AI-generated norms correlated highly with human ratings and performed well in accounting for word processing times and accuracy in regressions. Uses and limitations of the AI estimates are discussed. All resources are made available in the Open Science Framework and can be used freely for research and education.
本文回顾了Kuperman et al.(2012)的习得年龄(AoA)规范。进行了三项研究。研究1报告了一项众包“大型研究”,从参与者中获得了790,024个估计,他们第一次阅读和写作的年龄为11,074个早期获得的单词,来自Kuperman等人(2012)。该研究旨在区分口头语言接受性AoA和基于印刷品的AoA。结果与最初的估计有很好的相关性,如假设的那样,提供了更高的读/写aoa。这些是作为原始规范的补充而发布的。研究2探索了大型语言模型(llm)的潜力,特别是gpt - 40,以复制这些众包的AoA估计。研究结果表明,人工智能生成的估计与人类判断之间存在很强的相关性,表明人工智能在估计AoA和为心理语言学和教育研究制定规范方面的效用,而不是众包。研究3利用人工智能将估计扩展到Kuperman等人(2012)和英语众包项目(ECP)中的所有知名单词。研究3还研究了一个经过训练的模型,该模型对Kuperman等人(2012)的2000个评级进行了微调。微调增加了与人类评分的一致性,尽管与未经训练的模型的比较表明,在英语中,微调对于获得有用的AoA估计并不必要。经过训练和未经训练的人工智能生成的规范都与人类评分高度相关,并且在计算文字处理时间和回归准确性方面表现良好。讨论了人工智能估计的用途和局限性。所有资源都在开放科学框架中提供,可以免费用于研究和教育。
{"title":"Crowdsourced and AI-generated age-of-acquisition (AoA) norms for vocabulary in print: Extending the Kuperman et al. (2012) norms.","authors":"Clarence Green, Anthony Pak-Hin Kong, Marc Brysbaert, Kathleen Keogh","doi":"10.3758/s13428-025-02843-8","DOIUrl":"10.3758/s13428-025-02843-8","url":null,"abstract":"<p><p>This paper revisits the age-of-acquisition (AoA) norms of Kuperman et al. (2012). Three studies were conducted. Study 1 reports a crowdsourcing 'megastudy' obtaining 790,024 estimates from participants with the age they could first read and write 11,074 early acquired words from Kuperman et al. (2012). The study aimed to differentiate between oral language receptive AoA and print-based AoA. The results correlate well with the original estimates, offering, as hypothesized, higher AoAs for reading/writing. These are released as supplements to the original norms. Study 2 explored the potential of large language models (LLMs), specifically GPT-4o, to replicate these crowdsourced AoA estimates. The findings indicated a strong correlation between AI-generated estimates and human judgments, showing the utility of AI in estimating AoA and developing norms for psycholinguistic and educational research in lieu of crowdsourcing. Study 3 leveraged AI to extend estimates to all well-known words in Kuperman et al. (2012) and the English Crowdsourcing Project (ECP). Study 3 also investigated a trained model fine-tuned on 2000 ratings from Kuperman et al. (2012). Fine-tuning increased alignment with human ratings, though comparisons with untrained models suggested that fine-tuning is not essential in English for obtaining useful AoA estimates. Both trained and untrained AI-generated norms correlated highly with human ratings and performed well in accounting for word processing times and accuracy in regressions. Uses and limitations of the AI estimates are discussed. All resources are made available in the Open Science Framework and can be used freely for research and education.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"304"},"PeriodicalIF":3.9,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12500800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145237855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-06DOI: 10.3758/s13428-025-02802-3
Kevin Jamey, Hugo Laflamme, Nicholas E V Foster, Simon Rigoulot, Sarah Lippé, Sonja A Kotz, Simone Dalla Bella
Neurodevelopmental disorders like ADHD can affect rhythm perception and production, impacting performance in attention and sensorimotor tasks. Improving rhythmic abilities through targeted training might compensate for these cognitive functions. We introduce a novel protocol for training rhythmic skills via a tablet-based, serious game called Rhythm Workers (RW). This proof-of-concept study tested the feasibility of using RW in children with ADHD. We administered an at-home longitudinal protocol across Canada. A total of 27 children (7 to 13 years) were randomly assigned to either a finger-tapping rhythmic game (RW) or a control game with comparable auditory-motor demands but without beat synchronization (active control condition). Participants played the game for 300 min over 2 weeks. We collected data (self-reported and logged onto the device) on game compliance and acceptance. Further, we measured rhythmic abilities using the Battery for the Assessment of Auditory Sensorimotor and Timing Abilities (BAASTA). The current findings show that both games were equally played in duration, rated similarly for overall enjoyment, and relied on similar motor activity (finger taps). The children who played RW showed improved general rhythmic abilities compared to the controls; these improvements were also positively correlated with the playing duration. We also present evidence that executive functioning improved in those who played RW, but not in the controls. These findings indicate that both games are well matched. RW demonstrates efficacy in enhancing sensorimotor skills in children with ADHD, which may benefit their executive functioning. A future RCT with extended training and sample size could further validate these skill transfer effects.
{"title":"Can you beat the music? Validation of a gamified rhythmic training in children with ADHD.","authors":"Kevin Jamey, Hugo Laflamme, Nicholas E V Foster, Simon Rigoulot, Sarah Lippé, Sonja A Kotz, Simone Dalla Bella","doi":"10.3758/s13428-025-02802-3","DOIUrl":"10.3758/s13428-025-02802-3","url":null,"abstract":"<p><p>Neurodevelopmental disorders like ADHD can affect rhythm perception and production, impacting performance in attention and sensorimotor tasks. Improving rhythmic abilities through targeted training might compensate for these cognitive functions. We introduce a novel protocol for training rhythmic skills via a tablet-based, serious game called Rhythm Workers (RW). This proof-of-concept study tested the feasibility of using RW in children with ADHD. We administered an at-home longitudinal protocol across Canada. A total of 27 children (7 to 13 years) were randomly assigned to either a finger-tapping rhythmic game (RW) or a control game with comparable auditory-motor demands but without beat synchronization (active control condition). Participants played the game for 300 min over 2 weeks. We collected data (self-reported and logged onto the device) on game compliance and acceptance. Further, we measured rhythmic abilities using the Battery for the Assessment of Auditory Sensorimotor and Timing Abilities (BAASTA). The current findings show that both games were equally played in duration, rated similarly for overall enjoyment, and relied on similar motor activity (finger taps). The children who played RW showed improved general rhythmic abilities compared to the controls; these improvements were also positively correlated with the playing duration. We also present evidence that executive functioning improved in those who played RW, but not in the controls. These findings indicate that both games are well matched. RW demonstrates efficacy in enhancing sensorimotor skills in children with ADHD, which may benefit their executive functioning. A future RCT with extended training and sample size could further validate these skill transfer effects.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"303"},"PeriodicalIF":3.9,"publicationDate":"2025-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145237831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-03DOI: 10.3758/s13428-025-02837-6
Zhaochenze Li, Tongshu Yang, Yanli Huang, Yang Liu, Jiushu Xie
Many studies have used images of novel objects as experimental materials. Existing novel object databases do not provide diverse exemplars, and many studies need to manipulate or examine the diversity of exemplars. To fill this gap in experimental materials, the present study introduces the Novel Prototype and Exemplar (NPE) database. This database contains 108 prototypes and 2592 exemplars of viewpoint and shape variants. The present study conducted four experiments to standardize the database and validate nine dimensions of the novel objects in the NPE database. Experiment 1 standardized familiarity, visual complexity, naming difficulty, reality, and comfort. The results revealed that the prototypes in the NPE database performed well in these attributes. Experiment 2 used the same methodology and reported that the prototypes have high absolute novelty, high relative novelty, low name agreement, and low identity agreement. Experiment 3 used multidimensional scaling (MDS) and Monte Carlo simulations and revealed that the similarity between prototypes was low. Experiment 4 generated exemplars by varying viewpoints and shapes and used the same methodology to evaluate the similarity between prototypes and exemplars and between different exemplars. Finally, Experiment 4 used these evaluation results to select appropriate typical and atypical exemplars. In conclusion, the NPE database contains the largest number of novel 3D object images, controls for the largest number of subjective dimensions, and is the first to develop abundant exemplars. As an open-source novel object database, the NPE database will greatly contribute to research in psychology, linguistics, cognitive science, ergonomics, artificial intelligence (AI), etc.
{"title":"Novel Prototype and Exemplar (NPE) database: A set of 2700 novel 3D images with viewpoint and shape variations.","authors":"Zhaochenze Li, Tongshu Yang, Yanli Huang, Yang Liu, Jiushu Xie","doi":"10.3758/s13428-025-02837-6","DOIUrl":"10.3758/s13428-025-02837-6","url":null,"abstract":"<p><p>Many studies have used images of novel objects as experimental materials. Existing novel object databases do not provide diverse exemplars, and many studies need to manipulate or examine the diversity of exemplars. To fill this gap in experimental materials, the present study introduces the Novel Prototype and Exemplar (NPE) database. This database contains 108 prototypes and 2592 exemplars of viewpoint and shape variants. The present study conducted four experiments to standardize the database and validate nine dimensions of the novel objects in the NPE database. Experiment 1 standardized familiarity, visual complexity, naming difficulty, reality, and comfort. The results revealed that the prototypes in the NPE database performed well in these attributes. Experiment 2 used the same methodology and reported that the prototypes have high absolute novelty, high relative novelty, low name agreement, and low identity agreement. Experiment 3 used multidimensional scaling (MDS) and Monte Carlo simulations and revealed that the similarity between prototypes was low. Experiment 4 generated exemplars by varying viewpoints and shapes and used the same methodology to evaluate the similarity between prototypes and exemplars and between different exemplars. Finally, Experiment 4 used these evaluation results to select appropriate typical and atypical exemplars. In conclusion, the NPE database contains the largest number of novel 3D object images, controls for the largest number of subjective dimensions, and is the first to develop abundant exemplars. As an open-source novel object database, the NPE database will greatly contribute to research in psychology, linguistics, cognitive science, ergonomics, artificial intelligence (AI), etc.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 11","pages":"300"},"PeriodicalIF":3.9,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145224590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}