Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02801-4
Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy
Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.
{"title":"A practice-oriented guide to statistical inference in linear modeling for non-normal or heteroskedastic error distributions.","authors":"Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy","doi":"10.3758/s13428-025-02801-4","DOIUrl":"10.3758/s13428-025-02801-4","url":null,"abstract":"<p><p>Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"338"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the "temporal" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library "Pupilla."
{"title":"Dimensionality reduction techniques in pupillometry research: A primer for behavioral scientists.","authors":"Serena Castellotti, Irene Petrizzo, Roberto Arrighi, Elvio Blini","doi":"10.3758/s13428-025-02786-0","DOIUrl":"10.3758/s13428-025-02786-0","url":null,"abstract":"<p><p>The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the \"temporal\" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library \"Pupilla.\"</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"337"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02852-7
Cameron S Kay
Several prior studies have used advanced methodological techniques to demonstrate that there is an issue with the quality of data that can be collected on Amazon's Mechanical Turk (MTurk). The goal of the present project was to provide an accessible demonstration of this issue. We administered 27 semantic antonyms-pairs of items that assess clearly contradictory content (e.g., "I talk a lot" and "I rarely talk")-to samples drawn from Connect (N1 = 100), Prolific (N2 = 100), and MTurk (N3 = 400; N4 = 600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only "high-productivity" and "high-reputation" participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.
{"title":"Why you shouldn't trust data collected on MTurk.","authors":"Cameron S Kay","doi":"10.3758/s13428-025-02852-7","DOIUrl":"10.3758/s13428-025-02852-7","url":null,"abstract":"<p><p>Several prior studies have used advanced methodological techniques to demonstrate that there is an issue with the quality of data that can be collected on Amazon's Mechanical Turk (MTurk). The goal of the present project was to provide an accessible demonstration of this issue. We administered 27 semantic antonyms-pairs of items that assess clearly contradictory content (e.g., \"I talk a lot\" and \"I rarely talk\")-to samples drawn from Connect (N<sub>1</sub> = 100), Prolific (N<sub>2</sub> = 100), and MTurk (N<sub>3</sub> = 400; N<sub>4</sub> = 600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only \"high-productivity\" and \"high-reputation\" participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"340"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-10DOI: 10.3758/s13428-025-02820-1
Jesse Boot, Jill de Ron, Jonas Haslbeck, Sacha Epskamp
In psychological studies, it is common practice to select a sample based on the sum score of the modeled variables (e.g., based on symptom severity when investigating the associations between those same symptoms). However, this practice introduces bias if the sum score selection imperfectly defines the population of interest. Here, we propose a correction for this type of selection bias in the Ising model, a popular network model for binary data. Possible applications of our correction are when one wants to obtain (1) full population estimates when only the sum score subset of the data is available, and (2) improved estimates of a subpopulation, if we observe a mixture of populations that differ from each other in the sum score. In a simulation study, we verify that our correction recovers the network structure of the desired population after a sum score selection using both a node-wise regression and a multivariate estimation of the Ising model. In an example, we show how our correction can be used in practice using empirical data on symptoms of major depression from the National Comorbidity Study Replication (N = 9,282). We implemented our correction in four commonly used R packages for estimating the Ising model, namely IsingFit, IsingSampler, psychonetrics, and bootnet.
{"title":"Correcting for selection bias after conditioning on a sum score in the Ising model.","authors":"Jesse Boot, Jill de Ron, Jonas Haslbeck, Sacha Epskamp","doi":"10.3758/s13428-025-02820-1","DOIUrl":"10.3758/s13428-025-02820-1","url":null,"abstract":"<p><p>In psychological studies, it is common practice to select a sample based on the sum score of the modeled variables (e.g., based on symptom severity when investigating the associations between those same symptoms). However, this practice introduces bias if the sum score selection imperfectly defines the population of interest. Here, we propose a correction for this type of selection bias in the Ising model, a popular network model for binary data. Possible applications of our correction are when one wants to obtain (1) full population estimates when only the sum score subset of the data is available, and (2) improved estimates of a subpopulation, if we observe a mixture of populations that differ from each other in the sum score. In a simulation study, we verify that our correction recovers the network structure of the desired population after a sum score selection using both a node-wise regression and a multivariate estimation of the Ising model. In an example, we show how our correction can be used in practice using empirical data on symptoms of major depression from the National Comorbidity Study Replication (N = 9,282). We implemented our correction in four commonly used R packages for estimating the Ising model, namely IsingFit, IsingSampler, psychonetrics, and bootnet.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"341"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.3758/s13428-025-02868-z
Yu Wang, Wen Qu
Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.
{"title":"A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research.","authors":"Yu Wang, Wen Qu","doi":"10.3758/s13428-025-02868-z","DOIUrl":"https://doi.org/10.3758/s13428-025-02868-z","url":null,"abstract":"<p><p>Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"336"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.3758/s13428-025-02819-8
Nir Ofir, Ayelet N Landau
Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.
{"title":"A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects.","authors":"Nir Ofir, Ayelet N Landau","doi":"10.3758/s13428-025-02819-8","DOIUrl":"10.3758/s13428-025-02819-8","url":null,"abstract":"<p><p>Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"334"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12589256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.
{"title":"Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability.","authors":"Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao","doi":"10.3758/s13428-025-02853-6","DOIUrl":"10.3758/s13428-025-02853-6","url":null,"abstract":"<p><p>Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"335"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-04DOI: 10.3758/s13428-025-02848-3
Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang
The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.
{"title":"Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory.","authors":"Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang","doi":"10.3758/s13428-025-02848-3","DOIUrl":"10.3758/s13428-025-02848-3","url":null,"abstract":"<p><p>The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"333"},"PeriodicalIF":3.9,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02850-9
Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie
One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.
{"title":"Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations.","authors":"Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie","doi":"10.3758/s13428-025-02850-9","DOIUrl":"10.3758/s13428-025-02850-9","url":null,"abstract":"<p><p>One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"332"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02871-4
Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo
{"title":"Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice.","authors":"Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo","doi":"10.3758/s13428-025-02871-4","DOIUrl":"10.3758/s13428-025-02871-4","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"331"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145436976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}