Pub Date : 2025-11-04DOI: 10.3758/s13428-025-02848-3
Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang
The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.
{"title":"Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory.","authors":"Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang","doi":"10.3758/s13428-025-02848-3","DOIUrl":"10.3758/s13428-025-02848-3","url":null,"abstract":"<p><p>The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"333"},"PeriodicalIF":3.9,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02850-9
Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie
One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.
{"title":"Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations.","authors":"Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie","doi":"10.3758/s13428-025-02850-9","DOIUrl":"10.3758/s13428-025-02850-9","url":null,"abstract":"<p><p>One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"332"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02871-4
Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo
{"title":"Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice.","authors":"Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo","doi":"10.3758/s13428-025-02871-4","DOIUrl":"10.3758/s13428-025-02871-4","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"331"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145436976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02855-4
Jing Lu, Xue Wang, Jiwei Zhang
In this study, we propose a joint hierarchical model that combines a family of item response theory (IRT) models with a log-normal response time (RT) model to analyze item responses and response times. By incorporating RTs as auxiliary information, we improve the accuracy of latent trait estimation, thereby facilitating a deeper understanding of examinee performance. Additionally, we explore the use of either identical or distinct link functions across different items, allowing us to optimize IRT models for each item and improve overall model fit. We further investigate scenarios in which the joint distribution of speed and ability is nonlinear by integrating the generalized logit-linked IRT model with the log-normal random quadratic variable speed model. Compared to the traditional hierarchical model by van der Linden (Psychometrika, 72, 287-308 2007), this integration yields more accurate estimates of ability, item difficulty, and discrimination parameters. Additionally, Bayesian model comparison reveals that the new joint hierarchical model provides a better fit than various models combining item responses and RTs, particularly when the data are derived from a joint RT and two-parameter IRT model with both symmetric and asymmetric link functions. Finally, a comprehensive analysis of data from the computer-based Program for International Student Assessment (PISA) science examination from 2015 is conducted to illustrate the proposed methodology.
在这项研究中,我们提出了一个联合层次模型,该模型将项目反应理论(IRT)家族模型与对数正态反应时间(RT)模型相结合,以分析项目反应和反应时间。通过将RTs作为辅助信息,我们提高了潜在特质估计的准确性,从而有助于更深入地了解考生的表现。此外,我们探索了在不同项目之间使用相同或不同的链接函数,使我们能够优化每个项目的IRT模型并改善整体模型拟合。通过将广义对数链IRT模型与对数正态随机二次变量速度模型相结合,进一步研究了速度和能力的非线性联合分布。与van der Linden (Psychometrika, 72, 287-308 2007)的传统层次模型相比,这种整合可以更准确地估计能力、项目难度和辨别参数。此外,贝叶斯模型比较表明,新的联合分层模型比将项目反应和RTs结合在一起的各种模型提供了更好的拟合,特别是当数据来自具有对称和非对称链接函数的联合RT和双参数IRT模型时。最后,对2015年基于计算机的国际学生评估项目(PISA)科学考试的数据进行了全面分析,以说明所提出的方法。
{"title":"Joint modeling with generalized item response theory model family and response time model: Enhancing model structural flexibility and data-fitting adequacy.","authors":"Jing Lu, Xue Wang, Jiwei Zhang","doi":"10.3758/s13428-025-02855-4","DOIUrl":"10.3758/s13428-025-02855-4","url":null,"abstract":"<p><p>In this study, we propose a joint hierarchical model that combines a family of item response theory (IRT) models with a log-normal response time (RT) model to analyze item responses and response times. By incorporating RTs as auxiliary information, we improve the accuracy of latent trait estimation, thereby facilitating a deeper understanding of examinee performance. Additionally, we explore the use of either identical or distinct link functions across different items, allowing us to optimize IRT models for each item and improve overall model fit. We further investigate scenarios in which the joint distribution of speed and ability is nonlinear by integrating the generalized logit-linked IRT model with the log-normal random quadratic variable speed model. Compared to the traditional hierarchical model by van der Linden (Psychometrika, 72, 287-308 2007), this integration yields more accurate estimates of ability, item difficulty, and discrimination parameters. Additionally, Bayesian model comparison reveals that the new joint hierarchical model provides a better fit than various models combining item responses and RTs, particularly when the data are derived from a joint RT and two-parameter IRT model with both symmetric and asymmetric link functions. Finally, a comprehensive analysis of data from the computer-based Program for International Student Assessment (PISA) science examination from 2015 is conducted to illustrate the proposed methodology.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"330"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.3758/s13428-025-02792-2
Karina Navarro, Karin Schermelleh-Engel
Cross-loadings on non-target factors in measurement models of linear structural equation models (SEM) are often observed in empirical research but frequently disregarded. Previous research on linear SEM has already shown that omitted positive cross-loadings result in overestimated covariances of the latent predictor variables and distorted linear effects. For nonlinear SEM with interaction and quadratic effects, omitting cross-loadings has not been investigated. This study examines the consequences of omitted cross-loadings in both linear and nonlinear SEM using a single empirical dataset and a small simulation study. We focus on the bias patterns that emerge when cross-loadings-reflecting the multidimensionality of items-are either positive or negative and assess how these biases vary with the level of the latent predictor covariance. The empirical analysis reveals that constraining theoretically justified cross-loadings to zero results in systematic over- and underestimation of factor loadings and structural parameters, with more pronounced effects in the nonlinear component of the model, thereby altering the functional form of the relationships between the latent variables. The simulation study further illustrates that the direction and magnitude of bias in both linear and nonlinear SEM depend jointly on the sign of the cross-loadings and the level of the latent predictor covariance. These findings underscore the critical importance of incorporating cross-loadings only theory-driven to maintain an accurate representation of the functional relationships between latent constructs. Practical implications and challenges of including cross-loadings in the model are discussed.
{"title":"Cause for concern: Omitted cross-loadings in measurement models of nonlinear structural equation models.","authors":"Karina Navarro, Karin Schermelleh-Engel","doi":"10.3758/s13428-025-02792-2","DOIUrl":"10.3758/s13428-025-02792-2","url":null,"abstract":"<p><p>Cross-loadings on non-target factors in measurement models of linear structural equation models (SEM) are often observed in empirical research but frequently disregarded. Previous research on linear SEM has already shown that omitted positive cross-loadings result in overestimated covariances of the latent predictor variables and distorted linear effects. For nonlinear SEM with interaction and quadratic effects, omitting cross-loadings has not been investigated. This study examines the consequences of omitted cross-loadings in both linear and nonlinear SEM using a single empirical dataset and a small simulation study. We focus on the bias patterns that emerge when cross-loadings-reflecting the multidimensionality of items-are either positive or negative and assess how these biases vary with the level of the latent predictor covariance. The empirical analysis reveals that constraining theoretically justified cross-loadings to zero results in systematic over- and underestimation of factor loadings and structural parameters, with more pronounced effects in the nonlinear component of the model, thereby altering the functional form of the relationships between the latent variables. The simulation study further illustrates that the direction and magnitude of bias in both linear and nonlinear SEM depend jointly on the sign of the cross-loadings and the level of the latent predictor covariance. These findings underscore the critical importance of incorporating cross-loadings only theory-driven to maintain an accurate representation of the functional relationships between latent constructs. Practical implications and challenges of including cross-loadings in the model are discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"328"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145408003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.3758/s13428-025-02862-5
Marcus Nyström, Diederick C Niehorster, Roy S Hessels, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Jakub Lipiński, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Ignace T C Hooge
The eyeball is not rigid and deforms during saccades. As a consequence, the saccade waveform recorded by an eye tracker may depend on which structure of the eye is used to estimate eyeball rotation. Here, we systematically describe and compare signals co-recorded from the retina, the cornea (corneal reflection, CR), the pupil, and the lens (fourth Purkinje reflection, P4) during saccades. We found that several commonly used parameters for saccade characterization differ systematically across the signals. For instance, saccades in the retinal signal had earlier onsets compared to saccades in the pupil and the P4 signals. The retinal signal had the smallest saccade amplitude and reached the peak saccade velocity earlier compared to the other signals. At the end of saccades, the retinal signal came to a stop faster than the other signals. We discuss possible explanations that may account for the relationship between the retinal signal and the other signals.
{"title":"Do eye trackers estimate eyeball rotation? The relationship between tracked eye image feature and estimated saccadic waveform.","authors":"Marcus Nyström, Diederick C Niehorster, Roy S Hessels, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Jakub Lipiński, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Ignace T C Hooge","doi":"10.3758/s13428-025-02862-5","DOIUrl":"10.3758/s13428-025-02862-5","url":null,"abstract":"<p><p>The eyeball is not rigid and deforms during saccades. As a consequence, the saccade waveform recorded by an eye tracker may depend on which structure of the eye is used to estimate eyeball rotation. Here, we systematically describe and compare signals co-recorded from the retina, the cornea (corneal reflection, CR), the pupil, and the lens (fourth Purkinje reflection, P4) during saccades. We found that several commonly used parameters for saccade characterization differ systematically across the signals. For instance, saccades in the retinal signal had earlier onsets compared to saccades in the pupil and the P4 signals. The retinal signal had the smallest saccade amplitude and reached the peak saccade velocity earlier compared to the other signals. At the end of saccades, the retinal signal came to a stop faster than the other signals. We discuss possible explanations that may account for the relationship between the retinal signal and the other signals.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"329"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145408001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.3758/s13428-025-02856-3
Jessica N Goetz, Mark B Neider
Real-world stimuli can be difficult to manipulate and control in experimental psychology studies. Color information is frequently used as a variable, and researchers often rely on subjective color labels that imprecisely describe the color information within real-world objects. Here, we describe a new toolbox called MATCH (Matching And Transforming Closely Hued objects) that can easily and objectively quantify and manipulate color information within real-world objects to generate object pairs that match in color. MATCH was designed incorporating theoretical frameworks and conceptual understanding from visual cognition research. Additionally, MATCH provides critical information on the distribution of color and the specific color values of any stimulus set. We also present two experimental studies to validate whether MATCH produces images that are consistent with human visual perception. In the first study, we provide evidence that the stimuli generated by MATCH are perceptually closer in color to a reference object compared to human categorization of object-color pairs. In the second study, we investigated the search for real-world objects with distractors generated by MATCH that matched the target object's color. We found patterns of data that are consistent with current theories of human search behavior. In summary, MATCH allows researchers to carefully control the color of real-world stimuli used in their studies.
{"title":"MATCH: A toolbox to assess the primary color of real-world objects and generate color-matching stimuli.","authors":"Jessica N Goetz, Mark B Neider","doi":"10.3758/s13428-025-02856-3","DOIUrl":"10.3758/s13428-025-02856-3","url":null,"abstract":"<p><p>Real-world stimuli can be difficult to manipulate and control in experimental psychology studies. Color information is frequently used as a variable, and researchers often rely on subjective color labels that imprecisely describe the color information within real-world objects. Here, we describe a new toolbox called MATCH (Matching And Transforming Closely Hued objects) that can easily and objectively quantify and manipulate color information within real-world objects to generate object pairs that match in color. MATCH was designed incorporating theoretical frameworks and conceptual understanding from visual cognition research. Additionally, MATCH provides critical information on the distribution of color and the specific color values of any stimulus set. We also present two experimental studies to validate whether MATCH produces images that are consistent with human visual perception. In the first study, we provide evidence that the stimuli generated by MATCH are perceptually closer in color to a reference object compared to human categorization of object-color pairs. In the second study, we investigated the search for real-world objects with distractors generated by MATCH that matched the target object's color. We found patterns of data that are consistent with current theories of human search behavior. In summary, MATCH allows researchers to carefully control the color of real-world stimuli used in their studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"327"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.3758/s13428-025-02783-3
Andrew F Hayes, Paul D Allison, Sean M Alexander
Mediation analysis, popular in many disciplines that rely on behavioral science data analysis techniques, is often conducted using ordinary least squares (OLS) regression analysis methods. Given that one of OLS regression's weaknesses is its susceptibility to estimation bias resulting from unaccounted-for random measurement error in variables on the right-hand sides of the equation, many published mediation analyses certainly contain some and perhaps substantial bias in the direct, indirect, and total effects. In this manuscript, we offer errors-in-variables (EIV) regression as an easy-to-use alternative when a researcher has reasonable estimates of the reliability of the variables in the analysis. In three real-data examples, we show that EIV regression-based mediation analysis produces estimates that are equivalent to those obtained using an alternative, more analytically complex approach that accounts for measurement error-single-indicator latent variable structural equation modeling-yet quite different from the results generated by standard OLS regression that ignores random measurement error. In a small-scale simulation, we also establish that EIV regression successfully recovers the parameters of a mediation model involving variables adulterated by random measurement error while OLS regression generates biased estimates. To facilitate the adoption of EIV regression, we describe an implementation in the PROCESS macro for SPSS, SAS, and R that we believe eliminates most any excuse one can conjure for not accounting for random measurement error when conducting a mediation analysis.
{"title":"Errors-in-variables regression as a viable approach to mediation analysis with random error-tainted measurements: Estimation, effectiveness, and an easy-to-use implementation.","authors":"Andrew F Hayes, Paul D Allison, Sean M Alexander","doi":"10.3758/s13428-025-02783-3","DOIUrl":"10.3758/s13428-025-02783-3","url":null,"abstract":"<p><p>Mediation analysis, popular in many disciplines that rely on behavioral science data analysis techniques, is often conducted using ordinary least squares (OLS) regression analysis methods. Given that one of OLS regression's weaknesses is its susceptibility to estimation bias resulting from unaccounted-for random measurement error in variables on the right-hand sides of the equation, many published mediation analyses certainly contain some and perhaps substantial bias in the direct, indirect, and total effects. In this manuscript, we offer errors-in-variables (EIV) regression as an easy-to-use alternative when a researcher has reasonable estimates of the reliability of the variables in the analysis. In three real-data examples, we show that EIV regression-based mediation analysis produces estimates that are equivalent to those obtained using an alternative, more analytically complex approach that accounts for measurement error-single-indicator latent variable structural equation modeling-yet quite different from the results generated by standard OLS regression that ignores random measurement error. In a small-scale simulation, we also establish that EIV regression successfully recovers the parameters of a mediation model involving variables adulterated by random measurement error while OLS regression generates biased estimates. To facilitate the adoption of EIV regression, we describe an implementation in the PROCESS macro for SPSS, SAS, and R that we believe eliminates most any excuse one can conjure for not accounting for random measurement error when conducting a mediation analysis.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"323"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.3758/s13428-025-02834-9
Jianyong Cai, Yuting Han, Xin Jiang
For decades, there has been a dearth of efficient tools for researchers to measure L2 Chinese proficiency. This study introduces ClozCHI, a cloze test developed to assess L2 Chinese proficiency across a range of levels from novice to advanced. Unlike existing Chinese cloze tests, ClozCHI comprises three passages with varying levels of difficulty. Its effectiveness was assessed with 225 L2 Chinese learners who participated in the Hanyu Shuiping Kaoshi (HSK) at Levels 3 to 6 and completed ClozCHI within 2 weeks before or after their HSK tests. Additionally, supplementary data were collected from 97 learners below HSK Level 3 without HSK testing. The psychometric analysis of the ClozCHI using both classical test theory (CTT) and item response theory (IRT) revealed that the test demonstrated appropriate difficulty, good discrimination, and high reliability from novice to advanced levels. ClozCHI scores showed strong correlations with HSK levels, demonstrating criterion-related validity. Confirmatory factor analysis (CFA) further supported its unidimensional structure. ClozCHI was more effective for assessing reading than listening or writing. These findings suggested that ClozCHI is a reliable and valid instrument for assessing L2 Chinese proficiency in research settings. ClozCHI is freely available for researchers from the Open Science Framework repository: https://osf.io/5kcrq/ .
{"title":"ClozCHI: A cloze test for measuring L2 Chinese proficiency from novice to advanced levels.","authors":"Jianyong Cai, Yuting Han, Xin Jiang","doi":"10.3758/s13428-025-02834-9","DOIUrl":"10.3758/s13428-025-02834-9","url":null,"abstract":"<p><p>For decades, there has been a dearth of efficient tools for researchers to measure L2 Chinese proficiency. This study introduces ClozCHI, a cloze test developed to assess L2 Chinese proficiency across a range of levels from novice to advanced. Unlike existing Chinese cloze tests, ClozCHI comprises three passages with varying levels of difficulty. Its effectiveness was assessed with 225 L2 Chinese learners who participated in the Hanyu Shuiping Kaoshi (HSK) at Levels 3 to 6 and completed ClozCHI within 2 weeks before or after their HSK tests. Additionally, supplementary data were collected from 97 learners below HSK Level 3 without HSK testing. The psychometric analysis of the ClozCHI using both classical test theory (CTT) and item response theory (IRT) revealed that the test demonstrated appropriate difficulty, good discrimination, and high reliability from novice to advanced levels. ClozCHI scores showed strong correlations with HSK levels, demonstrating criterion-related validity. Confirmatory factor analysis (CFA) further supported its unidimensional structure. ClozCHI was more effective for assessing reading than listening or writing. These findings suggested that ClozCHI is a reliable and valid instrument for assessing L2 Chinese proficiency in research settings. ClozCHI is freely available for researchers from the Open Science Framework repository: https://osf.io/5kcrq/ .</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"325"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.3758/s13428-025-02833-w
Mirka Henninger, Simone Malejka, Johannes Titz
Researchers in psychology traditionally use analysis of variance to examine differences between multiple groups or conditions. A less well-known, but valuable alternative is contrast analysis - a simple statistical method for testing directional, theoretically motivated hypotheses that are defined prior to data collection. In this article, we review the core concepts of contrast analysis for testing hypotheses in between-subjects and within-subjects designs. We also outline and demonstrate the largely unknown possibility of directly testing two competing contrasts against each other. In the tutorial part of the article, we show how such competing-contrast analyses can be conducted in the free, open-source software R using the package cofad. Because competing-contrast analysis is a straightforward, flexible, highly powered, and hypothesis-driven approach, it is a valuable tool to extend the understanding of cognitive and behavioral processes in psychological research.
{"title":"Contrast analysis for competing hypotheses: A tutorial using the R package cofad.","authors":"Mirka Henninger, Simone Malejka, Johannes Titz","doi":"10.3758/s13428-025-02833-w","DOIUrl":"10.3758/s13428-025-02833-w","url":null,"abstract":"<p><p>Researchers in psychology traditionally use analysis of variance to examine differences between multiple groups or conditions. A less well-known, but valuable alternative is contrast analysis - a simple statistical method for testing directional, theoretically motivated hypotheses that are defined prior to data collection. In this article, we review the core concepts of contrast analysis for testing hypotheses in between-subjects and within-subjects designs. We also outline and demonstrate the largely unknown possibility of directly testing two competing contrasts against each other. In the tutorial part of the article, we show how such competing-contrast analyses can be conducted in the free, open-source software R using the package cofad. Because competing-contrast analysis is a straightforward, flexible, highly powered, and hypothesis-driven approach, it is a valuable tool to extend the understanding of cognitive and behavioral processes in psychological research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"326"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12572084/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}