Pub Date : 2025-11-05DOI: 10.3758/s13428-025-02868-z
Yu Wang, Wen Qu
Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.
{"title":"A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research.","authors":"Yu Wang, Wen Qu","doi":"10.3758/s13428-025-02868-z","DOIUrl":"10.3758/s13428-025-02868-z","url":null,"abstract":"<p><p>Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"336"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-05DOI: 10.3758/s13428-025-02819-8
Nir Ofir, Ayelet N Landau
Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.
{"title":"A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects.","authors":"Nir Ofir, Ayelet N Landau","doi":"10.3758/s13428-025-02819-8","DOIUrl":"10.3758/s13428-025-02819-8","url":null,"abstract":"<p><p>Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"334"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12589256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.
{"title":"Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability.","authors":"Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao","doi":"10.3758/s13428-025-02853-6","DOIUrl":"10.3758/s13428-025-02853-6","url":null,"abstract":"<p><p>Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"335"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-04DOI: 10.3758/s13428-025-02848-3
Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang
The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.
{"title":"Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory.","authors":"Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang","doi":"10.3758/s13428-025-02848-3","DOIUrl":"10.3758/s13428-025-02848-3","url":null,"abstract":"<p><p>The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"333"},"PeriodicalIF":3.9,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02850-9
Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie
One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.
{"title":"Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations.","authors":"Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie","doi":"10.3758/s13428-025-02850-9","DOIUrl":"10.3758/s13428-025-02850-9","url":null,"abstract":"<p><p>One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"332"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02871-4
Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo
{"title":"Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice.","authors":"Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo","doi":"10.3758/s13428-025-02871-4","DOIUrl":"10.3758/s13428-025-02871-4","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"331"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145436976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-03DOI: 10.3758/s13428-025-02855-4
Jing Lu, Xue Wang, Jiwei Zhang
In this study, we propose a joint hierarchical model that combines a family of item response theory (IRT) models with a log-normal response time (RT) model to analyze item responses and response times. By incorporating RTs as auxiliary information, we improve the accuracy of latent trait estimation, thereby facilitating a deeper understanding of examinee performance. Additionally, we explore the use of either identical or distinct link functions across different items, allowing us to optimize IRT models for each item and improve overall model fit. We further investigate scenarios in which the joint distribution of speed and ability is nonlinear by integrating the generalized logit-linked IRT model with the log-normal random quadratic variable speed model. Compared to the traditional hierarchical model by van der Linden (Psychometrika, 72, 287-308 2007), this integration yields more accurate estimates of ability, item difficulty, and discrimination parameters. Additionally, Bayesian model comparison reveals that the new joint hierarchical model provides a better fit than various models combining item responses and RTs, particularly when the data are derived from a joint RT and two-parameter IRT model with both symmetric and asymmetric link functions. Finally, a comprehensive analysis of data from the computer-based Program for International Student Assessment (PISA) science examination from 2015 is conducted to illustrate the proposed methodology.
在这项研究中,我们提出了一个联合层次模型,该模型将项目反应理论(IRT)家族模型与对数正态反应时间(RT)模型相结合,以分析项目反应和反应时间。通过将RTs作为辅助信息,我们提高了潜在特质估计的准确性,从而有助于更深入地了解考生的表现。此外,我们探索了在不同项目之间使用相同或不同的链接函数,使我们能够优化每个项目的IRT模型并改善整体模型拟合。通过将广义对数链IRT模型与对数正态随机二次变量速度模型相结合,进一步研究了速度和能力的非线性联合分布。与van der Linden (Psychometrika, 72, 287-308 2007)的传统层次模型相比,这种整合可以更准确地估计能力、项目难度和辨别参数。此外,贝叶斯模型比较表明,新的联合分层模型比将项目反应和RTs结合在一起的各种模型提供了更好的拟合,特别是当数据来自具有对称和非对称链接函数的联合RT和双参数IRT模型时。最后,对2015年基于计算机的国际学生评估项目(PISA)科学考试的数据进行了全面分析,以说明所提出的方法。
{"title":"Joint modeling with generalized item response theory model family and response time model: Enhancing model structural flexibility and data-fitting adequacy.","authors":"Jing Lu, Xue Wang, Jiwei Zhang","doi":"10.3758/s13428-025-02855-4","DOIUrl":"10.3758/s13428-025-02855-4","url":null,"abstract":"<p><p>In this study, we propose a joint hierarchical model that combines a family of item response theory (IRT) models with a log-normal response time (RT) model to analyze item responses and response times. By incorporating RTs as auxiliary information, we improve the accuracy of latent trait estimation, thereby facilitating a deeper understanding of examinee performance. Additionally, we explore the use of either identical or distinct link functions across different items, allowing us to optimize IRT models for each item and improve overall model fit. We further investigate scenarios in which the joint distribution of speed and ability is nonlinear by integrating the generalized logit-linked IRT model with the log-normal random quadratic variable speed model. Compared to the traditional hierarchical model by van der Linden (Psychometrika, 72, 287-308 2007), this integration yields more accurate estimates of ability, item difficulty, and discrimination parameters. Additionally, Bayesian model comparison reveals that the new joint hierarchical model provides a better fit than various models combining item responses and RTs, particularly when the data are derived from a joint RT and two-parameter IRT model with both symmetric and asymmetric link functions. Finally, a comprehensive analysis of data from the computer-based Program for International Student Assessment (PISA) science examination from 2015 is conducted to illustrate the proposed methodology.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"330"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.3758/s13428-025-02792-2
Karina Navarro, Karin Schermelleh-Engel
Cross-loadings on non-target factors in measurement models of linear structural equation models (SEM) are often observed in empirical research but frequently disregarded. Previous research on linear SEM has already shown that omitted positive cross-loadings result in overestimated covariances of the latent predictor variables and distorted linear effects. For nonlinear SEM with interaction and quadratic effects, omitting cross-loadings has not been investigated. This study examines the consequences of omitted cross-loadings in both linear and nonlinear SEM using a single empirical dataset and a small simulation study. We focus on the bias patterns that emerge when cross-loadings-reflecting the multidimensionality of items-are either positive or negative and assess how these biases vary with the level of the latent predictor covariance. The empirical analysis reveals that constraining theoretically justified cross-loadings to zero results in systematic over- and underestimation of factor loadings and structural parameters, with more pronounced effects in the nonlinear component of the model, thereby altering the functional form of the relationships between the latent variables. The simulation study further illustrates that the direction and magnitude of bias in both linear and nonlinear SEM depend jointly on the sign of the cross-loadings and the level of the latent predictor covariance. These findings underscore the critical importance of incorporating cross-loadings only theory-driven to maintain an accurate representation of the functional relationships between latent constructs. Practical implications and challenges of including cross-loadings in the model are discussed.
{"title":"Cause for concern: Omitted cross-loadings in measurement models of nonlinear structural equation models.","authors":"Karina Navarro, Karin Schermelleh-Engel","doi":"10.3758/s13428-025-02792-2","DOIUrl":"10.3758/s13428-025-02792-2","url":null,"abstract":"<p><p>Cross-loadings on non-target factors in measurement models of linear structural equation models (SEM) are often observed in empirical research but frequently disregarded. Previous research on linear SEM has already shown that omitted positive cross-loadings result in overestimated covariances of the latent predictor variables and distorted linear effects. For nonlinear SEM with interaction and quadratic effects, omitting cross-loadings has not been investigated. This study examines the consequences of omitted cross-loadings in both linear and nonlinear SEM using a single empirical dataset and a small simulation study. We focus on the bias patterns that emerge when cross-loadings-reflecting the multidimensionality of items-are either positive or negative and assess how these biases vary with the level of the latent predictor covariance. The empirical analysis reveals that constraining theoretically justified cross-loadings to zero results in systematic over- and underestimation of factor loadings and structural parameters, with more pronounced effects in the nonlinear component of the model, thereby altering the functional form of the relationships between the latent variables. The simulation study further illustrates that the direction and magnitude of bias in both linear and nonlinear SEM depend jointly on the sign of the cross-loadings and the level of the latent predictor covariance. These findings underscore the critical importance of incorporating cross-loadings only theory-driven to maintain an accurate representation of the functional relationships between latent constructs. Practical implications and challenges of including cross-loadings in the model are discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"328"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145408003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-30DOI: 10.3758/s13428-025-02862-5
Marcus Nyström, Diederick C Niehorster, Roy S Hessels, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Jakub Lipiński, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Ignace T C Hooge
The eyeball is not rigid and deforms during saccades. As a consequence, the saccade waveform recorded by an eye tracker may depend on which structure of the eye is used to estimate eyeball rotation. Here, we systematically describe and compare signals co-recorded from the retina, the cornea (corneal reflection, CR), the pupil, and the lens (fourth Purkinje reflection, P4) during saccades. We found that several commonly used parameters for saccade characterization differ systematically across the signals. For instance, saccades in the retinal signal had earlier onsets compared to saccades in the pupil and the P4 signals. The retinal signal had the smallest saccade amplitude and reached the peak saccade velocity earlier compared to the other signals. At the end of saccades, the retinal signal came to a stop faster than the other signals. We discuss possible explanations that may account for the relationship between the retinal signal and the other signals.
{"title":"Do eye trackers estimate eyeball rotation? The relationship between tracked eye image feature and estimated saccadic waveform.","authors":"Marcus Nyström, Diederick C Niehorster, Roy S Hessels, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Jakub Lipiński, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Ignace T C Hooge","doi":"10.3758/s13428-025-02862-5","DOIUrl":"10.3758/s13428-025-02862-5","url":null,"abstract":"<p><p>The eyeball is not rigid and deforms during saccades. As a consequence, the saccade waveform recorded by an eye tracker may depend on which structure of the eye is used to estimate eyeball rotation. Here, we systematically describe and compare signals co-recorded from the retina, the cornea (corneal reflection, CR), the pupil, and the lens (fourth Purkinje reflection, P4) during saccades. We found that several commonly used parameters for saccade characterization differ systematically across the signals. For instance, saccades in the retinal signal had earlier onsets compared to saccades in the pupil and the P4 signals. The retinal signal had the smallest saccade amplitude and reached the peak saccade velocity earlier compared to the other signals. At the end of saccades, the retinal signal came to a stop faster than the other signals. We discuss possible explanations that may account for the relationship between the retinal signal and the other signals.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"329"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145408001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-29DOI: 10.3758/s13428-025-02856-3
Jessica N Goetz, Mark B Neider
Real-world stimuli can be difficult to manipulate and control in experimental psychology studies. Color information is frequently used as a variable, and researchers often rely on subjective color labels that imprecisely describe the color information within real-world objects. Here, we describe a new toolbox called MATCH (Matching And Transforming Closely Hued objects) that can easily and objectively quantify and manipulate color information within real-world objects to generate object pairs that match in color. MATCH was designed incorporating theoretical frameworks and conceptual understanding from visual cognition research. Additionally, MATCH provides critical information on the distribution of color and the specific color values of any stimulus set. We also present two experimental studies to validate whether MATCH produces images that are consistent with human visual perception. In the first study, we provide evidence that the stimuli generated by MATCH are perceptually closer in color to a reference object compared to human categorization of object-color pairs. In the second study, we investigated the search for real-world objects with distractors generated by MATCH that matched the target object's color. We found patterns of data that are consistent with current theories of human search behavior. In summary, MATCH allows researchers to carefully control the color of real-world stimuli used in their studies.
{"title":"MATCH: A toolbox to assess the primary color of real-world objects and generate color-matching stimuli.","authors":"Jessica N Goetz, Mark B Neider","doi":"10.3758/s13428-025-02856-3","DOIUrl":"10.3758/s13428-025-02856-3","url":null,"abstract":"<p><p>Real-world stimuli can be difficult to manipulate and control in experimental psychology studies. Color information is frequently used as a variable, and researchers often rely on subjective color labels that imprecisely describe the color information within real-world objects. Here, we describe a new toolbox called MATCH (Matching And Transforming Closely Hued objects) that can easily and objectively quantify and manipulate color information within real-world objects to generate object pairs that match in color. MATCH was designed incorporating theoretical frameworks and conceptual understanding from visual cognition research. Additionally, MATCH provides critical information on the distribution of color and the specific color values of any stimulus set. We also present two experimental studies to validate whether MATCH produces images that are consistent with human visual perception. In the first study, we provide evidence that the stimuli generated by MATCH are perceptually closer in color to a reference object compared to human categorization of object-color pairs. In the second study, we investigated the search for real-world objects with distractors generated by MATCH that matched the target object's color. We found patterns of data that are consistent with current theories of human search behavior. In summary, MATCH allows researchers to carefully control the color of real-world stimuli used in their studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"327"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}