首页 > 最新文献

Behavior Research Methods最新文献

英文 中文
A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research. 预训练语言模型的微调教程:在社会和行为科学研究中的应用。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-05 DOI: 10.3758/s13428-025-02868-z
Yu Wang, Wen Qu

Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.

自然语言是表达思想和情感的主要媒介,使文本分析成为心理学研究的重要工具。它可以洞察人际交往中的个性特征、心理健康和情绪。传统的方法——例如人工编码、基于字典的方法或从头开始训练模型——经常受到限制,包括低效率、不完整的覆盖或高数据需求。本教程介绍了pretrain-fine - tune范式,这是文本分析和自然语言处理(NLP)中的一种变革性方法,它利用了大型预训练语言模型。与传统方法不同,这种范式即使在有限的标记数据下也能进行有效的微调,这使得它对标注样本稀缺的社会科学研究特别有价值。我们的教程全面介绍了预训练-微调框架,从预训练和微调的核心概念开始,然后是实际应用程序的实践练习,介绍了finetuneR,这是一个为R用户提供这些方法而开发的R包,最后讨论了现有资源和最佳实践中的常见误解。我们证明了它在不同任务中的有效性,包括多类分类和回归,显示了它比传统方法、基于特征提取的方法和基于gpt的策略的优势。通过强调其效率,可及性和卓越的性能,本教程旨在鼓励在心理和行为研究中更广泛地采用预训练-微调范式。
{"title":"A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research.","authors":"Yu Wang, Wen Qu","doi":"10.3758/s13428-025-02868-z","DOIUrl":"10.3758/s13428-025-02868-z","url":null,"abstract":"<p><p>Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"336"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects. 时间泛化的漂移-扩散模型优于现有模型,并捕获了模态差异和学习效果。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-05 DOI: 10.3758/s13428-025-02819-8
Nir Ofir, Ayelet N Landau

Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.

大脑中的多个系统跟踪时间的流逝,并能根据时间的要求调整它们的活动。虽然时间的神经实现在神经基质和行为任务之间差异很大,但在算法层面上,许多这些行为可以用决策的漂移-扩散模型来描述。在这项工作中,我们开发了一个漂移-扩散模型来拟合时间泛化任务中的表现,在该任务中,参与者需要将间隔与标准或参考持续时间进行相同或不同的分类。该模型包括一个从间隔开始的漂移扩散过程,表示经过时间的内部估计,以及两个边界。如果区间偏移处的漂移扩散过程在边界之间,则将该区间归类为等于标准。如果低于下边界或高于上边界,则将间隔分类为不同。该模型在拟合单个参与者数据和参数恢复分析方面优于以往的模型。我们还使用漂移-扩散模型来分析两个实验的数据,一个比较视觉和听觉的表现,另一个检查学习的效果。我们发现决策边界可以独立修改:虽然视觉的上边界比听觉的上边界高,但下边界随着任务的学习而降低。在两个实验中,时间噪声与被试的上边界呈正相关,这反映了任务中的准确性最大化策略。
{"title":"A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects.","authors":"Nir Ofir, Ayelet N Landau","doi":"10.3758/s13428-025-02819-8","DOIUrl":"10.3758/s13428-025-02819-8","url":null,"abstract":"<p><p>Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"334"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12589256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability. 架起数字概率与言语概率的桥梁:汉语言语概率词典的构建与应用。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-05 DOI: 10.3758/s13428-025-02853-6
Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao

Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.

概率通常以两种形式表示:数字(例如,50%)和口头(例如,可能)。在这方面,理解语言概率如何映射到它们的数值等价物对于检查各种文本中使用的概率语言至关重要。本研究通过引入汉语词概率词典(CLVP)来解决这一问题,CLVP由343个词概率短语组成,每个词概率短语都与相应的数字概率、隶属函数和来自三个语料库的频率数据配对。我们基于CLVP分析了汉语言语概率短语的主观值分布,并与英语相比较,建立了7个高频言语概率短语的基准,供组织使用。总的来说,本研究提供了一个有价值的工具,将口头概率转换为数值等值,有助于跨语言和跨文化研究。
{"title":"Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability.","authors":"Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao","doi":"10.3758/s13428-025-02853-6","DOIUrl":"10.3758/s13428-025-02853-6","url":null,"abstract":"<p><p>Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"335"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory. 连续评定量表分析(CoRSA):用项目反应理论分析连续和离散数据的工具。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-04 DOI: 10.3758/s13428-025-02848-3
Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang

The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.

像视觉模拟量表(VAS)这样的连续评定量表在研究中的使用越来越多,但它们不如像李克特量表这样的离散量表受欢迎。连续量表不受欢迎的主要原因是缺乏经过验证的分析工具和用户友好的界面,这也共同导致缺乏足够的理论和实证研究来支持使用连续评分格式的信心。本研究旨在通过四项研究来解决这些差距。第一项研究提出了一种算法,并开发了连续评定量表分析(CoRSA)来估计连续评定量表模型的参数(m ller, Psychometrika, 52, 165-181, 1987)。第二项研究评估了CoRSA与pcIRT (Hohensinn, Journal Statistical Software, 84,1 - 14,2018)和ConQuest (Adams et al., 2020)相比在分析连续得分方面的有效性。结果显示,连续数据的CoRSA参数恢复优于离散数据的可比较结果。第三项研究分析了职业兴趣和工作价值评估的实证数据,使用VAS和Likert量表与CoRSA,证明了良好的模型数据拟合,并验证了CoRSA在将数据重新缩放到区间测量方面的有效性。最后,第四项研究将CoRSA集成到VAS-RRP 2.0平台中(Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018),以增强可访问性和可用性,使不熟悉统计程序的研究人员和从业人员能够轻松分析连续数据。这些发现证实了CoRSA是分析连续和离散数据的有效工具,增强了连续评级格式在不同研究背景下的效用。
{"title":"Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory.","authors":"Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang","doi":"10.3758/s13428-025-02848-3","DOIUrl":"10.3758/s13428-025-02848-3","url":null,"abstract":"<p><p>The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"333"},"PeriodicalIF":3.9,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations. Autoscribe:一个自动工具,用于从音频录制的对话中创建转录文本网格。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-03 DOI: 10.3758/s13428-025-02850-9
Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie

One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.

会话研究的一个主要困难是对会话录音进行分段和转录所需的时间。虽然最近的进步已经改进了自动语音识别技术,但现有工具的一个限制是,它们通常是针对独白而不是对话中的语音。因此,这个项目的目的是开发和验证一个自动化的用户友好工具,用于记录对话。这个工具,称为Autoscribe,转换双向会话录音到Praat文本网格与时间对齐的回合边界之间的语音和非语音段和所有口语对话输出的文本。在这里,我们描述了这个工具的开发以及它在两个会话语料库上的验证。结果表明,Autoscribe将创建TextGrid所需的活动工作时间减少了70%以上。平均转录准确率为92%,平均话语边界位置为95%。因此,Autoscribe提供了一个实用的研究工具,大大减少了会话分割和转录所需的时间和资源密集度。
{"title":"Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations.","authors":"Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie","doi":"10.3758/s13428-025-02850-9","DOIUrl":"10.3758/s13428-025-02850-9","url":null,"abstract":"<p><p>One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"332"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice. 出版商更正:心理学中潜在阶级分析的系统回顾:检查指南与研究实践之间的差距。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-03 DOI: 10.3758/s13428-025-02871-4
Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo
{"title":"Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice.","authors":"Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo","doi":"10.3758/s13428-025-02871-4","DOIUrl":"10.3758/s13428-025-02871-4","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"331"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145436976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint modeling with generalized item response theory model family and response time model: Enhancing model structural flexibility and data-fitting adequacy. 广义项目反应理论模型族和反应时间模型的联合建模:提高模型结构的灵活性和数据拟合的充分性。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-03 DOI: 10.3758/s13428-025-02855-4
Jing Lu, Xue Wang, Jiwei Zhang

In this study, we propose a joint hierarchical model that combines a family of item response theory (IRT) models with a log-normal response time (RT) model to analyze item responses and response times. By incorporating RTs as auxiliary information, we improve the accuracy of latent trait estimation, thereby facilitating a deeper understanding of examinee performance. Additionally, we explore the use of either identical or distinct link functions across different items, allowing us to optimize IRT models for each item and improve overall model fit. We further investigate scenarios in which the joint distribution of speed and ability is nonlinear by integrating the generalized logit-linked IRT model with the log-normal random quadratic variable speed model. Compared to the traditional hierarchical model by van der Linden (Psychometrika, 72, 287-308 2007), this integration yields more accurate estimates of ability, item difficulty, and discrimination parameters. Additionally, Bayesian model comparison reveals that the new joint hierarchical model provides a better fit than various models combining item responses and RTs, particularly when the data are derived from a joint RT and two-parameter IRT model with both symmetric and asymmetric link functions. Finally, a comprehensive analysis of data from the computer-based Program for International Student Assessment (PISA) science examination from 2015 is conducted to illustrate the proposed methodology.

在这项研究中,我们提出了一个联合层次模型,该模型将项目反应理论(IRT)家族模型与对数正态反应时间(RT)模型相结合,以分析项目反应和反应时间。通过将RTs作为辅助信息,我们提高了潜在特质估计的准确性,从而有助于更深入地了解考生的表现。此外,我们探索了在不同项目之间使用相同或不同的链接函数,使我们能够优化每个项目的IRT模型并改善整体模型拟合。通过将广义对数链IRT模型与对数正态随机二次变量速度模型相结合,进一步研究了速度和能力的非线性联合分布。与van der Linden (Psychometrika, 72, 287-308 2007)的传统层次模型相比,这种整合可以更准确地估计能力、项目难度和辨别参数。此外,贝叶斯模型比较表明,新的联合分层模型比将项目反应和RTs结合在一起的各种模型提供了更好的拟合,特别是当数据来自具有对称和非对称链接函数的联合RT和双参数IRT模型时。最后,对2015年基于计算机的国际学生评估项目(PISA)科学考试的数据进行了全面分析,以说明所提出的方法。
{"title":"Joint modeling with generalized item response theory model family and response time model: Enhancing model structural flexibility and data-fitting adequacy.","authors":"Jing Lu, Xue Wang, Jiwei Zhang","doi":"10.3758/s13428-025-02855-4","DOIUrl":"10.3758/s13428-025-02855-4","url":null,"abstract":"<p><p>In this study, we propose a joint hierarchical model that combines a family of item response theory (IRT) models with a log-normal response time (RT) model to analyze item responses and response times. By incorporating RTs as auxiliary information, we improve the accuracy of latent trait estimation, thereby facilitating a deeper understanding of examinee performance. Additionally, we explore the use of either identical or distinct link functions across different items, allowing us to optimize IRT models for each item and improve overall model fit. We further investigate scenarios in which the joint distribution of speed and ability is nonlinear by integrating the generalized logit-linked IRT model with the log-normal random quadratic variable speed model. Compared to the traditional hierarchical model by van der Linden (Psychometrika, 72, 287-308 2007), this integration yields more accurate estimates of ability, item difficulty, and discrimination parameters. Additionally, Bayesian model comparison reveals that the new joint hierarchical model provides a better fit than various models combining item responses and RTs, particularly when the data are derived from a joint RT and two-parameter IRT model with both symmetric and asymmetric link functions. Finally, a comprehensive analysis of data from the computer-based Program for International Student Assessment (PISA) science examination from 2015 is conducted to illustrate the proposed methodology.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"330"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cause for concern: Omitted cross-loadings in measurement models of nonlinear structural equation models. 引起关注的原因:非线性结构方程模型的测量模型中忽略了交叉载荷。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-10-30 DOI: 10.3758/s13428-025-02792-2
Karina Navarro, Karin Schermelleh-Engel

Cross-loadings on non-target factors in measurement models of linear structural equation models (SEM) are often observed in empirical research but frequently disregarded. Previous research on linear SEM has already shown that omitted positive cross-loadings result in overestimated covariances of the latent predictor variables and distorted linear effects. For nonlinear SEM with interaction and quadratic effects, omitting cross-loadings has not been investigated. This study examines the consequences of omitted cross-loadings in both linear and nonlinear SEM using a single empirical dataset and a small simulation study. We focus on the bias patterns that emerge when cross-loadings-reflecting the multidimensionality of items-are either positive or negative and assess how these biases vary with the level of the latent predictor covariance. The empirical analysis reveals that constraining theoretically justified cross-loadings to zero results in systematic over- and underestimation of factor loadings and structural parameters, with more pronounced effects in the nonlinear component of the model, thereby altering the functional form of the relationships between the latent variables. The simulation study further illustrates that the direction and magnitude of bias in both linear and nonlinear SEM depend jointly on the sign of the cross-loadings and the level of the latent predictor covariance. These findings underscore the critical importance of incorporating cross-loadings only theory-driven to maintain an accurate representation of the functional relationships between latent constructs. Practical implications and challenges of including cross-loadings in the model are discussed.

线性结构方程模型(SEM)测量模型中非目标因子的交叉载荷在实证研究中经常被观察到,但往往被忽视。先前对线性扫描电镜的研究已经表明,忽略正向交叉载荷会导致潜在预测变量协方差的高估和线性效应的扭曲。对于具有相互作用和二次效应的非线性扫描电镜,没有研究忽略交叉载荷。本研究使用单个经验数据集和小型模拟研究,检查了线性和非线性SEM中忽略交叉载荷的后果。我们关注的是当交叉加载(反映项目的多维性)是积极或消极时出现的偏差模式,并评估这些偏差如何随着潜在预测因子协方差的水平而变化。实证分析表明,将理论上合理的交叉载荷约束为零,会导致对因子载荷和结构参数的系统性高估和低估,对模型的非线性分量的影响更为明显,从而改变潜在变量之间关系的函数形式。模拟研究进一步表明,线性和非线性SEM中偏差的方向和大小共同取决于交叉载荷的符号和潜在预测因子协方差的水平。这些发现强调了将交叉加载纳入理论驱动的关键重要性,以保持潜在构念之间功能关系的准确表示。讨论了在模型中包含交叉加载的实际意义和挑战。
{"title":"Cause for concern: Omitted cross-loadings in measurement models of nonlinear structural equation models.","authors":"Karina Navarro, Karin Schermelleh-Engel","doi":"10.3758/s13428-025-02792-2","DOIUrl":"10.3758/s13428-025-02792-2","url":null,"abstract":"<p><p>Cross-loadings on non-target factors in measurement models of linear structural equation models (SEM) are often observed in empirical research but frequently disregarded. Previous research on linear SEM has already shown that omitted positive cross-loadings result in overestimated covariances of the latent predictor variables and distorted linear effects. For nonlinear SEM with interaction and quadratic effects, omitting cross-loadings has not been investigated. This study examines the consequences of omitted cross-loadings in both linear and nonlinear SEM using a single empirical dataset and a small simulation study. We focus on the bias patterns that emerge when cross-loadings-reflecting the multidimensionality of items-are either positive or negative and assess how these biases vary with the level of the latent predictor covariance. The empirical analysis reveals that constraining theoretically justified cross-loadings to zero results in systematic over- and underestimation of factor loadings and structural parameters, with more pronounced effects in the nonlinear component of the model, thereby altering the functional form of the relationships between the latent variables. The simulation study further illustrates that the direction and magnitude of bias in both linear and nonlinear SEM depend jointly on the sign of the cross-loadings and the level of the latent predictor covariance. These findings underscore the critical importance of incorporating cross-loadings only theory-driven to maintain an accurate representation of the functional relationships between latent constructs. Practical implications and challenges of including cross-loadings in the model are discussed.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"328"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145408003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Do eye trackers estimate eyeball rotation? The relationship between tracked eye image feature and estimated saccadic waveform. 眼动仪能估计眼球转动吗?跟踪眼图像特征与估计眼动波形的关系。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-10-30 DOI: 10.3758/s13428-025-02862-5
Marcus Nyström, Diederick C Niehorster, Roy S Hessels, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Jakub Lipiński, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Ignace T C Hooge

The eyeball is not rigid and deforms during saccades. As a consequence, the saccade waveform recorded by an eye tracker may depend on which structure of the eye is used to estimate eyeball rotation. Here, we systematically describe and compare signals co-recorded from the retina, the cornea (corneal reflection, CR), the pupil, and the lens (fourth Purkinje reflection, P4) during saccades. We found that several commonly used parameters for saccade characterization differ systematically across the signals. For instance, saccades in the retinal signal had earlier onsets compared to saccades in the pupil and the P4 signals. The retinal signal had the smallest saccade amplitude and reached the peak saccade velocity earlier compared to the other signals. At the end of saccades, the retinal signal came to a stop faster than the other signals. We discuss possible explanations that may account for the relationship between the retinal signal and the other signals.

眼球不是刚性的,在扫视过程中会变形。因此,眼动仪记录的扫视波形可能取决于用来估计眼球旋转的眼睛结构。在这里,我们系统地描述并比较了在扫视过程中视网膜、角膜(角膜反射,CR)、瞳孔和晶状体(第四浦肯野反射,P4)共同记录的信号。我们发现几个常用的眼跳表征参数在不同的信号中有系统的不同。例如,与瞳孔和P4信号中的扫视相比,视网膜信号中的扫视出现得更早。与其他信号相比,视网膜信号具有最小的眼跳幅度和较早达到眼跳速度峰值。在扫视结束时,视网膜信号比其他信号更快地停止。我们讨论可能的解释,可能解释视网膜信号和其他信号之间的关系。
{"title":"Do eye trackers estimate eyeball rotation? The relationship between tracked eye image feature and estimated saccadic waveform.","authors":"Marcus Nyström, Diederick C Niehorster, Roy S Hessels, Richard Andersson, Marta K Skrok, Robert Konklewski, Patrycjusz Stremplewski, Maciej Nowakowski, Jakub Lipiński, Szymon Tamborski, Anna Szkulmowska, Maciej Szkulmowski, Ignace T C Hooge","doi":"10.3758/s13428-025-02862-5","DOIUrl":"10.3758/s13428-025-02862-5","url":null,"abstract":"<p><p>The eyeball is not rigid and deforms during saccades. As a consequence, the saccade waveform recorded by an eye tracker may depend on which structure of the eye is used to estimate eyeball rotation. Here, we systematically describe and compare signals co-recorded from the retina, the cornea (corneal reflection, CR), the pupil, and the lens (fourth Purkinje reflection, P4) during saccades. We found that several commonly used parameters for saccade characterization differ systematically across the signals. For instance, saccades in the retinal signal had earlier onsets compared to saccades in the pupil and the P4 signals. The retinal signal had the smallest saccade amplitude and reached the peak saccade velocity earlier compared to the other signals. At the end of saccades, the retinal signal came to a stop faster than the other signals. We discuss possible explanations that may account for the relationship between the retinal signal and the other signals.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"329"},"PeriodicalIF":3.9,"publicationDate":"2025-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12575526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145408001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MATCH: A toolbox to assess the primary color of real-world objects and generate color-matching stimuli. MATCH:一个评估现实世界物体的原色并产生颜色匹配刺激的工具箱。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-10-29 DOI: 10.3758/s13428-025-02856-3
Jessica N Goetz, Mark B Neider

Real-world stimuli can be difficult to manipulate and control in experimental psychology studies. Color information is frequently used as a variable, and researchers often rely on subjective color labels that imprecisely describe the color information within real-world objects. Here, we describe a new toolbox called MATCH (Matching And Transforming Closely Hued objects) that can easily and objectively quantify and manipulate color information within real-world objects to generate object pairs that match in color. MATCH was designed incorporating theoretical frameworks and conceptual understanding from visual cognition research. Additionally, MATCH provides critical information on the distribution of color and the specific color values of any stimulus set. We also present two experimental studies to validate whether MATCH produces images that are consistent with human visual perception. In the first study, we provide evidence that the stimuli generated by MATCH are perceptually closer in color to a reference object compared to human categorization of object-color pairs. In the second study, we investigated the search for real-world objects with distractors generated by MATCH that matched the target object's color. We found patterns of data that are consistent with current theories of human search behavior. In summary, MATCH allows researchers to carefully control the color of real-world stimuli used in their studies.

在实验心理学研究中,现实世界的刺激是难以操纵和控制的。颜色信息经常被用作变量,研究人员经常依赖于主观的颜色标签,这些标签不精确地描述了现实世界中物体的颜色信息。在这里,我们描述了一个名为MATCH(匹配和转换紧密色调对象)的新工具箱,它可以轻松客观地量化和操作现实世界对象中的颜色信息,以生成颜色匹配的对象对。MATCH的设计结合了视觉认知研究的理论框架和概念理解。此外,MATCH还提供了关于颜色分布和任何刺激集的特定颜色值的关键信息。我们还提出了两个实验研究来验证MATCH是否产生与人类视觉感知一致的图像。在第一项研究中,我们提供的证据表明,与人类对物体-颜色对的分类相比,MATCH产生的刺激在颜色上更接近参考物体。在第二项研究中,我们研究了使用MATCH生成的与目标物体颜色匹配的干扰物来搜索现实世界中的物体。我们发现了与当前人类搜索行为理论相一致的数据模式。总之,MATCH允许研究人员仔细控制他们研究中使用的真实世界刺激的颜色。
{"title":"MATCH: A toolbox to assess the primary color of real-world objects and generate color-matching stimuli.","authors":"Jessica N Goetz, Mark B Neider","doi":"10.3758/s13428-025-02856-3","DOIUrl":"10.3758/s13428-025-02856-3","url":null,"abstract":"<p><p>Real-world stimuli can be difficult to manipulate and control in experimental psychology studies. Color information is frequently used as a variable, and researchers often rely on subjective color labels that imprecisely describe the color information within real-world objects. Here, we describe a new toolbox called MATCH (Matching And Transforming Closely Hued objects) that can easily and objectively quantify and manipulate color information within real-world objects to generate object pairs that match in color. MATCH was designed incorporating theoretical frameworks and conceptual understanding from visual cognition research. Additionally, MATCH provides critical information on the distribution of color and the specific color values of any stimulus set. We also present two experimental studies to validate whether MATCH produces images that are consistent with human visual perception. In the first study, we provide evidence that the stimuli generated by MATCH are perceptually closer in color to a reference object compared to human categorization of object-color pairs. In the second study, we investigated the search for real-world objects with distractors generated by MATCH that matched the target object's color. We found patterns of data that are consistent with current theories of human search behavior. In summary, MATCH allows researchers to carefully control the color of real-world stimuli used in their studies.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"327"},"PeriodicalIF":3.9,"publicationDate":"2025-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145399573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Behavior Research Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1