首页 > 最新文献

Behavior Research Methods最新文献

英文 中文
A practice-oriented guide to statistical inference in linear modeling for non-normal or heteroskedastic error distributions. 在非正态或异方差误差分布的线性建模中统计推断的实践导向指南。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-10 DOI: 10.3758/s13428-025-02801-4
Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy

Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.

选择合适的统计方法是应用研究人员经常遇到的挑战,特别是在经典参数方法的假设被违反的情况下。为了提供一些指导和支持,我们在普通最小二乘(OLS)回归的框架下,将经典假设检验及其典型的正态性和同方差分布假设与常见且易于获取的替代推理方法(HC3, HC4和六种bootstrap方法)进行了比较。该方法的性能评估了四种不同的回归模型,具有不同程度的非正态性和异方差的误差,以及五种不同的样本量,从25到500例。对于每个场景,生成了10,000个观察样本。检查I型误差和覆盖率、功率和标准误差偏差,以评估方法的性能。没有一种方法在所有方面都令人满意。使用HC3或HC4标准误差,或者使用具有百分位数置信区间的随机引导过程,可以在许多(但不是全部)场景中产生可靠的结果。我们假设,在违反假设的情况下,研究人员可能会参考在最类似于其数据情况的场景中表现最好的方法。为了帮助选择合适的方法,我们提供了在所有考虑的场景中比较相对性能的表格。
{"title":"A practice-oriented guide to statistical inference in linear modeling for non-normal or heteroskedastic error distributions.","authors":"Hanna Rajh-Weber, Stefan Ernest Huber, Martin Arendasy","doi":"10.3758/s13428-025-02801-4","DOIUrl":"10.3758/s13428-025-02801-4","url":null,"abstract":"<p><p>Selecting an appropriate statistical method is a challenge frequently encountered by applied researchers, especially if assumptions for classical, parametric approaches are violated. To provide some guidelines and support, we compared classical hypothesis tests with their typical distributional assumptions of normality and homoskedasticity with common and easily accessible alternative inference methods (HC3, HC4, and six bootstrap methods) in the framework of ordinary least squares (OLS) regression. The method's performance was assessed for four different regression models with varying levels of non-normality and heteroskedasticity of errors, and for five different sample sizes ranging from 25 to 500 cases. For each scenario, 10,000 samples of observations were generated. Type I error and coverage rates, power, and standard error bias were examined to assess the methods' performance. No method considered here performed satisfactorily on all accounts. Using HC3 or HC4 standard errors, or a wild bootstrap procedure with percentile confidence intervals, could yield reliable results in many, but not all, scenarios. We suppose that, in the case of assumption violations, researchers might refer to a method that performed best in a scenario most similar to their data situation. To aid the selection of an appropriate method, we provide tables comparing relative performances in all considered scenarios.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"338"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602623/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dimensionality reduction techniques in pupillometry research: A primer for behavioral scientists. 瞳孔测量研究中的降维技术:行为科学家的入门。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-10 DOI: 10.3758/s13428-025-02786-0
Serena Castellotti, Irene Petrizzo, Roberto Arrighi, Elvio Blini

The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the "temporal" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library "Pupilla."

瞳孔大小的测量是心理生理学的经典工具,但由于眼球追踪行业的快速发展,它的受欢迎程度最近激增。同时,一些作者已经概述了大量的策略来处理瞳孔记录分析。共识是瞳孔大小变化的“时间”方面是关键,分析方法应注意时间因素。在这里,我们在这个问题上采取更激进的立场,认为当发现瞳孔大小发生重大变化时,已经太晚了。我们认为,这些变化确实是不同的、核心的生理过程的结果,这些生理过程在那一刻之前几百毫秒就开始了,并共同形成了观察到的信号。这些过程可以通过利用降维技术间接恢复。因此,我们在这里概述了时间主成分分析和相关旋转的关键概念,以表明它们揭示了一个潜在的低维空间,它非常有效地代表了这些过程:瞳孔流形。我们详细说明了为什么评估瞳孔流形为数据分析提供了另一种有吸引力的分析解决方案。特别是,降维得到的分数(1)注意到观察到的瞳孔大小变化背后的相关生理学,(2)非常方便和易于管理的统计建模,(3)没有一些任意的选择。我们在附带的R库“Pupilla”中提供的函数的教程中详细阐述了这些观点。
{"title":"Dimensionality reduction techniques in pupillometry research: A primer for behavioral scientists.","authors":"Serena Castellotti, Irene Petrizzo, Roberto Arrighi, Elvio Blini","doi":"10.3758/s13428-025-02786-0","DOIUrl":"10.3758/s13428-025-02786-0","url":null,"abstract":"<p><p>The measurement of pupil size is a classic tool in psychophysiology, but its popularity has recently surged due to the rapid developments of the eye-tracking industry. Concurrently, several authors have outlined a wealth of strategies for tackling pupillary recordings analytically. The consensus is that the \"temporal\" aspect of changes in pupil size is key, and that the analytical approach should be mindful of the temporal factor. Here we take a more radical stance on the matter by suggesting that, by the time significant changes in pupil size are detected, it is already too late. We suggest that these changes are indeed the result of distinct, core physiological processes that originate several hundreds of milliseconds before that moment and altogether shape the observed signal. These processes can be recovered indirectly by leveraging dimensionality reduction techniques. Here we therefore outline key concepts of temporal principal components analysis and related rotations to show that they reveal a latent, low-dimensional space that represents these processes very efficiently: a pupillary manifold. We elaborate on why assessing the pupillary manifold provides an alternative, appealing analytical solution for data analysis. In particular, dimensionality reduction returns scores that are (1) mindful of the relevant physiology underlying the observed changes in pupil size, (2) extremely handy and manageable for statistical modelling, and (3) devoid of several arbitrary choices. We elaborate on these points in the form of a tutorial paper for the functions provided in the accompanying R library \"Pupilla.\"</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"337"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602682/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Why you shouldn't trust data collected on MTurk. 为什么你不应该相信MTurk上收集的数据。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-10 DOI: 10.3758/s13428-025-02852-7
Cameron S Kay

Several prior studies have used advanced methodological techniques to demonstrate that there is an issue with the quality of data that can be collected on Amazon's Mechanical Turk (MTurk). The goal of the present project was to provide an accessible demonstration of this issue. We administered 27 semantic antonyms-pairs of items that assess clearly contradictory content (e.g., "I talk a lot" and "I rarely talk")-to samples drawn from Connect (N1 = 100), Prolific (N2 = 100), and MTurk (N3 = 400; N4 = 600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only "high-productivity" and "high-reputation" participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.

之前的一些研究使用了先进的方法技术来证明,在亚马逊的土耳其机器人(MTurk)上收集的数据质量存在问题。本项目的目标是为这个问题提供一个可访问的演示。我们从Connect (N1 = 100)、多产(N2 = 100)和MTurk (N3 = 400; N4 = 600)中抽取了27个语义反义词——评估明显矛盾内容(例如,“我经常说话”和“我很少说话”)的对偶项。尽管这些条目对在Connect和多产上大部分呈负相关,但在MTurk上超过96%呈正相关。这一问题无法通过使用常见的注意力检查措施筛选数据或只招募“高生产力”和“高声誉”的参与者来解决。这些发现提供了明确的证据,证明在MTurk上收集的数据根本不可信。
{"title":"Why you shouldn't trust data collected on MTurk.","authors":"Cameron S Kay","doi":"10.3758/s13428-025-02852-7","DOIUrl":"10.3758/s13428-025-02852-7","url":null,"abstract":"<p><p>Several prior studies have used advanced methodological techniques to demonstrate that there is an issue with the quality of data that can be collected on Amazon's Mechanical Turk (MTurk). The goal of the present project was to provide an accessible demonstration of this issue. We administered 27 semantic antonyms-pairs of items that assess clearly contradictory content (e.g., \"I talk a lot\" and \"I rarely talk\")-to samples drawn from Connect (N<sub>1</sub> = 100), Prolific (N<sub>2</sub> = 100), and MTurk (N<sub>3</sub> = 400; N<sub>4</sub> = 600). Despite most of these item pairs being negatively correlated on Connect and Prolific, over 96% were positively correlated on MTurk. This issue could not be remedied by screening the data using common attention check measures nor by recruiting only \"high-productivity\" and \"high-reputation\" participants. These findings provide clear evidence that data collected on MTurk simply cannot be trusted.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"340"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correcting for selection bias after conditioning on a sum score in the Ising model. 修正在伊辛模型中对一个总和分数进行条件反射后的选择偏差。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-10 DOI: 10.3758/s13428-025-02820-1
Jesse Boot, Jill de Ron, Jonas Haslbeck, Sacha Epskamp

In psychological studies, it is common practice to select a sample based on the sum score of the modeled variables (e.g., based on symptom severity when investigating the associations between those same symptoms). However, this practice introduces bias if the sum score selection imperfectly defines the population of interest. Here, we propose a correction for this type of selection bias in the Ising model, a popular network model for binary data. Possible applications of our correction are when one wants to obtain (1) full population estimates when only the sum score subset of the data is available, and (2) improved estimates of a subpopulation, if we observe a mixture of populations that differ from each other in the sum score. In a simulation study, we verify that our correction recovers the network structure of the desired population after a sum score selection using both a node-wise regression and a multivariate estimation of the Ising model. In an example, we show how our correction can be used in practice using empirical data on symptoms of major depression from the National Comorbidity Study Replication (N = 9,282). We implemented our correction in four commonly used R packages for estimating the Ising model, namely IsingFit, IsingSampler, psychonetrics, and bootnet.

在心理学研究中,通常的做法是根据建模变量的总和得分来选择样本(例如,在调查相同症状之间的关联时,根据症状严重程度)。然而,如果总和分数选择不完美地定义感兴趣的总体,这种做法会引入偏差。在这里,我们提出了对伊辛模型(一种流行的二进制数据网络模型)中这种类型的选择偏差的修正。我们修正的可能应用是当人们想要获得(1)当只有数据的总和分数子集可用时的完整总体估计,以及(2)如果我们观察到在总和分数中彼此不同的混合种群,则对子种群进行改进估计。在模拟研究中,我们验证了我们的校正在使用节点智能回归和Ising模型的多变量估计进行总和分数选择后恢复了期望群体的网络结构。在一个例子中,我们展示了我们的修正如何在实践中使用来自国家共病研究复制(N = 9282)的重度抑郁症症状的经验数据。我们在四个常用的R包中实现了我们的校正,用于估计Ising模型,即IsingFit, IsingSampler, psychonetrics和bootnet。
{"title":"Correcting for selection bias after conditioning on a sum score in the Ising model.","authors":"Jesse Boot, Jill de Ron, Jonas Haslbeck, Sacha Epskamp","doi":"10.3758/s13428-025-02820-1","DOIUrl":"10.3758/s13428-025-02820-1","url":null,"abstract":"<p><p>In psychological studies, it is common practice to select a sample based on the sum score of the modeled variables (e.g., based on symptom severity when investigating the associations between those same symptoms). However, this practice introduces bias if the sum score selection imperfectly defines the population of interest. Here, we propose a correction for this type of selection bias in the Ising model, a popular network model for binary data. Possible applications of our correction are when one wants to obtain (1) full population estimates when only the sum score subset of the data is available, and (2) improved estimates of a subpopulation, if we observe a mixture of populations that differ from each other in the sum score. In a simulation study, we verify that our correction recovers the network structure of the desired population after a sum score selection using both a node-wise regression and a multivariate estimation of the Ising model. In an example, we show how our correction can be used in practice using empirical data on symptoms of major depression from the National Comorbidity Study Replication (N = 9,282). We implemented our correction in four commonly used R packages for estimating the Ising model, namely IsingFit, IsingSampler, psychonetrics, and bootnet.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"341"},"PeriodicalIF":3.9,"publicationDate":"2025-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12602587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145487712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research. 预训练语言模型的微调教程:在社会和行为科学研究中的应用。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-05 DOI: 10.3758/s13428-025-02868-z
Yu Wang, Wen Qu

Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.

自然语言是表达思想和情感的主要媒介,使文本分析成为心理学研究的重要工具。它可以洞察人际交往中的个性特征、心理健康和情绪。传统的方法——例如人工编码、基于字典的方法或从头开始训练模型——经常受到限制,包括低效率、不完整的覆盖或高数据需求。本教程介绍了pretrain-fine - tune范式,这是文本分析和自然语言处理(NLP)中的一种变革性方法,它利用了大型预训练语言模型。与传统方法不同,这种范式即使在有限的标记数据下也能进行有效的微调,这使得它对标注样本稀缺的社会科学研究特别有价值。我们的教程全面介绍了预训练-微调框架,从预训练和微调的核心概念开始,然后是实际应用程序的实践练习,介绍了finetuneR,这是一个为R用户提供这些方法而开发的R包,最后讨论了现有资源和最佳实践中的常见误解。我们证明了它在不同任务中的有效性,包括多类分类和回归,显示了它比传统方法、基于特征提取的方法和基于gpt的策略的优势。通过强调其效率,可及性和卓越的性能,本教程旨在鼓励在心理和行为研究中更广泛地采用预训练-微调范式。
{"title":"A tutorial on fine-tuning pretrained language models: Applications in social and behavioral science research.","authors":"Yu Wang, Wen Qu","doi":"10.3758/s13428-025-02868-z","DOIUrl":"https://doi.org/10.3758/s13428-025-02868-z","url":null,"abstract":"<p><p>Natural language is a primary medium for expressing thoughts and emotions, making text analysis a vital tool in psychological research. It enables insights into personality traits, mental health, and sentiment in interpersonal communication. Traditional approaches - such as human coding, dictionary-based methods, or training models from scratch - often suffer from limitations, including inefficiency, incomplete coverage, or high data requirements. This tutorial introduces the pretrain-finetune paradigm, a transformative approach in text analysis and natural language processing (NLP) that leverages large pretrained language models. Unlike conventional methods, this paradigm enables efficient fine-tuning even with limited labeled data, making it particularly valuable for social science research where annotated samples are scarce. Our tutorial offers a comprehensive introduction to the pretrain-finetune framework, beginning with core concepts of pretraining and fine-tuning, followed by hands-on exercises with real-world applications, the introduction of finetuneR, an R package developed to make these methods accessible to R users, and concluding with a discussion of common misconceptions in existing resources and best practices. We demonstrate its effectiveness across diverse tasks, including multi-class classification and regression, showing its advantages over traditional methods, feature extraction-based approaches, and GPT-based strategies. By emphasizing its efficiency, accessibility, and superior performance, this tutorial aims to encourage broader adoption of the pretrain-finetune paradigm in psychological and behavioral research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"336"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects. 时间泛化的漂移-扩散模型优于现有模型,并捕获了模态差异和学习效果。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-05 DOI: 10.3758/s13428-025-02819-8
Nir Ofir, Ayelet N Landau

Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.

大脑中的多个系统跟踪时间的流逝,并能根据时间的要求调整它们的活动。虽然时间的神经实现在神经基质和行为任务之间差异很大,但在算法层面上,许多这些行为可以用决策的漂移-扩散模型来描述。在这项工作中,我们开发了一个漂移-扩散模型来拟合时间泛化任务中的表现,在该任务中,参与者需要将间隔与标准或参考持续时间进行相同或不同的分类。该模型包括一个从间隔开始的漂移扩散过程,表示经过时间的内部估计,以及两个边界。如果区间偏移处的漂移扩散过程在边界之间,则将该区间归类为等于标准。如果低于下边界或高于上边界,则将间隔分类为不同。该模型在拟合单个参与者数据和参数恢复分析方面优于以往的模型。我们还使用漂移-扩散模型来分析两个实验的数据,一个比较视觉和听觉的表现,另一个检查学习的效果。我们发现决策边界可以独立修改:虽然视觉的上边界比听觉的上边界高,但下边界随着任务的学习而降低。在两个实验中,时间噪声与被试的上边界呈正相关,这反映了任务中的准确性最大化策略。
{"title":"A drift-diffusion model of temporal generalization outperforms existing models and captures modality differences and learning effects.","authors":"Nir Ofir, Ayelet N Landau","doi":"10.3758/s13428-025-02819-8","DOIUrl":"10.3758/s13428-025-02819-8","url":null,"abstract":"<p><p>Multiple systems in the brain track the passage of time and can adapt their activity to temporal requirements. While the neural implementation of timing varies widely between neural substrates and behavioral tasks, at the algorithmic level, many of these behaviors can be described using drift-diffusion models of decision-making. In this work, wedevelop a drift-diffusion model to fit performance in the temporal generalization task, in which participants are required to categorize an interval as being the same or different compared to a standard, or reference, duration. The model includes a drift-diffusion process which starts with interval onset, representing the internal estimate of elapsed duration, and two boundaries. If the drift-diffusion process at interval offset is between the boundaries, the interval is categorized as equal to the standard. If it is below the lower boundary or above the upper boundary, the interval is categorized as different. This model outperformed previous models in fitting the data of single participants and in parameter recovery analyses. We also used the drift-diffusion model to analyze data from two experiments, one comparing performance between vision and audition and another examining the effect of learning. We found that decision boundaries can be modified independently: While the upper boundary was higher in vision than in audition, the lower boundary decreased with learning in the task. In both experiments, timing noise was positively correlated with upper boundaries across participants, which reflects an accuracy-maximizing strategy in the task.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"334"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12589256/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability. 架起数字概率与言语概率的桥梁:汉语言语概率词典的构建与应用。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-05 DOI: 10.3758/s13428-025-02853-6
Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao

Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.

概率通常以两种形式表示:数字(例如,50%)和口头(例如,可能)。在这方面,理解语言概率如何映射到它们的数值等价物对于检查各种文本中使用的概率语言至关重要。本研究通过引入汉语词概率词典(CLVP)来解决这一问题,CLVP由343个词概率短语组成,每个词概率短语都与相应的数字概率、隶属函数和来自三个语料库的频率数据配对。我们基于CLVP分析了汉语言语概率短语的主观值分布,并与英语相比较,建立了7个高频言语概率短语的基准,供组织使用。总的来说,本研究提供了一个有价值的工具,将口头概率转换为数值等值,有助于跨语言和跨文化研究。
{"title":"Bridging numerical and verbal probabilities: Construction and application of the Chinese Lexicon of Verbal Probability.","authors":"Xiao-Yang Sui, Jia-Wen Niu, Xiaoqian Liu, Li-Lin Rao","doi":"10.3758/s13428-025-02853-6","DOIUrl":"10.3758/s13428-025-02853-6","url":null,"abstract":"<p><p>Probabilities are typically expressed in two forms: numerical (e.g., 50%) and verbal (e.g., likely). In this regard, understanding how verbal probabilities map to their numerical equivalents is crucial for examining the probabilistic language used in various texts. This study addresses this issue by introducing the Chinese Lexicon of Verbal Probability (CLVP), comprising 343 verbal probability phrases that are each paired with corresponding numerical probabilities, membership functions, and frequency data from three corpora. We analyze the distribution of subjective values of verbal probability phrases in Chinese based on the CLVP, compare them with their English counterparts, and create a benchmark of seven high-frequency verbal probability phrases for organizational use. Overall, this study provides a valuable tool for converting verbal probabilities into numerical equivalents, contributing to cross-linguistic and cross-cultural research.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"335"},"PeriodicalIF":3.9,"publicationDate":"2025-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145450721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory. 连续评定量表分析(CoRSA):用项目反应理论分析连续和离散数据的工具。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-04 DOI: 10.3758/s13428-025-02848-3
Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang

The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.

像视觉模拟量表(VAS)这样的连续评定量表在研究中的使用越来越多,但它们不如像李克特量表这样的离散量表受欢迎。连续量表不受欢迎的主要原因是缺乏经过验证的分析工具和用户友好的界面,这也共同导致缺乏足够的理论和实证研究来支持使用连续评分格式的信心。本研究旨在通过四项研究来解决这些差距。第一项研究提出了一种算法,并开发了连续评定量表分析(CoRSA)来估计连续评定量表模型的参数(m ller, Psychometrika, 52, 165-181, 1987)。第二项研究评估了CoRSA与pcIRT (Hohensinn, Journal Statistical Software, 84,1 - 14,2018)和ConQuest (Adams et al., 2020)相比在分析连续得分方面的有效性。结果显示,连续数据的CoRSA参数恢复优于离散数据的可比较结果。第三项研究分析了职业兴趣和工作价值评估的实证数据,使用VAS和Likert量表与CoRSA,证明了良好的模型数据拟合,并验证了CoRSA在将数据重新缩放到区间测量方面的有效性。最后,第四项研究将CoRSA集成到VAS-RRP 2.0平台中(Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018),以增强可访问性和可用性,使不熟悉统计程序的研究人员和从业人员能够轻松分析连续数据。这些发现证实了CoRSA是分析连续和离散数据的有效工具,增强了连续评级格式在不同研究背景下的效用。
{"title":"Continuous Rating Scale Analytics (CoRSA): A tool for analyzing continuous and discrete data with item response theory.","authors":"Yeh-Tai Chou, Yao-Ting Sung, Wei-Hung Yang","doi":"10.3758/s13428-025-02848-3","DOIUrl":"10.3758/s13428-025-02848-3","url":null,"abstract":"<p><p>The use of continuous rating scales such as the visual analogue scale (VAS) in research has increased, yet they are less popular than discrete scales like the Likert scale. The non-popularity of continuous scales is primarily due to the lack of validated analytical tools and user-friendly interfaces, which have also jointly resulted in a lack of sufficient theoretical and empirical research supporting confidence in using continuous rating formats. This research aims to address these gaps through four studies. The first study proposed an algorithm and developed the Continuous Rating Scale Analytics (CoRSA) to estimate parameters for the continuous rating scale model (Müller, Psychometrika, 52, 165-181, 1987). The second study evaluated CoRSA's efficacy in analyzing continuous scores compared to pcIRT (Hohensinn, Journal of Statistical Software, 84, 1-14, 2018) and discrete scores against ConQuest (Adams et al., 2020). Results showed superior parameter recovery with CoRSA for continuous data and comparable outcomes for discrete data. The third study analyzed empirical data from career interest and work value assessments using both VAS and Likert scales with CoRSA, demonstrating good model-data fit and validating CoRSA's effectiveness in rescaling data to interval measurements. Finally, the fourth study integrated CoRSA into the VAS-RRP 2.0 platform (Sung & Wu, Behavior Research Methods, 50, 1694-1715, 2018) to enhance accessibility and usability, allowing researchers and practitioners unfamiliar with statistical procedures to easily analyze continuous data. These findings confirm CoRSA as a valid tool for analyzing both continuous and discrete data, enhancing the utility of continuous rating formats in diverse research contexts.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"333"},"PeriodicalIF":3.9,"publicationDate":"2025-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12586417/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145443831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations. Autoscribe:一个自动工具,用于从音频录制的对话中创建转录文本网格。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-03 DOI: 10.3758/s13428-025-02850-9
Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie

One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.

会话研究的一个主要困难是对会话录音进行分段和转录所需的时间。虽然最近的进步已经改进了自动语音识别技术,但现有工具的一个限制是,它们通常是针对独白而不是对话中的语音。因此,这个项目的目的是开发和验证一个自动化的用户友好工具,用于记录对话。这个工具,称为Autoscribe,转换双向会话录音到Praat文本网格与时间对齐的回合边界之间的语音和非语音段和所有口语对话输出的文本。在这里,我们描述了这个工具的开发以及它在两个会话语料库上的验证。结果表明,Autoscribe将创建TextGrid所需的活动工作时间减少了70%以上。平均转录准确率为92%,平均话语边界位置为95%。因此,Autoscribe提供了一个实用的研究工具,大大减少了会话分割和转录所需的时间和资源密集度。
{"title":"Autoscribe: An automated tool for creating transcribed TextGrids from audio-recorded conversations.","authors":"Tyson S Barrett, Camille J Wynn, Lotte Eijk, Katerina A Tetzloff, Stephanie A Borrie","doi":"10.3758/s13428-025-02850-9","DOIUrl":"10.3758/s13428-025-02850-9","url":null,"abstract":"<p><p>One major difficulty in conversational research is the time required to segment and transcribe conversational recordings. While recent advances have improved automatic speech recognition technologies, one limitation of current tools is that they are generally catered toward speech that occurs in monologues rather than conversation. Accordingly, the purpose of this project was to develop and validate an automated user-friendly tool for transcribing conversations. This tool, called Autoscribe, converts dyadic conversational audio recordings into Praat TextGrids with time-aligned turn boundaries between speech and non-speech segments and transcripts of all spoken dialogue output. Here we describe the development of this tool as well as its validation on two conversational corpora. Results showed that Autoscribe decreased the amount of active working time needed for TextGrid creation by over 70%. Average transcription accuracy was 92% and average utterance boundary placement of 95%. Thus, Autoscribe affords a practical research tool that drastically reduces the time and resource intensitivity needed for conversational segmentation and transcription.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"332"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145437003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice. 出版商更正:心理学中潜在阶级分析的系统回顾:检查指南与研究实践之间的差距。
IF 3.9 2区 心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Pub Date : 2025-11-03 DOI: 10.3758/s13428-025-02871-4
Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo
{"title":"Publisher Correction: A systematic review of latent class analysis in psychology: Examining the gap between guidelines and research practice.","authors":"Angela Sorgente, Rossella Caliciuri, Matteo Robba, Margherita Lanz, Bruno D Zumbo","doi":"10.3758/s13428-025-02871-4","DOIUrl":"10.3758/s13428-025-02871-4","url":null,"abstract":"","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 12","pages":"331"},"PeriodicalIF":3.9,"publicationDate":"2025-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12583348/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145436976","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Behavior Research Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1