Psychological methods最新文献

A tutorial on using generative models to advance psychological science: Lessons from the reliability paradox. 使用生成模型推进心理科学的教程：可靠性悖论的启示。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-04-14 DOI: 10.1037/met0000674

Nathaniel Haines,Peter D Kvam,Louis Irving,Colin Tucker Smith,Theodore P Beauchaine,Mark A Pitt,Woo-Young Ahn,Brandon M Turner

Theories of individual differences are foundational to psychological and brain sciences, yet they are traditionally developed and tested using superficial summaries of data (e.g., mean response times) that are disconnected from our otherwise rich conceptual theories of behavior. To resolve this theory-description gap, we review the generative modeling approach, which involves formally specifying how behavior is generated within individuals, and in turn how generative mechanisms vary across individuals. Generative modeling shifts our focus away from estimating descriptive statistical "effects" toward estimating psychologically interpretable parameters, while simultaneously enhancing the reliability and validity of our measures. We demonstrate the utility of generative modeling in the context of the "reliability paradox," a phenomenon wherein replicable group effects (e.g., Stroop effect) fail to capture individual differences (e.g., low test-retest reliability). Simulations and empirical data from the Implicit Association Test and Stroop, Flanker, Posner, and delay discounting tasks show that generative models yield (a) more theoretically informative parameters, and (b) higher test-retest reliability estimates relative to traditional approaches, illustrating their potential for enhancing theory development. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

个体差异理论是心理学和脑科学的基础，但它们传统上是使用肤浅的数据摘要（例如，平均反应时间）来发展和测试的，这与我们丰富的行为概念理论是脱节的。为了解决这一理论与描述的差距，我们回顾了生成建模方法，该方法涉及正式指定个体内部如何生成行为，以及个体之间生成机制的差异。生成模型将我们的注意力从估计描述性统计“效果”转移到估计心理上可解释的参数，同时增强了我们测量的可靠性和有效性。我们在“可靠性悖论”的背景下展示了生成模型的效用，这是一种可复制的群体效应（例如，Stroop效应）无法捕捉个体差异（例如，低测试-重测试可靠性）的现象。来自内隐关联测试、Stroop、Flanker、Posner和延迟贴现任务的模拟和经验数据表明，生成模型产生(a)与传统方法相比，有更多的理论信息参数，(b)更高的测试-重测信度估计，说明了它们促进理论发展的潜力。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"A tutorial on using generative models to advance psychological science: Lessons from the reliability paradox.","authors":"Nathaniel Haines,Peter D Kvam,Louis Irving,Colin Tucker Smith,Theodore P Beauchaine,Mark A Pitt,Woo-Young Ahn,Brandon M Turner","doi":"10.1037/met0000674","DOIUrl":"https://doi.org/10.1037/met0000674","url":null,"abstract":"Theories of individual differences are foundational to psychological and brain sciences, yet they are traditionally developed and tested using superficial summaries of data (e.g., mean response times) that are disconnected from our otherwise rich conceptual theories of behavior. To resolve this theory-description gap, we review the generative modeling approach, which involves formally specifying how behavior is generated within individuals, and in turn how generative mechanisms vary across individuals. Generative modeling shifts our focus away from estimating descriptive statistical \"effects\" toward estimating psychologically interpretable parameters, while simultaneously enhancing the reliability and validity of our measures. We demonstrate the utility of generative modeling in the context of the \"reliability paradox,\" a phenomenon wherein replicable group effects (e.g., Stroop effect) fail to capture individual differences (e.g., low test-retest reliability). Simulations and empirical data from the Implicit Association Test and Stroop, Flanker, Posner, and delay discounting tasks show that generative models yield (a) more theoretically informative parameters, and (b) higher test-retest reliability estimates relative to traditional approaches, illustrating their potential for enhancing theory development. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":"108 1","pages":""},"PeriodicalIF":7.0,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143836590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

The use of large language models for qualitative research: The Deep Computational Text Analyser (DECOTA). 在定性研究中使用大型语言模型：深度计算文本分析器（DECOTA）。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-04-07 DOI: 10.1037/met0000753

Lois Player, Ryan Hughes, Kaloyan Mitev, Lorraine Whitmarsh, Christina Demski, Nicholas Nash, Trisevgeni Papakonstantinou, Mark Wilson

Machine-assisted approaches for free-text analysis are rising in popularity, owing to a growing need to rapidly analyze large volumes of qualitative data. In both research and policy settings, these approaches have promise in providing timely insights into public perceptions and enabling policymakers to understand their community's needs. However, current approaches still require expert human interpretation-posing a financial and practical barrier for those outside of academia. For the first time, we propose and validate the Deep Computational Text Analyser (DECOTA)-a novel machine learning methodology that automatically analyzes large free-text data sets and outputs concise themes. Building on structural topic modeling approaches, we used two fine-tuned large language models and sentence transformers to automatically derive "codes" and their corresponding "themes", as in inductive thematic analysis. To fully automate the process, we designed and validated a novel algorithm to choose the optimal number of "topics" for the structural topic modeling. DECOTA outputs key codes and themes, their prevalence, and how prevalence varies across covariates such as age and gender. Each code is accompanied by three representative quotes. Four data sets previously analyzed using thematic analysis were triangulated with DECOTA's codes and themes. We found that DECOTA is approximately 378 times faster and 1,920 times cheaper than human coding and consistently yields codes in agreement with or complementary to human coding (averaging 91.6% for codes and 90% for themes). The implications for evidence-based policy development, public engagement with policymaking, and psychometric measure development are discussed. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

由于快速分析大量定性数据的需求日益增长，用于自由文本分析的机器辅助方法越来越受欢迎。在研究和政策制定方面，这些方法有望及时洞察公众的看法，并使决策者能够了解其社区的需求。然而，目前的方法仍然需要专家的人工解释，这给学术界以外的人带来了经济和实践上的障碍。我们首次提出并验证了深度计算文本分析器（DECOTA）——一种新颖的机器学习方法，可以自动分析大型自由文本数据集并输出简洁的主题。在结构化主题建模方法的基础上，我们使用了两个微调的大型语言模型和句子转换器来自动导出“代码”及其对应的“主题”，就像归纳主题分析一样。为了使这一过程完全自动化，我们设计并验证了一种新的算法来选择结构主题建模的最佳“主题”数量。DECOTA输出关键代码和主题、它们的流行程度，以及流行程度在年龄和性别等协变量之间的变化情况。每个代码都有三个代表性的引号。先前使用主题分析分析的四个数据集与DECOTA的代码和主题进行了三角测量。我们发现DECOTA比人工编码快378倍，便宜1920倍，并且始终产生与人类编码一致或互补的代码（代码平均为91.6%，主题平均为90%）。讨论了基于证据的政策制定、公众参与政策制定和心理测量发展的含义。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"The use of large language models for qualitative research: The Deep Computational Text Analyser (DECOTA).","authors":"Lois Player, Ryan Hughes, Kaloyan Mitev, Lorraine Whitmarsh, Christina Demski, Nicholas Nash, Trisevgeni Papakonstantinou, Mark Wilson","doi":"10.1037/met0000753","DOIUrl":"https://doi.org/10.1037/met0000753","url":null,"abstract":"Machine-assisted approaches for free-text analysis are rising in popularity, owing to a growing need to rapidly analyze large volumes of qualitative data. In both research and policy settings, these approaches have promise in providing timely insights into public perceptions and enabling policymakers to understand their community's needs. However, current approaches still require expert human interpretation-posing a financial and practical barrier for those outside of academia. For the first time, we propose and validate the Deep Computational Text Analyser (DECOTA)-a novel machine learning methodology that automatically analyzes large free-text data sets and outputs concise themes. Building on structural topic modeling approaches, we used two fine-tuned large language models and sentence transformers to automatically derive \"codes\" and their corresponding \"themes\", as in inductive thematic analysis. To fully automate the process, we designed and validated a novel algorithm to choose the optimal number of \"topics\" for the structural topic modeling. DECOTA outputs key codes and themes, their prevalence, and how prevalence varies across covariates such as age and gender. Each code is accompanied by three representative quotes. Four data sets previously analyzed using thematic analysis were triangulated with DECOTA's codes and themes. We found that DECOTA is approximately 378 times faster and 1,920 times cheaper than human coding and consistently yields codes in agreement with or complementary to human coding (averaging 91.6% for codes and 90% for themes). The implications for evidence-based policy development, public engagement with policymaking, and psychometric measure development are discussed. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143803939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Zero inflation in intensive longitudinal data: Why is it important and how should we deal with it? 密集纵向数据中的零通胀：为什么它很重要，我们应该如何应对？

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-04-07 DOI: 10.1037/met0000754

Sijing S J Shao, Ziqian Xu, Qimin Liu, Kenneth McClure, Ross Jacobucci, Scott E Maxwell, Zhiyong Zhang

This study addresses the challenge of analyzing intensive longitudinal data (ILD) with zero-inflated autoregressive processes. ILD, characterized by intensive longitudinal measurements, often exhibit excessive zeros and temporal dependencies. Neglecting zero inflation or mishandling it can lead to biased parameter estimates and inaccurate conclusions. To overcome this issue, we propose a novel zero-inflated process change multilevel autoregressive (ZIP-CAR) model that incorporates zero inflation using a Bayesian framework. We compare the performance of the proposed method with existing methods through a simulation study and demonstrate its advantages in accurately estimating parameters and improving statistical power. Additionally, we apply the ZIP-CAR model to a real intensive longitudinal data set on problematic drinking behaviors, highlighting its effectiveness in capturing autoregressive and cross-lag effects while accounting for zero inflation. The results underscore the importance of addressing zero inflation in ILD analysis and provide practical recommendations for researchers. Our proposed model offers a valuable tool for analyzing ILD with zero-inflated autoregressive processes, facilitating a more comprehensive understanding of dynamic behavioral changes over time. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

本研究解决了用零膨胀自回归过程分析密集纵向数据（ILD）的挑战。ILD的特点是密集的纵向测量，经常表现出过多的零和时间依赖性。忽略零膨胀或处理不当可能导致有偏差的参数估计和不准确的结论。为了克服这个问题，我们提出了一种新的零膨胀过程变化多级自回归（ZIP-CAR）模型，该模型使用贝叶斯框架合并了零膨胀。通过仿真研究，比较了该方法与现有方法的性能，证明了该方法在准确估计参数和提高统计能力方面的优势。此外，我们将ZIP-CAR模型应用于关于问题饮酒行为的真实密集纵向数据集，强调其在捕捉自回归和交叉滞后效应方面的有效性，同时考虑到零通货膨胀。结果强调了在ILD分析中解决零通胀的重要性，并为研究人员提供了实用的建议。我们提出的模型为分析具有零膨胀自回归过程的ILD提供了一个有价值的工具，促进了对动态行为变化的更全面的理解。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"Zero inflation in intensive longitudinal data: Why is it important and how should we deal with it?","authors":"Sijing S J Shao, Ziqian Xu, Qimin Liu, Kenneth McClure, Ross Jacobucci, Scott E Maxwell, Zhiyong Zhang","doi":"10.1037/met0000754","DOIUrl":"https://doi.org/10.1037/met0000754","url":null,"abstract":"This study addresses the challenge of analyzing intensive longitudinal data (ILD) with zero-inflated autoregressive processes. ILD, characterized by intensive longitudinal measurements, often exhibit excessive zeros and temporal dependencies. Neglecting zero inflation or mishandling it can lead to biased parameter estimates and inaccurate conclusions. To overcome this issue, we propose a novel zero-inflated process change multilevel autoregressive (ZIP-CAR) model that incorporates zero inflation using a Bayesian framework. We compare the performance of the proposed method with existing methods through a simulation study and demonstrate its advantages in accurately estimating parameters and improving statistical power. Additionally, we apply the ZIP-CAR model to a real intensive longitudinal data set on problematic drinking behaviors, highlighting its effectiveness in capturing autoregressive and cross-lag effects while accounting for zero inflation. The results underscore the importance of addressing zero inflation in ILD analysis and provide practical recommendations for researchers. Our proposed model offers a valuable tool for analyzing ILD with zero-inflated autoregressive processes, facilitating a more comprehensive understanding of dynamic behavioral changes over time. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143804006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian nonparametric latent class analysis with different item types. 不同项目类型的贝叶斯非参数潜在类分析。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-20 DOI: 10.1037/met0000728

Meng Qiu, Sally Paganin, Ilsang Ohn, Lizhen Lin

Latent class analysis (LCA) requires deciding on the number of classes. This is traditionally addressed by fitting several models with an increasing number of classes and determining the optimal one using model selection criteria. However, different criteria can suggest different models, making it difficult to reach a consensus on the best criterion. Bayesian nonparametric LCA based on the Dirichlet process mixture (DPM) model is a flexible alternative approach that allows for inferring the number of classes from the data. In this article, we introduce a DPM-based mixed-mode LCA model, referred to as DPM-MMLCA, which clusters individuals based on indicators measured on mixed metrics. We illustrate two algorithms for posterior estimation and discuss inferential procedures to estimate the number of classes and their composition. A simulation study is conducted to compare the performance of the DPM-MMLCA with the traditional mixed-mode LCA under different scenarios. Five design factors are considered, including the number of latent classes, the number of observed variables, sample size, mixing proportions, and class separation. Performance measures include evaluating the correct identification of the number of latent classes, parameter recovery, and assignment of class labels. The Bayesian nonparametric LCA approach is illustrated using three real data examples. Additionally, a hands-on tutorial using R and the nimble package is provided for ease of implementation. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

潜在类分析（LCA）需要确定类的数量。传统上解决这个问题的方法是，用越来越多的类拟合几个模型，并使用模型选择标准确定最优模型。然而，不同的标准可以建议不同的模型，这使得很难在最佳标准上达成共识。基于Dirichlet过程混合（DPM）模型的贝叶斯非参数LCA是一种灵活的替代方法，它允许从数据中推断类的数量。在本文中，我们将介绍一个基于dpm的混合模式LCA模型（称为DPM-MMLCA），该模型基于在混合度量上测量的指标对个体进行聚类。我们举例说明了两种后验估计算法，并讨论了估计类别数量及其组成的推理过程。通过仿真研究，比较了DPM-MMLCA与传统混合模式LCA在不同场景下的性能。考虑了五个设计因素，包括潜在类别的数量、观察变量的数量、样本量、混合比例和类别分离。性能度量包括评估潜在类别数量的正确识别、参数恢复和类别标签的分配。用三个实际数据实例说明了贝叶斯非参数LCA方法。此外，还提供了使用R和敏捷包的实践教程，以便于实现。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"Bayesian nonparametric latent class analysis with different item types.","authors":"Meng Qiu, Sally Paganin, Ilsang Ohn, Lizhen Lin","doi":"10.1037/met0000728","DOIUrl":"https://doi.org/10.1037/met0000728","url":null,"abstract":"Latent class analysis (LCA) requires deciding on the number of classes. This is traditionally addressed by fitting several models with an increasing number of classes and determining the optimal one using model selection criteria. However, different criteria can suggest different models, making it difficult to reach a consensus on the best criterion. Bayesian nonparametric LCA based on the Dirichlet process mixture (DPM) model is a flexible alternative approach that allows for inferring the number of classes from the data. In this article, we introduce a DPM-based mixed-mode LCA model, referred to as DPM-MMLCA, which clusters individuals based on indicators measured on mixed metrics. We illustrate two algorithms for posterior estimation and discuss inferential procedures to estimate the number of classes and their composition. A simulation study is conducted to compare the performance of the DPM-MMLCA with the traditional mixed-mode LCA under different scenarios. Five design factors are considered, including the number of latent classes, the number of observed variables, sample size, mixing proportions, and class separation. Performance measures include evaluating the correct identification of the number of latent classes, parameter recovery, and assignment of class labels. The Bayesian nonparametric LCA approach is illustrated using three real data examples. Additionally, a hands-on tutorial using R and the nimble package is provided for ease of implementation. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143670682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Adjusting for nonrepresentativeness in continuous norming using multilevel regression and poststratification. 利用多水平回归和后分层调整连续归一化的非代表性。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-13 DOI: 10.1037/met0000752

Klazien de Vries, Marieke E Timmerman, Anja F Ernst, Casper J Albers

In psychological test norming, nonrepresentativeness in background variables in the normative sample can lead to bias in the normed score estimates. Because representativeness is difficult to establish in practice, adjustment methods are needed to combat this bias. As a candidate adjustment method, we investigated generalized additive models for location, scale, and shape with multilevel regression and poststratification (GAMLSS + MRP), the combination of MRP and continuous norming with GAMLSS. This adjustment method was then compared to current adjustment methods in continuous norming using weighted regression: GAMLSS + P (with poststratification) and cNORM + R (with raking). The results of our simulation showed that GAMLSS + MRP was generally more efficient than GAMLSS + P and cNORM + R. Furthermore, GAMLSS + MRP was better than the current methods at reducing bias in samples where the nonrepresentativeness was age-dependent. We argue that GAMLSS + MRP is a valid adjustment method in continuous norming and recommend this adjustment method to mitigate bias in nonrepresentative normative samples. To facilitate the use of GAMLSS + MRP in practice, we provide a step-wise approach for the implementation of GAMLSS + MRP. We illustrate this approach by deriving normed scores from the normative data of the third Schlichting language test. All analysis code for this illustration is provided. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

在心理测试规范中，规范样本中背景变量的非代表性会导致规范分数估计的偏差。由于在实践中难以建立代表性，因此需要调整方法来克服这种偏见。作为候选调整方法，我们研究了基于多层次回归和后分层的位置、尺度和形状的广义加性模型（GAMLSS + MRP），以及MRP和连续归一化相结合的GAMLSS。然后使用加权回归将该调整方法与当前连续归一化调整方法GAMLSS + P（后分层）和cNORM + R（耙）进行比较。我们的模拟结果表明，GAMLSS + MRP通常比GAMLSS + P和cNORM + r更有效。此外，GAMLSS + MRP在减少非代表性与年龄相关的样本偏差方面比目前的方法更好。我们认为GAMLSS + MRP是一种有效的连续规范化调整方法，并推荐这种调整方法来减轻非代表性规范样本的偏差。为了便于在实践中使用GAMLSS + MRP，我们提供了一种逐步实施GAMLSS + MRP的方法。我们通过从第三次Schlichting语言测试的规范数据中导出规范分数来说明这种方法。提供了此插图的所有分析代码。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"Adjusting for nonrepresentativeness in continuous norming using multilevel regression and poststratification.","authors":"Klazien de Vries, Marieke E Timmerman, Anja F Ernst, Casper J Albers","doi":"10.1037/met0000752","DOIUrl":"https://doi.org/10.1037/met0000752","url":null,"abstract":"In psychological test norming, nonrepresentativeness in background variables in the normative sample can lead to bias in the normed score estimates. Because representativeness is difficult to establish in practice, adjustment methods are needed to combat this bias. As a candidate adjustment method, we investigated generalized additive models for location, scale, and shape with multilevel regression and poststratification (GAMLSS + MRP), the combination of MRP and continuous norming with GAMLSS. This adjustment method was then compared to current adjustment methods in continuous norming using weighted regression: GAMLSS + P (with poststratification) and cNORM + R (with raking). The results of our simulation showed that GAMLSS + MRP was generally more efficient than GAMLSS + P and cNORM + R. Furthermore, GAMLSS + MRP was better than the current methods at reducing bias in samples where the nonrepresentativeness was age-dependent. We argue that GAMLSS + MRP is a valid adjustment method in continuous norming and recommend this adjustment method to mitigate bias in nonrepresentative normative samples. To facilitate the use of GAMLSS + MRP in practice, we provide a step-wise approach for the implementation of GAMLSS + MRP. We illustrate this approach by deriving normed scores from the normative data of the third Schlichting language test. All analysis code for this illustration is provided. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143625380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unsupervised [randomly responding] survey bot detection: In search of high classification accuracy. 无监督[随机响应]调查机器人检测：寻求高分类精度。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-10 DOI: 10.1037/met0000746

Carl F Falk, Amaris Huang, Michael John Ilagan

While online survey data collection has become popular in the social sciences, there is a risk of data contamination by computer-generated random responses (i.e., bots). Bot prevalence poses a significant threat to data quality. If deterrence efforts fail or were not set up in advance, researchers can still attempt to detect bots already present in the data. In this research, we study a recently developed algorithm to detect survey bots. The algorithm requires neither a measurement model nor a sample of known humans and bots; thus, it is model agnostic and unsupervised. It involves a permutation test under the assumption that Likert-type items are exchangeable for bots, but not humans. While the algorithm maintains a desired sensitivity for detecting bots (e.g., 95%), its classification accuracy may depend on other inventory-specific or demographic factors. Generating hypothetical human responses from a well-known item response theory model, we use simulations to understand how classification accuracy is affected by item properties, the number of items, the number of latent factors, and factor correlations. In an additional study, we simulate bots to contaminate real human data from 35 publicly available data sets to understand the algorithm's classification accuracy under a variety of real measurement instruments. Through this work, we identify conditions under which classification accuracy is around 95% or above, but also conditions under which accuracy is quite low. In brief, performance is better with more items, more categories per item, and a variety in the difficulty or means of the survey items. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

虽然在线调查数据收集在社会科学领域已经变得很流行，但计算机生成的随机响应（即机器人）存在数据污染的风险。Bot的流行对数据质量构成了重大威胁。如果威慑措施失败或没有提前设置，研究人员仍然可以尝试检测数据中已经存在的机器人。在本研究中，我们研究了最近开发的一种检测调查机器人的算法。该算法既不需要测量模型，也不需要已知人类和机器人的样本；因此，它是模型不可知论和无监督的。它包含了一个排列测试，假设likert类型的道具可以与bot交换，但不能与人类交换。虽然该算法在检测机器人方面保持了理想的灵敏度（例如95%），但其分类准确性可能取决于其他特定于库存或人口统计因素。从一个著名的项目反应理论模型中生成假设的人类反应，我们使用模拟来了解分类准确性如何受到项目属性、项目数量、潜在因素数量和因素相关性的影响。在另一项研究中，我们模拟机器人污染来自35个公开可用数据集的真实人类数据，以了解算法在各种真实测量仪器下的分类准确性。通过这项工作，我们确定了分类准确率在95%左右或以上的情况，以及准确率相当低的情况。简而言之，项目越多，每个项目的类别越多，调查项目的难度或手段也越多样，表现就越好。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"Unsupervised [randomly responding] survey bot detection: In search of high classification accuracy.","authors":"Carl F Falk, Amaris Huang, Michael John Ilagan","doi":"10.1037/met0000746","DOIUrl":"https://doi.org/10.1037/met0000746","url":null,"abstract":"While online survey data collection has become popular in the social sciences, there is a risk of data contamination by computer-generated random responses (i.e., bots). Bot prevalence poses a significant threat to data quality. If deterrence efforts fail or were not set up in advance, researchers can still attempt to detect bots already present in the data. In this research, we study a recently developed algorithm to detect survey bots. The algorithm requires neither a measurement model nor a sample of known humans and bots; thus, it is model agnostic and unsupervised. It involves a permutation test under the assumption that Likert-type items are exchangeable for bots, but not humans. While the algorithm maintains a desired sensitivity for detecting bots (e.g., 95%), its classification accuracy may depend on other inventory-specific or demographic factors. Generating hypothetical human responses from a well-known item response theory model, we use simulations to understand how classification accuracy is affected by item properties, the number of items, the number of latent factors, and factor correlations. In an additional study, we simulate bots to contaminate real human data from 35 publicly available data sets to understand the algorithm's classification accuracy under a variety of real measurement instruments. Through this work, we identify conditions under which classification accuracy is around 95% or above, but also conditions under which accuracy is quite low. In brief, performance is better with more items, more categories per item, and a variety in the difficulty or means of the survey items. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143597819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A tutorial on estimating dynamic treatment regimes from observational longitudinal data using lavaan. 利用lavaan从观测纵向数据估计动态治疗方案的教程。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-06 DOI: 10.1037/met0000748

Wen Wei Loh, Terrence D Jorgensen

Psychological and behavioral scientists develop interventions toward addressing pressing societal challenges. But such endeavors are complicated by treatments that change over time as individuals' needs and responses evolve. For instance, students initially in a multiyear mentoring program to improve future academic outcomes may not continue with the program after interim school engagement improves. Conventional interventions bound by rigid treatment assignments cannot adapt to such time-dependent heterogeneity, thus undermining the interventions' practical relevance and leading to inefficient implementations. Dynamic treatment regimes (DTRs) are a class of interventions that are more tailored, relevant, and efficient than conventional interventions. DTRs, an established approach in the causal inference and personalized medicine literature, are designed to address the causal query: how can individual treatment assignments in successive time points be adapted, based on time-evolving responses, to optimize the intervention's effectiveness? This tutorial offers an accessible introduction to DTRs using a simple example from the psychology literature. We describe how, using observational data from a single naturally occurring longitudinal study, to estimate the outcomes had different DTRs been counterfactually implemented. To improve accessibility, we implement the estimation procedure in lavaan, a freely available statistical software popular in psychology and social science research. We hope this tutorial guides researchers on framing, interpreting, and testing DTRs in their investigations. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

心理和行为科学家为解决紧迫的社会挑战开发干预措施。但随着时间的推移，随着个人需求和反应的变化，治疗方法也会发生变化，这使得这种努力变得复杂。例如，最初参加多年指导计划以提高未来学业成绩的学生，在临时学校参与度提高后，可能不会继续参加该计划。受严格的治疗任务约束的常规干预措施无法适应这种随时间变化的异质性，从而破坏了干预措施的实际相关性，导致实施效率低下。动态治疗方案（DTRs）是一类比常规干预措施更有针对性、更相关、更有效的干预措施。dtr是因果推理和个性化医学文献中的一种既定方法，旨在解决因果问题：如何根据时间演变的反应调整连续时间点的个体治疗分配，以优化干预的有效性？本教程使用心理学文献中的一个简单例子对dtr进行了简单的介绍。我们描述了如何使用单个自然发生的纵向研究的观测数据来估计不同dtr被反事实实施的结果。为了提高可访问性，我们在lavaan中实现了估计程序，lavaan是一个免费的统计软件，在心理学和社会科学研究中很受欢迎。我们希望本教程能够指导研究人员在研究中构建、解释和测试dtr。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"A tutorial on estimating dynamic treatment regimes from observational longitudinal data using lavaan.","authors":"Wen Wei Loh, Terrence D Jorgensen","doi":"10.1037/met0000748","DOIUrl":"https://doi.org/10.1037/met0000748","url":null,"abstract":"Psychological and behavioral scientists develop interventions toward addressing pressing societal challenges. But such endeavors are complicated by treatments that change over time as individuals' needs and responses evolve. For instance, students initially in a multiyear mentoring program to improve future academic outcomes may not continue with the program after interim school engagement improves. Conventional interventions bound by rigid treatment assignments cannot adapt to such time-dependent heterogeneity, thus undermining the interventions' practical relevance and leading to inefficient implementations. Dynamic treatment regimes (DTRs) are a class of interventions that are more tailored, relevant, and efficient than conventional interventions. DTRs, an established approach in the causal inference and personalized medicine literature, are designed to address the causal query: how can individual treatment assignments in successive time points be adapted, based on time-evolving responses, to optimize the intervention's effectiveness? This tutorial offers an accessible introduction to DTRs using a simple example from the psychology literature. We describe how, using observational data from a single naturally occurring longitudinal study, to estimate the outcomes had different DTRs been counterfactually implemented. To improve accessibility, we implement the estimation procedure in lavaan, a freely available statistical software popular in psychology and social science research. We hope this tutorial guides researchers on framing, interpreting, and testing DTRs in their investigations. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143567974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Network science in psychology. 心理学中的网络科学。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-03 DOI: 10.1037/met0000745

Tracy Sweet, Selena Wang

Social network analysis can answer research questions such as why or how individuals interact or form relationships and how those relationships impact other outcomes. Despite the breadth of methods available to address psychological research questions, social network analysis is not yet a standard practice. To promote the use of social network analysis in psychological research, we present an overview of network methods, situating each method within the context of research studies and questions in psychology. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

引用次数: 0

Iterated community detection in psychological networks. 心理网络中的迭代社区检测。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-03 DOI: 10.1037/met0000744

M A Werner, J de Ron, E I Fried, D J Robinaugh

Psychological network models often feature communities: subsets of nodes that are more densely connected to themselves than to other nodes. The Spinglass algorithm is a popular method of detecting communities within a network, but it is a nondeterministic algorithm, meaning that the results can vary from one iteration to the next. There is no established method for determining the optimal solution or for evaluating instability across iterations in the emerging discipline of network psychometrics. We addressed this need by introducing and evaluating iterated community detection: Spinglass (IComDetSpin), a method for aggregating across multiple Spinglass iterations to identify the most frequent solution and quantify and visualize the instability of the solution across iterations. In two simulation studies, we evaluated (a) the performance of IComDetSpin in identifying the true community structure and (b) information about the fuzziness of community boundaries; information that is not available with a single iteration of Spinglass. In Study 1, IComDetSpin outperformed single-iteration Spinglass in identifying the true number of communities and performed comparably to Walktrap. In Study 2, we extended our evaluation to networks estimated from simulated data and found that both IComDetSpin and Exploratory Graph Analysis (a well-established community detection method in network psychometrics) performed well and that IComDetSpin outperformed Exploratory Graph Analysis when correlations between communities were high and number of nodes per community was lower (5 vs. 10). Overall, IComDetSpin improved the performance of Spinglass and provided unique information about the stability of community detection results and fuzziness in community structure. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

心理网络模型通常以社区为特征：节点的子集与自身的联系比与其他节点的联系更紧密。Spinglass算法是一种在网络中检测社区的流行方法，但它是一种不确定性算法，这意味着每次迭代的结果可能会有所不同。在新兴的网络心理测量学中，没有确定最优解或评估迭代不稳定性的既定方法。我们通过引入和评估迭代社区检测来解决这个需求：Spinglass (IComDetSpin)，这是一种跨多个Spinglass迭代的聚合方法，用于识别最频繁的解决方案，并量化和可视化跨迭代的解决方案的不稳定性。在两个模拟研究中，我们评估了(a) IComDetSpin识别真实群落结构的性能和(b)群落边界的模糊信息；在Spinglass的一次迭代中无法获得的信息。在研究1中，IComDetSpin在识别社区的真实数量方面优于单迭代Spinglass，并与Walktrap进行了比较。在研究2中，我们将我们的评估扩展到从模拟数据估计的网络，发现IComDetSpin和探索性图分析（网络心理测量学中一种成熟的社区检测方法）都表现良好，并且当社区之间的相关性高，每个社区的节点数较低（5比10）时，IComDetSpin优于探索性图分析。总体而言，IComDetSpin提高了Spinglass的性能，并在群落检测结果的稳定性和群落结构的模糊性方面提供了独特的信息。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"Iterated community detection in psychological networks.","authors":"M A Werner, J de Ron, E I Fried, D J Robinaugh","doi":"10.1037/met0000744","DOIUrl":"https://doi.org/10.1037/met0000744","url":null,"abstract":"Psychological network models often feature communities: subsets of nodes that are more densely connected to themselves than to other nodes. The Spinglass algorithm is a popular method of detecting communities within a network, but it is a nondeterministic algorithm, meaning that the results can vary from one iteration to the next. There is no established method for determining the optimal solution or for evaluating instability across iterations in the emerging discipline of network psychometrics. We addressed this need by introducing and evaluating iterated community detection: Spinglass (IComDetSpin), a method for aggregating across multiple Spinglass iterations to identify the most frequent solution and quantify and visualize the instability of the solution across iterations. In two simulation studies, we evaluated (a) the performance of IComDetSpin in identifying the true community structure and (b) information about the fuzziness of community boundaries; information that is not available with a single iteration of Spinglass. In Study 1, IComDetSpin outperformed single-iteration Spinglass in identifying the true number of communities and performed comparably to Walktrap. In Study 2, we extended our evaluation to networks estimated from simulated data and found that both IComDetSpin and Exploratory Graph Analysis (a well-established community detection method in network psychometrics) performed well and that IComDetSpin outperformed Exploratory Graph Analysis when correlations between communities were high and number of nodes per community was lower (5 vs. 10). Overall, IComDetSpin improved the performance of Spinglass and provided unique information about the stability of community detection results and fuzziness in community structure. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143543201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unidim: An index of scale homogeneity and unidimensionality. Unidim：量表同质性和单维性指数。

IF 7.6 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2025-03-03 DOI: 10.1037/met0000729

William Revelle, David Condon

How to evaluate how well a psychological scale measures just one construct is a recurring problem in assessment. We introduce an index, u, of the unidimensionality and homogeneity of a scale. u is just the product of two other indices: τ (a measure of τ equivalence) and ρc (a measure of congeneric fit). By combining these two indices into one, we provide a simple index of the unidimensionality and homogeneity of a scale. We evaluate u through simulations and with real data sets. Simulations of u across one-factor scales ranging from three to 24 items with various levels of factor homogeneity show that τ and, therefore, u are sensitive to the degree of factor homogeneity. Additional tests with multifactorial scales representing 9, 18, 27, and 36 items with a hierarchical factor structure varying in a general factor loading show that ρc and, therefore, u are sensitive to the general factor saturation of a test. We also demonstrate the performance of u on 45 different publicly available personality and ability measures. Comparisons with traditional measures (i.e., ωh, α, ωt, comparative fit index, and explained common variance) show that u has greater sensitivity to unidimensional structure and less sensitivity to the number of items in a scale. u is easily calculated with open source statistical packages and is relatively robust to sample sizes ranging from 100 to 5,000. (PsycInfo Database Record (c) 2025 APA, all rights reserved).

如何评估心理量表对一个构念的测量效果是评估中反复出现的问题。我们引入标度的单维性和齐次性的指标u。U只是另外两个指标的乘积：τ （τ等价的度量）和ρc（同型拟合的度量）。通过将这两个指标合二为一，我们提供了一个尺度的单维性和均匀性的简单指标。我们通过模拟和真实数据集来评估u。在单因子尺度上（从3到24个项目，具有不同程度的因子同质性）对u的模拟表明，τ和u对因子同质性程度很敏感。另外，用多因子量表（分别代表9、18、27和36个项目，其分层因子结构在一般因子负载中变化）进行的测试表明，ρc和u对测试的一般因子饱和敏感。我们还展示了u在45种不同的公开的性格和能力测试中的表现。与传统度量（ωh、ω α、ωt、比较拟合指数、解释共同方差）的比较表明，u对单维结构的敏感性较高，对量表项目数的敏感性较低。U可以很容易地用开放源码统计软件包计算出来，并且对于100到5000个样本大小的范围是相对健壮的。（PsycInfo Database Record (c) 2025 APA，版权所有）。

{"title":"Unidim: An index of scale homogeneity and unidimensionality.","authors":"William Revelle, David Condon","doi":"10.1037/met0000729","DOIUrl":"https://doi.org/10.1037/met0000729","url":null,"abstract":"How to evaluate how well a psychological scale measures just one construct is a recurring problem in assessment. We introduce an index, u, of the unidimensionality and homogeneity of a scale. u is just the product of two other indices: τ (a measure of τ equivalence) and ρc (a measure of congeneric fit). By combining these two indices into one, we provide a simple index of the unidimensionality and homogeneity of a scale. We evaluate u through simulations and with real data sets. Simulations of u across one-factor scales ranging from three to 24 items with various levels of factor homogeneity show that τ and, therefore, u are sensitive to the degree of factor homogeneity. Additional tests with multifactorial scales representing 9, 18, 27, and 36 items with a hierarchical factor structure varying in a general factor loading show that ρc and, therefore, u are sensitive to the general factor saturation of a test. We also demonstrate the performance of u on 45 different publicly available personality and ability measures. Comparisons with traditional measures (i.e., ωh, α, ωt, comparative fit index, and explained common variance) show that u has greater sensitivity to unidimensional structure and less sensitivity to the number of items in a scale. u is easily calculated with open source statistical packages and is relatively robust to sample sizes ranging from 100 to 5,000. (PsycInfo Database Record (c) 2025 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.6,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143543205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0