Psychological methods最新文献_第10页

Detecting mediation effects with the Bayes factor: Performance evaluation and tools for sample size determination. 利用贝叶斯因子检测中介效应：性能评估和确定样本量的工具

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-05-23 DOI: 10.1037/met0000670

Xiao Liu, Zhiyong Zhang, Lijuan Wang

Testing the presence of mediation effects is important in social science research. Recently, Bayesian hypothesis testing with Bayes factors (BFs) has become increasingly popular. However, the use of BFs for testing mediation effects is still under-studied, despite the growing literature on Bayesian mediation analysis. In this study, we systematically examine the performance of the BF for testing the presence versus absence of a mediation effect. Our results showed that the false and/or true positive rates of detecting mediation with the BF can be impacted by the prior specification, including the prior odds of the presence of each path (treatment-mediator path or mediator-outcome path) used in the design stage for data generation and in the analysis stage for calculating the BF of the mediation effect. Based on our examination, we developed an R function and a web application to determine sample sizes for testing mediation effects with the BF. Our study provides insights on the performance of the BF for testing mediation effects and adds to researchers' toolbox of sample size determination for mediation studies. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

检验是否存在中介效应在社会科学研究中非常重要。最近，使用贝叶斯因子（BFs）进行贝叶斯假设检验越来越流行。然而，尽管有关贝叶斯中介分析的文献越来越多，但使用贝叶斯因子检验中介效应的研究仍然不足。在本研究中，我们系统地考察了贝叶斯因子在检验中介效应存在与不存在时的表现。我们的研究结果表明，使用贝叶斯中介分析检测中介效应的假阳性率和/或真阳性率会受到先验规范的影响，包括在设计阶段用于生成数据和在分析阶段用于计算中介效应贝叶斯概率的每条路径（治疗-中介路径或中介-结果路径）存在的先验概率。根据我们的研究，我们开发了一个 R 函数和一个网络应用程序，用于确定用 BF 检验中介效应的样本大小。我们的研究为检验中介效应的 BF 性能提供了见解，并为研究人员确定中介研究样本量的工具箱增添了新的内容。(PsycInfo Database Record (c) 2024 APA, 版权所有）。

{"title":"Detecting mediation effects with the Bayes factor: Performance evaluation and tools for sample size determination.","authors":"Xiao Liu, Zhiyong Zhang, Lijuan Wang","doi":"10.1037/met0000670","DOIUrl":"https://doi.org/10.1037/met0000670","url":null,"abstract":"Testing the presence of mediation effects is important in social science research. Recently, Bayesian hypothesis testing with Bayes factors (BFs) has become increasingly popular. However, the use of BFs for testing mediation effects is still under-studied, despite the growing literature on Bayesian mediation analysis. In this study, we systematically examine the performance of the BF for testing the presence versus absence of a mediation effect. Our results showed that the false and/or true positive rates of detecting mediation with the BF can be impacted by the prior specification, including the prior odds of the presence of each path (treatment-mediator path or mediator-outcome path) used in the design stage for data generation and in the analysis stage for calculating the BF of the mediation effect. Based on our examination, we developed an R function and a web application to determine sample sizes for testing mediation effects with the BF. Our study provides insights on the performance of the BF for testing mediation effects and adds to researchers' toolbox of sample size determination for mediation studies. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimation of planned and unplanned missing individual scores in longitudinal designs using continuous-time state-space models. 使用连续时间状态空间模型估算纵向设计中计划内和计划外缺失的个人分数。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-05-16 DOI: 10.1037/met0000664

José Ángel Martínez-Huertas, Eduardo Estrada, Ricardo Olmos

Latent change score (LCS) models within a continuous-time state-space modeling framework provide a convenient statistical approach for analyzing developmental data. In this study, we evaluate the robustness of such an approach in the context of accelerated longitudinal designs (ALDs). ALDs are especially interesting because they imply a very high rate of planned data missingness. Additionally, most longitudinal studies present unexpected participant attrition leading to unplanned missing data. Therefore, in ALDs, both sources of data missingness are combined. Previous research has shown that ALDs for developmental research allow recovering the population generating process. However, it is unknown how participant attrition impacts the model estimates. We have three goals: (a) to evaluate the robustness of the group-level parameter estimates in scenarios with empirically plausible unplanned data missingness; (b) to evaluate the performance of Kalman scores (KS) imputations for individual data points that were expected but unobserved; and (c) to evaluate the performance of KS imputations for individual data points that were outside the age ranged observed for each case (i.e., to estimate the individual trajectories for the complete age range under study). In general, results showed lack of bias in the simulated conditions. The variability of the estimates increased with lower sample sizes and higher missingness severity. Similarly, we found very accurate estimates of individual scores for both planned and unplanned missing data points. These results are very important for applied practitioners in terms of forecasting and making individual-level decisions. R code is provided to facilitate its implementation by applied researchers. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

在连续时间状态空间建模框架内的潜在变化分数（LCS）模型为分析发展数据提供了一种便捷的统计方法。在本研究中，我们评估了这种方法在加速纵向设计（ALDs）背景下的稳健性。ALDs 尤为有趣，因为它们意味着计划数据缺失率非常高。此外，大多数纵向研究都会出现意外的参与者流失，导致计划外的数据缺失。因此，在 ALDs 中，这两种数据缺失的来源是结合在一起的。以往的研究表明，用于发展研究的 ALD 可以恢复人口生成过程。然而，我们还不知道参与者的流失会对模型估计产生怎样的影响。我们有三个目标(a) 评估在经验上可信的计划外数据缺失情况下群体级参数估计的稳健性；(b) 评估卡尔曼分数（KS）估算对预期但未观察到的单个数据点的性能；(c) 评估卡尔曼分数（KS）估算对每个案例观察到的年龄范围之外的单个数据点的性能（即估算研究中完整年龄范围的个体轨迹）。总体而言，结果显示模拟条件下没有偏差。随着样本量的减少和缺失严重程度的增加，估计值的变异性也随之增加。同样，我们发现无论是计划内还是计划外的数据缺失点，对个人分数的估计都非常准确。这些结果对于应用实践者进行预测和做出个人决策非常重要。我们提供了 R 代码，以方便应用研究人员实施。(PsycInfo 数据库记录 (c) 2024 APA，保留所有权利）。

{"title":"Estimation of planned and unplanned missing individual scores in longitudinal designs using continuous-time state-space models.","authors":"José Ángel Martínez-Huertas, Eduardo Estrada, Ricardo Olmos","doi":"10.1037/met0000664","DOIUrl":"https://doi.org/10.1037/met0000664","url":null,"abstract":"Latent change score (LCS) models within a continuous-time state-space modeling framework provide a convenient statistical approach for analyzing developmental data. In this study, we evaluate the robustness of such an approach in the context of accelerated longitudinal designs (ALDs). ALDs are especially interesting because they imply a very high rate of planned data missingness. Additionally, most longitudinal studies present unexpected participant attrition leading to unplanned missing data. Therefore, in ALDs, both sources of data missingness are combined. Previous research has shown that ALDs for developmental research allow recovering the population generating process. However, it is unknown how participant attrition impacts the model estimates. We have three goals: (a) to evaluate the robustness of the group-level parameter estimates in scenarios with empirically plausible unplanned data missingness; (b) to evaluate the performance of Kalman scores (KS) imputations for individual data points that were expected but unobserved; and (c) to evaluate the performance of KS imputations for individual data points that were outside the age ranged observed for each case (i.e., to estimate the individual trajectories for the complete age range under study). In general, results showed lack of bias in the simulated conditions. The variability of the estimates increased with lower sample sizes and higher missingness severity. Similarly, we found very accurate estimates of individual scores for both planned and unplanned missing data points. These results are very important for applied practitioners in terms of forecasting and making individual-level decisions. R code is provided to facilitate its implementation by applied researchers. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140945722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A deep learning method for comparing Bayesian hierarchical models. 比较贝叶斯分层模型的深度学习方法。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-05-06 DOI: 10.1037/met0000645

Lasse Elsemüller, Martin Schnuerch, Paul-Christian Bürkner, Stefan T Radev

Bayesian model comparison (BMC) offers a principled approach to assessing the relative merits of competing computational models and propagating uncertainty into model selection decisions. However, BMC is often intractable for the popular class of hierarchical models due to their high-dimensional nested parameter structure. To address this intractability, we propose a deep learning method for performing BMC on any set of hierarchical models which can be instantiated as probabilistic programs. Since our method enables amortized inference, it allows efficient re-estimation of posterior model probabilities and fast performance validation prior to any real-data application. In a series of extensive validation studies, we benchmark the performance of our method against the state-of-the-art bridge sampling method and demonstrate excellent amortized inference across all BMC settings. We then showcase our method by comparing four hierarchical evidence accumulation models that have previously been deemed intractable for BMC due to partly implicit likelihoods. Additionally, we demonstrate how transfer learning can be leveraged to enhance training efficiency. We provide reproducible code for all analyses and an open-source implementation of our method. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

贝叶斯模型比较（BMC）提供了一种原则性方法，用于评估相互竞争的计算模型的相对优点，并将不确定性传播到模型选择决策中。然而，由于分层模型的高维嵌套参数结构，对于流行的分层模型来说，贝叶斯模型比较往往难以实现。为了解决这一难题，我们提出了一种深度学习方法，用于在任意一组可实例化为概率程序的分层模型上执行 BMC。由于我们的方法可以实现摊销推理，因此可以高效地重新估计模型的后验概率，并在任何实际数据应用之前快速进行性能验证。在一系列广泛的验证研究中，我们将我们的方法与最先进的桥接采样方法进行了性能对比，并在所有 BMC 设置中展示了出色的摊销推断。然后，我们通过比较四种分层证据积累模型展示了我们的方法，这些模型之前由于部分隐含似然而被认为在 BMC 中难以实现。此外，我们还展示了如何利用迁移学习来提高训练效率。我们提供了所有分析的可重现代码以及我们方法的开源实现。(PsycInfo Database Record (c) 2024 APA, 版权所有）。

{"title":"A deep learning method for comparing Bayesian hierarchical models.","authors":"Lasse Elsemüller, Martin Schnuerch, Paul-Christian Bürkner, Stefan T Radev","doi":"10.1037/met0000645","DOIUrl":"https://doi.org/10.1037/met0000645","url":null,"abstract":"Bayesian model comparison (BMC) offers a principled approach to assessing the relative merits of competing computational models and propagating uncertainty into model selection decisions. However, BMC is often intractable for the popular class of hierarchical models due to their high-dimensional nested parameter structure. To address this intractability, we propose a deep learning method for performing BMC on any set of hierarchical models which can be instantiated as probabilistic programs. Since our method enables amortized inference, it allows efficient re-estimation of posterior model probabilities and fast performance validation prior to any real-data application. In a series of extensive validation studies, we benchmark the performance of our method against the state-of-the-art bridge sampling method and demonstrate excellent amortized inference across all BMC settings. We then showcase our method by comparing four hierarchical evidence accumulation models that have previously been deemed intractable for BMC due to partly implicit likelihoods. Additionally, we demonstrate how transfer learning can be leveraged to enhance training efficiency. We provide reproducible code for all analyses and an open-source implementation of our method. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140857735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Are factor scores measurement invariant? 因子得分是否具有测量不变性？

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-05-06 DOI: 10.1037/met0000658

Mark H C Lai, Winnie W-Y Tse

There has been increased interest in practical methods for integrative analysis of data from multiple studies or samples, and using factor scores to represent constructs has become a popular and practical alternative to latent variable models with all individual items. Although researchers are aware that scores representing the same construct should be on a similar metric across samples-namely they should be measurement invariant-for integrative data analysis, the methodological literature is unclear whether factor scores would satisfy such a requirement. In this note, we show that even when researchers successfully calibrate the latent factors to the same metric across samples, factor scores-which are estimates of the latent factors but not the factors themselves-may not be measurement invariant. Specifically, we prove that factor scores computed based on the popular regression method are generally not measurement invariant. Surprisingly, such scores can be noninvariant even when the items are invariant. We also demonstrate that our conclusions generalize to similar shrinkage scores in item response models for discrete items, namely the expected a posteriori scores and the maximum a posteriori scores. Researchers should be cautious in directly using factor scores for cross-sample analyses, even when such scores are obtained from measurement models that account for noninvariance. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

人们越来越关注对来自多个研究或样本的数据进行综合分析的实用方法，而使用因子得分来代表构念已成为一种流行而实用的方法，可以替代包含所有单个项目的潜变量模型。尽管研究人员意识到，代表同一构念的分数在不同样本中应具有相似的度量标准--即它们应具有测量不变性--以进行整合数据分析，但方法论文献并不清楚因子分数是否能满足这一要求。在本论文中，我们将证明，即使研究人员成功地将潜在因子校准到不同样本的相同尺度上，因子得分--即潜在因子的估计值而非因子本身--也可能不具有测量不变性。具体来说，我们证明了根据流行的回归方法计算出的因子得分通常不具有测量不变性。令人惊讶的是，即使项目是不变的，这些分数也可能是非不变的。我们还证明，我们的结论也适用于离散项目的项目反应模型中的类似收缩分数，即预期后验分数和最大后验分数。研究人员在直接使用因子得分进行跨样本分析时应该谨慎，即使这些得分是从考虑了非方差的测量模型中获得的。(PsycInfo Database Record (c) 2024 APA, 版权所有）。

{"title":"Are factor scores measurement invariant?","authors":"Mark H C Lai, Winnie W-Y Tse","doi":"10.1037/met0000658","DOIUrl":"https://doi.org/10.1037/met0000658","url":null,"abstract":"There has been increased interest in practical methods for integrative analysis of data from multiple studies or samples, and using factor scores to represent constructs has become a popular and practical alternative to latent variable models with all individual items. Although researchers are aware that scores representing the same construct should be on a similar metric across samples-namely they should be measurement invariant-for integrative data analysis, the methodological literature is unclear whether factor scores would satisfy such a requirement. In this note, we show that even when researchers successfully calibrate the latent factors to the same metric across samples, factor scores-which are estimates of the latent factors but not the factors themselves-may not be measurement invariant. Specifically, we prove that factor scores computed based on the popular regression method are generally not measurement invariant. Surprisingly, such scores can be noninvariant even when the items are invariant. We also demonstrate that our conclusions generalize to similar shrinkage scores in item response models for discrete items, namely the expected a posteriori scores and the maximum a posteriori scores. Researchers should be cautious in directly using factor scores for cross-sample analyses, even when such scores are obtained from measurement models that account for noninvariance. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140857844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reconsideration of the type I error rate for psychological science in the era of replication. 复制时代对心理科学I型错误率的再思考。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-04-01 Epub Date: 2022-04-11 DOI: 10.1037/met0000490

Michael T Carlin, Mack S Costello, Madisyn A Flansburg, Alyssa Darden

Careful consideration of the tradeoff between Type I and Type II error rates when designing experiments is critical for maximizing statistical decision accuracy. Typically, Type I error rates (e.g., .05) are significantly lower than Type II error rates (e.g., .20 for .80 power) in psychological science. Further, positive findings (true effects and Type I errors) are more likely to be the focus of replication. This conventional approach leads to very high rates of Type II error. Analyses show that increasing the Type I error rate to .10, thereby increasing power and decreasing the Type II error rate for each test, leads to higher overall rates of correct statistical decisions. This increase of Type I error rate is consistent with, and most beneficial in the context of, the replication and "New Statistics" movements in psychology. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

在设计实验时，仔细考虑I型和II型错误率之间的权衡对于最大限度地提高统计决策准确性至关重要。通常，在心理科学中，I型错误率（例如.05）显著低于II型错误率。此外，积极的发现（真实效果和I型错误）更有可能成为复制的焦点。这种传统方法导致非常高的II型错误率。分析表明，将I型错误率提高到.10，从而增加每次测试的功率并降低II型错误率，可以提高统计决策的总体正确率。I型错误率的增加与心理学中的复制和“新统计学”运动相一致，在这种情况下最为有益。（PsycInfo数据库记录（c）2022 APA，保留所有权利）。

引用次数: 0

Efficient alternatives for Bayesian hypothesis tests in psychology. 心理学中贝叶斯假设检验的有效替代方案。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-04-01 Epub Date: 2022-04-14 DOI: 10.1037/met0000482

Sandipan Pramanik, Valen E Johnson

Bayesian hypothesis testing procedures have gained increased acceptance in recent years. A key advantage that Bayesian tests have over classical testing procedures is their potential to quantify information in support of true null hypotheses. Ironically, default implementations of Bayesian tests prevent the accumulation of strong evidence in favor of true null hypotheses because associated default alternative hypotheses assign a high probability to data that are most consistent with a null effect. We propose the use of "nonlocal" alternative hypotheses to resolve this paradox. The resulting class of Bayesian hypothesis tests permits more rapid accumulation of evidence in favor of both true null hypotheses and alternative hypotheses that are compatible with standardized effect sizes of most interest in psychology. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

近年来，贝叶斯假设检验程序得到了越来越多的认可。与经典测试程序相比，贝叶斯测试的一个关键优势是它们有可能量化支持真零假设的信息。具有讽刺意味的是，贝叶斯测试的默认实现阻止了有利于真正零假设的有力证据的积累，因为相关的默认替代假设为最符合零效应的数据分配了高概率。我们建议使用“非局部”替代假设来解决这个悖论。由此产生的一类贝叶斯假设测试允许更快速地积累有利于真零假设和替代假设的证据，这些假设与心理学中最感兴趣的标准化效应大小相兼容。（PsycInfo数据库记录（c）2022 APA，保留所有权利）。

引用次数: 0

Assessing measurement invariance with moderated nonlinear factor analysis using the R package OpenMx. 利用 R 软件包 OpenMx 进行调节非线性因子分析，评估测量不变性。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-04-01 Epub Date: 2022-07-04 DOI: 10.1037/met0000501

Laura Kolbe, Dylan Molenaar, Suzanne Jak, Terrence D Jorgensen

Assessing measurement invariance is an important step in establishing a meaningful comparison of measurements of a latent construct across individuals or groups. Most recently, moderated nonlinear factor analysis (MNLFA) has been proposed as a method to assess measurement invariance. In MNLFA models, measurement invariance is examined in a single-group confirmatory factor analysis model by means of parameter moderation. The advantages of MNLFA over other methods is that it (a) accommodates the assessment of measurement invariance across multiple continuous and categorical background variables and (b) accounts for heteroskedasticity by allowing the factor and residual variances to differ as a function of the background variables. In this article, we aim to make MNLFA more accessible to researchers without access to commercial structural equation modeling software by demonstrating how this method can be applied with the open-source R package OpenMx. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

评估测量不变性是对不同个体或群体的潜在构念的测量结果进行有意义比较的重要步骤。最近，有人提出了调节性非线性因子分析（MNLFA）作为一种评估测量不变量的方法。在 MNLFA 模型中，测量不变性是通过参数调节的方式在单组确认性因子分析模型中进行检验的。与其他方法相比，MNLFA 的优点在于：(a) 可以评估多个连续和分类背景变量的测量不变量；(b) 允许因子方差和残差方差随背景变量的变化而变化，从而考虑到异方差。在本文中，我们旨在通过演示如何使用开源 R 软件包 OpenMx 来应用 MNLFA，使无法使用商业结构方程建模软件的研究人员更容易使用 MNLFA。(PsycInfo Database Record (c) 2024 APA, 版权所有）。

引用次数: 0

Meta-analysis of correlation coefficients: A cautionary tale on treating measurement error. 相关系数的元分析：处理测量误差的警示故事。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-04-01 Epub Date: 2022-05-23 DOI: 10.1037/met0000498

Qian Zhang

A scale to measure a psychological construct is subject to measurement error. When meta-analyzing correlations obtained from scale scores, many researchers recommend correcting for measurement error. I considered three caveats when correcting for measurement error in meta-analysis of correlations: (a) the distribution of true scores can be non-normal, resulting in violation of the normality assumption for raw correlations and Fisher's z transformed correlations; (b) coefficient alpha is often used as the reliability, but correlations corrected for measurement error using alpha can be inaccurate when some assumptions of alpha (e.g., tau-equivalence) are violated; and (c) item scores are often ordinal, making the disattenuation formula potentially problematic. Via three simulation studies, I examined the performance of two meta-analysis approaches-with raw correlations and z scores. In terms of estimation accuracy and coverage probability of the mean correlation, results showed that (a) considering the true-score distribution alone, estimation of the mean correlation was slightly worse when true scores of the constructs were skewed rather than normal; (b) when the tau-equivalence assumption was violated and coefficient alpha was used for correcting measurement error, the mean correlation estimates can be biased and coverage probabilities can be low; and (c) discretization of continuous items can result in biased estimates and undercoverage of the mean correlations even when tau-equivalence was satisfied. With more categories and/or items on a scale, results can improve whether tau-equivalence was met or not. Based on these findings, I gave recommendations for conducting meta-analyses of correlations. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

测量心理结构的量表存在测量误差。在对量表得分的相关性进行元分析时，许多研究人员建议对测量误差进行校正。在对相关性进行元分析时，我考虑了修正测量误差的三个注意事项：(a) 真实分数的分布可能是非正态分布，从而导致违反原始相关性和费雪 z 转换相关性的正态性假设；(b) 通常使用系数 alpha 作为信度，但当 alpha 的某些假设（例如，tau-等效性）被违反时，使用 alpha 修正测量误差的相关性可能会不准确；(c) 在对相关性进行元分析时，可能会对测量误差进行修正；(d) 在对相关性进行元分析时，可能会对测量误差进行修正；(e) 在对相关性进行元分析时，可能会对测量误差进行修正、(c) 项目得分通常是顺序性的，这就使得析取公式可能存在问题。通过三项模拟研究，我考察了两种荟萃分析方法--原始相关性和 z 分数--的性能。就平均相关性的估计准确性和覆盖概率而言，结果显示：(a) 如果只考虑真实分数分布，而不考虑正态分布，则平均相关性的估计结果略差；(b) 如果违反了 tau-equivalence 假设，并使用系数 alpha 来纠正测量误差，平均相关估计值就会有偏差，覆盖概率也会很低；以及 (c) 即使满足了 tau-equivalence 假设，连续项目的离散化也会导致平均相关估计值有偏差和覆盖率不足。如果在量表中增加类别和/或项目，无论是否满足陶氏等效性，结果都会有所改善。基于这些发现，我提出了对相关性进行元分析的建议。(PsycInfo Database Record (c) 2022 APA，保留所有权利）。

{"title":"Meta-analysis of correlation coefficients: A cautionary tale on treating measurement error.","authors":"Qian Zhang","doi":"10.1037/met0000498","DOIUrl":"10.1037/met0000498","url":null,"abstract":"A scale to measure a psychological construct is subject to measurement error. When meta-analyzing correlations obtained from scale scores, many researchers recommend correcting for measurement error. I considered three caveats when correcting for measurement error in meta-analysis of correlations: (a) the distribution of true scores can be non-normal, resulting in violation of the normality assumption for raw correlations and Fisher's z transformed correlations; (b) coefficient alpha is often used as the reliability, but correlations corrected for measurement error using alpha can be inaccurate when some assumptions of alpha (e.g., tau-equivalence) are violated; and (c) item scores are often ordinal, making the disattenuation formula potentially problematic. Via three simulation studies, I examined the performance of two meta-analysis approaches-with raw correlations and z scores. In terms of estimation accuracy and coverage probability of the mean correlation, results showed that (a) considering the true-score distribution alone, estimation of the mean correlation was slightly worse when true scores of the constructs were skewed rather than normal; (b) when the tau-equivalence assumption was violated and coefficient alpha was used for correcting measurement error, the mean correlation estimates can be biased and coverage probabilities can be low; and (c) discretization of continuous items can result in biased estimates and undercoverage of the mean correlations even when tau-equivalence was satisfied. With more categories and/or items on a scale, results can improve whether tau-equivalence was met or not. Based on these findings, I gave recommendations for conducting meta-analyses of correlations. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":"308-330"},"PeriodicalIF":7.0,"publicationDate":"2024-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139747300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Harnessing the power of excess statistical significance: Weighted and iterative least squares. 利用过度统计显著性的力量：加权和迭代最小二乘法。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-04-01 Epub Date: 2022-05-12 DOI: 10.1037/met0000502

T D Stanley, Hristos Doucouliagos

We introduce a new meta-analysis estimator, the weighted and iterated least squares (WILS), that greatly reduces publication selection bias (PSB) when selective reporting for statistical significance (SSS) is present. WILS is the simple weighted average that has smaller bias and rates of false positives than conventional meta-analysis estimators, the unrestricted weighted least squares (UWLS), and the weighted average of the adequately powered (WAAP) when there is SSS. As a simple weighted average, it is not vulnerable to violations in publication bias corrections models' assumptions too often seen in application. WILS is based on the novel idea of allowing excess statistical significance (ESS), which is a necessary condition of SSS, to identify when and how to reduce PSB. We show in comparisons with large-scale preregistered replications and in evidence-based simulations that the remaining bias is small. The routine application of WILS in the place of random effects would do much to reduce conventional meta-analysis's notable biases and high rates of false positives. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

我们引入了一种新的荟萃分析估计器，加权迭代最小二乘法（WILS），当存在统计显著性选择性报告（SSS）时，它可以大大降低出版物选择偏差（PSB）。WILS是一种简单的加权平均值，与传统的荟萃分析估计量、无限制加权最小二乘法（UWLS）和有SSS时的充分加权平均值（WAAP）相比，其偏差和误报率较小。作为一个简单的加权平均值，它不容易受到应用中经常出现的出版物偏差校正模型假设的违反。WILS基于允许超额统计显著性（ESS）的新思想，这是SSS的必要条件，以确定何时以及如何减少PSB。我们在与大规模预注册复制的比较和循证模拟中表明，剩余的偏差很小。用WILS代替随机效应的常规应用将大大减少传统荟萃分析的显著偏差和高假阳性率。（PsycInfo数据库记录（c）2023 APA，保留所有权利）。

引用次数: 0

Beta-binomial meta-analysis of individual differences based on sample means and standard deviations: Studying reliability of sum scores of binary items. 基于样本平均数和标准差的个体差异贝塔-二项式元分析：研究二元项目总分的可靠性。

IF 7 1区心理学 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Psychological methods

Pub Date : 2024-03-14 DOI: 10.1037/met0000649

Philipp Doebler, Susanne Frick, Anna Doebler

Individual differences are studied with a multitude of test instruments. Meta-analysis of tests is useful to understand whether individual differences in certain populations can be detected with the help of a class of tests. A method for the quantitative meta-analytical evaluation of test instruments with dichotomous items is introduced. The method assumes beta-binomially distributed test scores, an assumption that has been demonstrated to be plausible in many settings. With this assumption, the method only requires sample means and standard deviations of sum scores (or equivalently means and standard deviations of percent-correct scores), in contrast to methods that use estimates of reliability for a similar purpose. Two parameters are estimated for each sample: mean difficulty and an overdispersion parameter which can be interpreted as the test's ability to detect individual differences. The proposed bivariate meta-analytical approach (random or fixed effects) pools the two parameters simultaneously and allows to perform meta-regression. The bivariate pooling yields a between-sample correlation of mean difficulty parameters and overdispersion parameters. As a side product, reliability estimates are obtained which can be employed to disattenuate correlation coefficients for insufficient reliability when no other estimates are available. A worked example illustrates the method and R code is provided. (PsycInfo Database Record (c) 2024 APA, all rights reserved).

研究个体差异的测试工具多种多样。对测验进行元分析有助于了解是否可以借助一类测验来检测某些人群的个体差异。本文介绍了一种对具有二分项目的测验工具进行定量元分析评估的方法。该方法假设测验分数呈贝塔二项分布，这一假设在许多情况下都被证明是合理的。有了这一假设，该方法只需要样本总分的均值和标准差（或等同于正确率分数的均值和标准差），这与使用信度估计值来达到类似目的的方法截然不同。每个样本都有两个估计参数：平均难度和过度分散参数，后者可解释为测验检测个体差异的能力。拟议的双变量元分析方法（随机或固定效应）可同时汇集这两个参数，并进行元回归。双变量集合产生了平均难度参数和过度分散参数的样本间相关性。作为附带产品，还可获得可靠性估计值，在没有其他估计值的情况下，可利用可靠性估计值来消除可靠性不足的相关系数。我们提供了一个工作示例来说明该方法，并提供了 R 代码。(PsycInfo 数据库记录 (c) 2024 APA，保留所有权利）。

{"title":"Beta-binomial meta-analysis of individual differences based on sample means and standard deviations: Studying reliability of sum scores of binary items.","authors":"Philipp Doebler, Susanne Frick, Anna Doebler","doi":"10.1037/met0000649","DOIUrl":"https://doi.org/10.1037/met0000649","url":null,"abstract":"Individual differences are studied with a multitude of test instruments. Meta-analysis of tests is useful to understand whether individual differences in certain populations can be detected with the help of a class of tests. A method for the quantitative meta-analytical evaluation of test instruments with dichotomous items is introduced. The method assumes beta-binomially distributed test scores, an assumption that has been demonstrated to be plausible in many settings. With this assumption, the method only requires sample means and standard deviations of sum scores (or equivalently means and standard deviations of percent-correct scores), in contrast to methods that use estimates of reliability for a similar purpose. Two parameters are estimated for each sample: mean difficulty and an overdispersion parameter which can be interpreted as the test's ability to detect individual differences. The proposed bivariate meta-analytical approach (random or fixed effects) pools the two parameters simultaneously and allows to perform meta-regression. The bivariate pooling yields a between-sample correlation of mean difficulty parameters and overdispersion parameters. As a side product, reliability estimates are obtained which can be employed to disattenuate correlation coefficients for insufficient reliability when no other estimates are available. A worked example illustrates the method and R code is provided. (PsycInfo Database Record (c) 2024 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140132459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0