In psychology, researchers often predict a dependent variable (DV) consisting of multiple measurements (e.g., scale items measuring a concept). To analyze the data, researchers typically aggregate (sum/average) scores across items and use this as a DV. Alternatively, they may define the DV as a common factor using structural equation modeling. However, both approaches neglect the possibility that an independent variable (IV) may have different relationships to individual items. This variance in individual item slopes arises because items are randomly sampled from an infinite pool of items reflecting the construct that the scale purports to measure. Here, we offer a mixed-effects model called random item slope regression, which accounts for both similarities and differences of individual item associations. Critically, we argue that random item slope regression poses an alternative measurement model to common factor models prevalent in psychology. Unlike these models, the proposed model supposes no latent constructs and instead assumes that individual items have direct causal relationships with the IV. Such operationalization is especially useful when researchers want to assess a broad construct with heterogeneous items. Using mathematical proof and simulation, we demonstrate that random item slopes cause inflation of Type I error when not accounted for, particularly when the sample size (number of participants) is large. In real-world data (n = 564 participants) using commonly used surveys and two reaction time tasks, we demonstrate that random item slopes are present at problematic levels. We further demonstrate that common statistical indices are not sufficient to diagnose the presence of random item slopes. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Random item slope regression: An alternative measurement model that accounts for both similarities and differences in association with individual items.","authors":"Ed Donnellan, Satoshi Usami, Kou Murayama","doi":"10.1037/met0000587","DOIUrl":"https://doi.org/10.1037/met0000587","url":null,"abstract":"<p><p>In psychology, researchers often predict a dependent variable (DV) consisting of multiple measurements (e.g., scale items measuring a concept). To analyze the data, researchers typically aggregate (sum/average) scores across items and use this as a DV. Alternatively, they may define the DV as a common factor using structural equation modeling. However, both approaches neglect the possibility that an independent variable (IV) may have different relationships to individual items. This variance in individual item slopes arises because items are randomly sampled from an infinite pool of items reflecting the construct that the scale purports to measure. Here, we offer a mixed-effects model called <i>random item slope regression,</i> which accounts for both similarities and differences of individual item associations. Critically, we argue that random item slope regression poses an alternative measurement model to common factor models prevalent in psychology. Unlike these models, the proposed model supposes no latent constructs and instead assumes that individual items have direct causal relationships with the IV. Such operationalization is especially useful when researchers want to assess a broad construct with heterogeneous items. Using mathematical proof and simulation, we demonstrate that random item slopes cause inflation of Type I error when not accounted for, particularly when the sample size (number of participants) is large. In real-world data (<i>n</i> = 564 participants) using commonly used surveys and two reaction time tasks, we demonstrate that random item slopes are present at problematic levels. We further demonstrate that common statistical indices are not sufficient to diagnose the presence of random item slopes. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10259360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Given recent evidence challenging the replicability of results in the social and behavioral sciences, critical questions have been raised about appropriate measures for determining replication success in comparing effect estimates across studies. At issue is the fact that conclusions about replication success often depend on the measure used for evaluating correspondence in results. Despite the importance of choosing an appropriate measure, there is still no widespread agreement about which measures should be used. This article addresses these questions by describing formally the most commonly used measures for assessing replication success, and by comparing their performance in different contexts according to their replication probabilities-that is, the probability of obtaining replication success given study-specific settings. The measures may be characterized broadly as conclusion-based approaches, which assess the congruence of two independent studies' conclusions about the presence of an effect, and distance-based approaches, which test for a significant difference or equivalence of two effect estimates. We also introduce a new measure for assessing replication success called the correspondence test, which combines a difference and equivalence test in the same framework. To help researchers plan prospective replication efforts, we provide closed formulas for power calculations that can be used to determine the minimum detectable effect size (and thus, sample sizes) for each study so that a predetermined minimum replication probability can be achieved. Finally, we use a replication data set from the Open Science Collaboration (2015) to demonstrate the extent to which conclusions about replication success depend on the correspondence measure selected. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Correspondence measures for assessing replication success.","authors":"Peter M Steiner, Patrick Sheehan, Vivian C Wong","doi":"10.1037/met0000597","DOIUrl":"https://doi.org/10.1037/met0000597","url":null,"abstract":"<p><p>Given recent evidence challenging the replicability of results in the social and behavioral sciences, critical questions have been raised about appropriate measures for determining replication success in comparing effect estimates across studies. At issue is the fact that conclusions about replication success often depend on the measure used for evaluating correspondence in results. Despite the importance of choosing an appropriate measure, there is still no widespread agreement about which measures should be used. This article addresses these questions by describing formally the most commonly used measures for assessing replication success, and by comparing their performance in different contexts according to their replication probabilities-that is, the probability of obtaining replication success given study-specific settings. The measures may be characterized broadly as conclusion-based approaches, which assess the congruence of two independent studies' conclusions about the presence of an effect, and distance-based approaches, which test for a significant difference or equivalence of two effect estimates. We also introduce a new measure for assessing replication success called the correspondence test, which combines a difference and equivalence test in the same framework. To help researchers plan prospective replication efforts, we provide closed formulas for power calculations that can be used to determine the minimum detectable effect size (and thus, sample sizes) for each study so that a predetermined minimum replication probability can be achieved. Finally, we use a replication data set from the Open Science Collaboration (2015) to demonstrate the extent to which conclusions about replication success depend on the correspondence measure selected. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10259359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a novel method to analyze time-constrained yes/no questions about a target behavior (e.g., "Did you take sleeping pills during the last 12 months?"). A drawback of these questions is that the relative frequency of answering these questions with "yes" does not allow one to draw definite conclusions about the frequency of the target behavior (i.e., how often sleeping pills were taken) nor about the prevalence of trait carriers (i.e., percentage of people that take sleeping pills). Here we show how this information can be extracted from the results of such questions employing a prevalence curve and a Poisson model. The applicability of the method was evaluated with a survey on everyday behavior, which revealed plausible results and reasonable model fit. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"On estimating the frequency of a target behavior from time-constrained yes/no survey questions: A parametric approach based on the Poisson process.","authors":"Benedikt Iberl, Rolf Ulrich","doi":"10.1037/met0000588","DOIUrl":"https://doi.org/10.1037/met0000588","url":null,"abstract":"<p><p>We propose a novel method to analyze time-constrained yes/no questions about a target behavior (e.g., \"Did you take sleeping pills during the last 12 months?\"). A drawback of these questions is that the relative frequency of answering these questions with \"yes\" does not allow one to draw definite conclusions about the frequency of the target behavior (i.e., how often sleeping pills were taken) nor about the prevalence of trait carriers (i.e., percentage of people that take sleeping pills). Here we show how this information can be extracted from the results of such questions employing a prevalence curve and a Poisson model. The applicability of the method was evaluated with a survey on everyday behavior, which revealed plausible results and reasonable model fit. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9838249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The generally small but touted as "statistically significant" correlation coefficients in the social sciences jeopardize theory testing and prediction. To investigate these small coefficients' underlying causes, traditional equations such as Spearman's (1904) classic attenuation formula, Cronbach's (1951) alpha, and Guilford and Fruchter's (1973) equation for the effect of additional items on a scale's predictive power are considered. These equations' implications differ regarding large interitem correlations enhancing or diminishing predictive power. Contrary to conventional practice, such correlations decrease predictive power when treating items as multi-item scale components but can increase predictive power when treating items separately. The implications are wide-ranging. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Enhancing predictive power by unamalgamating multi-item scales.","authors":"David Trafimow, Michael R Hyman, Alena Kostyk","doi":"10.1037/met0000599","DOIUrl":"https://doi.org/10.1037/met0000599","url":null,"abstract":"<p><p>The generally small but touted as \"statistically significant\" correlation coefficients in the social sciences jeopardize theory testing and prediction. To investigate these small coefficients' underlying causes, traditional equations such as Spearman's (1904) classic attenuation formula, Cronbach's (1951) alpha, and Guilford and Fruchter's (1973) equation for the effect of additional items on a scale's predictive power are considered. These equations' implications differ regarding large interitem correlations enhancing or diminishing predictive power. Contrary to conventional practice, such correlations decrease predictive power when treating items as multi-item scale components but can increase predictive power when treating items separately. The implications are wide-ranging. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9838250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omega squared (ω^2) is a measure of effect size for analysis of variance (ANOVA) designs. It is less biased than eta squared, but reported less often. This is in part due to lack of clear guidance on how to calculate it. In this paper, we discuss the logic behind effect size measures, the problem with eta squared, the history of omega squared, and why it has been underused. We then provide a user-friendly guide to omega squared and partial omega squared for ANOVA designs with fixed factors, including one-way, two-way, and three-way designs, using within-subjects factors and/or between-subjects factors. We show how to calculate omega squared using output from SPSS. We provide information on the calculation of confidence intervals. We examine the problems of nonadditivity, and intrinsic versus extrinsic factors. We argue that statistical package developers could play an important role in making the calculation of omega squared easier. Finally, we recommend that researchers report the formulas used in calculating effect sizes, include confidence intervals if possible, and include ANOVA tables in the online supplemental materials of their work. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Demystifying omega squared: Practical guidance for effect size in common analysis of variance designs.","authors":"Antoinette D A Kroes, Jason R Finley","doi":"10.1037/met0000581","DOIUrl":"https://doi.org/10.1037/met0000581","url":null,"abstract":"<p><p>Omega squared (ω^2) is a measure of effect size for analysis of variance (ANOVA) designs. It is less biased than eta squared, but reported less often. This is in part due to lack of clear guidance on how to calculate it. In this paper, we discuss the logic behind effect size measures, the problem with eta squared, the history of omega squared, and why it has been underused. We then provide a user-friendly guide to omega squared and partial omega squared for ANOVA designs with fixed factors, including one-way, two-way, and three-way designs, using within-subjects factors and/or between-subjects factors. We show how to calculate omega squared using output from SPSS. We provide information on the calculation of confidence intervals. We examine the problems of nonadditivity, and intrinsic versus extrinsic factors. We argue that statistical package developers could play an important role in making the calculation of omega squared easier. Finally, we recommend that researchers report the formulas used in calculating effect sizes, include confidence intervals if possible, and include ANOVA tables in the online supplemental materials of their work. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9840882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In psychology, researchers often predict a dependent variable (DV) consisting of multiple measurements (e.g., scale items measuring a concept). To analyze the data, researchers typically aggregate (sum/average) scores across items and use this as a DV. Alternatively, they may define the DV as a common factor using structural equation modeling. However, both approaches neglect the possibility that an independent variable (IV) may have different relationships to individual items. This variance in individual item slopes arises because items are randomly sampled from an infinite pool of items reflecting the construct that the scale purports to measure. Here, we offer a mixed-effects model called random item slope regression, which accounts for both similarities and differences of individual item associations. Critically, we argue that random item slope regression poses an alternative measurement model to common factor models prevalent in psychology. Unlike these models, the proposed model supposes no latent constructs and instead assumes that individual items have direct causal relationships with the IV. Such operationalization is especially useful when researchers want to assess a broad construct with heterogeneous items. Using mathematical proof and simulation, we demonstrate that random item slopes cause inflation of Type I error when not accounted for, particularly when the sample size (number of participants) is large. In real-world data (n = 564 participants) using commonly used surveys and two reaction time tasks, we demonstrate that random item slopes are present at problematic levels. We further demonstrate that common statistical indices are not sufficient to diagnose the presence of random item slopes. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Random item slope regression: An alternative measurement model that accounts for both similarities and differences in association with individual items.","authors":"E. Donnellan, S. Usami, K. Murayama","doi":"10.1037/met0000587.supp","DOIUrl":"https://doi.org/10.1037/met0000587.supp","url":null,"abstract":"In psychology, researchers often predict a dependent variable (DV) consisting of multiple measurements (e.g., scale items measuring a concept). To analyze the data, researchers typically aggregate (sum/average) scores across items and use this as a DV. Alternatively, they may define the DV as a common factor using structural equation modeling. However, both approaches neglect the possibility that an independent variable (IV) may have different relationships to individual items. This variance in individual item slopes arises because items are randomly sampled from an infinite pool of items reflecting the construct that the scale purports to measure. Here, we offer a mixed-effects model called random item slope regression, which accounts for both similarities and differences of individual item associations. Critically, we argue that random item slope regression poses an alternative measurement model to common factor models prevalent in psychology. Unlike these models, the proposed model supposes no latent constructs and instead assumes that individual items have direct causal relationships with the IV. Such operationalization is especially useful when researchers want to assess a broad construct with heterogeneous items. Using mathematical proof and simulation, we demonstrate that random item slopes cause inflation of Type I error when not accounted for, particularly when the sample size (number of participants) is large. In real-world data (n = 564 participants) using commonly used surveys and two reaction time tasks, we demonstrate that random item slopes are present at problematic levels. We further demonstrate that common statistical indices are not sufficient to diagnose the presence of random item slopes. (PsycInfo Database Record (c) 2023 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48809969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Several authors have recommended adopting the receiver operator characteristic (ROC) area under the curve (AUC) or mean ridit as an effect size, arguing that it measures an important and interpretable type of effect that conventional effect-size measures do not. It is base-rate insensitive, robust to outliers, and invariant under order-preserving transformations. However, applications have been limited to group comparisons, and usually just two groups, in line with the popular interpretation of the AUC as measuring the probability that a randomly chosen case from one group will score higher on the dependent variable than a randomly chosen case from another group. This tutorial article shows that the AUC can be used as an effect size for both categorical and continuous predictors in a wide variety of general linear models, whose dependent variables may be ordinal, interval, or ratio level. Thus, the AUC is a general effect-size measure. Demonstrations in this article include linear regression, ordinal logistic regression, gamma regression, and beta regression. The online supplemental materials to this tutorial provide a survey of currently available software resources in R for the AUC and ridits, along with the code and access to the data used in the examples. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"The receiver operating characteristic area under the curve (or mean ridit) as an effect size.","authors":"Michael Smithson","doi":"10.1037/met0000601","DOIUrl":"https://doi.org/10.1037/met0000601","url":null,"abstract":"<p><p>Several authors have recommended adopting the receiver operator characteristic (ROC) area under the curve (AUC) or mean ridit as an effect size, arguing that it measures an important and interpretable type of effect that conventional effect-size measures do not. It is base-rate insensitive, robust to outliers, and invariant under order-preserving transformations. However, applications have been limited to group comparisons, and usually just two groups, in line with the popular interpretation of the AUC as measuring the probability that a randomly chosen case from one group will score higher on the dependent variable than a randomly chosen case from another group. This tutorial article shows that the AUC can be used as an effect size for both categorical and continuous predictors in a wide variety of general linear models, whose dependent variables may be ordinal, interval, or ratio level. Thus, the AUC is a general effect-size measure. Demonstrations in this article include linear regression, ordinal logistic regression, gamma regression, and beta regression. The online supplemental materials to this tutorial provide a survey of currently available software resources in R for the AUC and ridits, along with the code and access to the data used in the examples. (PsycInfo Database Record (c) 2023 APA, all rights reserved).</p>","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9776732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Omega squared (ω^2) is a measure of effect size for analysis of variance (ANOVA) designs. It is less biased than eta squared, but reported less often. This is in part due to lack of clear guidance on how to calculate it. In this paper, we discuss the logic behind effect size measures, the problem with eta squared, the history of omega squared, and why it has been underused. We then provide a user-friendly guide to omega squared and partial omega squared for ANOVA designs with fixed factors, including one-way, two-way, and three-way designs, using within-subjects factors and/or between-subjects factors. We show how to calculate omega squared using output from SPSS. We provide information on the calculation of confidence intervals. We examine the problems of nonadditivity, and intrinsic versus extrinsic factors. We argue that statistical package developers could play an important role in making the calculation of omega squared easier. Finally, we recommend that researchers report the formulas used in calculating effect sizes, include confidence intervals if possible, and include ANOVA tables in the online supplemental materials of their work. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Demystifying omega squared: Practical guidance for effect size in common analysis of variance designs.","authors":"Antoinette D A Kroes, Jason R. Finley","doi":"10.1037/met0000581.supp","DOIUrl":"https://doi.org/10.1037/met0000581.supp","url":null,"abstract":"Omega squared (ω^2) is a measure of effect size for analysis of variance (ANOVA) designs. It is less biased than eta squared, but reported less often. This is in part due to lack of clear guidance on how to calculate it. In this paper, we discuss the logic behind effect size measures, the problem with eta squared, the history of omega squared, and why it has been underused. We then provide a user-friendly guide to omega squared and partial omega squared for ANOVA designs with fixed factors, including one-way, two-way, and three-way designs, using within-subjects factors and/or between-subjects factors. We show how to calculate omega squared using output from SPSS. We provide information on the calculation of confidence intervals. We examine the problems of nonadditivity, and intrinsic versus extrinsic factors. We argue that statistical package developers could play an important role in making the calculation of omega squared easier. Finally, we recommend that researchers report the formulas used in calculating effect sizes, include confidence intervals if possible, and include ANOVA tables in the online supplemental materials of their work. (PsycInfo Database Record (c) 2023 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46555021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supplemental Material for Correspondence Measures for Assessing Replication Success","authors":"","doi":"10.1037/met0000597.supp","DOIUrl":"https://doi.org/10.1037/met0000597.supp","url":null,"abstract":"","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45980999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Integrating regularization methods into structural equation modeling is gaining increasing popularity. The purpose of regularization is to improve variable selection, model estimation, and prediction accuracy. In this study, we aim to: (a) compare Bayesian regularization methods for exploring covariate effects in multiple-indicators multiple-causes models, (b) examine the sensitivity of results to hyperparameter settings of penalty priors, and (c) investigate prediction accuracy through cross-validation. The Bayesian regularization methods examined included: ridge, lasso, adaptive lasso, spike-and-slab prior (SSP) and its variants, and horseshoe and its variants. Sparse solutions were developed for the structural coefficient matrix that contained only a small portion of nonzero path coefficients characterizing the effects of selected covariates on the latent variable. Results from the simulation study showed that compared to diffuse priors, penalty priors were advantageous in handling small sample sizes and collinearity among covariates. Priors with only the global penalty (ridge and lasso) yielded higher model convergence rates and power, whereas priors with both the global and local penalties (horseshoe and SSP) provided more accurate parameter estimates for medium and large covariate effects. The horseshoe and SSP improved accuracy in predicting factor scores, while achieving more parsimonious models. (PsycInfo Database Record (c) 2023 APA, all rights reserved).
{"title":"Bayesian regularization in multiple-indicators multiple-causes models.","authors":"Lijin Zhang, Xinya Liang","doi":"10.1037/met0000594.supp","DOIUrl":"https://doi.org/10.1037/met0000594.supp","url":null,"abstract":"Integrating regularization methods into structural equation modeling is gaining increasing popularity. The purpose of regularization is to improve variable selection, model estimation, and prediction accuracy. In this study, we aim to: (a) compare Bayesian regularization methods for exploring covariate effects in multiple-indicators multiple-causes models, (b) examine the sensitivity of results to hyperparameter settings of penalty priors, and (c) investigate prediction accuracy through cross-validation. The Bayesian regularization methods examined included: ridge, lasso, adaptive lasso, spike-and-slab prior (SSP) and its variants, and horseshoe and its variants. Sparse solutions were developed for the structural coefficient matrix that contained only a small portion of nonzero path coefficients characterizing the effects of selected covariates on the latent variable. Results from the simulation study showed that compared to diffuse priors, penalty priors were advantageous in handling small sample sizes and collinearity among covariates. Priors with only the global penalty (ridge and lasso) yielded higher model convergence rates and power, whereas priors with both the global and local penalties (horseshoe and SSP) provided more accurate parameter estimates for medium and large covariate effects. The horseshoe and SSP improved accuracy in predicting factor scores, while achieving more parsimonious models. (PsycInfo Database Record (c) 2023 APA, all rights reserved).","PeriodicalId":20782,"journal":{"name":"Psychological methods","volume":" ","pages":""},"PeriodicalIF":7.0,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47361273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}