Pub Date : 2023-11-01Epub Date: 2023-05-04DOI: 10.1080/00273171.2023.2194606
Alexander P Christensen, Luis Eduardo Garrido, Hudson Golino
The local independence assumption states that variables are unrelated after conditioning on a latent variable. Common problems that arise from violations of this assumption include model misspecification, biased model parameters, and inaccurate estimates of internal structure. These problems are not limited to latent variable models but also apply to network psychometrics. This paper proposes a novel network psychometric approach to detect locally dependent pairs of variables using network modeling and a graph theory measure called weighted topological overlap (wTO). Using simulation, this approach is compared to contemporary local dependence detection methods such as exploratory structural equation modeling with standardized expected parameter change and a recently developed approach using partial correlations and a resampling procedure. Different approaches to determine local dependence using statistical significance and cutoff values are also compared. Continuous, polytomous (5-point Likert scale), and dichotomous (binary) data were generated with skew across a variety of conditions. Our results indicate that cutoff values work better than significance approaches. Overall, the network psychometrics approaches using wTO with graphical least absolute shrinkage and selector operator with extended Bayesian information criterion and wTO with Bayesian Gaussian graphical model were the best performing local dependence detection methods overall.
{"title":"Unique Variable Analysis: A Network Psychometrics Method to Detect Local Dependence.","authors":"Alexander P Christensen, Luis Eduardo Garrido, Hudson Golino","doi":"10.1080/00273171.2023.2194606","DOIUrl":"10.1080/00273171.2023.2194606","url":null,"abstract":"<p><p>The local independence assumption states that variables are unrelated after conditioning on a latent variable. Common problems that arise from violations of this assumption include model misspecification, biased model parameters, and inaccurate estimates of internal structure. These problems are not limited to latent variable models but also apply to network psychometrics. This paper proposes a novel network psychometric approach to detect locally dependent pairs of variables using network modeling and a graph theory measure called weighted topological overlap (wTO). Using simulation, this approach is compared to contemporary local dependence detection methods such as exploratory structural equation modeling with standardized expected parameter change and a recently developed approach using partial correlations and a resampling procedure. Different approaches to determine local dependence using statistical significance and cutoff values are also compared. Continuous, polytomous (5-point Likert scale), and dichotomous (binary) data were generated with skew across a variety of conditions. Our results indicate that cutoff values work better than significance approaches. Overall, the network psychometrics approaches using wTO with graphical least absolute shrinkage and selector operator with extended Bayesian information criterion and wTO with Bayesian Gaussian graphical model were the best performing local dependence detection methods overall.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9406819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-05-25DOI: 10.1080/00273171.2023.2182753
Booil Jo, Trevor J Hastie, Zetan Li, Eric A Youngstrom, Robert L Findling, Sarah McCue Horwitz
Despite its potentials benefits, using prediction targets generated based on latent variable (LV) modeling is not a common practice in supervised learning, a dominating framework for developing prediction models. In supervised learning, it is typically assumed that the outcome to be predicted is clear and readily available, and therefore validating outcomes before predicting them is a foreign concept and an unnecessary step. The usual goal of LV modeling is inference, and therefore using it in supervised learning and in the prediction context requires a major conceptual shift. This study lays out methodological adjustments and conceptual shifts necessary for integrating LV modeling into supervised learning. It is shown that such integration is possible by combining the traditions of LV modeling, psychometrics, and supervised learning. In this interdisciplinary learning framework, generating practical outcomes using LV modeling and systematically validating them based on clinical validators are the two main strategies. In the example using the data from the Longitudinal Assessment of Manic Symptoms (LAMS) Study, a large pool of candidate outcomes is generated by flexible LV modeling. It is demonstrated that this exploratory situation can be used as an opportunity to tailor desirable prediction targets taking advantage of contemporary science and clinical insights.
{"title":"Reorienting Latent Variable Modeling for Supervised Learning.","authors":"Booil Jo, Trevor J Hastie, Zetan Li, Eric A Youngstrom, Robert L Findling, Sarah McCue Horwitz","doi":"10.1080/00273171.2023.2182753","DOIUrl":"10.1080/00273171.2023.2182753","url":null,"abstract":"<p><p>Despite its potentials benefits, using prediction targets generated based on latent variable (LV) modeling is not a common practice in supervised learning, a dominating framework for developing prediction models. In supervised learning, it is typically assumed that the outcome to be predicted is clear and readily available, and therefore validating outcomes before predicting them is a foreign concept and an unnecessary step. The usual goal of LV modeling is inference, and therefore using it in supervised learning and in the prediction context requires a major conceptual shift. This study lays out methodological adjustments and conceptual shifts necessary for integrating LV modeling into supervised learning. It is shown that such integration is possible by combining the traditions of LV modeling, psychometrics, and supervised learning. In this interdisciplinary learning framework, generating practical outcomes using LV modeling and systematically validating them based on clinical validators are the two main strategies. In the example using the data from the Longitudinal Assessment of Manic Symptoms (LAMS) Study, a large pool of candidate outcomes is generated by flexible LV modeling. It is demonstrated that this exploratory situation can be used as an opportunity to tailor desirable prediction targets taking advantage of contemporary science and clinical insights.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":5.3,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10674034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9524122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-25DOI: 10.1080/00273171.2023.2201277
Ivan Jacob Agaloos Pesigan, Rong Wei Sun, Shu Fai Cheung
The multivariate delta method was used by Yuan and Chan to estimate standard errors and confidence intervals for standardized regression coefficients. Jones and Waller extended the earlier work to situations where data are nonnormal by utilizing Browne's asymptotic distribution-free (ADF) theory. Furthermore, Dudgeon developed standard errors and confidence intervals, employing heteroskedasticity-consistent (HC) estimators, that are robust to nonnormality with better performance in smaller sample sizes compared to Jones and Waller's ADF technique. Despite these advancements, empirical research has been slow to adopt these methodologies. This can be a result of the dearth of user-friendly software programs to put these techniques to use. We present the betaDelta and the betaSandwich packages in the R statistical software environment in this manuscript. Both the normal-theory approach and the ADF approach put forth by Yuan and Chan and Jones and Waller are implemented by the betaDelta package. The HC approach proposed by Dudgeon is implemented by the betaSandwich package. The use of the packages is demonstrated with an empirical example. We think the packages will enable applied researchers to accurately assess the sampling variability of standardized regression coefficients.
Yuan 和 Chan 使用多元三角法估算标准化回归系数的标准误差和置信区间。Jones 和 Waller 利用 Browne 的无渐近分布 (ADF) 理论,将先前的工作扩展到了数据非正态分布的情况。此外,Dudgeon 利用异方差一致(HC)估计器开发了标准误差和置信区间,与 Jones 和 Waller 的 ADF 技术相比,这些估计器对非正态性具有稳健性,在样本量较小的情况下性能更好。尽管取得了这些进步,但实证研究在采用这些方法方面进展缓慢。这可能是由于缺乏用户友好的软件程序来使用这些技术。我们在本手稿中介绍了 R 统计软件环境中的 betaDelta 和 betaSandwich 软件包。Yuan和Chan以及Jones和Waller提出的正态理论方法和ADF方法都由betaDelta软件包实现。Dudgeon 提出的 HC 方法由 betaSandwich 软件包实现。我们通过一个实证例子演示了软件包的使用。我们认为这些软件包可以帮助应用研究人员准确评估标准化回归系数的抽样变异性。
{"title":"betaDelta and betaSandwich: Confidence Intervals for Standardized Regression Coefficients in R.","authors":"Ivan Jacob Agaloos Pesigan, Rong Wei Sun, Shu Fai Cheung","doi":"10.1080/00273171.2023.2201277","DOIUrl":"10.1080/00273171.2023.2201277","url":null,"abstract":"<p><p>The multivariate delta method was used by Yuan and Chan to estimate standard errors and confidence intervals for standardized regression coefficients. Jones and Waller extended the earlier work to situations where data are nonnormal by utilizing Browne's asymptotic distribution-free (ADF) theory. Furthermore, Dudgeon developed standard errors and confidence intervals, employing heteroskedasticity-consistent (HC) estimators, that are robust to nonnormality with better performance in smaller sample sizes compared to Jones and Waller's ADF technique. Despite these advancements, empirical research has been slow to adopt these methodologies. This can be a result of the dearth of user-friendly software programs to put these techniques to use. We present the betaDelta and the betaSandwich packages in the R statistical software environment in this manuscript. Both the normal-theory approach and the ADF approach put forth by Yuan and Chan and Jones and Waller are implemented by the betaDelta package. The HC approach proposed by Dudgeon is implemented by the betaSandwich package. The use of the packages is demonstrated with an empirical example. We think the packages will enable applied researchers to accurately assess the sampling variability of standardized regression coefficients.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9986115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-10DOI: 10.1080/00273171.2023.2174490
Jason D Rights, Sonya K Sterba
For multilevel models (MLMs) with fixed slopes, it has been widely recognized that a level-1 variable can have distinct between-cluster and within-cluster fixed effects, and that failing to disaggregate these effects yields a conflated, uninterpretable fixed effect. For MLMs with random slopes, however, we clarify that two different types of slope conflation can occur: that of the fixed component (termed fixed conflation) and that of the random component (termed random conflation). The latter is rarely recognized and not well understood. Here we explain that a model commonly used to disaggregate the fixed component-the contextual effect model with random slopes-troublingly still yields a conflated random component. Negative consequences of such random conflation have not been demonstrated. Here we show that they include erroneous interpretation and inferences about the substantively important extent of between-cluster differences in slopes, including either underestimating or overestimating such slope heterogeneity. Furthermore, we show that this random conflation can yield inappropriate standard errors for fixed effects. To aid researchers in practice, we delineate which types of random slope specifications yield an unconflated random component. We demonstrate the advantages of these unconflated models in terms of estimating and testing random slope variance (i.e., improved power, Type I error, and bias) and in terms of standard error estimation for fixed effects (i.e., more accurate standard errors), and make recommendations for which specifications to use for particular research purposes.
{"title":"On the Common but Problematic Specification of Conflated Random Slopes in Multilevel Models.","authors":"Jason D Rights, Sonya K Sterba","doi":"10.1080/00273171.2023.2174490","DOIUrl":"10.1080/00273171.2023.2174490","url":null,"abstract":"<p><p>For multilevel models (MLMs) with fixed slopes, it has been widely recognized that a level-1 variable can have distinct between-cluster and within-cluster fixed effects, and that failing to disaggregate these effects yields a conflated, uninterpretable fixed effect. For MLMs with random slopes, however, we clarify that two different types of slope conflation can occur: that of the fixed component (termed fixed conflation) and that of the random component (termed random conflation). The latter is rarely recognized and not well understood. Here we explain that a model commonly used to disaggregate the fixed component-the contextual effect model with random slopes-troublingly still yields a conflated random component. Negative consequences of such random conflation have not been demonstrated. Here we show that they include erroneous interpretation and inferences about the substantively important extent of between-cluster differences in slopes, including either underestimating or overestimating such slope heterogeneity. Furthermore, we show that this random conflation can yield inappropriate standard errors for fixed effects. To aid researchers in practice, we delineate which types of random slope specifications yield an unconflated random component. We demonstrate the advantages of these unconflated models in terms of estimating and testing random slope variance (i.e., improved power, Type I error, and bias) and in terms of standard error estimation for fixed effects (i.e., more accurate standard errors), and make recommendations for which specifications to use for particular research purposes.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9642680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-11DOI: 10.1080/00273171.2023.2193600
Lihan Chen, Victoria Savalei, Mijke Rhemtulla
The use of modern missing data techniques has become more prevalent with their increasing accessibility in statistical software. These techniques focus on handling data that are missing at random (MAR). Although all MAR mechanisms are routinely treated as the same, they are not equal. The impact of missing data on the efficiency of parameter estimates can differ for different MAR variations, even when the amount of missing data is held constant; yet, in current practice, only the rate of missing data is reported. The impact of MAR on the loss of efficiency can instead be more directly measured by the fraction of missing information (FMI). In this article, we explore this impact using FMIs in regression models with one and two predictors. With the help of a Shiny application, we demonstrate that efficiency loss due to missing data can be highly complex and is not always intuitive. We recommend substantive researchers who work with missing data report estimates of FMIs in addition to the rate of missingness. We also encourage methodologists to examine FMIs when designing simulation studies with missing data, and to explore the behavior of efficiency loss under MAR using FMIs in more complex models.
随着统计软件的日益普及,现代缺失数据技术的使用也越来越普遍。这些技术侧重于处理随机缺失数据(MAR)。虽然所有的随机缺失机制通常都被视为相同的,但它们并不相同。即使在缺失数据量保持不变的情况下,缺失数据对参数估计效率的影响也会因不同的 MAR 变化而不同;但在目前的实践中,只报告缺失数据率。MAR 对效率损失的影响可以通过缺失信息的比例 (FMI) 更直接地衡量。在本文中,我们将利用单预测因子和双预测因子回归模型中的 FMIs 来探讨这种影响。在 Shiny 应用程序的帮助下,我们证明了数据缺失导致的效率损失可能非常复杂,而且并不总是直观的。我们建议研究人员在处理缺失数据时,除了报告缺失率外,还要报告 FMIs 的估计值。我们还鼓励方法论专家在设计有缺失数据的模拟研究时检查 FMIs,并在更复杂的模型中使用 FMIs 探索 MAR 下的效率损失行为。
{"title":"Pay Attention to the Ignorable Missing Data Mechanisms! An Exploration of Their Impact on the Efficiency of Regression Coefficients.","authors":"Lihan Chen, Victoria Savalei, Mijke Rhemtulla","doi":"10.1080/00273171.2023.2193600","DOIUrl":"10.1080/00273171.2023.2193600","url":null,"abstract":"<p><p>The use of modern missing data techniques has become more prevalent with their increasing accessibility in statistical software. These techniques focus on handling data that are <i>missing at random</i> (MAR). Although all MAR mechanisms are routinely treated as the same, they are not equal. The impact of missing data on the efficiency of parameter estimates can differ for different MAR variations, even when the amount of missing data is held constant; yet, in current practice, only the rate of missing data is reported. The impact of MAR on the loss of efficiency can instead be more directly measured by the <i>fraction of missing information</i> (FMI). In this article, we explore this impact using FMIs in regression models with one and two predictors. With the help of a <i>Shiny</i> application, we demonstrate that efficiency loss due to missing data can be highly complex and is not always intuitive. We recommend substantive researchers who work with missing data report estimates of FMIs in addition to the rate of missingness. We also encourage methodologists to examine FMIs when designing simulation studies with missing data, and to explore the behavior of efficiency loss under MAR using FMIs in more complex models.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9273348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-10DOI: 10.1080/00273171.2023.2189571
Marcos Jiménez, Francisco J Abad, Eduardo Garcia-Garzon, Luis Eduardo Garrido
Exploratory bi-factor analysis (EBFA) is a very popular approach to estimate models where specific factors are concomitant to a single, general dimension. However, the models typically encountered in fields like personality, intelligence, and psychopathology involve more than one general factor. To address this circumstance, we developed an algorithm (GSLiD) based on partially specified targets to perform exploratory bi-factor analysis with multiple general factors (EBFA-MGF). In EBFA-MGF, researchers do not need to conduct independent bi-factor analyses anymore because several bi-factor models are estimated simultaneously in an exploratory manner, guarding against biased estimates and model misspecification errors due to unexpected cross-loadings and factor correlations. The results from an exhaustive Monte Carlo simulation manipulating nine variables of interest suggested that GSLiD outperforms the Schmid-Leiman approximation and is robust to challenging conditions involving cross-loadings and pure items of the general factors. Thereby, we supply an R package (bifactor) to make EBFA-MGF readily available for substantive research. Finally, we use GSLiD to assess the hierarchical structure of a reduced version of the Personality Inventory for DSM-5 Short Form (PID-5-SF).
{"title":"Exploratory Bi-factor Analysis with Multiple General Factors.","authors":"Marcos Jiménez, Francisco J Abad, Eduardo Garcia-Garzon, Luis Eduardo Garrido","doi":"10.1080/00273171.2023.2189571","DOIUrl":"10.1080/00273171.2023.2189571","url":null,"abstract":"<p><p>Exploratory bi-factor analysis (EBFA) is a very popular approach to estimate models where specific factors are concomitant to a single, general dimension. However, the models typically encountered in fields like personality, intelligence, and psychopathology involve more than one general factor. To address this circumstance, we developed an algorithm (GSLiD) based on partially specified targets to perform exploratory bi-factor analysis with multiple general factors (EBFA-MGF). In EBFA-MGF, researchers do not need to conduct independent bi-factor analyses anymore because several bi-factor models are estimated simultaneously in an exploratory manner, guarding against biased estimates and model misspecification errors due to unexpected cross-loadings and factor correlations. The results from an exhaustive Monte Carlo simulation manipulating nine variables of interest suggested that GSLiD outperforms the Schmid-Leiman approximation and is robust to challenging conditions involving cross-loadings and pure items of the general factors. Thereby, we supply an R package (bifactor) to make EBFA-MGF readily available for substantive research. Finally, we use GSLiD to assess the hierarchical structure of a reduced version of the Personality Inventory for DSM-5 Short Form (PID-5-SF).</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9642682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-03-23DOI: 10.1080/00273171.2023.2173135
David Huh, Scott A Baldwin, Zhengyang Zhou, Joonsuk Park, Eun-Young Mun
Meta-analysis using individual participant data (IPD) is an important methodology in intervention research because it (a) increases accuracy and precision of estimates, (b) allows researchers to investigate mediators and moderators of treatment effects, and (c) makes use of extant data. IPD meta-analysis can be conducted either via a one-step approach that uses data from all studies simultaneously, or a two-step approach, which aggregates data for each study and then combines them in a traditional meta-analysis model. Unfortunately, there are no evidence-based guidelines for how best to approach IPD meta-analysis for count outcomes with many zeroes, such as alcohol use. We used simulation to compare the performance of four hurdle models (3 one-step and 1 two-step models) for zero-inflated count IPD, under realistic data conditions. Overall, all models yielded adequate coverage and bias for the treatment effect in the count portion of the model, across all data conditions. However, in the zero portion, the treatment effect was underestimated in most models and data conditions, especially when there were fewer studies. The performance of both one- and two-step approaches depended on the formulation of the treatment effects, suggesting a need to carefully consider model assumptions and specifications when using IPD.
{"title":"Which is Better for Individual Participant Data Meta-Analysis of Zero-Inflated Count Outcomes, One-Step or Two-Step Analysis? A Simulation Study.","authors":"David Huh, Scott A Baldwin, Zhengyang Zhou, Joonsuk Park, Eun-Young Mun","doi":"10.1080/00273171.2023.2173135","DOIUrl":"10.1080/00273171.2023.2173135","url":null,"abstract":"<p><p>Meta-analysis using individual participant data (IPD) is an important methodology in intervention research because it (a) increases accuracy and precision of estimates, (b) allows researchers to investigate mediators and moderators of treatment effects, and (c) makes use of extant data. IPD meta-analysis can be conducted either via a one-step approach that uses data from all studies simultaneously, or a two-step approach, which aggregates data for each study and then combines them in a traditional meta-analysis model. Unfortunately, there are no evidence-based guidelines for how best to approach IPD meta-analysis for count outcomes with many zeroes, such as alcohol use. We used simulation to compare the performance of four hurdle models (3 one-step and 1 two-step models) for zero-inflated count IPD, under realistic data conditions. Overall, all models yielded adequate coverage and bias for the treatment effect in the count portion of the model, across all data conditions. However, in the zero portion, the treatment effect was underestimated in most models and data conditions, especially when there were fewer studies. The performance of both one- and two-step approaches depended on the formulation of the treatment effects, suggesting a need to carefully consider model assumptions and specifications when using IPD.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":5.3,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10517064/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9535145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-10DOI: 10.1080/00273171.2023.2193718
Giacomo Marzi, Marco Balzano, Leonardo Egidi, Alessandro Magrini
This article proposes the Shiny app 'CLC Estimator' -Congeneric Latent Construct Estimator- to address the problem of estimating latent unidimensional constructs via congeneric approaches. While congeneric approaches provide more rigorous results than suboptimal parallel-based scoring methods, most statistical packages do not provide easy access to congeneric approaches. To address this issue, the CLC Estimator allows social scientists to use congeneric approaches to estimate latent unidimensional constructs smoothly. The present app provides a novel solution to the challenge of limited access to congeneric estimation methods in survey research.
{"title":"CLC Estimator: A Tool for Latent Construct Estimation via Congeneric Approaches in Survey Research.","authors":"Giacomo Marzi, Marco Balzano, Leonardo Egidi, Alessandro Magrini","doi":"10.1080/00273171.2023.2193718","DOIUrl":"10.1080/00273171.2023.2193718","url":null,"abstract":"<p><p>This article proposes the Shiny app 'CLC Estimator' -Congeneric Latent Construct Estimator- to address the problem of estimating latent unidimensional constructs via congeneric approaches. While congeneric approaches provide more rigorous results than suboptimal parallel-based scoring methods, most statistical packages do not provide easy access to congeneric approaches. To address this issue, the CLC Estimator allows social scientists to use congeneric approaches to estimate latent unidimensional constructs smoothly. The present app provides a novel solution to the challenge of limited access to congeneric estimation methods in survey research.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9279519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-10-10DOI: 10.1080/00273171.2023.2261224
Michael D Hunter
Increasingly, behavioral scientists encounter data where several individuals were measured on multiple variables over numerous occasions. Many current methods combine these data, assuming all individuals are randomly equivalent. An extreme alternative assumes no one is randomly equivalent. We propose state space mixture modeling as one possible compromise. State space mixture modeling assumes that unknown groups of people exist who share the same parameters of a state space model, and simultaneously estimates both the state space parameters and group membership. The goal is to find people that are undergoing similar change processes over time. The present work demonstrates state space mixture modeling on a simulated data set, and summarizes the results from a large simulation study. The illustration shows how the analysis is conducted, whereas the simulation provides evidence of its general validity and applicability. In the simulation study, sample size had the greatest influence on parameter estimation and the dimension of the change process had the greatest impact on correctly grouping people together, likely due to the distinctiveness of their patterns of change. State space mixture modeling offers one of the best-performing methods for simultaneously drawing conclusions about individual change processes while also analyzing multiple people.
{"title":"State Space Mixture Modeling: Finding People with Similar Patterns of Change.","authors":"Michael D Hunter","doi":"10.1080/00273171.2023.2261224","DOIUrl":"https://doi.org/10.1080/00273171.2023.2261224","url":null,"abstract":"<p><p>Increasingly, behavioral scientists encounter data where several individuals were measured on multiple variables over numerous occasions. Many current methods combine these data, assuming all individuals are randomly equivalent. An extreme alternative assumes no one is randomly equivalent. We propose state space mixture modeling as one possible compromise. State space mixture modeling assumes that unknown groups of people exist who share the same parameters of a state space model, and simultaneously estimates both the state space parameters and group membership. The goal is to find people that are undergoing similar change processes over time. The present work demonstrates state space mixture modeling on a simulated data set, and summarizes the results from a large simulation study. The illustration shows how the analysis is conducted, whereas the simulation provides evidence of its general validity and applicability. In the simulation study, sample size had the greatest influence on parameter estimation and the dimension of the change process had the greatest impact on correctly grouping people together, likely due to the distinctiveness of their patterns of change. State space mixture modeling offers one of the best-performing methods for simultaneously drawing conclusions about individual change processes while also analyzing multiple people.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41184181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-09-01Epub Date: 2023-01-05DOI: 10.1080/00273171.2022.2146638
Marie Salditt, Sarah Humberg, Steffen Nestler
Gradient tree boosting is a powerful machine learning technique that has shown good performance in predicting a variety of outcomes. However, when applied to hierarchical (e.g., longitudinal or clustered) data, the predictive performance of gradient tree boosting may be harmed by ignoring the hierarchical structure, and may be improved by accounting for it. Tree-based methods such as regression trees and random forests have already been extended to hierarchical data settings by combining them with the linear mixed effects model (MEM). In the present article, we add to this literature by proposing two algorithms to estimate a combination of the MEM and gradient tree boosting. We report on two simulation studies that (i) investigate the predictive performance of the two MEM boosting algorithms and (ii) compare them to standard gradient tree boosting, standard random forest, and other existing methods for hierarchical data (MEM, MEM random forests, model-based boosting, Bayesian additive regression trees [BART]). We found substantial improvements in the predictive performance of our MEM boosting algorithms over standard boosting when the random effects were non-negligible. MEM boosting as well as BART showed a predictive performance similar to the correctly specified MEM (i.e., the benchmark model), and overall outperformed the model-based boosting and random forest approaches.
{"title":"Gradient Tree Boosting for Hierarchical Data.","authors":"Marie Salditt, Sarah Humberg, Steffen Nestler","doi":"10.1080/00273171.2022.2146638","DOIUrl":"10.1080/00273171.2022.2146638","url":null,"abstract":"<p><p>Gradient tree boosting is a powerful machine learning technique that has shown good performance in predicting a variety of outcomes. However, when applied to hierarchical (e.g., longitudinal or clustered) data, the predictive performance of gradient tree boosting may be harmed by ignoring the hierarchical structure, and may be improved by accounting for it. Tree-based methods such as regression trees and random forests have already been extended to hierarchical data settings by combining them with the linear mixed effects model (MEM). In the present article, we add to this literature by proposing two algorithms to estimate a combination of the MEM and gradient tree boosting. We report on two simulation studies that (i) investigate the predictive performance of the two MEM boosting algorithms and (ii) compare them to standard gradient tree boosting, standard random forest, and other existing methods for hierarchical data (MEM, MEM random forests, model-based boosting, Bayesian additive regression trees [BART]). We found substantial improvements in the predictive performance of our MEM boosting algorithms over standard boosting when the random effects were non-negligible. MEM boosting as well as BART showed a predictive performance similar to the correctly specified MEM (i.e., the benchmark model), and overall outperformed the model-based boosting and random forest approaches.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":null,"pages":null},"PeriodicalIF":3.8,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10480691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}