Pub Date : 2024-01-01Epub Date: 2023-05-23DOI: 10.1080/00273171.2023.2205393
Adéla Hladká, Patrícia Martinková, David Magis
Many of the differential item functioning (DIF) detection methods rely on a principle of testing for DIF item by item, while considering the rest of the items or at least some of them being DIF-free. Computational algorithms of these DIF detection methods involve the selection of DIF-free items in an iterative procedure called item purification. Another aspect is the need to correct for multiple comparisons, which can be done with a number of existing multiple comparison adjustment methods. In this article, we demonstrate that implementation of these two controlling procedures together may have an impact on which items are detected as DIF items. We propose an iterative algorithm combining item purification and adjustment for multiple comparisons. Pleasant properties of the newly proposed algorithm are shown with a simulation study. The method is demonstrated on a real data example.
{"title":"Combining Item Purification and Multiple Comparison Adjustment Methods in Detection of Differential Item Functioning.","authors":"Adéla Hladká, Patrícia Martinková, David Magis","doi":"10.1080/00273171.2023.2205393","DOIUrl":"10.1080/00273171.2023.2205393","url":null,"abstract":"<p><p>Many of the differential item functioning (DIF) detection methods rely on a principle of testing for DIF item by item, while considering the rest of the items or at least some of them being DIF-free. Computational algorithms of these DIF detection methods involve the selection of DIF-free items in an iterative procedure called <i>item purification</i>. Another aspect is the need to correct for multiple comparisons, which can be done with a number of existing <i>multiple comparison adjustment</i> methods. In this article, we demonstrate that implementation of these two controlling procedures together may have an impact on which items are detected as DIF items. We propose an iterative algorithm combining item purification and adjustment for multiple comparisons. Pleasant properties of the newly proposed algorithm are shown with a simulation study. The method is demonstrated on a real data example.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"46-61"},"PeriodicalIF":3.8,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9581128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01Epub Date: 2023-06-01DOI: 10.1080/00273171.2023.2211564
Xuelan Qiu, Sheng-Yun Huang, Wen-Chung Wang, You-Gan Wang
Many person-fit statistics have been proposed to detect aberrant response behaviors (e.g., cheating, guessing). Among them, lz is one of the most widely used indices. The computation of lz assumes the item and person parameters are known. In reality, they often have to be estimated from data. The better the estimation, the better lz will perform. When aberrant behaviors occur, the person and item parameter estimations are inaccurate, which in turn degrade the performance of lz. In this study, an iterative procedure was developed to attain more accurate person parameter estimates for improved performance of lz. A series of simulations were conducted to evaluate the iterative procedure under two conditions of item parameters, known and unknown, and three aberrant response styles of difficulty-sharing cheating, random-sharing cheating, and random guessing. The results demonstrated the superiority of the iterative procedure over the non-iterative one in maintaining control of Type-I error rates and improving the power of detecting aberrant responses. The proposed procedure was applied to a high-stake intelligence test.
{"title":"An Iterative Scale Purification Procedure on <i>l</i><sub>z</sub> for the Detection of Aberrant Responses.","authors":"Xuelan Qiu, Sheng-Yun Huang, Wen-Chung Wang, You-Gan Wang","doi":"10.1080/00273171.2023.2211564","DOIUrl":"10.1080/00273171.2023.2211564","url":null,"abstract":"<p><p>Many person-fit statistics have been proposed to detect aberrant response behaviors (e.g., cheating, guessing). Among them, <i>l</i><sub>z</sub> is one of the most widely used indices. The computation of <i>l</i><sub>z</sub> assumes the item and person parameters are known. In reality, they often have to be estimated from data. The better the estimation, the better <i>l</i><sub>z</sub> will perform. When aberrant behaviors occur, the person and item parameter estimations are inaccurate, which in turn degrade the performance of <i>l</i><sub>z</sub>. In this study, an iterative procedure was developed to attain more accurate person parameter estimates for improved performance of <i>l</i><sub>z</sub>. A series of simulations were conducted to evaluate the iterative procedure under two conditions of item parameters, known and unknown, and three aberrant response styles of difficulty-sharing cheating, random-sharing cheating, and random guessing. The results demonstrated the superiority of the iterative procedure over the non-iterative one in maintaining control of Type-I error rates and improving the power of detecting aberrant responses. The proposed procedure was applied to a high-stake intelligence test.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"62-77"},"PeriodicalIF":3.8,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9552354","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-01-01Epub Date: 2023-07-17DOI: 10.1080/00273171.2023.2229079
John J Dziak, Daniel Almirall, Walter Dempsey, Catherine Stanger, Inbal Nahum-Shani
Sequential Multiple-Assignment Randomized Trials (SMARTs) play an increasingly important role in psychological and behavioral health research. This experimental approach enables researchers to answer scientific questions about how to sequence and match interventions to the unique, changing needs of individuals. A variety of sample size planning resources for SMART studies have been developed, enabling researchers to plan SMARTs for addressing different types of scientific questions. However, relatively limited attention has been given to planning SMARTs with binary (dichotomous) outcomes, which often require higher sample sizes relative to continuous outcomes. Existing resources for estimating sample size requirements for SMARTs with binary outcomes do not consider the potential to improve power by including a baseline measurement and/or multiple repeated outcome measurements. The current paper addresses this issue by providing sample size planning simulation procedures and approximate formulas for two-wave repeated measures binary outcomes (i.e., two measurement times for the outcome variable, before and after intervention delivery). The simulation results agree well with the formulas. We also discuss how to use simulations to calculate power for studies with more than two outcome measurement occasions. Results show that having at least one repeated measurement of the outcome can substantially improve power under certain conditions.
{"title":"SMART Binary: New Sample Size Planning Resources for SMART Studies with Binary Outcome Measurements.","authors":"John J Dziak, Daniel Almirall, Walter Dempsey, Catherine Stanger, Inbal Nahum-Shani","doi":"10.1080/00273171.2023.2229079","DOIUrl":"10.1080/00273171.2023.2229079","url":null,"abstract":"<p><p>Sequential Multiple-Assignment Randomized Trials (SMARTs) play an increasingly important role in psychological and behavioral health research. This experimental approach enables researchers to answer scientific questions about how to sequence and match interventions to the unique, changing needs of individuals. A variety of sample size planning resources for SMART studies have been developed, enabling researchers to plan SMARTs for addressing different types of scientific questions. However, relatively limited attention has been given to planning SMARTs with binary (dichotomous) outcomes, which often require higher sample sizes relative to continuous outcomes. Existing resources for estimating sample size requirements for SMARTs with binary outcomes do not consider the potential to improve power by including a baseline measurement and/or multiple repeated outcome measurements. The current paper addresses this issue by providing sample size planning simulation procedures and approximate formulas for two-wave repeated measures binary outcomes (i.e., two measurement times for the outcome variable, before and after intervention delivery). The simulation results agree well with the formulas. We also discuss how to use simulations to calculate power for studies with more than two outcome measurement occasions. Results show that having at least one repeated measurement of the outcome can substantially improve power under certain conditions.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1-16"},"PeriodicalIF":3.8,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10792389/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10053172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-12-31DOI: 10.1080/00273171.2023.2254769
M. M. Haqiqatkhah, O. Ryan, E. L. Hamaker
Multilevel autoregressive models are popular choices for the analysis of intensive longitudinal data in psychology. Empirical studies have found a positive correlation between autoregressive parame...
{"title":"Skewness and Staging: Does the Floor Effect Induce Bias in Multilevel AR(1) Models?","authors":"M. M. Haqiqatkhah, O. Ryan, E. L. Hamaker","doi":"10.1080/00273171.2023.2254769","DOIUrl":"https://doi.org/10.1080/00273171.2023.2254769","url":null,"abstract":"Multilevel autoregressive models are popular choices for the analysis of intensive longitudinal data in psychology. Empirical studies have found a positive correlation between autoregressive parame...","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":"77 1","pages":""},"PeriodicalIF":3.8,"publicationDate":"2023-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139070537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-05-04DOI: 10.1080/00273171.2023.2194606
Alexander P Christensen, Luis Eduardo Garrido, Hudson Golino
The local independence assumption states that variables are unrelated after conditioning on a latent variable. Common problems that arise from violations of this assumption include model misspecification, biased model parameters, and inaccurate estimates of internal structure. These problems are not limited to latent variable models but also apply to network psychometrics. This paper proposes a novel network psychometric approach to detect locally dependent pairs of variables using network modeling and a graph theory measure called weighted topological overlap (wTO). Using simulation, this approach is compared to contemporary local dependence detection methods such as exploratory structural equation modeling with standardized expected parameter change and a recently developed approach using partial correlations and a resampling procedure. Different approaches to determine local dependence using statistical significance and cutoff values are also compared. Continuous, polytomous (5-point Likert scale), and dichotomous (binary) data were generated with skew across a variety of conditions. Our results indicate that cutoff values work better than significance approaches. Overall, the network psychometrics approaches using wTO with graphical least absolute shrinkage and selector operator with extended Bayesian information criterion and wTO with Bayesian Gaussian graphical model were the best performing local dependence detection methods overall.
{"title":"Unique Variable Analysis: A Network Psychometrics Method to Detect Local Dependence.","authors":"Alexander P Christensen, Luis Eduardo Garrido, Hudson Golino","doi":"10.1080/00273171.2023.2194606","DOIUrl":"10.1080/00273171.2023.2194606","url":null,"abstract":"<p><p>The local independence assumption states that variables are unrelated after conditioning on a latent variable. Common problems that arise from violations of this assumption include model misspecification, biased model parameters, and inaccurate estimates of internal structure. These problems are not limited to latent variable models but also apply to network psychometrics. This paper proposes a novel network psychometric approach to detect locally dependent pairs of variables using network modeling and a graph theory measure called weighted topological overlap (wTO). Using simulation, this approach is compared to contemporary local dependence detection methods such as exploratory structural equation modeling with standardized expected parameter change and a recently developed approach using partial correlations and a resampling procedure. Different approaches to determine local dependence using statistical significance and cutoff values are also compared. Continuous, polytomous (5-point Likert scale), and dichotomous (binary) data were generated with skew across a variety of conditions. Our results indicate that cutoff values work better than significance approaches. Overall, the network psychometrics approaches using wTO with graphical least absolute shrinkage and selector operator with extended Bayesian information criterion and wTO with Bayesian Gaussian graphical model were the best performing local dependence detection methods overall.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1165-1182"},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9406819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-05-25DOI: 10.1080/00273171.2023.2182753
Booil Jo, Trevor J Hastie, Zetan Li, Eric A Youngstrom, Robert L Findling, Sarah McCue Horwitz
Despite its potentials benefits, using prediction targets generated based on latent variable (LV) modeling is not a common practice in supervised learning, a dominating framework for developing prediction models. In supervised learning, it is typically assumed that the outcome to be predicted is clear and readily available, and therefore validating outcomes before predicting them is a foreign concept and an unnecessary step. The usual goal of LV modeling is inference, and therefore using it in supervised learning and in the prediction context requires a major conceptual shift. This study lays out methodological adjustments and conceptual shifts necessary for integrating LV modeling into supervised learning. It is shown that such integration is possible by combining the traditions of LV modeling, psychometrics, and supervised learning. In this interdisciplinary learning framework, generating practical outcomes using LV modeling and systematically validating them based on clinical validators are the two main strategies. In the example using the data from the Longitudinal Assessment of Manic Symptoms (LAMS) Study, a large pool of candidate outcomes is generated by flexible LV modeling. It is demonstrated that this exploratory situation can be used as an opportunity to tailor desirable prediction targets taking advantage of contemporary science and clinical insights.
{"title":"Reorienting Latent Variable Modeling for Supervised Learning.","authors":"Booil Jo, Trevor J Hastie, Zetan Li, Eric A Youngstrom, Robert L Findling, Sarah McCue Horwitz","doi":"10.1080/00273171.2023.2182753","DOIUrl":"10.1080/00273171.2023.2182753","url":null,"abstract":"<p><p>Despite its potentials benefits, using prediction targets generated based on latent variable (LV) modeling is not a common practice in supervised learning, a dominating framework for developing prediction models. In supervised learning, it is typically assumed that the outcome to be predicted is clear and readily available, and therefore validating outcomes before predicting them is a foreign concept and an unnecessary step. The usual goal of LV modeling is inference, and therefore using it in supervised learning and in the prediction context requires a major conceptual shift. This study lays out methodological adjustments and conceptual shifts necessary for integrating LV modeling into supervised learning. It is shown that such integration is possible by combining the traditions of LV modeling, psychometrics, and supervised learning. In this interdisciplinary learning framework, generating practical outcomes using LV modeling and systematically validating them based on clinical validators are the two main strategies. In the example using the data from the Longitudinal Assessment of Manic Symptoms (LAMS) Study, a large pool of candidate outcomes is generated by flexible LV modeling. It is demonstrated that this exploratory situation can be used as an opportunity to tailor desirable prediction targets taking advantage of contemporary science and clinical insights.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1057-1071"},"PeriodicalIF":5.3,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10674034/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9524122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-25DOI: 10.1080/00273171.2023.2201277
Ivan Jacob Agaloos Pesigan, Rong Wei Sun, Shu Fai Cheung
The multivariate delta method was used by Yuan and Chan to estimate standard errors and confidence intervals for standardized regression coefficients. Jones and Waller extended the earlier work to situations where data are nonnormal by utilizing Browne's asymptotic distribution-free (ADF) theory. Furthermore, Dudgeon developed standard errors and confidence intervals, employing heteroskedasticity-consistent (HC) estimators, that are robust to nonnormality with better performance in smaller sample sizes compared to Jones and Waller's ADF technique. Despite these advancements, empirical research has been slow to adopt these methodologies. This can be a result of the dearth of user-friendly software programs to put these techniques to use. We present the betaDelta and the betaSandwich packages in the R statistical software environment in this manuscript. Both the normal-theory approach and the ADF approach put forth by Yuan and Chan and Jones and Waller are implemented by the betaDelta package. The HC approach proposed by Dudgeon is implemented by the betaSandwich package. The use of the packages is demonstrated with an empirical example. We think the packages will enable applied researchers to accurately assess the sampling variability of standardized regression coefficients.
Yuan 和 Chan 使用多元三角法估算标准化回归系数的标准误差和置信区间。Jones 和 Waller 利用 Browne 的无渐近分布 (ADF) 理论,将先前的工作扩展到了数据非正态分布的情况。此外,Dudgeon 利用异方差一致(HC)估计器开发了标准误差和置信区间,与 Jones 和 Waller 的 ADF 技术相比,这些估计器对非正态性具有稳健性,在样本量较小的情况下性能更好。尽管取得了这些进步,但实证研究在采用这些方法方面进展缓慢。这可能是由于缺乏用户友好的软件程序来使用这些技术。我们在本手稿中介绍了 R 统计软件环境中的 betaDelta 和 betaSandwich 软件包。Yuan和Chan以及Jones和Waller提出的正态理论方法和ADF方法都由betaDelta软件包实现。Dudgeon 提出的 HC 方法由 betaSandwich 软件包实现。我们通过一个实证例子演示了软件包的使用。我们认为这些软件包可以帮助应用研究人员准确评估标准化回归系数的抽样变异性。
{"title":"betaDelta and betaSandwich: Confidence Intervals for Standardized Regression Coefficients in R.","authors":"Ivan Jacob Agaloos Pesigan, Rong Wei Sun, Shu Fai Cheung","doi":"10.1080/00273171.2023.2201277","DOIUrl":"10.1080/00273171.2023.2201277","url":null,"abstract":"<p><p>The multivariate delta method was used by Yuan and Chan to estimate standard errors and confidence intervals for standardized regression coefficients. Jones and Waller extended the earlier work to situations where data are nonnormal by utilizing Browne's asymptotic distribution-free (ADF) theory. Furthermore, Dudgeon developed standard errors and confidence intervals, employing heteroskedasticity-consistent (HC) estimators, that are robust to nonnormality with better performance in smaller sample sizes compared to Jones and Waller's ADF technique. Despite these advancements, empirical research has been slow to adopt these methodologies. This can be a result of the dearth of user-friendly software programs to put these techniques to use. We present the betaDelta and the betaSandwich packages in the R statistical software environment in this manuscript. Both the normal-theory approach and the ADF approach put forth by Yuan and Chan and Jones and Waller are implemented by the betaDelta package. The HC approach proposed by Dudgeon is implemented by the betaSandwich package. The use of the packages is demonstrated with an empirical example. We think the packages will enable applied researchers to accurately assess the sampling variability of standardized regression coefficients.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1183-1186"},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9986115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-10DOI: 10.1080/00273171.2023.2174490
Jason D Rights, Sonya K Sterba
For multilevel models (MLMs) with fixed slopes, it has been widely recognized that a level-1 variable can have distinct between-cluster and within-cluster fixed effects, and that failing to disaggregate these effects yields a conflated, uninterpretable fixed effect. For MLMs with random slopes, however, we clarify that two different types of slope conflation can occur: that of the fixed component (termed fixed conflation) and that of the random component (termed random conflation). The latter is rarely recognized and not well understood. Here we explain that a model commonly used to disaggregate the fixed component-the contextual effect model with random slopes-troublingly still yields a conflated random component. Negative consequences of such random conflation have not been demonstrated. Here we show that they include erroneous interpretation and inferences about the substantively important extent of between-cluster differences in slopes, including either underestimating or overestimating such slope heterogeneity. Furthermore, we show that this random conflation can yield inappropriate standard errors for fixed effects. To aid researchers in practice, we delineate which types of random slope specifications yield an unconflated random component. We demonstrate the advantages of these unconflated models in terms of estimating and testing random slope variance (i.e., improved power, Type I error, and bias) and in terms of standard error estimation for fixed effects (i.e., more accurate standard errors), and make recommendations for which specifications to use for particular research purposes.
{"title":"On the Common but Problematic Specification of Conflated Random Slopes in Multilevel Models.","authors":"Jason D Rights, Sonya K Sterba","doi":"10.1080/00273171.2023.2174490","DOIUrl":"10.1080/00273171.2023.2174490","url":null,"abstract":"<p><p>For multilevel models (MLMs) with fixed slopes, it has been widely recognized that a level-1 variable can have distinct between-cluster and within-cluster fixed effects, and that failing to disaggregate these effects yields a conflated, uninterpretable fixed effect. For MLMs with random slopes, however, we clarify that two different types of slope conflation can occur: that of the fixed component (termed fixed conflation) and that of the random component (termed random conflation). The latter is rarely recognized and not well understood. Here we explain that a model commonly used to disaggregate the fixed component-the contextual effect model with random slopes-troublingly still yields a conflated random component. Negative consequences of such random conflation have not been demonstrated. Here we show that they include erroneous interpretation and inferences about the substantively important extent of between-cluster differences in slopes, including either underestimating or overestimating such slope heterogeneity. Furthermore, we show that this random conflation can yield inappropriate standard errors for fixed effects. To aid researchers in practice, we delineate which types of random slope specifications yield an unconflated random component. We demonstrate the advantages of these unconflated models in terms of estimating and testing random slope variance (i.e., improved power, Type I error, and bias) and in terms of standard error estimation for fixed effects (i.e., more accurate standard errors), and make recommendations for which specifications to use for particular research purposes.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1106-1133"},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9642680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-11DOI: 10.1080/00273171.2023.2193600
Lihan Chen, Victoria Savalei, Mijke Rhemtulla
The use of modern missing data techniques has become more prevalent with their increasing accessibility in statistical software. These techniques focus on handling data that are missing at random (MAR). Although all MAR mechanisms are routinely treated as the same, they are not equal. The impact of missing data on the efficiency of parameter estimates can differ for different MAR variations, even when the amount of missing data is held constant; yet, in current practice, only the rate of missing data is reported. The impact of MAR on the loss of efficiency can instead be more directly measured by the fraction of missing information (FMI). In this article, we explore this impact using FMIs in regression models with one and two predictors. With the help of a Shiny application, we demonstrate that efficiency loss due to missing data can be highly complex and is not always intuitive. We recommend substantive researchers who work with missing data report estimates of FMIs in addition to the rate of missingness. We also encourage methodologists to examine FMIs when designing simulation studies with missing data, and to explore the behavior of efficiency loss under MAR using FMIs in more complex models.
随着统计软件的日益普及,现代缺失数据技术的使用也越来越普遍。这些技术侧重于处理随机缺失数据(MAR)。虽然所有的随机缺失机制通常都被视为相同的,但它们并不相同。即使在缺失数据量保持不变的情况下,缺失数据对参数估计效率的影响也会因不同的 MAR 变化而不同;但在目前的实践中,只报告缺失数据率。MAR 对效率损失的影响可以通过缺失信息的比例 (FMI) 更直接地衡量。在本文中,我们将利用单预测因子和双预测因子回归模型中的 FMIs 来探讨这种影响。在 Shiny 应用程序的帮助下,我们证明了数据缺失导致的效率损失可能非常复杂,而且并不总是直观的。我们建议研究人员在处理缺失数据时,除了报告缺失率外,还要报告 FMIs 的估计值。我们还鼓励方法论专家在设计有缺失数据的模拟研究时检查 FMIs,并在更复杂的模型中使用 FMIs 探索 MAR 下的效率损失行为。
{"title":"Pay Attention to the Ignorable Missing Data Mechanisms! An Exploration of Their Impact on the Efficiency of Regression Coefficients.","authors":"Lihan Chen, Victoria Savalei, Mijke Rhemtulla","doi":"10.1080/00273171.2023.2193600","DOIUrl":"10.1080/00273171.2023.2193600","url":null,"abstract":"<p><p>The use of modern missing data techniques has become more prevalent with their increasing accessibility in statistical software. These techniques focus on handling data that are <i>missing at random</i> (MAR). Although all MAR mechanisms are routinely treated as the same, they are not equal. The impact of missing data on the efficiency of parameter estimates can differ for different MAR variations, even when the amount of missing data is held constant; yet, in current practice, only the rate of missing data is reported. The impact of MAR on the loss of efficiency can instead be more directly measured by the <i>fraction of missing information</i> (FMI). In this article, we explore this impact using FMIs in regression models with one and two predictors. With the help of a <i>Shiny</i> application, we demonstrate that efficiency loss due to missing data can be highly complex and is not always intuitive. We recommend substantive researchers who work with missing data report estimates of FMIs in addition to the rate of missingness. We also encourage methodologists to examine FMIs when designing simulation studies with missing data, and to explore the behavior of efficiency loss under MAR using FMIs in more complex models.</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1134-1159"},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9273348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-11-01Epub Date: 2023-04-10DOI: 10.1080/00273171.2023.2189571
Marcos Jiménez, Francisco J Abad, Eduardo Garcia-Garzon, Luis Eduardo Garrido
Exploratory bi-factor analysis (EBFA) is a very popular approach to estimate models where specific factors are concomitant to a single, general dimension. However, the models typically encountered in fields like personality, intelligence, and psychopathology involve more than one general factor. To address this circumstance, we developed an algorithm (GSLiD) based on partially specified targets to perform exploratory bi-factor analysis with multiple general factors (EBFA-MGF). In EBFA-MGF, researchers do not need to conduct independent bi-factor analyses anymore because several bi-factor models are estimated simultaneously in an exploratory manner, guarding against biased estimates and model misspecification errors due to unexpected cross-loadings and factor correlations. The results from an exhaustive Monte Carlo simulation manipulating nine variables of interest suggested that GSLiD outperforms the Schmid-Leiman approximation and is robust to challenging conditions involving cross-loadings and pure items of the general factors. Thereby, we supply an R package (bifactor) to make EBFA-MGF readily available for substantive research. Finally, we use GSLiD to assess the hierarchical structure of a reduced version of the Personality Inventory for DSM-5 Short Form (PID-5-SF).
{"title":"Exploratory Bi-factor Analysis with Multiple General Factors.","authors":"Marcos Jiménez, Francisco J Abad, Eduardo Garcia-Garzon, Luis Eduardo Garrido","doi":"10.1080/00273171.2023.2189571","DOIUrl":"10.1080/00273171.2023.2189571","url":null,"abstract":"<p><p>Exploratory bi-factor analysis (EBFA) is a very popular approach to estimate models where specific factors are concomitant to a single, general dimension. However, the models typically encountered in fields like personality, intelligence, and psychopathology involve more than one general factor. To address this circumstance, we developed an algorithm (GSLiD) based on partially specified targets to perform exploratory bi-factor analysis with multiple general factors (EBFA-MGF). In EBFA-MGF, researchers do not need to conduct independent bi-factor analyses anymore because several bi-factor models are estimated simultaneously in an exploratory manner, guarding against biased estimates and model misspecification errors due to unexpected cross-loadings and factor correlations. The results from an exhaustive Monte Carlo simulation manipulating nine variables of interest suggested that GSLiD outperforms the Schmid-Leiman approximation and is robust to challenging conditions involving cross-loadings and pure items of the general factors. Thereby, we supply an R package (bifactor) to make EBFA-MGF readily available for substantive research. Finally, we use GSLiD to assess the hierarchical structure of a reduced version of the Personality Inventory for DSM-5 Short Form (PID-5-SF).</p>","PeriodicalId":53155,"journal":{"name":"Multivariate Behavioral Research","volume":" ","pages":"1072-1089"},"PeriodicalIF":3.8,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9642682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}