Pub Date : 2023-04-01Epub Date: 2022-04-29DOI: 10.1177/00131644221093619
Yoona Jang, Sehee Hong
The purpose of this study was to evaluate the degree of classification quality in the basic latent class model when covariates are either included or are not included in the model. To accomplish this task, Monte Carlo simulations were conducted in which the results of models with and without a covariate were compared. Based on these simulations, it was determined that models without a covariate better predicted the number of classes. These findings in general supported the use of the popular three-step approach; with its quality of classification determined to be more than 70% under various conditions of covariate effect, sample size, and quality of indicators. In light of these findings, the practical utility of evaluating classification quality is discussed relative to issues that applied researchers need to carefully consider when applying latent class models.
{"title":"Evaluating the Quality of Classification in Mixture Model Simulations.","authors":"Yoona Jang, Sehee Hong","doi":"10.1177/00131644221093619","DOIUrl":"10.1177/00131644221093619","url":null,"abstract":"<p><p>The purpose of this study was to evaluate the degree of classification quality in the basic latent class model when covariates are either included or are not included in the model. To accomplish this task, Monte Carlo simulations were conducted in which the results of models with and without a covariate were compared. Based on these simulations, it was determined that models without a covariate better predicted the number of classes. These findings in general supported the use of the popular three-step approach; with its quality of classification determined to be more than 70% under various conditions of covariate effect, sample size, and quality of indicators. In light of these findings, the practical utility of evaluating classification quality is discussed relative to issues that applied researchers need to carefully consider when applying latent class models.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 2","pages":"351-374"},"PeriodicalIF":2.7,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972124/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10833189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01Epub Date: 2022-07-30DOI: 10.1177/00131644221104220
Michael John Ilagan, Carl F Falk
Administering Likert-type questionnaires to online samples risks contamination of the data by malicious computer-generated random responses, also known as bots. Although nonresponsivity indices (NRIs) such as person-total correlations or Mahalanobis distance have shown great promise to detect bots, universal cutoff values are elusive. An initial calibration sample constructed via stratified sampling of bots and humans-real or simulated under a measurement model-has been used to empirically choose cutoffs with a high nominal specificity. However, a high-specificity cutoff is less accurate when the target sample has a high contamination rate. In the present article, we propose the supervised classes, unsupervised mixing proportions (SCUMP) algorithm that chooses a cutoff to maximize accuracy. SCUMP uses a Gaussian mixture model to estimate, unsupervised, the contamination rate in the sample of interest. A simulation study found that, in the absence of model misspecification on the bots, our cutoffs maintained accuracy across varying contamination rates.
{"title":"Supervised Classes, Unsupervised Mixing Proportions: Detection of Bots in a Likert-Type Questionnaire.","authors":"Michael John Ilagan, Carl F Falk","doi":"10.1177/00131644221104220","DOIUrl":"10.1177/00131644221104220","url":null,"abstract":"<p><p>Administering Likert-type questionnaires to online samples risks contamination of the data by malicious computer-generated random responses, also known as bots. Although nonresponsivity indices (NRIs) such as person-total correlations or Mahalanobis distance have shown great promise to detect bots, universal cutoff values are elusive. An initial calibration sample constructed via stratified sampling of bots and humans-real or simulated under a measurement model-has been used to empirically choose cutoffs with a high nominal specificity. However, a high-specificity cutoff is less accurate when the target sample has a high contamination rate. In the present article, we propose the supervised classes, unsupervised mixing proportions (SCUMP) algorithm that chooses a cutoff to maximize accuracy. SCUMP uses a Gaussian mixture model to estimate, unsupervised, the contamination rate in the sample of interest. A simulation study found that, in the absence of model misspecification on the bots, our cutoffs maintained accuracy across varying contamination rates.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 2","pages":"217-239"},"PeriodicalIF":2.7,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972131/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10823907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01Epub Date: 2022-02-28DOI: 10.1177/00131644221081011
James D Weese, Ronna C Turner, Xinya Liang, Allison Ames, Brandon Crawford
A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and large differential item functioning (DIF) for polytomous response data with three to seven response options. These are provided for researchers studying polytomous data using POLYSIBTEST software that has been published previously. The second simulation study provides one pair of standardized effect size heuristics that can be employed with items having any number of response options and compares true-positive and false-positive rates for the standardized effect size proposed by Weese with one proposed by Zwick et al. and two unstandardized classification procedures (Gierl; Golia). All four procedures retained false-positive rates generally below the level of significance at both moderate and large DIF levels. However, Weese's standardized effect size was not affected by sample size and provided slightly higher true-positive rates than the Zwick et al. and Golia's recommendations, while flagging substantially fewer items that might be characterized as having negligible DIF when compared with Gierl's suggested criterion. The proposed effect size allows for easier use and interpretation by practitioners as it can be applied to items with any number of response options and is interpreted as a difference in standard deviation units.
{"title":"Implementing a Standardized Effect Size in the POLYSIBTEST Procedure.","authors":"James D Weese, Ronna C Turner, Xinya Liang, Allison Ames, Brandon Crawford","doi":"10.1177/00131644221081011","DOIUrl":"10.1177/00131644221081011","url":null,"abstract":"<p><p>A study was conducted to implement the use of a standardized effect size and corresponding classification guidelines for polytomous data with the POLYSIBTEST procedure and compare those guidelines with prior recommendations. Two simulation studies were included. The first identifies new unstandardized test heuristics for classifying moderate and large differential item functioning (DIF) for polytomous response data with three to seven response options. These are provided for researchers studying polytomous data using POLYSIBTEST software that has been published previously. The second simulation study provides one pair of standardized effect size heuristics that can be employed with items having any number of response options and compares true-positive and false-positive rates for the standardized effect size proposed by Weese with one proposed by Zwick et al. and two unstandardized classification procedures (Gierl; Golia). All four procedures retained false-positive rates generally below the level of significance at both moderate and large DIF levels. However, Weese's standardized effect size was not affected by sample size and provided slightly higher true-positive rates than the Zwick et al. and Golia's recommendations, while flagging substantially fewer items that might be characterized as having negligible DIF when compared with Gierl's suggested criterion. The proposed effect size allows for easier use and interpretation by practitioners as it can be applied to items with any number of response options and is interpreted as a difference in standard deviation units.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 2","pages":"401-427"},"PeriodicalIF":2.7,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972129/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10823908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-01DOI: 10.1177/00131644221092347
Oscar Gonzalez
When scores are used to make decisions about respondents, it is of interest to estimate classification accuracy (CA), the probability of making a correct decision, and classification consistency (CC), the probability of making the same decision across two parallel administrations of the measure. Model-based estimates of CA and CC computed from the linear factor model have been recently proposed, but parameter uncertainty of the CA and CC indices has not been investigated. This article demonstrates how to estimate percentile bootstrap confidence intervals and Bayesian credible intervals for CA and CC indices, which have the added benefit of incorporating the sampling variability of the parameters of the linear factor model to summary intervals. Results from a small simulation study suggest that percentile bootstrap confidence intervals have appropriate confidence interval coverage, although displaying a small negative bias. However, Bayesian credible intervals with diffused priors have poor interval coverage, but their coverage improves once empirical, weakly informative priors are used. The procedures are illustrated by estimating CA and CC indices from a measure used to identify individuals low on mindfulness for a hypothetical intervention, and R code is provided to facilitate the implementation of the procedures.
{"title":"Summary Intervals for Model-Based Classification Accuracy and Consistency Indices.","authors":"Oscar Gonzalez","doi":"10.1177/00131644221092347","DOIUrl":"https://doi.org/10.1177/00131644221092347","url":null,"abstract":"<p><p>When scores are used to make decisions about respondents, it is of interest to estimate classification accuracy (CA), the probability of making a correct decision, and classification consistency (CC), the probability of making the same decision across two parallel administrations of the measure. Model-based estimates of CA and CC computed from the linear factor model have been recently proposed, but parameter uncertainty of the CA and CC indices has not been investigated. This article demonstrates how to estimate percentile bootstrap confidence intervals and Bayesian credible intervals for CA and CC indices, which have the added benefit of incorporating the sampling variability of the parameters of the linear factor model to summary intervals. Results from a small simulation study suggest that percentile bootstrap confidence intervals have appropriate confidence interval coverage, although displaying a small negative bias. However, Bayesian credible intervals with diffused priors have poor interval coverage, but their coverage improves once empirical, weakly informative priors are used. The procedures are illustrated by estimating CA and CC indices from a measure used to identify individuals low on mindfulness for a hypothetical intervention, and R code is provided to facilitate the implementation of the procedures.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 2","pages":"240-261"},"PeriodicalIF":2.7,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9972125/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10823910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01Epub Date: 2022-02-28DOI: 10.1177/00131644221077135
Mirka Henninger, Rudolf Debelak, Carolin Strobl
To detect differential item functioning (DIF), Rasch trees search for optimal splitpoints in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF effects as significant in larger samples. This leads to larger trees, which split the sample into more subgroups. What would be more desirable is an approach that is driven more by effect size rather than sample size. In order to achieve this, we suggest to implement an additional stopping criterion: the popular Educational Testing Service (ETS) classification scheme based on the Mantel-Haenszel odds ratio. This criterion helps us to evaluate whether a split in a Rasch tree is based on a substantial or an ignorable difference in item parameters, and it allows the Rasch tree to stop growing when DIF between the identified subgroups is small. Furthermore, it supports identifying DIF items and quantifying DIF effect sizes in each split. Based on simulation results, we conclude that the Mantel-Haenszel effect size further reduces unnecessary splits in Rasch trees under the null hypothesis, or when the sample size is large but DIF effects are negligible. To make the stopping criterion easy-to-use for applied researchers, we have implemented the procedure in the statistical software R. Finally, we discuss how DIF effects between different nodes in a Rasch tree can be interpreted and emphasize the importance of purification strategies for the Mantel-Haenszel procedure on tree stopping and DIF item classification.
{"title":"A New Stopping Criterion for Rasch Trees Based on the Mantel-Haenszel Effect Size Measure for Differential Item Functioning.","authors":"Mirka Henninger, Rudolf Debelak, Carolin Strobl","doi":"10.1177/00131644221077135","DOIUrl":"10.1177/00131644221077135","url":null,"abstract":"<p><p>To detect differential item functioning (DIF), Rasch trees search for optimal splitpoints in covariates and identify subgroups of respondents in a data-driven way. To determine whether and in which covariate a split should be performed, Rasch trees use statistical significance tests. Consequently, Rasch trees are more likely to label small DIF effects as significant in larger samples. This leads to larger trees, which split the sample into more subgroups. What would be more desirable is an approach that is driven more by effect size rather than sample size. In order to achieve this, we suggest to implement an additional stopping criterion: the popular Educational Testing Service (ETS) classification scheme based on the Mantel-Haenszel odds ratio. This criterion helps us to evaluate whether a split in a Rasch tree is based on a substantial or an ignorable difference in item parameters, and it allows the Rasch tree to stop growing when DIF between the identified subgroups is small. Furthermore, it supports identifying DIF items and quantifying DIF effect sizes in each split. Based on simulation results, we conclude that the Mantel-Haenszel effect size further reduces unnecessary splits in Rasch trees under the null hypothesis, or when the sample size is large but DIF effects are negligible. To make the stopping criterion easy-to-use for applied researchers, we have implemented the procedure in the statistical software R. Finally, we discuss how DIF effects between different nodes in a Rasch tree can be interpreted and emphasize the importance of purification strategies for the Mantel-Haenszel procedure on tree stopping and DIF item classification.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 1","pages":"181-212"},"PeriodicalIF":2.1,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806517/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10489716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01Epub Date: 2022-02-08DOI: 10.1177/00131644221075580
Xiaoling Liu, Pei Cao, Xinzhen Lai, Jianbing Wen, Yanyun Yang
Percentage of uncontaminated correlations (PUC), explained common variance (ECV), and omega hierarchical (ωH) have been used to assess the degree to which a scale is essentially unidimensional and to predict structural coefficient bias when a unidimensional measurement model is fit to multidimensional data. The usefulness of these indices has been investigated in the context of bifactor models with balanced structures. This study extends the examination by focusing on bifactor models with unbalanced structures. The maximum and minimum PUC values given the total number of items and factors were derived. The usefulness of PUC, ECV, and ωH in predicting structural coefficient bias was examined under a variety of structural regression models with bifactor measurement components. Results indicated that the performance of these indices in predicting structural coefficient bias depended on whether the bifactor measurement model had a balanced or unbalanced structure. PUC failed to predict structural coefficient bias when the bifactor model had an unbalanced structure. ECV performed reasonably well, but worse than ωH.
{"title":"Assessing Essential Unidimensionality of Scales and Structural Coefficient Bias.","authors":"Xiaoling Liu, Pei Cao, Xinzhen Lai, Jianbing Wen, Yanyun Yang","doi":"10.1177/00131644221075580","DOIUrl":"10.1177/00131644221075580","url":null,"abstract":"<p><p>Percentage of uncontaminated correlations (PUC), explained common variance (ECV), and omega hierarchical (ω<sub>H</sub>) have been used to assess the degree to which a scale is essentially unidimensional and to predict structural coefficient bias when a unidimensional measurement model is fit to multidimensional data. The usefulness of these indices has been investigated in the context of bifactor models with balanced structures. This study extends the examination by focusing on bifactor models with unbalanced structures. The maximum and minimum PUC values given the total number of items and factors were derived. The usefulness of PUC, ECV, and ω<sub>H</sub> in predicting structural coefficient bias was examined under a variety of structural regression models with bifactor measurement components. Results indicated that the performance of these indices in predicting structural coefficient bias depended on whether the bifactor measurement model had a balanced or unbalanced structure. PUC failed to predict structural coefficient bias when the bifactor model had an unbalanced structure. ECV performed reasonably well, but worse than ω<sub>H</sub>.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 1","pages":"28-47"},"PeriodicalIF":2.7,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806515/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10489717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01DOI: 10.1177/00131644211069906
Hung-Yu Huang
The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs) can provide information regarding the mastery status of test takers on latent discrete variables and are more commonly used for cognitive tests employed in educational settings than for noncognitive tests. The purpose of this study is to develop a new class of DCM for FC items under the higher-order DCM framework to meet the practical demands of simultaneously controlling for response biases and providing diagnostic classification information. By conducting a series of simulations and calibrating the model parameters with a Bayesian estimation, the study shows that, in general, the model parameters can be recovered satisfactorily with the use of long tests and large samples. More attributes improve the precision of the second-order latent trait estimation in a long test, but decrease the classification accuracy and the estimation quality of the structural parameters. When statements are allowed to load on two distinct attributes in paired comparison items, the specific-attribute condition produces better a parameter estimation than the overlap-attribute condition. Finally, an empirical analysis related to work-motivation measures is presented to demonstrate the applications and implications of the new model.
{"title":"Diagnostic Classification Model for Forced-Choice Items and Noncognitive Tests.","authors":"Hung-Yu Huang","doi":"10.1177/00131644211069906","DOIUrl":"https://doi.org/10.1177/00131644211069906","url":null,"abstract":"<p><p>The forced-choice (FC) item formats used for noncognitive tests typically develop a set of response options that measure different traits and instruct respondents to make judgments among these options in terms of their preference to control the response biases that are commonly observed in normative tests. Diagnostic classification models (DCMs) can provide information regarding the mastery status of test takers on latent discrete variables and are more commonly used for cognitive tests employed in educational settings than for noncognitive tests. The purpose of this study is to develop a new class of DCM for FC items under the higher-order DCM framework to meet the practical demands of simultaneously controlling for response biases and providing diagnostic classification information. By conducting a series of simulations and calibrating the model parameters with a Bayesian estimation, the study shows that, in general, the model parameters can be recovered satisfactorily with the use of long tests and large samples. More attributes improve the precision of the second-order latent trait estimation in a long test, but decrease the classification accuracy and the estimation quality of the structural parameters. When statements are allowed to load on two distinct attributes in paired comparison items, the specific-attribute condition produces better a parameter estimation than the overlap-attribute condition. Finally, an empirical analysis related to work-motivation measures is presented to demonstrate the applications and implications of the new model.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 1","pages":"146-180"},"PeriodicalIF":2.7,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/5c/8c/10.1177_00131644211069906.PMC9806518.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10489721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01Epub Date: 2022-01-31DOI: 10.1177/00131644211073121
Charles L Fisk, Jeffrey R Harring, Zuchao Shen, Walter Leite, King Yiu Suen, Katerina M Marcoulides
Sensitivity analyses encompass a broad set of post-analytic techniques that are characterized as measuring the potential impact of any factor that has an effect on some output variables of a model. This research focuses on the utility of the simulated annealing algorithm to automatically identify path configurations and parameter values of omitted confounders in structural equation modeling (SEM). An empirical example based on a past published study is used to illustrate how strongly related an omitted variable must be to model variables for the conclusions of an analysis to change. The algorithm is outlined in detail and the results stemming from the sensitivity analysis are discussed.
{"title":"Using Simulated Annealing to Investigate Sensitivity of SEM to External Model Misspecification.","authors":"Charles L Fisk, Jeffrey R Harring, Zuchao Shen, Walter Leite, King Yiu Suen, Katerina M Marcoulides","doi":"10.1177/00131644211073121","DOIUrl":"10.1177/00131644211073121","url":null,"abstract":"<p><p>Sensitivity analyses encompass a broad set of post-analytic techniques that are characterized as measuring the potential impact of any factor that has an effect on some output variables of a model. This research focuses on the utility of the simulated annealing algorithm to automatically identify path configurations and parameter values of omitted confounders in structural equation modeling (SEM). An empirical example based on a past published study is used to illustrate how strongly related an omitted variable must be to model variables for the conclusions of an analysis to change. The algorithm is outlined in detail and the results stemming from the sensitivity analysis are discussed.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 1","pages":"73-92"},"PeriodicalIF":2.7,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806519/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10494315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01Epub Date: 2022-03-11DOI: 10.1177/00131644221080451
Kyle Cox, Benjamin Kelcey
Multilevel structural equation models (MSEMs) are well suited for educational research because they accommodate complex systems involving latent variables in multilevel settings. Estimation using Croon's bias-corrected factor score (BCFS) path estimation has recently been extended to MSEMs and demonstrated promise with limited sample sizes. This makes it well suited for planned educational research which often involves sample sizes constrained by logistical and financial factors. However, the performance of BCFS estimation with MSEMs has yet to be thoroughly explored under common but difficult conditions including in the presence of non-normal indicators and model misspecifications. We conducted two simulation studies to evaluate the accuracy and efficiency of the estimator under these conditions. Results suggest that BCFS estimation of MSEMs is often more dependable, more efficient, and less biased than other estimation approaches when sample sizes are limited or model misspecifications are present but is more susceptible to indicator non-normality. These results support, supplement, and elucidate previous literature describing the effective performance of BCFS estimation encouraging its utilization as an alternative or supplemental estimator for MSEMs.
{"title":"Croon's Bias-Corrected Estimation for Multilevel Structural Equation Models with Non-Normal Indicators and Model Misspecifications.","authors":"Kyle Cox, Benjamin Kelcey","doi":"10.1177/00131644221080451","DOIUrl":"10.1177/00131644221080451","url":null,"abstract":"<p><p>Multilevel structural equation models (MSEMs) are well suited for educational research because they accommodate complex systems involving latent variables in multilevel settings. Estimation using Croon's bias-corrected factor score (BCFS) path estimation has recently been extended to MSEMs and demonstrated promise with limited sample sizes. This makes it well suited for planned educational research which often involves sample sizes constrained by logistical and financial factors. However, the performance of BCFS estimation with MSEMs has yet to be thoroughly explored under common but difficult conditions including in the presence of non-normal indicators and model misspecifications. We conducted two simulation studies to evaluate the accuracy and efficiency of the estimator under these conditions. Results suggest that BCFS estimation of MSEMs is often more dependable, more efficient, and less biased than other estimation approaches when sample sizes are limited or model misspecifications are present but is more susceptible to indicator non-normality. These results support, supplement, and elucidate previous literature describing the effective performance of BCFS estimation encouraging its utilization as an alternative or supplemental estimator for MSEMs.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 1","pages":"48-72"},"PeriodicalIF":2.7,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806522/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10489718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-01Epub Date: 2022-03-07DOI: 10.1177/00131644221082688
Hope O Akaeze, Frank R Lawrence, Jamie Heng-Chieh Wu
Multidimensionality and hierarchical data structure are common in assessment data. These design features, if not accounted for, can threaten the validity of the results and inferences generated from factor analysis, a method frequently employed to assess test dimensionality. In this article, we describe and demonstrate the application of the multilevel bifactor model to address these features in examining test dimensionality. The tool for this exposition is the Child Observation Record Advantage 1.5 (COR-Adv1.5), a child assessment instrument widely used in Head Start programs. Previous studies on this assessment tool reported highly correlated factors and did not account for the nesting of children in classrooms. Results from this study show how the flexibility of the multilevel bifactor model, together with useful model-based statistics, can be harnessed to judge the dimensionality of a test instrument and inform the interpretability of the associated factor scores.
{"title":"Resolving Dimensionality in a Child Assessment Tool: An Application of the Multilevel Bifactor Model.","authors":"Hope O Akaeze, Frank R Lawrence, Jamie Heng-Chieh Wu","doi":"10.1177/00131644221082688","DOIUrl":"10.1177/00131644221082688","url":null,"abstract":"<p><p>Multidimensionality and hierarchical data structure are common in assessment data. These design features, if not accounted for, can threaten the validity of the results and inferences generated from factor analysis, a method frequently employed to assess test dimensionality. In this article, we describe and demonstrate the application of the multilevel bifactor model to address these features in examining test dimensionality. The tool for this exposition is the Child Observation Record Advantage 1.5 (COR-Adv1.5), a child assessment instrument widely used in Head Start programs. Previous studies on this assessment tool reported highly correlated factors and did not account for the nesting of children in classrooms. Results from this study show how the flexibility of the multilevel bifactor model, together with useful model-based statistics, can be harnessed to judge the dimensionality of a test instrument and inform the interpretability of the associated factor scores.</p>","PeriodicalId":11502,"journal":{"name":"Educational and Psychological Measurement","volume":"83 1","pages":"93-115"},"PeriodicalIF":2.7,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9806520/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10494318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}