Pub Date : 2023-02-08DOI: 10.3102/10769986221149140
D. Y. Lee, Jeffrey R. Harring
A Monte Carlo simulation was performed to compare methods for handling missing data in growth mixture models. The methods considered in the current study were (a) a fully Bayesian approach using a Gibbs sampler, (b) full information maximum likelihood using the expectation–maximization algorithm, (c) multiple imputation, (d) a two-stage multiple imputation method, and (e) listwise deletion. Of the five methods, it was found that the Bayesian approach and two-stage multiple imputation methods generally produce less biased parameter estimates compared to maximum likelihood or single imputation methods, although key differences were observed. Similarities and disparities among methods are highlighted and general recommendations articulated.
{"title":"Handling Missing Data in Growth Mixture Models","authors":"D. Y. Lee, Jeffrey R. Harring","doi":"10.3102/10769986221149140","DOIUrl":"https://doi.org/10.3102/10769986221149140","url":null,"abstract":"A Monte Carlo simulation was performed to compare methods for handling missing data in growth mixture models. The methods considered in the current study were (a) a fully Bayesian approach using a Gibbs sampler, (b) full information maximum likelihood using the expectation–maximization algorithm, (c) multiple imputation, (d) a two-stage multiple imputation method, and (e) listwise deletion. Of the five methods, it was found that the Bayesian approach and two-stage multiple imputation methods generally produce less biased parameter estimates compared to maximum likelihood or single imputation methods, although key differences were observed. Similarities and disparities among methods are highlighted and general recommendations articulated.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"320 - 348"},"PeriodicalIF":2.4,"publicationDate":"2023-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46904680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-11DOI: 10.3102/10769986221144727
Ehsan Bokhari
The prediction of dangerous and/or violent behavior is particularly important to the conduct of the U.S. criminal justice system when it makes decisions about restrictions of personal freedom, such as preventive detention, forensic commitment, parole, and in some states such as Texas, when to permit an execution to proceed of an individual found guilty of a capital crime. This article discusses the prediction of dangerous behavior both through clinical judgment and actuarial assessment. The general conclusion drawn is that for both clinical and actuarial prediction of dangerous behavior, we are far from a level of accuracy that could justify routine use. To support this later negative assessment, two topic areas are emphasized: (1) the MacArthur Study of Mental Disorder and Violence, including the actuarial instrument developed as part of this project (the Classification of Violence Risk), along with all the data collected that helped develop the instrument; and (2) the U.S. Supreme Court case of Barefoot v. Estelle (1983) and the American Psychiatric Association “friend of the court” brief on the (in)accuracy of clinical prediction for the commission of future violence. Although now three decades old, Barefoot v. Estelle is still the controlling Supreme Court opinion regarding the prediction of future dangerous behavior and the imposition of the death penalty in states, such as Texas; for example, see Coble v. Texas (2011) and the Supreme Court denial of certiorari in that case.
{"title":"Clinical (In)Efficiency in the Prediction of Dangerous Behavior","authors":"Ehsan Bokhari","doi":"10.3102/10769986221144727","DOIUrl":"https://doi.org/10.3102/10769986221144727","url":null,"abstract":"The prediction of dangerous and/or violent behavior is particularly important to the conduct of the U.S. criminal justice system when it makes decisions about restrictions of personal freedom, such as preventive detention, forensic commitment, parole, and in some states such as Texas, when to permit an execution to proceed of an individual found guilty of a capital crime. This article discusses the prediction of dangerous behavior both through clinical judgment and actuarial assessment. The general conclusion drawn is that for both clinical and actuarial prediction of dangerous behavior, we are far from a level of accuracy that could justify routine use. To support this later negative assessment, two topic areas are emphasized: (1) the MacArthur Study of Mental Disorder and Violence, including the actuarial instrument developed as part of this project (the Classification of Violence Risk), along with all the data collected that helped develop the instrument; and (2) the U.S. Supreme Court case of Barefoot v. Estelle (1983) and the American Psychiatric Association “friend of the court” brief on the (in)accuracy of clinical prediction for the commission of future violence. Although now three decades old, Barefoot v. Estelle is still the controlling Supreme Court opinion regarding the prediction of future dangerous behavior and the imposition of the death penalty in states, such as Texas; for example, see Coble v. Texas (2011) and the Supreme Court denial of certiorari in that case.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"661 - 682"},"PeriodicalIF":2.4,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47231762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-01-09DOI: 10.3102/10769986221143515
J. Lang
This article is concerned with the statistical detection of copying on multiple-choice exams. As an alternative to existing permutation- and model-based copy-detection approaches, a simple randomization p-value (RP) test is proposed. The RP test, which is based on an intuitive match-score statistic, makes no assumptions about the distribution of examinees’ answer vectors and hence is broadly applicable. Especially important in this copy-detection setting, the RP test is shown to be exact in that its size is guaranteed to be no larger than a nominal α value. Additionally, simulation results suggest that the RP test is typically more powerful for copy detection than the existing approximate tests. The development of the RP test is based on the idea that the copy-detection problem can be recast as a causal inference and missing data problem. In particular, the observed data are viewed as a subset of a larger collection of potential values, or counterfactuals, and the null hypothesis of “no copying” is viewed as a “no causal effect” hypothesis and formally expressed in terms of constraints on potential variables.
{"title":"A Randomization P-Value Test for Detecting Copying on Multiple-Choice Exams","authors":"J. Lang","doi":"10.3102/10769986221143515","DOIUrl":"https://doi.org/10.3102/10769986221143515","url":null,"abstract":"This article is concerned with the statistical detection of copying on multiple-choice exams. As an alternative to existing permutation- and model-based copy-detection approaches, a simple randomization p-value (RP) test is proposed. The RP test, which is based on an intuitive match-score statistic, makes no assumptions about the distribution of examinees’ answer vectors and hence is broadly applicable. Especially important in this copy-detection setting, the RP test is shown to be exact in that its size is guaranteed to be no larger than a nominal α value. Additionally, simulation results suggest that the RP test is typically more powerful for copy detection than the existing approximate tests. The development of the RP test is based on the idea that the copy-detection problem can be recast as a causal inference and missing data problem. In particular, the observed data are viewed as a subset of a larger collection of potential values, or counterfactuals, and the null hypothesis of “no copying” is viewed as a “no causal effect” hypothesis and formally expressed in terms of constraints on potential variables.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"296 - 319"},"PeriodicalIF":2.4,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49603850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-27DOI: 10.3102/10769986221133088
Yu Wang, Chia-Yi Chiu, Hans-Friedrich Köhn
The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format has also been adapted to the cognitive diagnosis (CD) framework. Early approaches simply dichotomized the responses and analyzed them with a CD model for binary responses. Obviously, this strategy cannot exploit the additional diagnostic information provided by MC items. De la Torre’s MC Deterministic Inputs, Noisy “And” Gate (MC-DINA) model was the first for the explicit analysis of items having MC response format. However, as a drawback, the attribute vectors of the distractors are restricted to be nested within the key and each other. The method presented in this article for the CD of DINA items having MC response format does not require such constraints. Another contribution of the proposed method concerns its implementation using a nonparametric classification algorithm, which predestines it for use especially in small-sample settings like classrooms, where CD is most needed for monitoring instruction and student learning. In contrast, default parametric CD estimation routines that rely on EM- or MCMC-based algorithms cannot guarantee stable and reliable estimates—despite their effectiveness and efficiency when samples are large—due to computational feasibility issues caused by insufficient sample sizes. Results of simulation studies and a real-world application are also reported.
多项选择题格式已广泛应用于不同内容领域的教育评估。据称MC项目允许收集更丰富的诊断信息。管理MC项目的有效性和经济性可能进一步促进了它们的普及,而不仅仅是在教育评估方面。MC项目格式也适应于认知诊断(CD)框架。早期的方法只是简单地将响应分为两类,并使用二元响应的CD模型对其进行分析。显然,该策略不能利用MC项目提供的附加诊断信息。De la Torre的MC确定性输入,嘈杂的“和”门(MC- dina)模型是第一个明确分析具有MC响应格式的项目的模型。然而,作为一个缺点,分心器的属性向量被限制在键和彼此内嵌套。本文提出的用于具有MC响应格式的DINA项目的CD的方法不需要这样的约束。所提出的方法的另一个贡献在于它使用非参数分类算法的实现,这预定了它特别适用于小样本环境,如教室,其中最需要CD来监控教学和学生学习。相比之下,依赖于基于EM或mcmc的算法的默认参数CD估计例程无法保证稳定可靠的估计-尽管它们在样本量大时具有有效性和效率-由于样本量不足引起的计算可行性问题。本文还报道了仿真研究和实际应用的结果。
{"title":"Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis","authors":"Yu Wang, Chia-Yi Chiu, Hans-Friedrich Köhn","doi":"10.3102/10769986221133088","DOIUrl":"https://doi.org/10.3102/10769986221133088","url":null,"abstract":"The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format has also been adapted to the cognitive diagnosis (CD) framework. Early approaches simply dichotomized the responses and analyzed them with a CD model for binary responses. Obviously, this strategy cannot exploit the additional diagnostic information provided by MC items. De la Torre’s MC Deterministic Inputs, Noisy “And” Gate (MC-DINA) model was the first for the explicit analysis of items having MC response format. However, as a drawback, the attribute vectors of the distractors are restricted to be nested within the key and each other. The method presented in this article for the CD of DINA items having MC response format does not require such constraints. Another contribution of the proposed method concerns its implementation using a nonparametric classification algorithm, which predestines it for use especially in small-sample settings like classrooms, where CD is most needed for monitoring instruction and student learning. In contrast, default parametric CD estimation routines that rely on EM- or MCMC-based algorithms cannot guarantee stable and reliable estimates—despite their effectiveness and efficiency when samples are large—due to computational feasibility issues caused by insufficient sample sizes. Results of simulation studies and a real-world application are also reported.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"189 - 219"},"PeriodicalIF":2.4,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69397792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-07DOI: 10.3102/10769986221128810
N. Waller
Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and contemporary researchers are unaware of an important fact. Namely, for any common factor model with known (or estimated) model parameters, infinite sets of factor scores can be constructed to fit the model. Because all sets are mathematically exchangeable, factor scores are indeterminate. Our professional silence on this topic is difficult to explain given that FSI was first noted almost 100 years ago by E. B. Wilson, the 24th president (1929) of the American Statistical Association. To help disseminate Wilson’s insights, we demonstrate the underlying mathematics of FSI using the language of finite-dimensional vector spaces and well-known ideas of regression theory. We then illustrate the numerical implications of FSI by describing new and easily implemented methods for transforming factor scores into alternative sets of factor scores. An online supplement (and the fungible R library) includes R functions for illustrating FSI.
{"title":"Breaking Our Silence on Factor Score Indeterminacy","authors":"N. Waller","doi":"10.3102/10769986221128810","DOIUrl":"https://doi.org/10.3102/10769986221128810","url":null,"abstract":"Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and contemporary researchers are unaware of an important fact. Namely, for any common factor model with known (or estimated) model parameters, infinite sets of factor scores can be constructed to fit the model. Because all sets are mathematically exchangeable, factor scores are indeterminate. Our professional silence on this topic is difficult to explain given that FSI was first noted almost 100 years ago by E. B. Wilson, the 24th president (1929) of the American Statistical Association. To help disseminate Wilson’s insights, we demonstrate the underlying mathematics of FSI using the language of finite-dimensional vector spaces and well-known ideas of regression theory. We then illustrate the numerical implications of FSI by describing new and easily implemented methods for transforming factor scores into alternative sets of factor scores. An online supplement (and the fungible R library) includes R functions for illustrating FSI.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"244 - 261"},"PeriodicalIF":2.4,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44687727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-17DOI: 10.3102/10769986221127379
M. H. Vembye, J. Pustejovsky, T. Pigott
Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we show that the new power formulas can accurately approximate the true power of meta-analytic models for dependent effect sizes. Lastly, we investigate the Type I error rate and power for several common models, finding that tests using robust variance estimation provide better Type I error calibration than tests with model-based variance estimation. We consider implications for practice with respect to selecting a working model and an inferential approach.
{"title":"Power Approximations for Overall Average Effects in Meta-Analysis With Dependent Effect Sizes","authors":"M. H. Vembye, J. Pustejovsky, T. Pigott","doi":"10.3102/10769986221127379","DOIUrl":"https://doi.org/10.3102/10769986221127379","url":null,"abstract":"Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we show that the new power formulas can accurately approximate the true power of meta-analytic models for dependent effect sizes. Lastly, we investigate the Type I error rate and power for several common models, finding that tests using robust variance estimation provide better Type I error calibration than tests with model-based variance estimation. We consider implications for practice with respect to selecting a working model and an inferential approach.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"70 - 102"},"PeriodicalIF":2.4,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47190874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-06DOI: 10.3102/10769986221126747
Ziwei Zhang, Corissa T. Rohloff, N. Kohli
To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an inverse transformation for the linear–linear piecewise growth model with an unknown random knot, an intrinsically nonlinear function, in the SEM framework. This method allowed for the incorporation of time-invariant covariates. While the proposed method made novel contributions in this area of research, the use of transformations introduces some challenges to model estimation and dissemination. This commentary aims to illustrate the significant contributions of the authors’ proposed method in the SEM framework, along with presenting the challenges involved in implementing this method and opportunities available in an alternative framework.
{"title":"Commentary on “Obtaining Interpretable Parameters From Reparameterized Longitudinal Models: Transformation Matrices Between Growth Factors in Two Parameter Spaces”","authors":"Ziwei Zhang, Corissa T. Rohloff, N. Kohli","doi":"10.3102/10769986221126747","DOIUrl":"https://doi.org/10.3102/10769986221126747","url":null,"abstract":"To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an inverse transformation for the linear–linear piecewise growth model with an unknown random knot, an intrinsically nonlinear function, in the SEM framework. This method allowed for the incorporation of time-invariant covariates. While the proposed method made novel contributions in this area of research, the use of transformations introduces some challenges to model estimation and dissemination. This commentary aims to illustrate the significant contributions of the authors’ proposed method in the SEM framework, along with presenting the challenges involved in implementing this method and opportunities available in an alternative framework.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"262 - 268"},"PeriodicalIF":2.4,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46380599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-03DOI: 10.3102/10769986221126741
Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu
To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item parameters of new items. Three simulation studies with simulated and real item banks were conducted to investigate the performance of the proposed method and compare it with the joint estimation algorithm (JEA) and the single-item estimation (SIE) methods. The results indicated that the proposed Gini-based online calibration method yielded higher calibration efficiency than those of the SIE method and outperformed the JEA method on item calibration tasks in terms of both accuracy and efficiency under most experimental conditions.
{"title":"Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index","authors":"Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu","doi":"10.3102/10769986221126741","DOIUrl":"https://doi.org/10.3102/10769986221126741","url":null,"abstract":"To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item parameters of new items. Three simulation studies with simulated and real item banks were conducted to investigate the performance of the proposed method and compare it with the joint estimation algorithm (JEA) and the single-item estimation (SIE) methods. The results indicated that the proposed Gini-based online calibration method yielded higher calibration efficiency than those of the SIE method and outperformed the JEA method on item calibration tasks in terms of both accuracy and efficiency under most experimental conditions.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"103 - 141"},"PeriodicalIF":2.4,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44152501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-17DOI: 10.3102/10769986221116905
Harold C. Doran
This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.
{"title":"A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications","authors":"Harold C. Doran","doi":"10.3102/10769986221116905","DOIUrl":"https://doi.org/10.3102/10769986221116905","url":null,"abstract":"This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"37 - 69"},"PeriodicalIF":2.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48606504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-17DOI: 10.3102/10769986221115446
Weicong Lyu, Jee-Seon Kim, Youmi Suk
This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and outcome models so that misclassification does not obstruct estimation of treatment effects. Simulation demonstrates that the proposed method finds the correct number of latent classes, estimates class-specific treatment effects well, and provides proper posterior standard deviations and credible intervals of ATEs. We apply this method to Trends in International Mathematics and Science Study data to investigate the effects of private science lessons on achievement scores and then find two latent classes, one with zero ATE and the other with positive ATE.
{"title":"Estimating Heterogeneous Treatment Effects Within Latent Class Multilevel Models: A Bayesian Approach","authors":"Weicong Lyu, Jee-Seon Kim, Youmi Suk","doi":"10.3102/10769986221115446","DOIUrl":"https://doi.org/10.3102/10769986221115446","url":null,"abstract":"This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and outcome models so that misclassification does not obstruct estimation of treatment effects. Simulation demonstrates that the proposed method finds the correct number of latent classes, estimates class-specific treatment effects well, and provides proper posterior standard deviations and credible intervals of ATEs. We apply this method to Trends in International Mathematics and Science Study data to investigate the effects of private science lessons on achievement scores and then find two latent classes, one with zero ATE and the other with positive ATE.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"3 - 36"},"PeriodicalIF":2.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46234214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}