Pub Date : 2022-11-27DOI: 10.3102/10769986221133088
Yu Wang, Chia-Yi Chiu, Hans-Friedrich Köhn
The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format has also been adapted to the cognitive diagnosis (CD) framework. Early approaches simply dichotomized the responses and analyzed them with a CD model for binary responses. Obviously, this strategy cannot exploit the additional diagnostic information provided by MC items. De la Torre’s MC Deterministic Inputs, Noisy “And” Gate (MC-DINA) model was the first for the explicit analysis of items having MC response format. However, as a drawback, the attribute vectors of the distractors are restricted to be nested within the key and each other. The method presented in this article for the CD of DINA items having MC response format does not require such constraints. Another contribution of the proposed method concerns its implementation using a nonparametric classification algorithm, which predestines it for use especially in small-sample settings like classrooms, where CD is most needed for monitoring instruction and student learning. In contrast, default parametric CD estimation routines that rely on EM- or MCMC-based algorithms cannot guarantee stable and reliable estimates—despite their effectiveness and efficiency when samples are large—due to computational feasibility issues caused by insufficient sample sizes. Results of simulation studies and a real-world application are also reported.
多项选择题格式已广泛应用于不同内容领域的教育评估。据称MC项目允许收集更丰富的诊断信息。管理MC项目的有效性和经济性可能进一步促进了它们的普及,而不仅仅是在教育评估方面。MC项目格式也适应于认知诊断(CD)框架。早期的方法只是简单地将响应分为两类,并使用二元响应的CD模型对其进行分析。显然,该策略不能利用MC项目提供的附加诊断信息。De la Torre的MC确定性输入,嘈杂的“和”门(MC- dina)模型是第一个明确分析具有MC响应格式的项目的模型。然而,作为一个缺点,分心器的属性向量被限制在键和彼此内嵌套。本文提出的用于具有MC响应格式的DINA项目的CD的方法不需要这样的约束。所提出的方法的另一个贡献在于它使用非参数分类算法的实现,这预定了它特别适用于小样本环境,如教室,其中最需要CD来监控教学和学生学习。相比之下,依赖于基于EM或mcmc的算法的默认参数CD估计例程无法保证稳定可靠的估计-尽管它们在样本量大时具有有效性和效率-由于样本量不足引起的计算可行性问题。本文还报道了仿真研究和实际应用的结果。
{"title":"Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis","authors":"Yu Wang, Chia-Yi Chiu, Hans-Friedrich Köhn","doi":"10.3102/10769986221133088","DOIUrl":"https://doi.org/10.3102/10769986221133088","url":null,"abstract":"The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format has also been adapted to the cognitive diagnosis (CD) framework. Early approaches simply dichotomized the responses and analyzed them with a CD model for binary responses. Obviously, this strategy cannot exploit the additional diagnostic information provided by MC items. De la Torre’s MC Deterministic Inputs, Noisy “And” Gate (MC-DINA) model was the first for the explicit analysis of items having MC response format. However, as a drawback, the attribute vectors of the distractors are restricted to be nested within the key and each other. The method presented in this article for the CD of DINA items having MC response format does not require such constraints. Another contribution of the proposed method concerns its implementation using a nonparametric classification algorithm, which predestines it for use especially in small-sample settings like classrooms, where CD is most needed for monitoring instruction and student learning. In contrast, default parametric CD estimation routines that rely on EM- or MCMC-based algorithms cannot guarantee stable and reliable estimates—despite their effectiveness and efficiency when samples are large—due to computational feasibility issues caused by insufficient sample sizes. Results of simulation studies and a real-world application are also reported.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"189 - 219"},"PeriodicalIF":2.4,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69397792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-11-07DOI: 10.3102/10769986221128810
N. Waller
Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and contemporary researchers are unaware of an important fact. Namely, for any common factor model with known (or estimated) model parameters, infinite sets of factor scores can be constructed to fit the model. Because all sets are mathematically exchangeable, factor scores are indeterminate. Our professional silence on this topic is difficult to explain given that FSI was first noted almost 100 years ago by E. B. Wilson, the 24th president (1929) of the American Statistical Association. To help disseminate Wilson’s insights, we demonstrate the underlying mathematics of FSI using the language of finite-dimensional vector spaces and well-known ideas of regression theory. We then illustrate the numerical implications of FSI by describing new and easily implemented methods for transforming factor scores into alternative sets of factor scores. An online supplement (and the fungible R library) includes R functions for illustrating FSI.
{"title":"Breaking Our Silence on Factor Score Indeterminacy","authors":"N. Waller","doi":"10.3102/10769986221128810","DOIUrl":"https://doi.org/10.3102/10769986221128810","url":null,"abstract":"Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and contemporary researchers are unaware of an important fact. Namely, for any common factor model with known (or estimated) model parameters, infinite sets of factor scores can be constructed to fit the model. Because all sets are mathematically exchangeable, factor scores are indeterminate. Our professional silence on this topic is difficult to explain given that FSI was first noted almost 100 years ago by E. B. Wilson, the 24th president (1929) of the American Statistical Association. To help disseminate Wilson’s insights, we demonstrate the underlying mathematics of FSI using the language of finite-dimensional vector spaces and well-known ideas of regression theory. We then illustrate the numerical implications of FSI by describing new and easily implemented methods for transforming factor scores into alternative sets of factor scores. An online supplement (and the fungible R library) includes R functions for illustrating FSI.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"244 - 261"},"PeriodicalIF":2.4,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44687727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-17DOI: 10.3102/10769986221127379
M. H. Vembye, J. Pustejovsky, T. Pigott
Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we show that the new power formulas can accurately approximate the true power of meta-analytic models for dependent effect sizes. Lastly, we investigate the Type I error rate and power for several common models, finding that tests using robust variance estimation provide better Type I error calibration than tests with model-based variance estimation. We consider implications for practice with respect to selecting a working model and an inferential approach.
{"title":"Power Approximations for Overall Average Effects in Meta-Analysis With Dependent Effect Sizes","authors":"M. H. Vembye, J. Pustejovsky, T. Pigott","doi":"10.3102/10769986221127379","DOIUrl":"https://doi.org/10.3102/10769986221127379","url":null,"abstract":"Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we show that the new power formulas can accurately approximate the true power of meta-analytic models for dependent effect sizes. Lastly, we investigate the Type I error rate and power for several common models, finding that tests using robust variance estimation provide better Type I error calibration than tests with model-based variance estimation. We consider implications for practice with respect to selecting a working model and an inferential approach.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"70 - 102"},"PeriodicalIF":2.4,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47190874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-06DOI: 10.3102/10769986221126747
Ziwei Zhang, Corissa T. Rohloff, N. Kohli
To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an inverse transformation for the linear–linear piecewise growth model with an unknown random knot, an intrinsically nonlinear function, in the SEM framework. This method allowed for the incorporation of time-invariant covariates. While the proposed method made novel contributions in this area of research, the use of transformations introduces some challenges to model estimation and dissemination. This commentary aims to illustrate the significant contributions of the authors’ proposed method in the SEM framework, along with presenting the challenges involved in implementing this method and opportunities available in an alternative framework.
{"title":"Commentary on “Obtaining Interpretable Parameters From Reparameterized Longitudinal Models: Transformation Matrices Between Growth Factors in Two Parameter Spaces”","authors":"Ziwei Zhang, Corissa T. Rohloff, N. Kohli","doi":"10.3102/10769986221126747","DOIUrl":"https://doi.org/10.3102/10769986221126747","url":null,"abstract":"To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an inverse transformation for the linear–linear piecewise growth model with an unknown random knot, an intrinsically nonlinear function, in the SEM framework. This method allowed for the incorporation of time-invariant covariates. While the proposed method made novel contributions in this area of research, the use of transformations introduces some challenges to model estimation and dissemination. This commentary aims to illustrate the significant contributions of the authors’ proposed method in the SEM framework, along with presenting the challenges involved in implementing this method and opportunities available in an alternative framework.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"262 - 268"},"PeriodicalIF":2.4,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46380599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-10-03DOI: 10.3102/10769986221126741
Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu
To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item parameters of new items. Three simulation studies with simulated and real item banks were conducted to investigate the performance of the proposed method and compare it with the joint estimation algorithm (JEA) and the single-item estimation (SIE) methods. The results indicated that the proposed Gini-based online calibration method yielded higher calibration efficiency than those of the SIE method and outperformed the JEA method on item calibration tasks in terms of both accuracy and efficiency under most experimental conditions.
{"title":"Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index","authors":"Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu","doi":"10.3102/10769986221126741","DOIUrl":"https://doi.org/10.3102/10769986221126741","url":null,"abstract":"To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item parameters of new items. Three simulation studies with simulated and real item banks were conducted to investigate the performance of the proposed method and compare it with the joint estimation algorithm (JEA) and the single-item estimation (SIE) methods. The results indicated that the proposed Gini-based online calibration method yielded higher calibration efficiency than those of the SIE method and outperformed the JEA method on item calibration tasks in terms of both accuracy and efficiency under most experimental conditions.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"103 - 141"},"PeriodicalIF":2.4,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44152501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-17DOI: 10.3102/10769986221116905
Harold C. Doran
This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.
{"title":"A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications","authors":"Harold C. Doran","doi":"10.3102/10769986221116905","DOIUrl":"https://doi.org/10.3102/10769986221116905","url":null,"abstract":"This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"37 - 69"},"PeriodicalIF":2.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48606504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-08-17DOI: 10.3102/10769986221115446
Weicong Lyu, Jee-Seon Kim, Youmi Suk
This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and outcome models so that misclassification does not obstruct estimation of treatment effects. Simulation demonstrates that the proposed method finds the correct number of latent classes, estimates class-specific treatment effects well, and provides proper posterior standard deviations and credible intervals of ATEs. We apply this method to Trends in International Mathematics and Science Study data to investigate the effects of private science lessons on achievement scores and then find two latent classes, one with zero ATE and the other with positive ATE.
{"title":"Estimating Heterogeneous Treatment Effects Within Latent Class Multilevel Models: A Bayesian Approach","authors":"Weicong Lyu, Jee-Seon Kim, Youmi Suk","doi":"10.3102/10769986221115446","DOIUrl":"https://doi.org/10.3102/10769986221115446","url":null,"abstract":"This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and outcome models so that misclassification does not obstruct estimation of treatment effects. Simulation demonstrates that the proposed method finds the correct number of latent classes, estimates class-specific treatment effects well, and provides proper posterior standard deviations and credible intervals of ATEs. We apply this method to Trends in International Mathematics and Science Study data to investigate the effects of private science lessons on achievement scores and then find two latent classes, one with zero ATE and the other with positive ATE.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"3 - 36"},"PeriodicalIF":2.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46234214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-28DOI: 10.3102/10769986221111085
P. Zhan, K. Man, Stefanie A. Wind, Jonathan Malone
Respondents’ problem-solving behaviors comprise behaviors that represent complicated cognitive processes that are frequently systematically tied to one another. Biometric data, such as visual fixation counts (FCs), which are an important eye-tracking indicator, can be combined with other types of variables that reflect different aspects of problem-solving behavior to quantify variability in problem-solving behavior. To provide comprehensive feedback and accurate diagnosis when using such multimodal data, the present study proposes a multimodal joint cognitive diagnosis model that accounts for latent attributes, latent ability, processing speed, and visual engagement by simultaneously modeling response accuracy (RA), response times, and FCs. We used two simulation studies to test the feasibility of the proposed model. Findings mainly suggest that the parameters of the proposed model can be well recovered and that modeling FCs, in addition to RA and response times, could increase the comprehensiveness of feedback on problem-solving-related cognitive characteristics as well as the accuracy of knowledge structure diagnosis. An empirical example is used to demonstrate the applicability and benefits of the proposed model. We discuss the implications of our findings as they relate to research and practice.
{"title":"Cognitive Diagnosis Modeling Incorporating Response Times and Fixation Counts: Providing Comprehensive Feedback and Accurate Diagnosis","authors":"P. Zhan, K. Man, Stefanie A. Wind, Jonathan Malone","doi":"10.3102/10769986221111085","DOIUrl":"https://doi.org/10.3102/10769986221111085","url":null,"abstract":"Respondents’ problem-solving behaviors comprise behaviors that represent complicated cognitive processes that are frequently systematically tied to one another. Biometric data, such as visual fixation counts (FCs), which are an important eye-tracking indicator, can be combined with other types of variables that reflect different aspects of problem-solving behavior to quantify variability in problem-solving behavior. To provide comprehensive feedback and accurate diagnosis when using such multimodal data, the present study proposes a multimodal joint cognitive diagnosis model that accounts for latent attributes, latent ability, processing speed, and visual engagement by simultaneously modeling response accuracy (RA), response times, and FCs. We used two simulation studies to test the feasibility of the proposed model. Findings mainly suggest that the parameters of the proposed model can be well recovered and that modeling FCs, in addition to RA and response times, could increase the comprehensiveness of feedback on problem-solving-related cognitive characteristics as well as the accuracy of knowledge structure diagnosis. An empirical example is used to demonstrate the applicability and benefits of the proposed model. We discuss the implications of our findings as they relate to research and practice.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"736 - 776"},"PeriodicalIF":2.4,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47269107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-18DOI: 10.3102/10769986221109208
Weimeng Wang, Yang Liu, Hongyun Liu
Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection procedures based on item response theory (IRT) have been widely used, a majority of IRT-based DIF tests assume predefined anchor (i.e., DIF-free) items. Not only is this assumption strong, but violations to it may also lead to erroneous inferences, for example, an inflated Type I error rate. We propose a general framework to define the effect sizes of DIF without a priori knowledge of anchor items. In particular, we quantify DIF by item-specific residuals from a regression model fitted to the true item parameters in respective groups. Moreover, the null distribution of the proposed test statistic using robust estimator can be derived analytically or approximated numerically even when there is a mix of DIF and non-DIF items, which yields asymptotically justified statistical inference. The Type I error rate and the power performance of the proposed procedure are evaluated and compared with the conventional likelihood-ratio DIF tests in a Monte Carlo experiment. Our simulation study has shown promising results in controlling Type I error rate and power of detecting DIF items. Even when there is a mix of DIF and non-DIF items, the true and false alarm rate can be well controlled when a robust regression estimator is used.
{"title":"Testing Differential Item Functioning Without Predefined Anchor Items Using Robust Regression","authors":"Weimeng Wang, Yang Liu, Hongyun Liu","doi":"10.3102/10769986221109208","DOIUrl":"https://doi.org/10.3102/10769986221109208","url":null,"abstract":"Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection procedures based on item response theory (IRT) have been widely used, a majority of IRT-based DIF tests assume predefined anchor (i.e., DIF-free) items. Not only is this assumption strong, but violations to it may also lead to erroneous inferences, for example, an inflated Type I error rate. We propose a general framework to define the effect sizes of DIF without a priori knowledge of anchor items. In particular, we quantify DIF by item-specific residuals from a regression model fitted to the true item parameters in respective groups. Moreover, the null distribution of the proposed test statistic using robust estimator can be derived analytically or approximated numerically even when there is a mix of DIF and non-DIF items, which yields asymptotically justified statistical inference. The Type I error rate and the power performance of the proposed procedure are evaluated and compared with the conventional likelihood-ratio DIF tests in a Monte Carlo experiment. Our simulation study has shown promising results in controlling Type I error rate and power of detecting DIF items. Even when there is a mix of DIF and non-DIF items, the true and false alarm rate can be well controlled when a robust regression estimator is used.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"666 - 692"},"PeriodicalIF":2.4,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42754815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-15DOI: 10.3102/10769986221108455
D. Molenaar, M. Curi, Jorge L. Bazán
Bounded continuous data are encountered in many applications of item response theory, including the measurement of mood, personality, and response times and in the analyses of summed item scores. Although different item response theory models exist to analyze such bounded continuous data, most models assume the data to be in an open interval and cannot accommodate data in a closed interval. As a result, ad hoc transformations are needed to prevent scores on the bounds of the observed variables. To motivate the present study, we demonstrate in real and simulated data that this practice of fitting open interval models to closed interval data can majorly affect parameter estimates even in cases with only 5% of the responses on one of the bounds of the observed variables. To address this problem, we propose a zero and one inflated item response theory modeling framework for bounded continuous responses in the closed interval. We illustrate how four existing models for bounded responses from the literature can be accommodated in the framework. The resulting zero and one inflated item response theory models are studied in a simulation study and a real data application to investigate parameter recovery, model fit, and the consequences of fitting the incorrect distribution to the data. We find that neglecting the bounded nature of the data biases parameters and that misspecification of the exact distribution may affect the results depending on the data generating model.
{"title":"Zero and One Inflated Item Response Theory Models for Bounded Continuous Data","authors":"D. Molenaar, M. Curi, Jorge L. Bazán","doi":"10.3102/10769986221108455","DOIUrl":"https://doi.org/10.3102/10769986221108455","url":null,"abstract":"Bounded continuous data are encountered in many applications of item response theory, including the measurement of mood, personality, and response times and in the analyses of summed item scores. Although different item response theory models exist to analyze such bounded continuous data, most models assume the data to be in an open interval and cannot accommodate data in a closed interval. As a result, ad hoc transformations are needed to prevent scores on the bounds of the observed variables. To motivate the present study, we demonstrate in real and simulated data that this practice of fitting open interval models to closed interval data can majorly affect parameter estimates even in cases with only 5% of the responses on one of the bounds of the observed variables. To address this problem, we propose a zero and one inflated item response theory modeling framework for bounded continuous responses in the closed interval. We illustrate how four existing models for bounded responses from the literature can be accommodated in the framework. The resulting zero and one inflated item response theory models are studied in a simulation study and a real data application to investigate parameter recovery, model fit, and the consequences of fitting the incorrect distribution to the data. We find that neglecting the bounded nature of the data biases parameters and that misspecification of the exact distribution may affect the results depending on the data generating model.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"693 - 735"},"PeriodicalIF":2.4,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45894141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}