首页 > 最新文献

Journal of Educational and Behavioral Statistics最新文献

英文 中文
Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis 认知诊断中多项选择题的非参数分类方法
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-11-27 DOI: 10.3102/10769986221133088
Yu Wang, Chia-Yi Chiu, Hans-Friedrich Köhn
The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format has also been adapted to the cognitive diagnosis (CD) framework. Early approaches simply dichotomized the responses and analyzed them with a CD model for binary responses. Obviously, this strategy cannot exploit the additional diagnostic information provided by MC items. De la Torre’s MC Deterministic Inputs, Noisy “And” Gate (MC-DINA) model was the first for the explicit analysis of items having MC response format. However, as a drawback, the attribute vectors of the distractors are restricted to be nested within the key and each other. The method presented in this article for the CD of DINA items having MC response format does not require such constraints. Another contribution of the proposed method concerns its implementation using a nonparametric classification algorithm, which predestines it for use especially in small-sample settings like classrooms, where CD is most needed for monitoring instruction and student learning. In contrast, default parametric CD estimation routines that rely on EM- or MCMC-based algorithms cannot guarantee stable and reliable estimates—despite their effectiveness and efficiency when samples are large—due to computational feasibility issues caused by insufficient sample sizes. Results of simulation studies and a real-world application are also reported.
多项选择题格式已广泛应用于不同内容领域的教育评估。据称MC项目允许收集更丰富的诊断信息。管理MC项目的有效性和经济性可能进一步促进了它们的普及,而不仅仅是在教育评估方面。MC项目格式也适应于认知诊断(CD)框架。早期的方法只是简单地将响应分为两类,并使用二元响应的CD模型对其进行分析。显然,该策略不能利用MC项目提供的附加诊断信息。De la Torre的MC确定性输入,嘈杂的“和”门(MC- dina)模型是第一个明确分析具有MC响应格式的项目的模型。然而,作为一个缺点,分心器的属性向量被限制在键和彼此内嵌套。本文提出的用于具有MC响应格式的DINA项目的CD的方法不需要这样的约束。所提出的方法的另一个贡献在于它使用非参数分类算法的实现,这预定了它特别适用于小样本环境,如教室,其中最需要CD来监控教学和学生学习。相比之下,依赖于基于EM或mcmc的算法的默认参数CD估计例程无法保证稳定可靠的估计-尽管它们在样本量大时具有有效性和效率-由于样本量不足引起的计算可行性问题。本文还报道了仿真研究和实际应用的结果。
{"title":"Nonparametric Classification Method for Multiple-Choice Items in Cognitive Diagnosis","authors":"Yu Wang, Chia-Yi Chiu, Hans-Friedrich Köhn","doi":"10.3102/10769986221133088","DOIUrl":"https://doi.org/10.3102/10769986221133088","url":null,"abstract":"The multiple-choice (MC) item format has been widely used in educational assessments across diverse content domains. MC items purportedly allow for collecting richer diagnostic information. The effectiveness and economy of administering MC items may have further contributed to their popularity not just in educational assessment. The MC item format has also been adapted to the cognitive diagnosis (CD) framework. Early approaches simply dichotomized the responses and analyzed them with a CD model for binary responses. Obviously, this strategy cannot exploit the additional diagnostic information provided by MC items. De la Torre’s MC Deterministic Inputs, Noisy “And” Gate (MC-DINA) model was the first for the explicit analysis of items having MC response format. However, as a drawback, the attribute vectors of the distractors are restricted to be nested within the key and each other. The method presented in this article for the CD of DINA items having MC response format does not require such constraints. Another contribution of the proposed method concerns its implementation using a nonparametric classification algorithm, which predestines it for use especially in small-sample settings like classrooms, where CD is most needed for monitoring instruction and student learning. In contrast, default parametric CD estimation routines that rely on EM- or MCMC-based algorithms cannot guarantee stable and reliable estimates—despite their effectiveness and efficiency when samples are large—due to computational feasibility issues caused by insufficient sample sizes. Results of simulation studies and a real-world application are also reported.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"189 - 219"},"PeriodicalIF":2.4,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"69397792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Breaking Our Silence on Factor Score Indeterminacy 打破我们对因子得分不确定性的沉默
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-11-07 DOI: 10.3102/10769986221128810
N. Waller
Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and contemporary researchers are unaware of an important fact. Namely, for any common factor model with known (or estimated) model parameters, infinite sets of factor scores can be constructed to fit the model. Because all sets are mathematically exchangeable, factor scores are indeterminate. Our professional silence on this topic is difficult to explain given that FSI was first noted almost 100 years ago by E. B. Wilson, the 24th president (1929) of the American Statistical Association. To help disseminate Wilson’s insights, we demonstrate the underlying mathematics of FSI using the language of finite-dimensional vector spaces and well-known ideas of regression theory. We then illustrate the numerical implications of FSI by describing new and easily implemented methods for transforming factor scores into alternative sets of factor scores. An online supplement (and the fungible R library) includes R functions for illustrating FSI.
尽管许多关于多元统计的教科书都讨论了常见的因子分析模型,但这些书中很少提到因子得分不确定性(FSI)的问题。因此,许多学生和当代研究者没有意识到一个重要的事实。也就是说,对于任何具有已知(或估计)模型参数的公共因子模型,可以构造无限组因子得分来拟合该模型。因为所有集合在数学上都是可交换的,所以因子得分是不确定的。鉴于美国统计协会第24任主席(1929年)E.B.Wilson在近100年前首次注意到FSI,我们在这个话题上的专业沉默很难解释。为了帮助传播Wilson的见解,我们使用有限维向量空间的语言和回归理论的著名思想来演示FSI的基本数学。然后,我们通过描述将因子得分转换为因子得分的替代集合的新的、易于实现的方法来说明FSI的数字含义。在线增刊(以及可替代的R库)包括用于说明FSI的R函数。
{"title":"Breaking Our Silence on Factor Score Indeterminacy","authors":"N. Waller","doi":"10.3102/10769986221128810","DOIUrl":"https://doi.org/10.3102/10769986221128810","url":null,"abstract":"Although many textbooks on multivariate statistics discuss the common factor analysis model, few of these books mention the problem of factor score indeterminacy (FSI). Thus, many students and contemporary researchers are unaware of an important fact. Namely, for any common factor model with known (or estimated) model parameters, infinite sets of factor scores can be constructed to fit the model. Because all sets are mathematically exchangeable, factor scores are indeterminate. Our professional silence on this topic is difficult to explain given that FSI was first noted almost 100 years ago by E. B. Wilson, the 24th president (1929) of the American Statistical Association. To help disseminate Wilson’s insights, we demonstrate the underlying mathematics of FSI using the language of finite-dimensional vector spaces and well-known ideas of regression theory. We then illustrate the numerical implications of FSI by describing new and easily implemented methods for transforming factor scores into alternative sets of factor scores. An online supplement (and the fungible R library) includes R functions for illustrating FSI.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"244 - 261"},"PeriodicalIF":2.4,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44687727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Power Approximations for Overall Average Effects in Meta-Analysis With Dependent Effect Sizes 具有相关效应量的meta分析中总体平均效应的功率近似
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-10-17 DOI: 10.3102/10769986221127379
M. H. Vembye, J. Pustejovsky, T. Pigott
Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we show that the new power formulas can accurately approximate the true power of meta-analytic models for dependent effect sizes. Lastly, we investigate the Type I error rate and power for several common models, finding that tests using robust variance estimation provide better Type I error calibration than tests with model-based variance estimation. We consider implications for practice with respect to selecting a working model and an inferential approach.
在过去的几十年里,依赖效应大小的元分析模型变得越来越复杂,这给先验功率计算带来了挑战。基于处理依赖效应大小的几种常见方法,我们引入了平均效应大小测试的幂近似。在蒙特卡洛模拟中,我们证明了新的幂公式可以准确地近似依赖效应大小的元分析模型的真幂。最后,我们研究了几种常见模型的I型误差率和功率,发现使用稳健方差估计的测试比使用基于模型的方差估计的检验提供了更好的I型错误校准。我们考虑在选择工作模式和推理方法方面对实践的影响。
{"title":"Power Approximations for Overall Average Effects in Meta-Analysis With Dependent Effect Sizes","authors":"M. H. Vembye, J. Pustejovsky, T. Pigott","doi":"10.3102/10769986221127379","DOIUrl":"https://doi.org/10.3102/10769986221127379","url":null,"abstract":"Meta-analytic models for dependent effect sizes have grown increasingly sophisticated over the last few decades, which has created challenges for a priori power calculations. We introduce power approximations for tests of average effect sizes based upon several common approaches for handling dependent effect sizes. In a Monte Carlo simulation, we show that the new power formulas can accurately approximate the true power of meta-analytic models for dependent effect sizes. Lastly, we investigate the Type I error rate and power for several common models, finding that tests using robust variance estimation provide better Type I error calibration than tests with model-based variance estimation. We consider implications for practice with respect to selecting a working model and an inferential approach.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"70 - 102"},"PeriodicalIF":2.4,"publicationDate":"2022-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47190874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Commentary on “Obtaining Interpretable Parameters From Reparameterized Longitudinal Models: Transformation Matrices Between Growth Factors in Two Parameter Spaces” “从重新参数化的纵向模型中获得可解释的参数:两个参数空间中增长因子之间的变换矩阵”述评
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-10-06 DOI: 10.3102/10769986221126747
Ziwei Zhang, Corissa T. Rohloff, N. Kohli
To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an inverse transformation for the linear–linear piecewise growth model with an unknown random knot, an intrinsically nonlinear function, in the SEM framework. This method allowed for the incorporation of time-invariant covariates. While the proposed method made novel contributions in this area of research, the use of transformations introduces some challenges to model estimation and dissemination. This commentary aims to illustrate the significant contributions of the authors’ proposed method in the SEM framework, along with presenting the challenges involved in implementing this method and opportunities available in an alternative framework.
为了模拟随时间的增长,统计技术可用于结构方程建模(SEM)和随机效应建模框架。Liu等人在SEM框架下对具有未知随机结(本质上是非线性函数)的线性-线性分段增长模型提出了一个变换和一个逆变换。这种方法允许合并时不变协变量。虽然所提出的方法在这一研究领域做出了新的贡献,但转换的使用给模型估计和传播带来了一些挑战。这篇评论旨在说明作者在SEM框架中提出的方法的重要贡献,以及在实施该方法时所涉及的挑战和在替代框架中可用的机会。
{"title":"Commentary on “Obtaining Interpretable Parameters From Reparameterized Longitudinal Models: Transformation Matrices Between Growth Factors in Two Parameter Spaces”","authors":"Ziwei Zhang, Corissa T. Rohloff, N. Kohli","doi":"10.3102/10769986221126747","DOIUrl":"https://doi.org/10.3102/10769986221126747","url":null,"abstract":"To model growth over time, statistical techniques are available in both structural equation modeling (SEM) and random effects modeling frameworks. Liu et al. proposed a transformation and an inverse transformation for the linear–linear piecewise growth model with an unknown random knot, an intrinsically nonlinear function, in the SEM framework. This method allowed for the incorporation of time-invariant covariates. While the proposed method made novel contributions in this area of research, the use of transformations introduces some challenges to model estimation and dissemination. This commentary aims to illustrate the significant contributions of the authors’ proposed method in the SEM framework, along with presenting the challenges involved in implementing this method and opportunities available in an alternative framework.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"262 - 268"},"PeriodicalIF":2.4,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46380599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index 基于Gini指数的CD-CAT高精度有效在线标定方法研究
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-10-03 DOI: 10.3102/10769986221126741
Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu
To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item parameters of new items. Three simulation studies with simulated and real item banks were conducted to investigate the performance of the proposed method and compare it with the joint estimation algorithm (JEA) and the single-item estimation (SIE) methods. The results indicated that the proposed Gini-based online calibration method yielded higher calibration efficiency than those of the SIE method and outperformed the JEA method on item calibration tasks in terms of both accuracy and efficiency under most experimental conditions.
为了提高认知诊断计算机自适应测试(CD-CAT)对新项目的校准精度和校准效率,最终促进CD-CAT在实践中的广泛应用,本文提出了一种基于gini的在线校准方法,该方法可以同时校准新项目的q矩阵和项目参数。通过模拟和真实物项库的仿真研究,研究了该方法的性能,并将其与联合估计算法(JEA)和单项估计方法(SIE)进行了比较。结果表明,在大多数实验条件下,基于基尼系数的在线校准方法的校准效率高于SIE方法,在项目校准任务的精度和效率方面都优于JEA方法。
{"title":"Development of a High-Accuracy and Effective Online Calibration Method in CD-CAT Based on Gini Index","authors":"Qingrong Tan, Yan Cai, Fen Luo, Dongbo Tu","doi":"10.3102/10769986221126741","DOIUrl":"https://doi.org/10.3102/10769986221126741","url":null,"abstract":"To improve the calibration accuracy and calibration efficiency of cognitive diagnostic computerized adaptive testing (CD-CAT) for new items and, ultimately, contribute to the widespread application of CD-CAT in practice, the current article proposed a Gini-based online calibration method that can simultaneously calibrate the Q-matrix and item parameters of new items. Three simulation studies with simulated and real item banks were conducted to investigate the performance of the proposed method and compare it with the joint estimation algorithm (JEA) and the single-item estimation (SIE) methods. The results indicated that the proposed Gini-based online calibration method yielded higher calibration efficiency than those of the SIE method and outperformed the JEA method on item calibration tasks in terms of both accuracy and efficiency under most experimental conditions.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"103 - 141"},"PeriodicalIF":2.4,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44152501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications 构建可扩展的心理测量应用程序有用的数值配方集合
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-08-17 DOI: 10.3102/10769986221116905
Harold C. Doran
This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.
本文关注的是一组数值稳定和可扩展的算法,这些算法在机器学习和海量数据时代有助于支持计算复杂的心理测量模型。这里选择的子集是计算心理测量学家应该熟悉的一组核心数值方法,并考虑用于处理相关数据的白化变换、线性模型的计算概念、多变量积分和优化技术。
{"title":"A Collection of Numerical Recipes Useful for Building Scalable Psychometric Applications","authors":"Harold C. Doran","doi":"10.3102/10769986221116905","DOIUrl":"https://doi.org/10.3102/10769986221116905","url":null,"abstract":"This article is concerned with a subset of numerically stable and scalable algorithms useful to support computationally complex psychometric models in the era of machine learning and massive data. The subset selected here is a core set of numerical methods that should be familiar to computational psychometricians and considers whitening transforms for dealing with correlated data, computational concepts for linear models, multivariable integration, and optimization techniques.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"37 - 69"},"PeriodicalIF":2.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48606504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Estimating Heterogeneous Treatment Effects Within Latent Class Multilevel Models: A Bayesian Approach 在潜在类多水平模型中估计异质性治疗效果:贝叶斯方法
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-08-17 DOI: 10.3102/10769986221115446
Weicong Lyu, Jee-Seon Kim, Youmi Suk
This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and outcome models so that misclassification does not obstruct estimation of treatment effects. Simulation demonstrates that the proposed method finds the correct number of latent classes, estimates class-specific treatment effects well, and provides proper posterior standard deviations and credible intervals of ATEs. We apply this method to Trends in International Mathematics and Science Study data to investigate the effects of private science lessons on achievement scores and then find two latent classes, one with zero ATE and the other with positive ATE.
本文提出了一个多水平数据的潜在分类模型,以识别潜在亚群并估计异质性治疗效果。与先划分数据然后估计类内平均治疗效果(ATEs)的顺序方法不同,我们采用贝叶斯过程来联合估计混合概率、选择和结果模型,以便错误分类不会妨碍对治疗效果的估计。仿真结果表明,该方法能较好地估计出潜在类别的数量和类别特异性治疗效果,并能提供合适的后验标准差和可信区间。我们将这种方法应用于国际数学和科学趋势研究数据,以调查私人科学课程对成就分数的影响,然后发现两个潜在类别,一个是零ATE,另一个是正ATE。
{"title":"Estimating Heterogeneous Treatment Effects Within Latent Class Multilevel Models: A Bayesian Approach","authors":"Weicong Lyu, Jee-Seon Kim, Youmi Suk","doi":"10.3102/10769986221115446","DOIUrl":"https://doi.org/10.3102/10769986221115446","url":null,"abstract":"This article presents a latent class model for multilevel data to identify latent subgroups and estimate heterogeneous treatment effects. Unlike sequential approaches that partition data first and then estimate average treatment effects (ATEs) within classes, we employ a Bayesian procedure to jointly estimate mixing probability, selection, and outcome models so that misclassification does not obstruct estimation of treatment effects. Simulation demonstrates that the proposed method finds the correct number of latent classes, estimates class-specific treatment effects well, and provides proper posterior standard deviations and credible intervals of ATEs. We apply this method to Trends in International Mathematics and Science Study data to investigate the effects of private science lessons on achievement scores and then find two latent classes, one with zero ATE and the other with positive ATE.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"48 1","pages":"3 - 36"},"PeriodicalIF":2.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46234214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cognitive Diagnosis Modeling Incorporating Response Times and Fixation Counts: Providing Comprehensive Feedback and Accurate Diagnosis 结合反应时间和注视次数的认知诊断模型:提供全面反馈和准确诊断
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-07-28 DOI: 10.3102/10769986221111085
P. Zhan, K. Man, Stefanie A. Wind, Jonathan Malone
Respondents’ problem-solving behaviors comprise behaviors that represent complicated cognitive processes that are frequently systematically tied to one another. Biometric data, such as visual fixation counts (FCs), which are an important eye-tracking indicator, can be combined with other types of variables that reflect different aspects of problem-solving behavior to quantify variability in problem-solving behavior. To provide comprehensive feedback and accurate diagnosis when using such multimodal data, the present study proposes a multimodal joint cognitive diagnosis model that accounts for latent attributes, latent ability, processing speed, and visual engagement by simultaneously modeling response accuracy (RA), response times, and FCs. We used two simulation studies to test the feasibility of the proposed model. Findings mainly suggest that the parameters of the proposed model can be well recovered and that modeling FCs, in addition to RA and response times, could increase the comprehensiveness of feedback on problem-solving-related cognitive characteristics as well as the accuracy of knowledge structure diagnosis. An empirical example is used to demonstrate the applicability and benefits of the proposed model. We discuss the implications of our findings as they relate to research and practice.
受访者解决问题的行为包括代表复杂认知过程的行为,这些认知过程经常系统地相互联系。生物特征数据,如视觉注视计数(FC),这是一个重要的眼睛跟踪指标,可以与反映解决问题行为不同方面的其他类型的变量相结合,以量化解决问题行为的可变性。为了在使用这种多模式数据时提供全面的反馈和准确的诊断,本研究提出了一种多模式联合认知诊断模型,该模型通过同时建模反应准确性(RA)、反应时间和FC来考虑潜在属性、潜在能力、处理速度和视觉参与。我们使用了两个模拟研究来测试所提出的模型的可行性。研究结果主要表明,所提出的模型的参数可以很好地恢复,建模FC,除了RA和响应时间外,还可以提高对问题解决相关认知特征的反馈的全面性以及知识结构诊断的准确性。通过实例验证了该模型的适用性和优点。我们讨论了我们的发现对研究和实践的影响。
{"title":"Cognitive Diagnosis Modeling Incorporating Response Times and Fixation Counts: Providing Comprehensive Feedback and Accurate Diagnosis","authors":"P. Zhan, K. Man, Stefanie A. Wind, Jonathan Malone","doi":"10.3102/10769986221111085","DOIUrl":"https://doi.org/10.3102/10769986221111085","url":null,"abstract":"Respondents’ problem-solving behaviors comprise behaviors that represent complicated cognitive processes that are frequently systematically tied to one another. Biometric data, such as visual fixation counts (FCs), which are an important eye-tracking indicator, can be combined with other types of variables that reflect different aspects of problem-solving behavior to quantify variability in problem-solving behavior. To provide comprehensive feedback and accurate diagnosis when using such multimodal data, the present study proposes a multimodal joint cognitive diagnosis model that accounts for latent attributes, latent ability, processing speed, and visual engagement by simultaneously modeling response accuracy (RA), response times, and FCs. We used two simulation studies to test the feasibility of the proposed model. Findings mainly suggest that the parameters of the proposed model can be well recovered and that modeling FCs, in addition to RA and response times, could increase the comprehensiveness of feedback on problem-solving-related cognitive characteristics as well as the accuracy of knowledge structure diagnosis. An empirical example is used to demonstrate the applicability and benefits of the proposed model. We discuss the implications of our findings as they relate to research and practice.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"736 - 776"},"PeriodicalIF":2.4,"publicationDate":"2022-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47269107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Testing Differential Item Functioning Without Predefined Anchor Items Using Robust Regression 使用稳健回归测试没有预定义锚项目的差异项目功能
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-07-18 DOI: 10.3102/10769986221109208
Weimeng Wang, Yang Liu, Hongyun Liu
Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection procedures based on item response theory (IRT) have been widely used, a majority of IRT-based DIF tests assume predefined anchor (i.e., DIF-free) items. Not only is this assumption strong, but violations to it may also lead to erroneous inferences, for example, an inflated Type I error rate. We propose a general framework to define the effect sizes of DIF without a priori knowledge of anchor items. In particular, we quantify DIF by item-specific residuals from a regression model fitted to the true item parameters in respective groups. Moreover, the null distribution of the proposed test statistic using robust estimator can be derived analytically or approximated numerically even when there is a mix of DIF and non-DIF items, which yields asymptotically justified statistical inference. The Type I error rate and the power performance of the proposed procedure are evaluated and compared with the conventional likelihood-ratio DIF tests in a Monte Carlo experiment. Our simulation study has shown promising results in controlling Type I error rate and power of detecting DIF items. Even when there is a mix of DIF and non-DIF items, the true and false alarm rate can be well controlled when a robust regression estimator is used.
当具有相同潜在特征水平的个体在不同群体中认可某个项目的概率不同时,就会出现差异项目功能(DIF)。DIF项目的存在可能危及文书的有效性;因此,在教育评估的日常操作中识别DIF项目是至关重要的。虽然基于项目反应理论(IRT)的DIF检测程序已被广泛使用,但大多数基于IRT的DIF测试都假设了预定义的锚(即,无DIF)项目。这种假设不仅很强,而且违反它也可能导致错误的推断,例如,夸大的I型错误率。我们提出了一个通用框架来定义DIF的效果大小,而不需要锚项的先验知识。特别是,我们通过回归模型中的项目特异性残差来量化DIF,该回归模型与各组中的真实项目参数相拟合。此外,即使在DIF和非DIF项目混合的情况下,使用鲁棒估计器的测试统计量的零分布也可以通过分析或数值近似得出,这产生了渐近合理的统计推断。在蒙特卡洛实验中,评估了所提出程序的I型错误率和功率性能,并将其与传统的似然比DIF测试进行了比较。我们的仿真研究在控制I型错误率和检测DIF项目的能力方面显示出了有希望的结果。即使在DIF和非DIF项目混合的情况下,当使用稳健回归估计器时,也可以很好地控制真警率和假警率。
{"title":"Testing Differential Item Functioning Without Predefined Anchor Items Using Robust Regression","authors":"Weimeng Wang, Yang Liu, Hongyun Liu","doi":"10.3102/10769986221109208","DOIUrl":"https://doi.org/10.3102/10769986221109208","url":null,"abstract":"Differential item functioning (DIF) occurs when the probability of endorsing an item differs across groups for individuals with the same latent trait level. The presence of DIF items may jeopardize the validity of an instrument; therefore, it is crucial to identify DIF items in routine operations of educational assessment. While DIF detection procedures based on item response theory (IRT) have been widely used, a majority of IRT-based DIF tests assume predefined anchor (i.e., DIF-free) items. Not only is this assumption strong, but violations to it may also lead to erroneous inferences, for example, an inflated Type I error rate. We propose a general framework to define the effect sizes of DIF without a priori knowledge of anchor items. In particular, we quantify DIF by item-specific residuals from a regression model fitted to the true item parameters in respective groups. Moreover, the null distribution of the proposed test statistic using robust estimator can be derived analytically or approximated numerically even when there is a mix of DIF and non-DIF items, which yields asymptotically justified statistical inference. The Type I error rate and the power performance of the proposed procedure are evaluated and compared with the conventional likelihood-ratio DIF tests in a Monte Carlo experiment. Our simulation study has shown promising results in controlling Type I error rate and power of detecting DIF items. Even when there is a mix of DIF and non-DIF items, the true and false alarm rate can be well controlled when a robust regression estimator is used.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"666 - 692"},"PeriodicalIF":2.4,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42754815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Zero and One Inflated Item Response Theory Models for Bounded Continuous Data 有界连续数据的零和一充气项目反应理论模型
IF 2.4 3区 心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Pub Date : 2022-07-15 DOI: 10.3102/10769986221108455
D. Molenaar, M. Curi, Jorge L. Bazán
Bounded continuous data are encountered in many applications of item response theory, including the measurement of mood, personality, and response times and in the analyses of summed item scores. Although different item response theory models exist to analyze such bounded continuous data, most models assume the data to be in an open interval and cannot accommodate data in a closed interval. As a result, ad hoc transformations are needed to prevent scores on the bounds of the observed variables. To motivate the present study, we demonstrate in real and simulated data that this practice of fitting open interval models to closed interval data can majorly affect parameter estimates even in cases with only 5% of the responses on one of the bounds of the observed variables. To address this problem, we propose a zero and one inflated item response theory modeling framework for bounded continuous responses in the closed interval. We illustrate how four existing models for bounded responses from the literature can be accommodated in the framework. The resulting zero and one inflated item response theory models are studied in a simulation study and a real data application to investigate parameter recovery, model fit, and the consequences of fitting the incorrect distribution to the data. We find that neglecting the bounded nature of the data biases parameters and that misspecification of the exact distribution may affect the results depending on the data generating model.
在项目反应理论的许多应用中都会遇到有界连续数据,包括情绪、个性和反应时间的测量以及项目总得分的分析。虽然存在不同的项目反应理论模型来分析这种有界连续数据,但大多数模型都假设数据处于开放区间,无法容纳封闭区间的数据。因此,需要特别的转换来防止在观察变量的边界上得分。为了激励本研究,我们在真实和模拟数据中证明,即使在观测变量的一个边界上只有5%的响应的情况下,将开放区间模型拟合到封闭区间数据的做法也会严重影响参数估计。为了解决这一问题,我们提出了一个零项和一项膨胀项的有界连续响应理论建模框架。我们将说明如何将文献中已有的四种有界响应模型纳入该框架。在模拟研究和实际数据应用中,研究了由此产生的零和一膨胀项目反应理论模型,以研究参数恢复,模型拟合以及拟合数据不正确分布的后果。我们发现,忽略数据偏差参数的有界性质和准确分布的错误说明可能会影响数据生成模型的结果。
{"title":"Zero and One Inflated Item Response Theory Models for Bounded Continuous Data","authors":"D. Molenaar, M. Curi, Jorge L. Bazán","doi":"10.3102/10769986221108455","DOIUrl":"https://doi.org/10.3102/10769986221108455","url":null,"abstract":"Bounded continuous data are encountered in many applications of item response theory, including the measurement of mood, personality, and response times and in the analyses of summed item scores. Although different item response theory models exist to analyze such bounded continuous data, most models assume the data to be in an open interval and cannot accommodate data in a closed interval. As a result, ad hoc transformations are needed to prevent scores on the bounds of the observed variables. To motivate the present study, we demonstrate in real and simulated data that this practice of fitting open interval models to closed interval data can majorly affect parameter estimates even in cases with only 5% of the responses on one of the bounds of the observed variables. To address this problem, we propose a zero and one inflated item response theory modeling framework for bounded continuous responses in the closed interval. We illustrate how four existing models for bounded responses from the literature can be accommodated in the framework. The resulting zero and one inflated item response theory models are studied in a simulation study and a real data application to investigate parameter recovery, model fit, and the consequences of fitting the incorrect distribution to the data. We find that neglecting the bounded nature of the data biases parameters and that misspecification of the exact distribution may affect the results depending on the data generating model.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"693 - 735"},"PeriodicalIF":2.4,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45894141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
Journal of Educational and Behavioral Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1