Accounting for item calibration error in computerized adaptive testing.

IF 3.9 2区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Behavior Research Methods Pub Date : 2025-03-26 DOI:10.3758/s13428-025-02649-8

Aron Fink, Christoph König, Andreas Frey

{"title":"Accounting for item calibration error in computerized adaptive testing.","authors":"Aron Fink, Christoph König, Andreas Frey","doi":"10.3758/s13428-025-02649-8","DOIUrl":null,"url":null,"abstract":"<p><p>In computerized adaptive testing (CAT), item parameter estimates derived from calibration studies are considered to be known and are used as fixed values for adaptive item selection and ability estimation. This is not completely accurate because these item parameter estimates contain a certain degree of error. If this error is random, the typical CAT procedure leads to standard errors of the final ability estimates that are too small. If the calibration error is large, it has been shown that the accuracy of the ability estimates is negatively affected due to the capitalization on chance problem, especially for extreme ability levels. In order to find a solution for this fundamental problem of CAT, we conducted a Monte Carlo simulation study to examine three approaches that can be used to consider the uncertainty of item parameter estimates in CAT. The first two approaches used a measurement error modeling approach in which item parameters were treated as covariates that contained errors. The third approach was fully Bayesian. Each of the approaches was compared with regard to the quality of the resulting ability estimates. The results indicate that each of the three approaches is capable of reducing bias and the mean squared error (MSE) of the ability estimates, especially for high item calibration errors. The Bayesian approach clearly outperformed the other approaches. We recommend the Bayesian approach, especially for application areas in which the recruitment of a large calibration sample is infeasible.</p>","PeriodicalId":8717,"journal":{"name":"Behavior Research Methods","volume":"57 5","pages":"126"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11947018/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Behavior Research Methods","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3758/s13428-025-02649-8","RegionNum":2,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

In computerized adaptive testing (CAT), item parameter estimates derived from calibration studies are considered to be known and are used as fixed values for adaptive item selection and ability estimation. This is not completely accurate because these item parameter estimates contain a certain degree of error. If this error is random, the typical CAT procedure leads to standard errors of the final ability estimates that are too small. If the calibration error is large, it has been shown that the accuracy of the ability estimates is negatively affected due to the capitalization on chance problem, especially for extreme ability levels. In order to find a solution for this fundamental problem of CAT, we conducted a Monte Carlo simulation study to examine three approaches that can be used to consider the uncertainty of item parameter estimates in CAT. The first two approaches used a measurement error modeling approach in which item parameters were treated as covariates that contained errors. The third approach was fully Bayesian. Each of the approaches was compared with regard to the quality of the resulting ability estimates. The results indicate that each of the three approaches is capable of reducing bias and the mean squared error (MSE) of the ability estimates, especially for high item calibration errors. The Bayesian approach clearly outperformed the other approaches. We recommend the Bayesian approach, especially for application areas in which the recruitment of a large calibration sample is infeasible.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

计算机自适应测试中项目校准误差的解释。

在计算机化自适应测试（CAT）中，从校准研究中得出的项目参数估计值被认为是已知的，并被用作自适应项目选择和能力估计的固定值。这不是完全准确的，因为这些项目参数估计包含一定程度的误差。如果该误差是随机的，则典型的CAT过程会导致最终能力估计的标准误差过小。在校正误差较大的情况下，由于机会上的资本化问题，能力估计的准确性受到负面影响，特别是对于极端能力水平。为了找到CAT这个基本问题的解决方案，我们进行了蒙特卡罗模拟研究，以检查可用于考虑CAT中项目参数估计的不确定性的三种方法。前两种方法使用测量误差建模方法，其中项目参数被视为包含误差的协变量。第三种方法是完全贝叶斯的。每一种方法都比较了结果能力估计的质量。结果表明，三种方法均能减小能力估计的偏倚和均方误差（MSE），特别是在高项目校准误差的情况下。贝叶斯方法显然优于其他方法。我们推荐贝叶斯方法，特别是在应用领域，其中招募一个大的校准样本是不可行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Behavior Research Methods Multiple-

CiteScore

10.30

自引率

9.30%

发文量

266

期刊介绍： Behavior Research Methods publishes articles concerned with the methods, techniques, and instrumentation of research in experimental psychology. The journal focuses particularly on the use of computer technology in psychological research. An annual special issue is devoted to this field.