The Impact of Item Model Parameter Variations on Person Parameter Estimation in Computerized Adaptive Testing With Automatically Generated Items.

IF 1.2 4区心理学 Q4 PSYCHOLOGY, MATHEMATICAL Applied Psychological Measurement Pub Date : 2023-06-01 Epub Date: 2023-03-17 DOI:10.1177/01466216231165313

Chen Tian, Jaehwa Choi

{"title":"The Impact of Item Model Parameter Variations on Person Parameter Estimation in Computerized Adaptive Testing With Automatically Generated Items.","authors":"Chen Tian, Jaehwa Choi","doi":"10.1177/01466216231165313","DOIUrl":null,"url":null,"abstract":"<p><p>Sibling items developed through automatic item generation share similar but not identical psychometric properties. However, considering sibling item variations may bring huge computation difficulties and little improvement on scoring. Assuming identical characteristics among siblings, this study explores the impact of item model parameter variations (i.e., within-family variation between siblings) on person parameter estimation in linear tests and Computerized Adaptive Testing (CAT). Specifically, we explore (1) what if small/medium/large within-family variance is ignored, (2) if the effect of larger within-model variance can be compensated by greater test length, (3) if the item model pool properties affect the impact of within-family variance on scoring, and (4) if the issues in (1) and (2) are different in linear vs. adaptive testing. Related sibling model is used for data generation and identical sibling model is assumed for scoring. Manipulated factors include test length, the size of within-model variation, and item model pool characteristics. Results show that as within-family variance increases, the standard error of scores remains at similar levels. For correlations between true and estimated score and RMSE, the effect of the larger within-model variance was compensated by test length. For bias, scores are biased towards the center, and bias was not compensated by test length. Despite the within-family variation is random in current simulations, to yield less biased ability estimates, the item model pool should provide balanced opportunities such that \"fake-easy\" and \"fake-difficult\" item instances cancel their effects. The results of CAT are similar to that of linear tests, except for higher efficiency.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"47 4","pages":"275-290"},"PeriodicalIF":1.2000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10240571/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216231165313","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/3/17 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Sibling items developed through automatic item generation share similar but not identical psychometric properties. However, considering sibling item variations may bring huge computation difficulties and little improvement on scoring. Assuming identical characteristics among siblings, this study explores the impact of item model parameter variations (i.e., within-family variation between siblings) on person parameter estimation in linear tests and Computerized Adaptive Testing (CAT). Specifically, we explore (1) what if small/medium/large within-family variance is ignored, (2) if the effect of larger within-model variance can be compensated by greater test length, (3) if the item model pool properties affect the impact of within-family variance on scoring, and (4) if the issues in (1) and (2) are different in linear vs. adaptive testing. Related sibling model is used for data generation and identical sibling model is assumed for scoring. Manipulated factors include test length, the size of within-model variation, and item model pool characteristics. Results show that as within-family variance increases, the standard error of scores remains at similar levels. For correlations between true and estimated score and RMSE, the effect of the larger within-model variance was compensated by test length. For bias, scores are biased towards the center, and bias was not compensated by test length. Despite the within-family variation is random in current simulations, to yield less biased ability estimates, the item model pool should provide balanced opportunities such that "fake-easy" and "fake-difficult" item instances cancel their effects. The results of CAT are similar to that of linear tests, except for higher efficiency.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在使用自动生成项目的计算机化自适应测试中，项目模型参数变化对人员参数估计的影响。

通过自动生成项目开发的同源项目具有相似但不完全相同的心理测量特性。然而，考虑兄弟姐妹间的项目差异可能会带来巨大的计算困难，而且对评分的改善甚微。本研究假定兄弟姐妹间的特征完全相同，探讨了项目模型参数变化（即兄弟姐妹间的家内变化）对线性测验和计算机化自适应测验（CAT）中的人参数估计的影响。具体来说，我们将探讨：(1) 如果忽略小/中/大的家内变异，(2) 更大的模型内变异的影响是否可以通过更长的测试长度来补偿，(3) 项目模型库的属性是否会影响家内变异对得分的影响，(4) (1) 和 (2) 中的问题在线性测试和适应性测试中是否有所不同。相关兄弟姐妹模型用于数据生成，相同兄弟姐妹模型用于评分。操纵因素包括测试长度、模型内变异大小和项目模型库特征。结果表明，随着家庭内变异的增加，分数的标准误差保持在相似的水平。对于真实分数和估计分数之间的相关性以及均方根误差，较大的模型内变异的影响被测试长度所补偿。至于偏差，分数偏向中心，偏差没有被测试长度补偿。尽管在目前的模拟中，族内变异是随机的，但为了减少能力估计值的偏差，项目模型库应提供均衡的机会，使 "假容易 "和 "假困难 "的项目实例抵消它们的影响。CAT 的结果与线性测试相似，只是效率更高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Applied Psychological Measurement Multiple-

CiteScore

2.30

自引率

8.30%

发文量

期刊介绍： Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.