序列贝叶斯能力估计在混合格式项目测试中的应用。

IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Applied Psychological Measurement Pub Date : 2023-09-01 Epub Date: 2023-09-08 DOI:10.1177/01466216231201986
Jiawei Xiong, Allan S Cohen, Xinhui Maggie Xiong
{"title":"序列贝叶斯能力估计在混合格式项目测试中的应用。","authors":"Jiawei Xiong, Allan S Cohen, Xinhui Maggie Xiong","doi":"10.1177/01466216231201986","DOIUrl":null,"url":null,"abstract":"<p><p>Large-scale tests often contain mixed-format items, such as when multiple-choice (MC) items and constructed-response (CR) items are both contained in the same test. Although previous research has analyzed both types of items simultaneously, this may not always provide the best estimate of ability. In this paper, a two-step sequential Bayesian (SB) analytic method under the concept of empirical Bayes is explored for mixed item response models. This method integrates ability estimates from different item formats. Unlike the empirical Bayes method, the SB method estimates examinees' posterior ability parameters with individual-level sample-dependent prior distributions estimated from the MC items. Simulations were used to evaluate the accuracy of recovery of ability and item parameters over four factors: the type of the ability distribution, sample size, test length (number of items for each item type), and person/item parameter estimation method. The SB method was compared with a traditional concurrent Bayesian (CB) calibration method, EAPsum, that uses scaled scores for summed scores to estimate parameters from the MC and CR items simultaneously in one estimation step. From the simulation results, the SB method showed more accurate and reliable ability estimation than the CB method, especially when the sample size was small (150 and 500). Both methods presented similar recovery results for MC item parameters, but the CB method yielded a bit better recovery of the CR item parameters. The empirical example suggested that posterior ability estimated by the proposed SB method had higher reliability than the CB method.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"47 5-6","pages":"402-419"},"PeriodicalIF":1.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10552734/pdf/","citationCount":"0","resultStr":"{\"title\":\"Sequential Bayesian Ability Estimation Applied to Mixed-Format Item Tests.\",\"authors\":\"Jiawei Xiong, Allan S Cohen, Xinhui Maggie Xiong\",\"doi\":\"10.1177/01466216231201986\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Large-scale tests often contain mixed-format items, such as when multiple-choice (MC) items and constructed-response (CR) items are both contained in the same test. Although previous research has analyzed both types of items simultaneously, this may not always provide the best estimate of ability. In this paper, a two-step sequential Bayesian (SB) analytic method under the concept of empirical Bayes is explored for mixed item response models. This method integrates ability estimates from different item formats. Unlike the empirical Bayes method, the SB method estimates examinees' posterior ability parameters with individual-level sample-dependent prior distributions estimated from the MC items. Simulations were used to evaluate the accuracy of recovery of ability and item parameters over four factors: the type of the ability distribution, sample size, test length (number of items for each item type), and person/item parameter estimation method. The SB method was compared with a traditional concurrent Bayesian (CB) calibration method, EAPsum, that uses scaled scores for summed scores to estimate parameters from the MC and CR items simultaneously in one estimation step. From the simulation results, the SB method showed more accurate and reliable ability estimation than the CB method, especially when the sample size was small (150 and 500). Both methods presented similar recovery results for MC item parameters, but the CB method yielded a bit better recovery of the CR item parameters. The empirical example suggested that posterior ability estimated by the proposed SB method had higher reliability than the CB method.</p>\",\"PeriodicalId\":48300,\"journal\":{\"name\":\"Applied Psychological Measurement\",\"volume\":\"47 5-6\",\"pages\":\"402-419\"},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10552734/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Psychological Measurement\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1177/01466216231201986\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/9/8 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"PSYCHOLOGY, MATHEMATICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216231201986","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/9/8 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0

摘要

大规模测试通常包含混合格式的项目,例如当多项选择(MC)项目和构造反应(CR)项目都包含在同一测试中时。尽管先前的研究同时分析了这两种类型的项目,但这可能并不总能提供对能力的最佳估计。本文在经验贝叶斯的概念下,对混合项目反应模型的两步序列贝叶斯分析方法进行了探索。该方法集成了来自不同项目格式的能力评估。与经验贝叶斯方法不同,SB方法使用从MC项目估计的个体水平样本相关先验分布来估计考生的后验能力参数。模拟用于评估四个因素的能力和项目参数恢复的准确性:能力分布的类型、样本量、测试长度(每个项目类型的项目数量)和个人/项目参数估计方法。将SB方法与传统的并发贝叶斯(CB)校准方法EAPsum进行了比较,该方法使用缩放分数作为总分数,在一个估计步骤中同时估计MC和CR项目的参数。从模拟结果来看,SB方法比CB方法显示出更准确可靠的能力估计,尤其是当样本量较小(150和500)时。两种方法对MC项目参数的恢复结果相似,但CB方法对CR项目参数的修复效果要好一些。实例表明,所提出的SB方法估计的后验能力比CB方法具有更高的可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Sequential Bayesian Ability Estimation Applied to Mixed-Format Item Tests.

Large-scale tests often contain mixed-format items, such as when multiple-choice (MC) items and constructed-response (CR) items are both contained in the same test. Although previous research has analyzed both types of items simultaneously, this may not always provide the best estimate of ability. In this paper, a two-step sequential Bayesian (SB) analytic method under the concept of empirical Bayes is explored for mixed item response models. This method integrates ability estimates from different item formats. Unlike the empirical Bayes method, the SB method estimates examinees' posterior ability parameters with individual-level sample-dependent prior distributions estimated from the MC items. Simulations were used to evaluate the accuracy of recovery of ability and item parameters over four factors: the type of the ability distribution, sample size, test length (number of items for each item type), and person/item parameter estimation method. The SB method was compared with a traditional concurrent Bayesian (CB) calibration method, EAPsum, that uses scaled scores for summed scores to estimate parameters from the MC and CR items simultaneously in one estimation step. From the simulation results, the SB method showed more accurate and reliable ability estimation than the CB method, especially when the sample size was small (150 and 500). Both methods presented similar recovery results for MC item parameters, but the CB method yielded a bit better recovery of the CR item parameters. The empirical example suggested that posterior ability estimated by the proposed SB method had higher reliability than the CB method.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.30
自引率
8.30%
发文量
50
期刊介绍: Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.
期刊最新文献
Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions. Evaluating the Construct Validity of Instructional Manipulation Checks as Measures of Careless Responding to Surveys. A Mark-Recapture Approach to Estimating Item Pool Compromise. Estimating Test-Retest Reliability in the Presence of Self-Selection Bias and Learning/Practice Effects. The Improved EMS Algorithm for Latent Variable Selection in M3PL Model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1