在计算机化的适应性测试中，一种定制的方法来选择项目

Journal of the Brazilian Computer Society Pub Date : 2020-05-19 DOI:10.1186/s13173-020-00098-z

Victor M. G. Jatobá, Jorge S. Farias, Valdinei Freire, André S. Ruela, Karina V. Delgado

{"title":"在计算机化的适应性测试中，一种定制的方法来选择项目","authors":"Victor M. G. Jatobá, Jorge S. Farias, Valdinei Freire, André S. Ruela, Karina V. Delgado","doi":"10.1186/s13173-020-00098-z","DOIUrl":null,"url":null,"abstract":"Computerized adaptive testing (CAT) based on item response theory allows more accurate assessments with fewer questions than the classic paper and pencil (P&P) test. Nonetheless, the CAT construction involves some key questions that, when done properly, can further improve the accuracy and efficiency in estimating the examinees’ abilities. One of the main questions is in regard to choosing the item selection rule (ISR). The classic CAT makes exclusive use of one ISR. However, these rules have differences depending on the examinees’ ability level and on the CAT stage. Thus, the objective of this work is to reduce the dichotomous test size which is inserted in a classic CAT with no significant loss of accuracy in the estimation of the examinee’s ability level. For this purpose, we analyze the ISR performance and then build a personalized item selection process in CAT considering the use of more than one rule. The case study in Mathematics and its Technologies test of the ENEM 2012 shows that the Kullback-Leibler information with a posterior distribution ( KLP ) has better performance in the examinees’ ability estimation when compared with Fisher information ( F ), Kullback-Leibler information ( KL ), maximum likelihood weighted information ( MLWI ), and maximum posterior weighted information ( MPWI ) rules. Previous results in the literature show that CAT using KLP was able to reduce this test size by 46.6 % from the full size of 45 items with no significant loss of accuracy in estimating the examinees’ ability level. In this work, we observe that the F and the MLWI rules performed better on early CAT stages to estimate examinees’ proficiency level with extreme negative and positive values, respectively. With this information, we were able to reduce the same test by 53.3 % using the personalized item selection process, called ALICAT, which includes the best rules working together.","PeriodicalId":39760,"journal":{"name":"Journal of the Brazilian Computer Society","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"ALICAT: a customized approach to item selection process in computerized adaptive testing\",\"authors\":\"Victor M. G. Jatobá, Jorge S. Farias, Valdinei Freire, André S. Ruela, Karina V. Delgado\",\"doi\":\"10.1186/s13173-020-00098-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Computerized adaptive testing (CAT) based on item response theory allows more accurate assessments with fewer questions than the classic paper and pencil (P&P) test. Nonetheless, the CAT construction involves some key questions that, when done properly, can further improve the accuracy and efficiency in estimating the examinees’ abilities. One of the main questions is in regard to choosing the item selection rule (ISR). The classic CAT makes exclusive use of one ISR. However, these rules have differences depending on the examinees’ ability level and on the CAT stage. Thus, the objective of this work is to reduce the dichotomous test size which is inserted in a classic CAT with no significant loss of accuracy in the estimation of the examinee’s ability level. For this purpose, we analyze the ISR performance and then build a personalized item selection process in CAT considering the use of more than one rule. The case study in Mathematics and its Technologies test of the ENEM 2012 shows that the Kullback-Leibler information with a posterior distribution ( KLP ) has better performance in the examinees’ ability estimation when compared with Fisher information ( F ), Kullback-Leibler information ( KL ), maximum likelihood weighted information ( MLWI ), and maximum posterior weighted information ( MPWI ) rules. Previous results in the literature show that CAT using KLP was able to reduce this test size by 46.6 % from the full size of 45 items with no significant loss of accuracy in estimating the examinees’ ability level. In this work, we observe that the F and the MLWI rules performed better on early CAT stages to estimate examinees’ proficiency level with extreme negative and positive values, respectively. With this information, we were able to reduce the same test by 53.3 % using the personalized item selection process, called ALICAT, which includes the best rules working together.\",\"PeriodicalId\":39760,\"journal\":{\"name\":\"Journal of the Brazilian Computer Society\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Brazilian Computer Society\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13173-020-00098-z\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Brazilian Computer Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13173-020-00098-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

基于项目反应理论的计算机化自适应测试(CAT)比传统的纸笔测试(P&P)题数更少，评估更准确。然而，CAT的构建涉及到一些关键问题，如果处理得当，可以进一步提高评估考生能力的准确性和效率。其中一个主要问题是关于选择项目选择规则(ISR)。经典的CAT只使用一个ISR。然而，这些规则根据考生的能力水平和CAT阶段而有所不同。因此，这项工作的目的是减少在经典CAT中插入的二分类测试大小，而在估计考生能力水平方面没有显着的准确性损失。为此，我们分析了ISR的性能，然后在CAT中构建了一个考虑使用多个规则的个性化项目选择过程。以2012年高考数学及其技术试题为例，对比Fisher信息(F)、Kullback-Leibler信息(KL)、最大似然加权信息(MLWI)和最大后验加权信息(MPWI)规则，具有后验分布的Kullback-Leibler信息(KLP)规则在考生能力估计方面具有更好的表现。先前的文献结果表明，使用KLP的CAT能够在45个项目的完整尺寸上减少46.6%的测试尺寸，并且在估计考生能力水平的准确性方面没有明显的损失。在这项工作中，我们观察到F和MLWI规则在CAT早期阶段分别以极值负和极值正来估计考生的熟练程度。有了这些信息，我们能够使用个性化的项目选择过程(称为ALICAT)将相同的测试减少53.3%，其中包括最佳规则一起工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ALICAT: a customized approach to item selection process in computerized adaptive testing

Computerized adaptive testing (CAT) based on item response theory allows more accurate assessments with fewer questions than the classic paper and pencil (P&P) test. Nonetheless, the CAT construction involves some key questions that, when done properly, can further improve the accuracy and efficiency in estimating the examinees’ abilities. One of the main questions is in regard to choosing the item selection rule (ISR). The classic CAT makes exclusive use of one ISR. However, these rules have differences depending on the examinees’ ability level and on the CAT stage. Thus, the objective of this work is to reduce the dichotomous test size which is inserted in a classic CAT with no significant loss of accuracy in the estimation of the examinee’s ability level. For this purpose, we analyze the ISR performance and then build a personalized item selection process in CAT considering the use of more than one rule. The case study in Mathematics and its Technologies test of the ENEM 2012 shows that the Kullback-Leibler information with a posterior distribution ( KLP ) has better performance in the examinees’ ability estimation when compared with Fisher information ( F ), Kullback-Leibler information ( KL ), maximum likelihood weighted information ( MLWI ), and maximum posterior weighted information ( MPWI ) rules. Previous results in the literature show that CAT using KLP was able to reduce this test size by 46.6 % from the full size of 45 items with no significant loss of accuracy in estimating the examinees’ ability level. In this work, we observe that the F and the MLWI rules performed better on early CAT stages to estimate examinees’ proficiency level with extreme negative and positive values, respectively. With this information, we were able to reduce the same test by 53.3 % using the personalized item selection process, called ALICAT, which includes the best rules working together.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the Brazilian Computer Society Computer Science-Computer Science (all)

CiteScore

2.40

自引率

0.00%

发文量

期刊介绍： JBCS is a formal quarterly publication of the Brazilian Computer Society. It is a peer-reviewed international journal which aims to serve as a forum to disseminate innovative research in all fields of computer science and related subjects. Theoretical, practical and experimental papers reporting original research contributions are welcome, as well as high quality survey papers. The journal is open to contributions in all computer science topics, computer systems development or in formal and theoretical aspects of computing, as the list of topics below is not exhaustive. Contributions will be considered for publication in JBCS if they have not been published previously and are not under consideration for publication elsewhere.