教育测试中的题库质量控制:变化点模型、复合风险和顺序检测

IF 1.9 3区心理学 Q2 EDUCATION & EDUCATIONAL RESEARCH Journal of Educational and Behavioral Statistics Pub Date : 2021-12-13 DOI:10.3102/10769986211059085

Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li

{"title":"教育测试中的题库质量控制:变化点模型、复合风险和顺序检测","authors":"Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li","doi":"10.3102/10769986211059085","DOIUrl":null,"url":null,"abstract":"In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.","PeriodicalId":48001,"journal":{"name":"Journal of Educational and Behavioral Statistics","volume":"47 1","pages":"322 - 352"},"PeriodicalIF":1.9000,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection\",\"authors\":\"Yunxiao Chen, Yi-Hsuan Lee, Xiaoou Li\",\"doi\":\"10.3102/10769986211059085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.\",\"PeriodicalId\":48001,\"journal\":{\"name\":\"Journal of Educational and Behavioral Statistics\",\"volume\":\"47 1\",\"pages\":\"322 - 352\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2021-12-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Educational and Behavioral Statistics\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.3102/10769986211059085\",\"RegionNum\":3,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"EDUCATION & EDUCATIONAL RESEARCH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Educational and Behavioral Statistics","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.3102/10769986211059085","RegionNum":3,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}

引用次数: 3

摘要

在标准化的教育测试中，测试项目在多个测试管理中重复使用。为了确保考试成绩的有效性，项目的心理测量特性应随时间保持不变。在这篇文章中，我们考虑了对测试项目的顺序监测，特别是检测其心理测量特性的突然变化，例如，项目的泄露或相应课程的变化可能会导致变化。我们提出了一个统计框架来检测单个项目的突变。该框架由（1）描述项目顺序变化的多流贝叶斯变点模型，（2）量化顺序决策中风险的复合风险函数，以及（3）控制复合风险的顺序决策规则组成。在整个顺序决策过程中，所提出的决策规则平衡了两个错误来源之间的权衡，即更改前项目的错误检测和更改后项目的未检测。基于项目反应理论模型，提出了一种针对项目的监测统计数据，该模型消除了受试者群体中随时间变化的混杂因素。序列决策规则及其理论性质是在两种设置下发展起来的：在预言机设置下，贝叶斯变点模型是完全已知的，在更现实的设置下，模型的一些参数是未知的。模拟研究是在模拟实际操作测试的环境下进行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Item Pool Quality Control in Educational Testing: Change Point Model, Compound Risk, and Sequential Detection

In standardized educational testing, test items are reused in multiple test administrations. To ensure the validity of test scores, the psychometric properties of items should remain unchanged over time. In this article, we consider the sequential monitoring of test items, in particular, the detection of abrupt changes to their psychometric properties, where a change can be caused by, for example, leakage of the item or change of the corresponding curriculum. We propose a statistical framework for the detection of abrupt changes in individual items. This framework consists of (1) a multistream Bayesian change point model describing sequential changes in items, (2) a compound risk function quantifying the risk in sequential decisions, and (3) sequential decision rules that control the compound risk. Throughout the sequential decision process, the proposed decision rule balances the trade-off between two sources of errors, the false detection of prechange items, and the nondetection of postchange items. An item-specific monitoring statistic is proposed based on an item response theory model that eliminates the confounding from the examinee population which changes over time. Sequential decision rules and their theoretical properties are developed under two settings: the oracle setting where the Bayesian change point model is completely known and a more realistic setting where some parameters of the model are unknown. Simulation studies are conducted under settings that mimic real operational tests.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Educational and Behavioral Statistics Multiple-

CiteScore

4.40

自引率

4.20%

发文量

期刊介绍： Journal of Educational and Behavioral Statistics, sponsored jointly by the American Educational Research Association and the American Statistical Association, publishes articles that are original and provide methods that are useful to those studying problems and issues in educational or behavioral research. Typical papers introduce new methods of analysis. Critical reviews of current practice, tutorial presentations of less well known methods, and novel applications of already-known methods are also of interest. Papers discussing statistical techniques without specific educational or behavioral interest or focusing on substantive results without developing new statistical methods or models or making novel use of existing methods have lower priority. Simulation studies, either to demonstrate properties of an existing method or to compare several existing methods (without providing a new method), also have low priority. The Journal of Educational and Behavioral Statistics provides an outlet for papers that are original and provide methods that are useful to those studying problems and issues in educational or behavioral research. Typical papers introduce new methods of analysis, provide properties of these methods, and an example of use in education or behavioral research. Critical reviews of current practice, tutorial presentations of less well known methods, and novel applications of already-known methods are also sometimes accepted. Papers discussing statistical techniques without specific educational or behavioral interest or focusing on substantive results without developing new statistical methods or models or making novel use of existing methods have lower priority. Simulation studies, either to demonstrate properties of an existing method or to compare several existing methods (without providing a new method), also have low priority.