评估 IRT 观察得分和核等价方法中的等价变换。

IF 1 4区心理学 Q4 PSYCHOLOGY, MATHEMATICAL Applied Psychological Measurement Pub Date : 2023-03-01 Epub Date: 2022-10-04 DOI:10.1177/01466216221124087

Waldir Leôncio, Marie Wiberg, Michela Battauz

{"title":"评估 IRT 观察得分和核等价方法中的等价变换。","authors":"Waldir Leôncio, Marie Wiberg, Michela Battauz","doi":"10.1177/01466216221124087","DOIUrl":null,"url":null,"abstract":"Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. This article compares equating transformations originated from three different frameworks, namely IRT Observed-Score Equating (IRTOSE), Kernel Equating (KE), and IRT Kernel Equating (IRTKE). The comparisons were made under different data-generating scenarios, which include the development of a novel data-generation procedure that allows the simulation of test data without relying on IRT parameters while still providing control over some test score properties such as distribution skewness and item difficulty. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE might be able to provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework.","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/74/30/10.1177_01466216221124087.PMC9979196.pdf","citationCount":"0","resultStr":"{\"title\":\"Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.\",\"authors\":\"Waldir Leôncio, Marie Wiberg, Michela Battauz\",\"doi\":\"10.1177/01466216221124087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. This article compares equating transformations originated from three different frameworks, namely IRT Observed-Score Equating (IRTOSE), Kernel Equating (KE), and IRT Kernel Equating (IRTKE). The comparisons were made under different data-generating scenarios, which include the development of a novel data-generation procedure that allows the simulation of test data without relying on IRT parameters while still providing control over some test score properties such as distribution skewness and item difficulty. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE might be able to provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework.\",\"PeriodicalId\":48300,\"journal\":{\"name\":\"Applied Psychological Measurement\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/74/30/10.1177_01466216221124087.PMC9979196.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Psychological Measurement\",\"FirstCategoryId\":\"102\",\"ListUrlMain\":\"https://doi.org/10.1177/01466216221124087\",\"RegionNum\":4,\"RegionCategory\":\"心理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2022/10/4 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q4\",\"JCRName\":\"PSYCHOLOGY, MATHEMATICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216221124087","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/10/4 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}

引用次数: 0

摘要

测验等化是一种统计程序，旨在确保不同测验形式的分数可以互换使用。有多种方法可用于等分，其中一些基于经典测验理论（CTT）框架，另一些则基于项目反应理论（IRT）框架。本文比较了源自三种不同框架的等分转换方法，即 IRT 观察得分等分法（IRTOSE）、核等分法（KE）和 IRT 核等分法（IRTKE）。比较是在不同的数据生成情景下进行的，其中包括开发一种新颖的数据生成程序，该程序允许在不依赖 IRT 参数的情况下模拟测试数据，同时还能控制某些测试得分属性，如分布偏度和项目难度。我们的结果表明，即使数据不是由 IRT 过程生成的，IRT 方法也往往能提供比 KE 更好的结果。如果能找到合适的预平滑方案，KE 也许能提供令人满意的结果，而且比 IRT 方法快得多。在日常应用中，我们建议观察结果对均衡方法的敏感性，同时注意良好的模型拟合和满足框架假设的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.

Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. This article compares equating transformations originated from three different frameworks, namely IRT Observed-Score Equating (IRTOSE), Kernel Equating (KE), and IRT Kernel Equating (IRTKE). The comparisons were made under different data-generating scenarios, which include the development of a novel data-generation procedure that allows the simulation of test data without relying on IRT parameters while still providing control over some test score properties such as distribution skewness and item difficulty. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE might be able to provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Psychological Measurement Multiple-

CiteScore

2.30

自引率

8.30%

发文量

期刊介绍： Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.