Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.

IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Applied Psychological Measurement Pub Date : 2023-03-01 Epub Date: 2022-10-04 DOI:10.1177/01466216221124087
Waldir Leôncio, Marie Wiberg, Michela Battauz
{"title":"Evaluating Equating Transformations in IRT Observed-Score and Kernel Equating Methods.","authors":"Waldir Leôncio, Marie Wiberg, Michela Battauz","doi":"10.1177/01466216221124087","DOIUrl":null,"url":null,"abstract":"<p><p>Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. This article compares equating transformations originated from three different frameworks, namely IRT Observed-Score Equating (IRTOSE), Kernel Equating (KE), and IRT Kernel Equating (IRTKE). The comparisons were made under different data-generating scenarios, which include the development of a novel data-generation procedure that allows the simulation of test data without relying on IRT parameters while still providing control over some test score properties such as distribution skewness and item difficulty. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE might be able to provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"47 2","pages":"123-140"},"PeriodicalIF":1.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/74/30/10.1177_01466216221124087.PMC9979196.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216221124087","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2022/10/4 0:00:00","PubModel":"Epub","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Test equating is a statistical procedure to ensure that scores from different test forms can be used interchangeably. There are several methodologies available to perform equating, some of which are based on the Classical Test Theory (CTT) framework and others are based on the Item Response Theory (IRT) framework. This article compares equating transformations originated from three different frameworks, namely IRT Observed-Score Equating (IRTOSE), Kernel Equating (KE), and IRT Kernel Equating (IRTKE). The comparisons were made under different data-generating scenarios, which include the development of a novel data-generation procedure that allows the simulation of test data without relying on IRT parameters while still providing control over some test score properties such as distribution skewness and item difficulty. Our results suggest that IRT methods tend to provide better results than KE even when the data are not generated from IRT processes. KE might be able to provide satisfactory results if a proper pre-smoothing solution can be found, while also being much faster than IRT methods. For daily applications, we recommend observing the sensibility of the results to the equating method, minding the importance of good model fit and meeting the assumptions of the framework.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评估 IRT 观察得分和核等价方法中的等价变换。
测验等化是一种统计程序,旨在确保不同测验形式的分数可以互换使用。有多种方法可用于等分,其中一些基于经典测验理论(CTT)框架,另一些则基于项目反应理论(IRT)框架。本文比较了源自三种不同框架的等分转换方法,即 IRT 观察得分等分法(IRTOSE)、核等分法(KE)和 IRT 核等分法(IRTKE)。比较是在不同的数据生成情景下进行的,其中包括开发一种新颖的数据生成程序,该程序允许在不依赖 IRT 参数的情况下模拟测试数据,同时还能控制某些测试得分属性,如分布偏度和项目难度。我们的结果表明,即使数据不是由 IRT 过程生成的,IRT 方法也往往能提供比 KE 更好的结果。如果能找到合适的预平滑方案,KE 也许能提供令人满意的结果,而且比 IRT 方法快得多。在日常应用中,我们建议观察结果对均衡方法的敏感性,同时注意良好的模型拟合和满足框架假设的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.30
自引率
8.30%
发文量
50
期刊介绍: Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.
期刊最新文献
Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions. Evaluating the Construct Validity of Instructional Manipulation Checks as Measures of Careless Responding to Surveys. A Mark-Recapture Approach to Estimating Item Pool Compromise. Estimating Test-Retest Reliability in the Presence of Self-Selection Bias and Learning/Practice Effects. The Improved EMS Algorithm for Latent Variable Selection in M3PL Model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1