Modeling Rating Order Effects Under Item Response Theory Models for Rater-Mediated Assessments.

IF 1 4区 心理学 Q4 PSYCHOLOGY, MATHEMATICAL Applied Psychological Measurement Pub Date : 2023-06-01 DOI:10.1177/01466216231174566
Hung-Yu Huang
{"title":"Modeling Rating Order Effects Under Item Response Theory Models for Rater-Mediated Assessments.","authors":"Hung-Yu Huang","doi":"10.1177/01466216231174566","DOIUrl":null,"url":null,"abstract":"<p><p>Rater effects are commonly observed in rater-mediated assessments. By using item response theory (IRT) modeling, raters can be treated as independent factors that function as instruments for measuring ratees. Most rater effects are static and can be addressed appropriately within an IRT framework, and a few models have been developed for dynamic rater effects. Operational rating projects often require human raters to continuously and repeatedly score ratees over a certain period, imposing a burden on the cognitive processing abilities and attention spans of raters that stems from judgment fatigue and thus affects the rating quality observed during the rating period. As a result, ratees' scores may be influenced by the order in which they are graded by raters in a rating sequence, and the rating order effect should be considered in new IRT models. In this study, two types of many-faceted (MF)-IRT models are developed to account for such dynamic rater effects, which assume that rater severity can drift systematically or stochastically. The results obtained from two simulation studies indicate that the parameters of the newly developed models can be estimated satisfactorily using Bayesian estimation and that disregarding the rating order effect produces biased model structure and ratee proficiency parameter estimations. A creativity assessment is outlined to demonstrate the application of the new models and to investigate the consequences of failing to detect the possible rating order effect in a real rater-mediated evaluation.</p>","PeriodicalId":48300,"journal":{"name":"Applied Psychological Measurement","volume":"47 4","pages":"312-327"},"PeriodicalIF":1.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/7c/68/10.1177_01466216231174566.PMC10240569.pdf","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Psychological Measurement","FirstCategoryId":"102","ListUrlMain":"https://doi.org/10.1177/01466216231174566","RegionNum":4,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"PSYCHOLOGY, MATHEMATICAL","Score":null,"Total":0}
引用次数: 1

Abstract

Rater effects are commonly observed in rater-mediated assessments. By using item response theory (IRT) modeling, raters can be treated as independent factors that function as instruments for measuring ratees. Most rater effects are static and can be addressed appropriately within an IRT framework, and a few models have been developed for dynamic rater effects. Operational rating projects often require human raters to continuously and repeatedly score ratees over a certain period, imposing a burden on the cognitive processing abilities and attention spans of raters that stems from judgment fatigue and thus affects the rating quality observed during the rating period. As a result, ratees' scores may be influenced by the order in which they are graded by raters in a rating sequence, and the rating order effect should be considered in new IRT models. In this study, two types of many-faceted (MF)-IRT models are developed to account for such dynamic rater effects, which assume that rater severity can drift systematically or stochastically. The results obtained from two simulation studies indicate that the parameters of the newly developed models can be estimated satisfactorily using Bayesian estimation and that disregarding the rating order effect produces biased model structure and ratee proficiency parameter estimations. A creativity assessment is outlined to demonstrate the application of the new models and to investigate the consequences of failing to detect the possible rating order effect in a real rater-mediated evaluation.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
评价中介评价项目反应理论模型下评价顺序效应的建模。
评价者效应通常在评价者介导的评估中观察到。通过项目反应理论(IRT)建模,可以将评价者视为独立的因素,作为衡量评价率的工具。大多数比率效应是静态的,可以在IRT框架内适当地处理,并且已经为动态比率效应开发了一些模型。操作性评级项目往往需要人类评分员在一段时间内连续重复地评分,这给评分员的认知加工能力和注意力带来了负担,这是由于判断疲劳造成的,从而影响了评分期间观察到的评分质量。因此,评分者的分数可能会受到评分者在评分序列中的评分顺序的影响,在新的IRT模型中应该考虑评分顺序效应。在本研究中,开发了两种类型的多面(MF)-IRT模型来解释这种动态评级效应,这些模型假设评级严重程度可以系统地或随机地漂移。两项仿真研究的结果表明,新建立的模型参数可以用贝叶斯估计得到满意的估计,忽略评级顺序效应会导致模型结构和率熟练度参数估计有偏差。本文概述了一个创造力评估,以展示新模型的应用,并调查在真实的评分中介评估中未能检测到可能的评分顺序效应的后果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
2.30
自引率
8.30%
发文量
50
期刊介绍: Applied Psychological Measurement publishes empirical research on the application of techniques of psychological measurement to substantive problems in all areas of psychology and related disciplines.
期刊最新文献
Effect of Differential Item Functioning on Computer Adaptive Testing Under Different Conditions. Evaluating the Construct Validity of Instructional Manipulation Checks as Measures of Careless Responding to Surveys. A Mark-Recapture Approach to Estimating Item Pool Compromise. Estimating Test-Retest Reliability in the Presence of Self-Selection Bias and Learning/Practice Effects. The Improved EMS Algorithm for Latent Variable Selection in M3PL Model.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1