在绩效评估中检测较高的中心性效应:一种基于模型的中心性指数比较

IF 0.6 Q3 SOCIAL SCIENCES, INTERDISCIPLINARY Measurement-Interdisciplinary Research and Perspectives Pub Date : 2022-10-02 DOI:10.1080/15366367.2021.1972654
K. Jin, T. Eckes
{"title":"在绩效评估中检测较高的中心性效应:一种基于模型的中心性指数比较","authors":"K. Jin, T. Eckes","doi":"10.1080/15366367.2021.1972654","DOIUrl":null,"url":null,"abstract":"ABSTRACT Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale’s middle categories. In the present paper, we adopted Jin and Wang’s (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters exhibiting strong central tendencies to raters exhibiting strong tendencies in the opposite direction (extremity). In two simulation studies, we examined three model-based centrality detection indices (rater infit statistics, residual–expected correlations, and rater threshold SDs) as well as the raw-score SD in terms of their efficiency of reconstructing the true rater centrality rank ordering. Findings confirmed the superiority of the residual–expected correlation, rater threshold SD, and raw-score SD statistics, particularly when the examinee sample size was large and the number of scoring criteria was high. By contrast, the infit statistic results were much less consistent and, under conditions of large differences between criterion difficulties, suggested erroneous conclusions about raters’ central tendencies. Analyzing real rating data from a large-scale speaking performance assessment confirmed that infit statistics are unsuitable for identifying raters’ central tendencies. The discussion focuses on detecting centrality effects under different facets models and the indices’ implications for rater monitoring and fair performance assessment.","PeriodicalId":46596,"journal":{"name":"Measurement-Interdisciplinary Research and Perspectives","volume":"55 1","pages":"228 - 247"},"PeriodicalIF":0.6000,"publicationDate":"2022-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices\",\"authors\":\"K. Jin, T. Eckes\",\"doi\":\"10.1080/15366367.2021.1972654\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ABSTRACT Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale’s middle categories. In the present paper, we adopted Jin and Wang’s (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters exhibiting strong central tendencies to raters exhibiting strong tendencies in the opposite direction (extremity). In two simulation studies, we examined three model-based centrality detection indices (rater infit statistics, residual–expected correlations, and rater threshold SDs) as well as the raw-score SD in terms of their efficiency of reconstructing the true rater centrality rank ordering. Findings confirmed the superiority of the residual–expected correlation, rater threshold SD, and raw-score SD statistics, particularly when the examinee sample size was large and the number of scoring criteria was high. By contrast, the infit statistic results were much less consistent and, under conditions of large differences between criterion difficulties, suggested erroneous conclusions about raters’ central tendencies. Analyzing real rating data from a large-scale speaking performance assessment confirmed that infit statistics are unsuitable for identifying raters’ central tendencies. The discussion focuses on detecting centrality effects under different facets models and the indices’ implications for rater monitoring and fair performance assessment.\",\"PeriodicalId\":46596,\"journal\":{\"name\":\"Measurement-Interdisciplinary Research and Perspectives\",\"volume\":\"55 1\",\"pages\":\"228 - 247\"},\"PeriodicalIF\":0.6000,\"publicationDate\":\"2022-10-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Measurement-Interdisciplinary Research and Perspectives\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/15366367.2021.1972654\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"SOCIAL SCIENCES, INTERDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement-Interdisciplinary Research and Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/15366367.2021.1972654","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 6

摘要

最近对绩效评估中评分者效应的研究越来越关注评分者的中心性,即在评分量表的中间类别周围分配分数的趋势。在本文中,我们采用Jin和Wang(2018)的扩展面建模方法,构建了一个中心性连续体,范围从表现出强烈集中倾向的评分者到表现出强烈反方向(极端)倾向的评分者。在两项模拟研究中,我们检查了三种基于模型的中心性检测指标(评分不全统计量、残差预期相关性和评分阈值SD)以及原始评分SD在重建真实评分中心性排名顺序方面的效率。研究结果证实了残差期望相关、评分阈值SD和原始评分SD统计的优越性,特别是在考生样本量大、评分标准数量多的情况下。相比之下,infit统计结果不太一致,并且在标准难度之间存在较大差异的情况下,对评分者的集中倾向提出了错误的结论。通过对大规模演讲绩效评估的真实评分数据的分析,证实了infit统计不适合用于识别评分者的中心倾向。讨论的重点是在不同方面模型下检测中心性效应以及指数对评分监测和公平绩效评估的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Detecting Rater Centrality Effects in Performance Assessments: A Model-Based Comparison of Centrality Indices
ABSTRACT Recent research on rater effects in performance assessments has increasingly focused on rater centrality, the tendency to assign scores clustering around the rating scale’s middle categories. In the present paper, we adopted Jin and Wang’s (2018) extended facets modeling approach and constructed a centrality continuum, ranging from raters exhibiting strong central tendencies to raters exhibiting strong tendencies in the opposite direction (extremity). In two simulation studies, we examined three model-based centrality detection indices (rater infit statistics, residual–expected correlations, and rater threshold SDs) as well as the raw-score SD in terms of their efficiency of reconstructing the true rater centrality rank ordering. Findings confirmed the superiority of the residual–expected correlation, rater threshold SD, and raw-score SD statistics, particularly when the examinee sample size was large and the number of scoring criteria was high. By contrast, the infit statistic results were much less consistent and, under conditions of large differences between criterion difficulties, suggested erroneous conclusions about raters’ central tendencies. Analyzing real rating data from a large-scale speaking performance assessment confirmed that infit statistics are unsuitable for identifying raters’ central tendencies. The discussion focuses on detecting centrality effects under different facets models and the indices’ implications for rater monitoring and fair performance assessment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Measurement-Interdisciplinary Research and Perspectives
Measurement-Interdisciplinary Research and Perspectives SOCIAL SCIENCES, INTERDISCIPLINARY-
CiteScore
1.80
自引率
0.00%
发文量
23
期刊最新文献
A Latent Trait Approach to the Measurement of Physical Fitness Application of Machine Learning Techniques for Fake News Classification The Use of Multidimensional Item Response Theory Estimations in Controlling Differential Item Functioning Opinion Instability and Measurement Errors: A G-Theory Analysis of College Students Predicting the Risk of Diabetes and Heart Disease with Machine Learning Classifiers: The Mediation Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1