{"title":"Rater Connections and the Detection of Bias in Performance Assessment","authors":"Stefanie A. Wind","doi":"10.1080/15366367.2021.1942672","DOIUrl":null,"url":null,"abstract":"ABSTRACT In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the relatively limited observations of each rater can pose challenges for identifying raters who may exhibit scoring idiosyncrasies specific to individual or subgroups of examinees, such as differential rater functioning (DRF; i.e., rater bias). In particular, when raters who exhibit DRF are directly connected to other raters who exhibit the same type of DRF, there is limited information with which to detect this effect. On the other hand, if raters who exhibit DRF are connected to raters who do not exhibit DRF, this effect may be more readily detected. In this study, a simulation is used to systematically examine the degree to which the nature of connections among raters who exhibit common DRF patterns in sparse rating designs impacts the sensitivity of DRF indices. The use of additional “monitoring ratings” and variable rater assignment to student performances are considered as strategies to improve DRF detection in sparse designs. The results indicate that the nature of connections among DRF raters has a substantial impact on the sensitivity of DRF indices, and that monitoring ratings and variable rater assignment to student performances can improve DRF detection.","PeriodicalId":46596,"journal":{"name":"Measurement-Interdisciplinary Research and Perspectives","volume":"26 1","pages":"91 - 106"},"PeriodicalIF":0.6000,"publicationDate":"2022-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Measurement-Interdisciplinary Research and Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/15366367.2021.1942672","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 1
Abstract
ABSTRACT In many performance assessments, one or two raters from the complete rater pool scores each performance, resulting in a sparse rating design, where there are limited observations of each rater relative to the complete sample of students. Although sparse rating designs can be constructed to facilitate estimation of student achievement, the relatively limited observations of each rater can pose challenges for identifying raters who may exhibit scoring idiosyncrasies specific to individual or subgroups of examinees, such as differential rater functioning (DRF; i.e., rater bias). In particular, when raters who exhibit DRF are directly connected to other raters who exhibit the same type of DRF, there is limited information with which to detect this effect. On the other hand, if raters who exhibit DRF are connected to raters who do not exhibit DRF, this effect may be more readily detected. In this study, a simulation is used to systematically examine the degree to which the nature of connections among raters who exhibit common DRF patterns in sparse rating designs impacts the sensitivity of DRF indices. The use of additional “monitoring ratings” and variable rater assignment to student performances are considered as strategies to improve DRF detection in sparse designs. The results indicate that the nature of connections among DRF raters has a substantial impact on the sensitivity of DRF indices, and that monitoring ratings and variable rater assignment to student performances can improve DRF detection.