Using Repeated Ratings to Improve Measurement Precision in Incomplete Rating Designs.

Journal of applied measurement Pub Date : 2018-01-01

Eli Jones, Stefanie A Wind

{"title":"Using Repeated Ratings to Improve Measurement Precision in Incomplete Rating Designs.","authors":"Eli Jones, Stefanie A Wind","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>When selecting a design for rater-mediated assessments, one important consideration is the number of raters who rate each examinee. In balancing costs and rater-coverage, rating designs are often implemented wherein only a portion of the examinees are rated by each judge, resulting in large amounts of missing data. One drawback to these sparse rating designs is the reduced precision of examinee ability estimates they provide. When increasing the number of raters per examinee is not feasible, another option may be to increase the number of ratings provided by each rater per examinee. This study applies a Rasch model to explore the effect of increasing the number of rating occasions used by raters to judge examinee proficiency. We used a simulation study to approximate a sparse but connected rater network with a sequentially increasing number of repeated ratings per examinee. The generated data were used to explore the influence of repeated ratings on the precision of rater, examinee, and task parameter estimates as measured by parameter standard errors, the correlation of sparse parameter estimates to true estimates, and the root mean square error of parameter estimates. Results suggest that increasing the number of rating occasions significantly improves the precision of examinee and rater parameter estimates. Results also suggest that parameter recovery levels of rater and task estimates are quite robust to reductions in the number of repeated ratings, although examinee parameter estimates are more sensitive to them. Implications for research and practice in the context of rater-mediated assessment designs are discussed.</p>","PeriodicalId":73608,"journal":{"name":"Journal of applied measurement","volume":"19 2","pages":"148-161"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of applied measurement","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

When selecting a design for rater-mediated assessments, one important consideration is the number of raters who rate each examinee. In balancing costs and rater-coverage, rating designs are often implemented wherein only a portion of the examinees are rated by each judge, resulting in large amounts of missing data. One drawback to these sparse rating designs is the reduced precision of examinee ability estimates they provide. When increasing the number of raters per examinee is not feasible, another option may be to increase the number of ratings provided by each rater per examinee. This study applies a Rasch model to explore the effect of increasing the number of rating occasions used by raters to judge examinee proficiency. We used a simulation study to approximate a sparse but connected rater network with a sequentially increasing number of repeated ratings per examinee. The generated data were used to explore the influence of repeated ratings on the precision of rater, examinee, and task parameter estimates as measured by parameter standard errors, the correlation of sparse parameter estimates to true estimates, and the root mean square error of parameter estimates. Results suggest that increasing the number of rating occasions significantly improves the precision of examinee and rater parameter estimates. Results also suggest that parameter recovery levels of rater and task estimates are quite robust to reductions in the number of repeated ratings, although examinee parameter estimates are more sensitive to them. Implications for research and practice in the context of rater-mediated assessment designs are discussed.

微信好友朋友圈 QQ好友复制链接

本刊更多论文

在不完全额定值设计中使用重复额定值来提高测量精度。

在为评分中介评估选择设计时，一个重要的考虑因素是为每个考生评分的评分员的数量。为了平衡成本和评分覆盖率，评分设计通常是由每位评委只对一部分考生进行评分，导致大量数据缺失。这些稀疏评级设计的一个缺点是，它们提供的考生能力估计的精度降低了。当增加每个考生的评分员数量不可行的时候，另一个选择可能是增加每个考生的评分员提供的评分数量。本研究运用Rasch模型，探讨评核员增加评核次数对考生熟练程度的影响。我们使用模拟研究来近似一个稀疏但连接的评分网络，每个考生的重复评分数量依次增加。通过参数标准误差、稀疏参数估计值与真实估计值的相关性以及参数估计值的均方根误差，利用生成的数据探讨重复评分对评分者、考生和任务参数估计值精度的影响。结果表明，增加评分次数可以显著提高考生和评分者参数估计的精度。结果还表明，评分者和任务估计的参数恢复水平对重复评分数量的减少相当稳健，尽管考生参数估计对它们更为敏感。对研究和实践的影响在评级中介的评估设计的背景下进行了讨论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Journal of applied measurement

自引率

0.00%

发文量

期刊最新文献

Validation of Egalitarian Education Questionnaire using Rasch Measurement Model. Bootstrap Estimate of Bias for Intraclass Correlation. Rasch's Logistic Model Applied to Growth. Psychometric Properties of the General Movement Optimality Score using Rasch Measurement. Rasch Analysis of the Burn-Specific Pain Anxiety Scale: Evidence for the Abbreviated Version.