Examining Rater Performance on the CELBAN Speaking: A Many-Facets Rasch Measurement Analysis

IF 0.5 0 LANGUAGE & LINGUISTICS Canadian Journal of Applied Linguistics Pub Date : 2020-10-16 DOI:10.37213/cjal.2020.30436
Peiyu Wang, Karen L. Coetzee, Andy Strachan, S. Monteiro, Liying Cheng
{"title":"Examining Rater Performance on the CELBAN Speaking: A Many-Facets Rasch Measurement Analysis","authors":"Peiyu Wang, Karen L. Coetzee, Andy Strachan, S. Monteiro, Liying Cheng","doi":"10.37213/cjal.2020.30436","DOIUrl":null,"url":null,"abstract":"Internationally educated nurses’ (IENs) English language proficiency is critical to professional licensure as communication is a key competency for safe practice. The Canadian English Language Benchmark Assessment for Nurses (CELBAN) is Canada’s only Canadian Language Benchmarks (CLB) referenced examination used in the context of healthcare regulation. This high-stakes assessment claims proof of proficiency for IENs seeking licensure in Canada and a measure of public safety for nursing regulators. Understanding the quality of rater performance when examination results are used for high-stakes decisions is crucial to maintaining speaking test quality as it involves judgement, and thus requires strong reliability evidence (Koizumi et al., 2017). This study examined rater performance on the CELBAN Speaking component using a Many-Facets Rasch Measurement (MFRM). Specifically, this study identified CELBAN rater reliability in terms of consistency and severity, rating bias, and use of rating scale. The study was based on a sample of 115 raters across eight test sites in Canada and results on 2698 examinations across four parallel versions. Findings demonstrated relatively high inter-rater reliability and intra-rater reliability, and that CLB-based speaking descriptors (CLB 6-9) provided sufficient information for raters to discriminate examinees’ oral proficiency. There was no influence of test site or test version, offering validity evidence to support test use for high-stakes purposes. Grammar, among the eight speaking criteria, was identified as the most difficult criterion on the scale, and the one demonstrating most rater bias. This study highlights the value of MFRM analysis in rater performance research with implications for rater training. This study is one of the first research studies using MFRM with a CLB-referenced high-stakes assessment within the Canadian context.","PeriodicalId":43961,"journal":{"name":"Canadian Journal of Applied Linguistics","volume":"76 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2020-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Canadian Journal of Applied Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37213/cjal.2020.30436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 2

Abstract

Internationally educated nurses’ (IENs) English language proficiency is critical to professional licensure as communication is a key competency for safe practice. The Canadian English Language Benchmark Assessment for Nurses (CELBAN) is Canada’s only Canadian Language Benchmarks (CLB) referenced examination used in the context of healthcare regulation. This high-stakes assessment claims proof of proficiency for IENs seeking licensure in Canada and a measure of public safety for nursing regulators. Understanding the quality of rater performance when examination results are used for high-stakes decisions is crucial to maintaining speaking test quality as it involves judgement, and thus requires strong reliability evidence (Koizumi et al., 2017). This study examined rater performance on the CELBAN Speaking component using a Many-Facets Rasch Measurement (MFRM). Specifically, this study identified CELBAN rater reliability in terms of consistency and severity, rating bias, and use of rating scale. The study was based on a sample of 115 raters across eight test sites in Canada and results on 2698 examinations across four parallel versions. Findings demonstrated relatively high inter-rater reliability and intra-rater reliability, and that CLB-based speaking descriptors (CLB 6-9) provided sufficient information for raters to discriminate examinees’ oral proficiency. There was no influence of test site or test version, offering validity evidence to support test use for high-stakes purposes. Grammar, among the eight speaking criteria, was identified as the most difficult criterion on the scale, and the one demonstrating most rater bias. This study highlights the value of MFRM analysis in rater performance research with implications for rater training. This study is one of the first research studies using MFRM with a CLB-referenced high-stakes assessment within the Canadian context.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
考察在CELBAN口语中的表现:一个多方面的Rasch测量分析
受过国际教育的护士(IENs)的英语能力对专业执照至关重要,因为沟通是安全实践的关键能力。加拿大护士英语基准评估(CELBAN)是加拿大唯一的加拿大语言基准(CLB)参考考试,用于医疗保健法规的背景下。这一高风险的评估为在加拿大寻求执照的IENs提供了熟练程度的证明,并为护理监管机构提供了公共安全措施。当考试结果用于高风险决策时,理解评分者的表现质量对于维持口语考试质量至关重要,因为它涉及判断,因此需要强有力的可靠性证据(Koizumi等人,2017)。本研究使用多面拉赫测量(MFRM)检测了CELBAN说话组件的性能。具体而言,本研究确定了CELBAN评分在一致性和严重性、评分偏差和评分量表使用方面的可靠性。这项研究基于加拿大8个考点的115名评分者的样本,以及四个平行版本的2698次考试的结果。结果显示,评分者之间和内部的信度较高,基于CLB的口语描述符(CLB 6-9)为评分者区分考生的口语水平提供了足够的信息。没有测试地点或测试版本的影响,为支持高风险目的的测试使用提供了有效性证据。在八个口语标准中,语法被认为是量表上最难的标准,也是表现出最大偏见的标准。本研究强调了MFRM分析在评价员绩效研究中的价值,并对评价员培训产生了启示。本研究是在加拿大背景下使用MFRM与clb相关的高风险评估的首批研究之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Canadian Journal of Applied Linguistics
Canadian Journal of Applied Linguistics LANGUAGE & LINGUISTICS-
CiteScore
1.00
自引率
0.00%
发文量
20
审稿时长
52 weeks
期刊最新文献
Évaluer la compréhension en lecture d’un récit et d’un texte informatif auprès d’élèves de 8 ans Review of Mackey, A. (2020). Interaction, feedback and task research in second language learning: Methods and design. Cambridge University Press. Review of Schmitt, N., & Schmitt, D. (2020). Vocabulary in language teaching (2nd ed.). Cambridge University Press. Teachers’ Perceptions Toward Video as a Tool for Feedback on Students’ Oral Performance English-Language Proficiency Requirements for Migration to Canada, Australia, the United Kingdom, and the United States, and the Implications for Language Testing Research
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1