高分口语测评中测评者严重程度和一致性随时间变化的共性因素

IF 2.2 1区 文学 0 LANGUAGE & LINGUISTICS Language Testing Pub Date : 2024-04-10 DOI:10.1177/02655322241239363
Reeta Neittaanmäki, Iasonas Lamprianou
{"title":"高分口语测评中测评者严重程度和一致性随时间变化的共性因素","authors":"Reeta Neittaanmäki, Iasonas Lamprianou","doi":"10.1177/02655322241239363","DOIUrl":null,"url":null,"abstract":"This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":null,"pages":null},"PeriodicalIF":2.2000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Communal factors in rater severity and consistency over time in high-stakes oral assessment\",\"authors\":\"Reeta Neittaanmäki, Iasonas Lamprianou\",\"doi\":\"10.1177/02655322241239363\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.\",\"PeriodicalId\":17928,\"journal\":{\"name\":\"Language Testing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Language Testing\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1177/02655322241239363\",\"RegionNum\":1,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Testing","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1177/02655322241239363","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

摘要

本文重点关注评分者的严重性和一致性,以及它们与高风险测试背景下评分系统重大变化的关系。研究基于 2009 年至 2019 年期间从芬兰国家语言能力证书第二语言(L2)芬兰语子测试中收集的纵向数据。我们调查了在此期间评分者的严重程度和一致性是否发生了变化,以及这些变化是否可以用评分系统的重大变化来解释,例如主考官的更换、评分和培训方式(现场或远程)以及评分者群体的构成。数据包括 104 名评分员和 59 899 名受试者的 45 次评分,并使用多面 Rasch 模型和广义线性混合模型进行了分析。分析表明,随着时间的推移,评分者作为一个群体变得更加宽松。此外,结果表明,评分者群体及其做法、主考官以及评分和培训方式都会影响评分行为。最后,我们阐述了对研究和实践的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Communal factors in rater severity and consistency over time in high-stakes oral assessment
This article focuses on rater severity and consistency and their relation to major changes in the rating system in a high-stakes testing context. The study is based on longitudinal data collected from 2009 to 2019 from the second language (L2) Finnishspeaking subtest in the National Certificates of Language Proficiency in Finland. We investigated whether rater severity and consistency changed over that period and whether the changes could be explained by major changes in the rating system, such as the change of lead examiner, the modus of rating and training (on-site or remote), and the composition of the rater group. The data consisted of 45 rating sessions with 104 raters and 59,899 examinees and were analysed using the Many-Facets Rasch model and generalized linear mixed models. The analyses indicated that raters as a group became somewhat more lenient over time. In addition, the results showed that the rater community and its practices, the lead examiners, and the modus of rating and training can influence the rating behaviour. Finally, we elaborate on implications for both research and practice.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Language Testing
Language Testing Multiple-
CiteScore
6.70
自引率
9.80%
发文量
35
期刊介绍: Language Testing is a fully peer reviewed international journal that publishes original research and review articles on language testing and assessment. It provides a forum for the exchange of ideas and information between people working in the fields of first and second language testing and assessment. This includes researchers and practitioners in EFL and ESL testing, and assessment in child language acquisition and language pathology. In addition, special attention is focused on issues of testing theory, experimental investigations, and the following up of practical implications.
期刊最新文献
Can language test providers do more to support open science? A response to Winke Considerations to promote and accelerate Open Science: A response to Winke Evaluating the impact of nonverbal behavior on language ability ratings Sharing, collaborating, and building trust: How Open Science advances language testing Open Science in language assessment research contexts: A reply to Winke
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1