Comparing two formats of data-driven rating scales for classroom assessment of pragmatic performance with roleplays

IF 2.2 1区 文学 0 LANGUAGE & LINGUISTICS Language Testing Pub Date : 2023-11-29 DOI:10.1177/02655322231210217
Yunwen Su, Sun-Young Shin
{"title":"Comparing two formats of data-driven rating scales for classroom assessment of pragmatic performance with roleplays","authors":"Yunwen Su, Sun-Young Shin","doi":"10.1177/02655322231210217","DOIUrl":null,"url":null,"abstract":"Rating scales that language testers design should be tailored to the specific test purpose and score use as well as reflect the target construct. Researchers have long argued for the value of data-driven scales for classroom performance assessment, because they are specific to pedagogical tasks and objectives, have rich descriptors to offer useful diagnostic information, and exhibit robust content representativeness and stable measurement properties. This sequential mixed methods study compares two data-driven rating scales with multiple criteria that use different formats for pragmatic performance. They were developed using roleplays performed by 43 second-language learners of Mandarin—the hierarchical-binary (HB) scale, developed through close analysis of performance data, and the multi-trait (MT) scale derived from the HB, which has the same criteria but takes the format of an analytic scale. Results revealed the influence of format, albeit to a limited extent: MT showed a marginal advantage over HB in terms of overall reliability, practicality, and discriminatory power, though measurement properties of the two scales were largely comparable. All raters were positive about the pedagogical value of both scales. This study reveals that rater perceptions of the ease of use and effectiveness of both scales provide further insights into scale functioning.","PeriodicalId":17928,"journal":{"name":"Language Testing","volume":"52 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2023-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language Testing","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1177/02655322231210217","RegionNum":1,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Rating scales that language testers design should be tailored to the specific test purpose and score use as well as reflect the target construct. Researchers have long argued for the value of data-driven scales for classroom performance assessment, because they are specific to pedagogical tasks and objectives, have rich descriptors to offer useful diagnostic information, and exhibit robust content representativeness and stable measurement properties. This sequential mixed methods study compares two data-driven rating scales with multiple criteria that use different formats for pragmatic performance. They were developed using roleplays performed by 43 second-language learners of Mandarin—the hierarchical-binary (HB) scale, developed through close analysis of performance data, and the multi-trait (MT) scale derived from the HB, which has the same criteria but takes the format of an analytic scale. Results revealed the influence of format, albeit to a limited extent: MT showed a marginal advantage over HB in terms of overall reliability, practicality, and discriminatory power, though measurement properties of the two scales were largely comparable. All raters were positive about the pedagogical value of both scales. This study reveals that rater perceptions of the ease of use and effectiveness of both scales provide further insights into scale functioning.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
比较两种以数据为导向的评分量表格式,以评估课堂上的角色扮演语用表现
语言测试人员设计的评分量表应适合特定的测试目的和分数用途,并能反映目标建构。长期以来,研究人员一直在论证数据驱动量表在课堂表现评估中的价值,因为它们针对教学任务和目标,具有丰富的描述符,可提供有用的诊断信息,并表现出强大的内容代表性和稳定的测量属性。这项连续的混合方法研究比较了两种数据驱动的评分量表,它们采用不同的实用性表现形式,具有多重标准。这两个量表是由 43 名普通话第二语言学习者通过角色扮演的方式完成的--通过对表现数据的严密分析而开发的分层二元量表(HB),以及从 HB 量表衍生出的多特征量表(MT),后者具有相同的标准,但采用了分析量表的形式。结果显示了量表形式的影响,尽管影响程度有限:尽管两个量表的测量属性基本相当,但在总体可靠性、实用性和区分度方面,MT 量表比 HB 量表略胜一筹。所有评分者都对两种量表的教学价值持肯定态度。本研究揭示了评定者对两种量表的易用性和有效性的看法,从而进一步揭示了量表的功能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Language Testing
Language Testing Multiple-
CiteScore
6.70
自引率
9.80%
发文量
35
期刊介绍: Language Testing is a fully peer reviewed international journal that publishes original research and review articles on language testing and assessment. It provides a forum for the exchange of ideas and information between people working in the fields of first and second language testing and assessment. This includes researchers and practitioners in EFL and ESL testing, and assessment in child language acquisition and language pathology. In addition, special attention is focused on issues of testing theory, experimental investigations, and the following up of practical implications.
期刊最新文献
Can language test providers do more to support open science? A response to Winke Considerations to promote and accelerate Open Science: A response to Winke Evaluating the impact of nonverbal behavior on language ability ratings Sharing, collaborating, and building trust: How Open Science advances language testing Open Science in language assessment research contexts: A reply to Winke
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1