大规模高风险语言测试中跨性别和学术背景的DIF调查

IF 0.1 Q4 LINGUISTICS Studies in Language Assessment Pub Date : 2015-01-01 DOI:10.58379/rshg8366
Xia-li Song, Liying Cheng, D. Klinger
{"title":"大规模高风险语言测试中跨性别和学术背景的DIF调查","authors":"Xia-li Song, Liying Cheng, D. Klinger","doi":"10.58379/rshg8366","DOIUrl":null,"url":null,"abstract":"High-stakes pre-entry language testing is the predominate tool used to measure test takers’ proficiency for admission purposes in higher education in China. Given the important role of these tests, there are heated discussions about how to ensure test fairness for different groups of test takers. This study examined the fairness of the Graduate School Entrance English Examination (GSEEE) that is used to decide whether over one million test takers can enter master’s programs in China. Using SIBTEST and content analysis, the study investigated differential item functioning (DIF) and the presence of potential bias on the GSEEE with aspects to groups of gender and academic background. Results found that a large percentage of the GSEEE items did not provide reliable results to distinguish good and poor performers. A number of DIF and DBF functioned differentially and three test reviewers identified a myriad of factors such as motivation and learning styles that potentially contributed to group performance differences. However, consistent evidence was not found to suggest these flagged items/texts exhibited bias. While systematic bias may not have been detected, the results revealed poor test reliability and the study highlighted an urgent need to improve test quality and clarify the purpose of the test. DIF issues may be revisited once test quality has been improved.","PeriodicalId":29650,"journal":{"name":"Studies in Language Assessment","volume":"90 1","pages":""},"PeriodicalIF":0.1000,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"DIF investigations across groups of gender and academic background in a large-scale high-stakes language test \",\"authors\":\"Xia-li Song, Liying Cheng, D. Klinger\",\"doi\":\"10.58379/rshg8366\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-stakes pre-entry language testing is the predominate tool used to measure test takers’ proficiency for admission purposes in higher education in China. Given the important role of these tests, there are heated discussions about how to ensure test fairness for different groups of test takers. This study examined the fairness of the Graduate School Entrance English Examination (GSEEE) that is used to decide whether over one million test takers can enter master’s programs in China. Using SIBTEST and content analysis, the study investigated differential item functioning (DIF) and the presence of potential bias on the GSEEE with aspects to groups of gender and academic background. Results found that a large percentage of the GSEEE items did not provide reliable results to distinguish good and poor performers. A number of DIF and DBF functioned differentially and three test reviewers identified a myriad of factors such as motivation and learning styles that potentially contributed to group performance differences. However, consistent evidence was not found to suggest these flagged items/texts exhibited bias. While systematic bias may not have been detected, the results revealed poor test reliability and the study highlighted an urgent need to improve test quality and clarify the purpose of the test. DIF issues may be revisited once test quality has been improved.\",\"PeriodicalId\":29650,\"journal\":{\"name\":\"Studies in Language Assessment\",\"volume\":\"90 1\",\"pages\":\"\"},\"PeriodicalIF\":0.1000,\"publicationDate\":\"2015-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Studies in Language Assessment\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.58379/rshg8366\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in Language Assessment","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.58379/rshg8366","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"LINGUISTICS","Score":null,"Total":0}
引用次数: 12

摘要

在中国高等教育中,高风险的入学前语言测试是衡量考生入学能力的主要工具。鉴于这些考试的重要作用,人们就如何确保不同考生群体的考试公平展开了激烈的讨论。这项研究考察了研究生入学英语考试(GSEEE)的公平性,GSEEE被用来决定中国100多万考生是否能进入硕士项目。本研究采用SIBTEST和内容分析方法,从性别和学术背景两个方面考察了GSEEE的差异项目功能(DIF)和潜在偏差的存在。结果发现,很大比例的GSEEE项目没有提供可靠的结果来区分优等生和劣等生。许多DIF和DBF的功能是不同的,三个测试评审者确定了无数的因素,如动机和学习风格,这些因素可能会导致小组表现的差异。然而,没有一致的证据表明这些标记的项目/文本显示出偏见。虽然系统偏倚可能没有被发现,但结果显示了较差的测试可靠性,该研究强调了提高测试质量和澄清测试目的的迫切需要。一旦测试质量得到改善,DIF问题可能会被重新审视。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DIF investigations across groups of gender and academic background in a large-scale high-stakes language test 
High-stakes pre-entry language testing is the predominate tool used to measure test takers’ proficiency for admission purposes in higher education in China. Given the important role of these tests, there are heated discussions about how to ensure test fairness for different groups of test takers. This study examined the fairness of the Graduate School Entrance English Examination (GSEEE) that is used to decide whether over one million test takers can enter master’s programs in China. Using SIBTEST and content analysis, the study investigated differential item functioning (DIF) and the presence of potential bias on the GSEEE with aspects to groups of gender and academic background. Results found that a large percentage of the GSEEE items did not provide reliable results to distinguish good and poor performers. A number of DIF and DBF functioned differentially and three test reviewers identified a myriad of factors such as motivation and learning styles that potentially contributed to group performance differences. However, consistent evidence was not found to suggest these flagged items/texts exhibited bias. While systematic bias may not have been detected, the results revealed poor test reliability and the study highlighted an urgent need to improve test quality and clarify the purpose of the test. DIF issues may be revisited once test quality has been improved.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Contextual variables in written assessment feedback in a university-level Spanish program The effect of in-class and one-on-one video feedback on EFL learners’ English public speaking competency and anxiety Gebril, A. (Ed.) Learning-Oriented Language Assessment: Putting Theory into Practice. Is the devil you know better? Testwiseness and eliciting evidence of interactional competence in familiar versus unfamiliar triadic speaking tasks The meaningfulness of two curriculum-based national tests of English as a foreign language
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1