Detecting ChatGPT-generated essays in a large-scale writing assessment: Is there a bias against non-native English speakers?

IF 8.9 1区 教育学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Computers & Education Pub Date : 2024-05-06 DOI:10.1016/j.compedu.2024.105070
Yang Jiang, Jiangang Hao, Michael Fauss, Chen Li
{"title":"Detecting ChatGPT-generated essays in a large-scale writing assessment: Is there a bias against non-native English speakers?","authors":"Yang Jiang,&nbsp;Jiangang Hao,&nbsp;Michael Fauss,&nbsp;Chen Li","doi":"10.1016/j.compedu.2024.105070","DOIUrl":null,"url":null,"abstract":"<div><p>With the prevalence of generative AI tools like ChatGPT, automated detectors of AI-generated texts have been increasingly used in education to detect the misuse of these tools (e.g., cheating in assessments). Recently, the responsible use of these detectors has attracted a lot of attention. Research has shown that publicly available detectors are more likely to misclassify essays written by non-native English speakers as AI-generated than those written by native English speakers. In this study, we address these concerns by leveraging carefully sampled large-scale data from the Graduate Record Examinations (GRE) writing assessment. We developed multiple detectors of ChatGPT-generated essays based on linguistic features from the ETS e-rater engine and text perplexity features, and investigated their performance and potential bias. Results showed that our carefully constructed detectors not only achieved near-perfect detection accuracy, but also showed no evidence of bias disadvantaging non-native English speakers. Findings of this study contribute to the ongoing debates surrounding the formulation of policies for utilizing AI-generated content detectors in education.</p></div>","PeriodicalId":10568,"journal":{"name":"Computers & Education","volume":"217 ","pages":"Article 105070"},"PeriodicalIF":8.9000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Education","FirstCategoryId":"95","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360131524000848","RegionNum":1,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

With the prevalence of generative AI tools like ChatGPT, automated detectors of AI-generated texts have been increasingly used in education to detect the misuse of these tools (e.g., cheating in assessments). Recently, the responsible use of these detectors has attracted a lot of attention. Research has shown that publicly available detectors are more likely to misclassify essays written by non-native English speakers as AI-generated than those written by native English speakers. In this study, we address these concerns by leveraging carefully sampled large-scale data from the Graduate Record Examinations (GRE) writing assessment. We developed multiple detectors of ChatGPT-generated essays based on linguistic features from the ETS e-rater engine and text perplexity features, and investigated their performance and potential bias. Results showed that our carefully constructed detectors not only achieved near-perfect detection accuracy, but also showed no evidence of bias disadvantaging non-native English speakers. Findings of this study contribute to the ongoing debates surrounding the formulation of policies for utilizing AI-generated content detectors in education.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
在大规模写作评估中检测由 ChatGPT 生成的文章:是否存在对英语非母语者的偏见?
随着 ChatGPT 等生成式人工智能工具的普及,人工智能生成文本的自动检测器也越来越多地应用于教育领域,以检测这些工具的滥用(如评估中的作弊)。最近,如何负责任地使用这些检测器引起了广泛关注。研究表明,与母语为英语的人相比,公开可用的检测器更容易将母语为非英语的人所写的文章错误地归类为人工智能生成的文章。在本研究中,我们利用从研究生入学考试(GRE)写作评估中仔细抽取的大规模数据,解决了这些问题。我们基于 ETS 电子评分器引擎的语言特征和文本复杂性特征,开发了多种 ChatGPT 生成作文的检测器,并对其性能和潜在偏差进行了研究。结果表明,我们精心构建的检测器不仅达到了近乎完美的检测准确率,而且没有证据表明存在对英语非母语人士不利的偏差。本研究的结果有助于围绕在教育中使用人工智能生成的内容检测器的政策制定展开讨论。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Computers & Education
Computers & Education 工程技术-计算机:跨学科应用
CiteScore
27.10
自引率
5.80%
发文量
204
审稿时长
42 days
期刊介绍: Computers & Education seeks to advance understanding of how digital technology can improve education by publishing high-quality research that expands both theory and practice. The journal welcomes research papers exploring the pedagogical applications of digital technology, with a focus broad enough to appeal to the wider education community.
期刊最新文献
Explicit video-based instruction enhanced students’ online credibility evaluation skills: Did storifying instruction matter? Can AI support human grading? Examining machine attention and confidence in short answer scoring The role of perceived teacher support in students’ attitudes towards and flow experience in programming learning: A multi-group analysis of primary students A Topical Review of Research in Computer-Supported Collaborative Learning: Questions and Possibilities How do Chinese undergraduates harness the potential of appraisal and emotions in generative AI-Powered learning? A multigroup analysis based on appraisal theory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1