Can ChatGPT Pass High School Exams on English Language Comprehension?

IF 4.7 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS International Journal of Artificial Intelligence in Education Pub Date : 2023-09-13 DOI:10.1007/s40593-023-00372-z
Joost C. F. de Winter
{"title":"Can ChatGPT Pass High School Exams on English Language Comprehension?","authors":"Joost C. F. de Winter","doi":"10.1007/s40593-023-00372-z","DOIUrl":null,"url":null,"abstract":"Abstract Launched in late November 2022, ChatGPT, a large language model chatbot, has garnered considerable attention. However, ongoing questions remain regarding its capabilities. In this study, ChatGPT was used to complete national high school exams in the Netherlands on the topic of English reading comprehension. In late December 2022, we submitted the exam questions through the ChatGPT web interface (GPT-3.5). According to official norms, ChatGPT achieved a mean grade of 7.3 on the Dutch scale of 1 to 10—comparable to the mean grade of all students who took the exam in the Netherlands, 6.99. However, ChatGPT occasionally required re-prompting to arrive at an explicit answer; without these nudges, the overall grade was 6.5. In March 2023, API access was made available, and a new version of ChatGPT, GPT-4, was released. We submitted the same exams to the API, and GPT-4 achieved a score of 8.3 without a need for re-prompting. Additionally, employing a bootstrapping method that incorporated randomness through ChatGPT’s ‘temperature’ parameter proved effective in self-identifying potentially incorrect answers. Finally, a re-assessment conducted with the GPT-4 model updated as of June 2023 showed no substantial change in the overall score. The present findings highlight significant opportunities but also raise concerns about the impact of ChatGPT and similar large language models on educational assessment.","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"30 1","pages":"0"},"PeriodicalIF":4.7000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Artificial Intelligence in Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40593-023-00372-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 22

Abstract

Abstract Launched in late November 2022, ChatGPT, a large language model chatbot, has garnered considerable attention. However, ongoing questions remain regarding its capabilities. In this study, ChatGPT was used to complete national high school exams in the Netherlands on the topic of English reading comprehension. In late December 2022, we submitted the exam questions through the ChatGPT web interface (GPT-3.5). According to official norms, ChatGPT achieved a mean grade of 7.3 on the Dutch scale of 1 to 10—comparable to the mean grade of all students who took the exam in the Netherlands, 6.99. However, ChatGPT occasionally required re-prompting to arrive at an explicit answer; without these nudges, the overall grade was 6.5. In March 2023, API access was made available, and a new version of ChatGPT, GPT-4, was released. We submitted the same exams to the API, and GPT-4 achieved a score of 8.3 without a need for re-prompting. Additionally, employing a bootstrapping method that incorporated randomness through ChatGPT’s ‘temperature’ parameter proved effective in self-identifying potentially incorrect answers. Finally, a re-assessment conducted with the GPT-4 model updated as of June 2023 showed no substantial change in the overall score. The present findings highlight significant opportunities but also raise concerns about the impact of ChatGPT and similar large language models on educational assessment.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT能通过高中英语语言理解考试吗?
ChatGPT是一个大型语言模型聊天机器人,于2022年11月下旬推出,引起了人们的广泛关注。然而,有关其能力的问题仍然存在。在本研究中,ChatGPT被用于完成荷兰的全国高中英语阅读理解考试。在2022年12月下旬,我们通过ChatGPT网络界面(GPT-3.5)提交了试题。根据官方标准,ChatGPT在荷兰1到10的评分标准中平均得分为7.3分,与荷兰所有参加考试的学生的平均得分6.99分相当。然而,ChatGPT有时需要重新提示才能得到明确的答案;如果没有这些助推,总分是6.5分。在2023年3月,API访问可用,并发布了新版本的ChatGPT, GPT-4。我们向API提交了相同的测试,GPT-4在不需要重新提示的情况下获得了8.3分。此外,通过ChatGPT的“温度”参数引入随机性的自举方法在自我识别潜在错误答案方面被证明是有效的。最后,使用截至2023年6月更新的GPT-4模型进行的重新评估显示,总体得分没有实质性变化。目前的研究结果强调了重要的机会,但也提出了对ChatGPT和类似的大型语言模型对教育评估的影响的担忧。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal of Artificial Intelligence in Education
International Journal of Artificial Intelligence in Education COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-
CiteScore
11.10
自引率
6.10%
发文量
32
期刊介绍: IJAIED publishes papers concerned with the application of AI to education. It aims to help the development of principles for the design of computer-based learning systems. Its premise is that such principles involve the modelling and representation of relevant aspects of knowledge, before implementation or during execution, and hence require the application of AI techniques and concepts. IJAIED has a very broad notion of the scope of AI and of a ''computer-based learning system'', as indicated by the following list of topics considered to be within the scope of IJAIED: adaptive and intelligent multimedia and hypermedia systemsagent-based learning environmentsAIED and teacher educationarchitectures for AIED systemsassessment and testing of learning outcomesauthoring systems and shells for AIED systemsbayesian and statistical methodscase-based systemscognitive developmentcognitive models of problem-solvingcognitive tools for learningcomputer-assisted language learningcomputer-supported collaborative learningdialogue (argumentation, explanation, negotiation, etc.) discovery environments and microworldsdistributed learning environmentseducational roboticsembedded training systemsempirical studies to inform the design of learning environmentsenvironments to support the learning of programmingevaluation of AIED systemsformal models of components of AIED systemshelp and advice systemshuman factors and interface designinstructional design principlesinstructional planningintelligent agents on the internetintelligent courseware for computer-based trainingintelligent tutoring systemsknowledge and skill acquisitionknowledge representation for instructionmodelling metacognitive skillsmodelling pedagogical interactionsmotivationnatural language interfaces for instructional systemsnetworked learning and teaching systemsneural models applied to AIED systemsperformance support systemspractical, real-world applications of AIED systemsqualitative reasoning in simulationssituated learning and cognitive apprenticeshipsocial and cultural aspects of learningstudent modelling and cognitive diagnosissupport for knowledge building communitiessupport for networked communicationtheories of learning and conceptual changetools for administration and curriculum integrationtools for the guided exploration of information resources
期刊最新文献
AI Adaptivity in a Mixed-Reality System Improves Learning Debiasing Education Algorithms Facial Expression Recognition for Examining Emotional Regulation in Synchronous Online Collaborative Learning Multilingual Age of Exposure 2.0 Examining the Effect of Assessment Construct Characteristics on Machine Learning Scoring of Scientific Argumentation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1