{"title":"ChatGPT能通过高中英语语言理解考试吗?","authors":"Joost C. F. de Winter","doi":"10.1007/s40593-023-00372-z","DOIUrl":null,"url":null,"abstract":"Abstract Launched in late November 2022, ChatGPT, a large language model chatbot, has garnered considerable attention. However, ongoing questions remain regarding its capabilities. In this study, ChatGPT was used to complete national high school exams in the Netherlands on the topic of English reading comprehension. In late December 2022, we submitted the exam questions through the ChatGPT web interface (GPT-3.5). According to official norms, ChatGPT achieved a mean grade of 7.3 on the Dutch scale of 1 to 10—comparable to the mean grade of all students who took the exam in the Netherlands, 6.99. However, ChatGPT occasionally required re-prompting to arrive at an explicit answer; without these nudges, the overall grade was 6.5. In March 2023, API access was made available, and a new version of ChatGPT, GPT-4, was released. We submitted the same exams to the API, and GPT-4 achieved a score of 8.3 without a need for re-prompting. Additionally, employing a bootstrapping method that incorporated randomness through ChatGPT’s ‘temperature’ parameter proved effective in self-identifying potentially incorrect answers. Finally, a re-assessment conducted with the GPT-4 model updated as of June 2023 showed no substantial change in the overall score. The present findings highlight significant opportunities but also raise concerns about the impact of ChatGPT and similar large language models on educational assessment.","PeriodicalId":46637,"journal":{"name":"International Journal of Artificial Intelligence in Education","volume":"30 1","pages":"0"},"PeriodicalIF":4.7000,"publicationDate":"2023-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":"{\"title\":\"Can ChatGPT Pass High School Exams on English Language Comprehension?\",\"authors\":\"Joost C. F. de Winter\",\"doi\":\"10.1007/s40593-023-00372-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Launched in late November 2022, ChatGPT, a large language model chatbot, has garnered considerable attention. However, ongoing questions remain regarding its capabilities. In this study, ChatGPT was used to complete national high school exams in the Netherlands on the topic of English reading comprehension. In late December 2022, we submitted the exam questions through the ChatGPT web interface (GPT-3.5). According to official norms, ChatGPT achieved a mean grade of 7.3 on the Dutch scale of 1 to 10—comparable to the mean grade of all students who took the exam in the Netherlands, 6.99. However, ChatGPT occasionally required re-prompting to arrive at an explicit answer; without these nudges, the overall grade was 6.5. In March 2023, API access was made available, and a new version of ChatGPT, GPT-4, was released. We submitted the same exams to the API, and GPT-4 achieved a score of 8.3 without a need for re-prompting. Additionally, employing a bootstrapping method that incorporated randomness through ChatGPT’s ‘temperature’ parameter proved effective in self-identifying potentially incorrect answers. Finally, a re-assessment conducted with the GPT-4 model updated as of June 2023 showed no substantial change in the overall score. The present findings highlight significant opportunities but also raise concerns about the impact of ChatGPT and similar large language models on educational assessment.\",\"PeriodicalId\":46637,\"journal\":{\"name\":\"International Journal of Artificial Intelligence in Education\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2023-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"22\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Artificial Intelligence in Education\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s40593-023-00372-z\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Artificial Intelligence in Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s40593-023-00372-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
Can ChatGPT Pass High School Exams on English Language Comprehension?
Abstract Launched in late November 2022, ChatGPT, a large language model chatbot, has garnered considerable attention. However, ongoing questions remain regarding its capabilities. In this study, ChatGPT was used to complete national high school exams in the Netherlands on the topic of English reading comprehension. In late December 2022, we submitted the exam questions through the ChatGPT web interface (GPT-3.5). According to official norms, ChatGPT achieved a mean grade of 7.3 on the Dutch scale of 1 to 10—comparable to the mean grade of all students who took the exam in the Netherlands, 6.99. However, ChatGPT occasionally required re-prompting to arrive at an explicit answer; without these nudges, the overall grade was 6.5. In March 2023, API access was made available, and a new version of ChatGPT, GPT-4, was released. We submitted the same exams to the API, and GPT-4 achieved a score of 8.3 without a need for re-prompting. Additionally, employing a bootstrapping method that incorporated randomness through ChatGPT’s ‘temperature’ parameter proved effective in self-identifying potentially incorrect answers. Finally, a re-assessment conducted with the GPT-4 model updated as of June 2023 showed no substantial change in the overall score. The present findings highlight significant opportunities but also raise concerns about the impact of ChatGPT and similar large language models on educational assessment.
期刊介绍:
IJAIED publishes papers concerned with the application of AI to education. It aims to help the development of principles for the design of computer-based learning systems. Its premise is that such principles involve the modelling and representation of relevant aspects of knowledge, before implementation or during execution, and hence require the application of AI techniques and concepts. IJAIED has a very broad notion of the scope of AI and of a ''computer-based learning system'', as indicated by the following list of topics considered to be within the scope of IJAIED: adaptive and intelligent multimedia and hypermedia systemsagent-based learning environmentsAIED and teacher educationarchitectures for AIED systemsassessment and testing of learning outcomesauthoring systems and shells for AIED systemsbayesian and statistical methodscase-based systemscognitive developmentcognitive models of problem-solvingcognitive tools for learningcomputer-assisted language learningcomputer-supported collaborative learningdialogue (argumentation, explanation, negotiation, etc.) discovery environments and microworldsdistributed learning environmentseducational roboticsembedded training systemsempirical studies to inform the design of learning environmentsenvironments to support the learning of programmingevaluation of AIED systemsformal models of components of AIED systemshelp and advice systemshuman factors and interface designinstructional design principlesinstructional planningintelligent agents on the internetintelligent courseware for computer-based trainingintelligent tutoring systemsknowledge and skill acquisitionknowledge representation for instructionmodelling metacognitive skillsmodelling pedagogical interactionsmotivationnatural language interfaces for instructional systemsnetworked learning and teaching systemsneural models applied to AIED systemsperformance support systemspractical, real-world applications of AIED systemsqualitative reasoning in simulationssituated learning and cognitive apprenticeshipsocial and cultural aspects of learningstudent modelling and cognitive diagnosissupport for knowledge building communitiessupport for networked communicationtheories of learning and conceptual changetools for administration and curriculum integrationtools for the guided exploration of information resources