{"title":"Detecting AI-generated essays: the ChatGPT challenge","authors":"Ilker Cingillioglu","doi":"10.1108/ijilt-03-2023-0043","DOIUrl":null,"url":null,"abstract":"PurposeWith the advent of ChatGPT, a sophisticated generative artificial intelligence (AI) tool, maintaining academic integrity in all educational settings has recently become a challenge for educators. This paper discusses a method and necessary strategies to confront this challenge.Design/methodology/approachIn this study, a language model was defined to achieve high accuracy in distinguishing ChatGPT-generated essays from human written essays with a particular focus on “not falsely” classifying genuinely human-written essays as AI-generated (Negative).FindingsVia support vector machine (SVM) algorithm 100% accuracy was recorded for identifying human generated essays. The author discussed the key use of Recall and F2 score for measuring classification performance and the importance of eliminating False Negatives and making sure that no actual human generated essays are incorrectly classified as AI generated. The results of the proposed model's classification algorithms were compared to those of AI-generated text detection software developed by OpenAI, GPTZero and Copyleaks.Practical implicationsAI-generated essays submitted by students can be detected by teachers and educational designers using the proposed language model and machine learning (ML) classifier at a high accuracy. Human (student)-generated essays can and must be correctly identified with 100% accuracy even if the overall classification accuracy performance is slightly reduced.Originality/valueThis is the first and only study that used an n-gram bag-of-words (BOWs) discrepancy language model as input for a classifier to make such prediction and compared the classification results of other AI-generated text detection software in an empirical way.","PeriodicalId":51872,"journal":{"name":"International Journal of Information and Learning Technology","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information and Learning Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijilt-03-2023-0043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 3
Abstract
PurposeWith the advent of ChatGPT, a sophisticated generative artificial intelligence (AI) tool, maintaining academic integrity in all educational settings has recently become a challenge for educators. This paper discusses a method and necessary strategies to confront this challenge.Design/methodology/approachIn this study, a language model was defined to achieve high accuracy in distinguishing ChatGPT-generated essays from human written essays with a particular focus on “not falsely” classifying genuinely human-written essays as AI-generated (Negative).FindingsVia support vector machine (SVM) algorithm 100% accuracy was recorded for identifying human generated essays. The author discussed the key use of Recall and F2 score for measuring classification performance and the importance of eliminating False Negatives and making sure that no actual human generated essays are incorrectly classified as AI generated. The results of the proposed model's classification algorithms were compared to those of AI-generated text detection software developed by OpenAI, GPTZero and Copyleaks.Practical implicationsAI-generated essays submitted by students can be detected by teachers and educational designers using the proposed language model and machine learning (ML) classifier at a high accuracy. Human (student)-generated essays can and must be correctly identified with 100% accuracy even if the overall classification accuracy performance is slightly reduced.Originality/valueThis is the first and only study that used an n-gram bag-of-words (BOWs) discrepancy language model as input for a classifier to make such prediction and compared the classification results of other AI-generated text detection software in an empirical way.
期刊介绍:
International Journal of Information and Learning Technology (IJILT) provides a forum for the sharing of the latest theories, applications, and services related to planning, developing, managing, using, and evaluating information technologies in administrative, academic, and library computing, as well as other educational technologies. Submissions can include research: -Illustrating and critiquing educational technologies -New uses of technology in education -Issue-or results-focused case studies detailing examples of technology applications in higher education -In-depth analyses of the latest theories, applications and services in the field The journal provides wide-ranging and independent coverage of the management, use and integration of information resources and learning technologies.