Detecting AI-generated essays: the ChatGPT challenge

IF 2.6 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS International Journal of Information and Learning Technology Pub Date : 2023-05-01 DOI:10.1108/ijilt-03-2023-0043

Ilker Cingillioglu

{"title":"Detecting AI-generated essays: the ChatGPT challenge","authors":"Ilker Cingillioglu","doi":"10.1108/ijilt-03-2023-0043","DOIUrl":null,"url":null,"abstract":"PurposeWith the advent of ChatGPT, a sophisticated generative artificial intelligence (AI) tool, maintaining academic integrity in all educational settings has recently become a challenge for educators. This paper discusses a method and necessary strategies to confront this challenge.Design/methodology/approachIn this study, a language model was defined to achieve high accuracy in distinguishing ChatGPT-generated essays from human written essays with a particular focus on “not falsely” classifying genuinely human-written essays as AI-generated (Negative).FindingsVia support vector machine (SVM) algorithm 100% accuracy was recorded for identifying human generated essays. The author discussed the key use of Recall and F2 score for measuring classification performance and the importance of eliminating False Negatives and making sure that no actual human generated essays are incorrectly classified as AI generated. The results of the proposed model's classification algorithms were compared to those of AI-generated text detection software developed by OpenAI, GPTZero and Copyleaks.Practical implicationsAI-generated essays submitted by students can be detected by teachers and educational designers using the proposed language model and machine learning (ML) classifier at a high accuracy. Human (student)-generated essays can and must be correctly identified with 100% accuracy even if the overall classification accuracy performance is slightly reduced.Originality/valueThis is the first and only study that used an n-gram bag-of-words (BOWs) discrepancy language model as input for a classifier to make such prediction and compared the classification results of other AI-generated text detection software in an empirical way.","PeriodicalId":51872,"journal":{"name":"International Journal of Information and Learning Technology","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2023-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Information and Learning Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1108/ijilt-03-2023-0043","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 3

Abstract

PurposeWith the advent of ChatGPT, a sophisticated generative artificial intelligence (AI) tool, maintaining academic integrity in all educational settings has recently become a challenge for educators. This paper discusses a method and necessary strategies to confront this challenge.Design/methodology/approachIn this study, a language model was defined to achieve high accuracy in distinguishing ChatGPT-generated essays from human written essays with a particular focus on “not falsely” classifying genuinely human-written essays as AI-generated (Negative).FindingsVia support vector machine (SVM) algorithm 100% accuracy was recorded for identifying human generated essays. The author discussed the key use of Recall and F2 score for measuring classification performance and the importance of eliminating False Negatives and making sure that no actual human generated essays are incorrectly classified as AI generated. The results of the proposed model's classification algorithms were compared to those of AI-generated text detection software developed by OpenAI, GPTZero and Copyleaks.Practical implicationsAI-generated essays submitted by students can be detected by teachers and educational designers using the proposed language model and machine learning (ML) classifier at a high accuracy. Human (student)-generated essays can and must be correctly identified with 100% accuracy even if the overall classification accuracy performance is slightly reduced.Originality/valueThis is the first and only study that used an n-gram bag-of-words (BOWs) discrepancy language model as input for a classifier to make such prediction and compared the classification results of other AI-generated text detection software in an empirical way.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

检测人工智能生成的文章:ChatGPT的挑战

随着ChatGPT(一种复杂的生成式人工智能(AI)工具)的出现，在所有教育环境中保持学术诚信最近成为教育工作者面临的挑战。本文探讨了应对这一挑战的方法和必要的策略。设计/方法/方法在本研究中，定义了一种语言模型，以在区分chatgpt生成的论文和人类撰写的论文方面实现高精度，特别关注“不错误地”将真正的人类撰写的论文分类为人工智能生成的(否定)。通过支持向量机(SVM)算法对人工生成的文章进行识别，准确率达到100%。作者讨论了Recall和F2分数用于衡量分类性能的关键用途，以及消除假阴性的重要性，并确保没有实际的人类生成的文章被错误地分类为AI生成的。将所提出模型的分类算法与OpenAI、GPTZero和Copyleaks开发的人工智能生成文本检测软件的分类结果进行对比。教师和教育设计师可以使用提出的语言模型和机器学习(ML)分类器以高精度检测学生提交的ai生成的论文。人类(学生)生成的文章可以并且必须以100%的准确率正确识别，即使整体分类精度性能略有降低。独创性/价值这是第一个也是唯一一个使用n-gram bag-of-words (bow)差异语言模型作为分类器的输入进行预测，并对其他人工智能生成的文本检测软件的分类结果进行实证比较的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Information and Learning Technology COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS-

CiteScore

6.10

自引率

3.30%

发文量

期刊介绍： International Journal of Information and Learning Technology (IJILT) provides a forum for the sharing of the latest theories, applications, and services related to planning, developing, managing, using, and evaluating information technologies in administrative, academic, and library computing, as well as other educational technologies. Submissions can include research: -Illustrating and critiquing educational technologies -New uses of technology in education -Issue-or results-focused case studies detailing examples of technology applications in higher education -In-depth analyses of the latest theories, applications and services in the field The journal provides wide-ranging and independent coverage of the management, use and integration of information resources and learning technologies.