Identifying Generative Artificial Intelligence Chatbot Use on Multiple-Choice, General Chemistry Exams Using Rasch Analysis

IF 2.5 3区 教育学 Q2 CHEMISTRY, MULTIDISCIPLINARY Journal of Chemical Education Pub Date : 2024-07-09 DOI:10.1021/acs.jchemed.4c00165
Benjamin Sorenson, Kenneth Hanson
{"title":"Identifying Generative Artificial Intelligence Chatbot Use on Multiple-Choice, General Chemistry Exams Using Rasch Analysis","authors":"Benjamin Sorenson, Kenneth Hanson","doi":"10.1021/acs.jchemed.4c00165","DOIUrl":null,"url":null,"abstract":"Generative artificial intelligence (AI) technology is expected to have a profound impact on chemical education. While there are certainly positive uses, some of which are being actively implemented even now, there is a reasonable concern about its use in cheating. Efforts are underway to detect generative AI usage on open-ended questions, lab reports, and essays, but its detection on multiple choice exams is largely unexplored. Here we propose the use of Rasch analysis to identify the unique behavioral pattern of ChatGPT on General Chemistry II, multiple choice exams. While raw statistics (e.g., average, ability, outfit) were insufficient to readily identify ChatGPT instances, a strategy of fixing the ability scale on high success questions and then refitting the outcomes dramatically enhanced its outlier behavior in terms of Z-standardized out-fit statistic and ability displacement. Setting the detection threshold to a true positive rate (TPR) of 1.0, a false positive rate (FPR) of <0.1 was obtained across a majority of the 20 exams investigated here. Furthermore, the receiver operating characteristic curve (i.e., FPR vs TPR) exhibited outstanding areas under the curve of >0.9 for nearly all exams. While limitations of this method are described and the analysis is by no means exhaustive, these outcomes suggest that the unique behavior patterns of generative AI chat bots can be identified using Rasch modeling and fit statistics.","PeriodicalId":43,"journal":{"name":"Journal of Chemical Education","volume":null,"pages":null},"PeriodicalIF":2.5000,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Education","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jchemed.4c00165","RegionNum":3,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Generative artificial intelligence (AI) technology is expected to have a profound impact on chemical education. While there are certainly positive uses, some of which are being actively implemented even now, there is a reasonable concern about its use in cheating. Efforts are underway to detect generative AI usage on open-ended questions, lab reports, and essays, but its detection on multiple choice exams is largely unexplored. Here we propose the use of Rasch analysis to identify the unique behavioral pattern of ChatGPT on General Chemistry II, multiple choice exams. While raw statistics (e.g., average, ability, outfit) were insufficient to readily identify ChatGPT instances, a strategy of fixing the ability scale on high success questions and then refitting the outcomes dramatically enhanced its outlier behavior in terms of Z-standardized out-fit statistic and ability displacement. Setting the detection threshold to a true positive rate (TPR) of 1.0, a false positive rate (FPR) of <0.1 was obtained across a majority of the 20 exams investigated here. Furthermore, the receiver operating characteristic curve (i.e., FPR vs TPR) exhibited outstanding areas under the curve of >0.9 for nearly all exams. While limitations of this method are described and the analysis is by no means exhaustive, these outcomes suggest that the unique behavior patterns of generative AI chat bots can be identified using Rasch modeling and fit statistics.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用 Rasch 分析鉴定生成式人工智能聊天机器人在普通化学考试多选题中的使用情况
生成人工智能(AI)技术有望对化学教育产生深远影响。虽然人工智能技术肯定会有积极的用途,其中一些甚至现在就在积极实施,但人们对其在作弊中的使用也有合理的担忧。目前正在努力检测生成式人工智能在开放式问题、实验报告和论文中的使用情况,但在选择题考试中的检测工作基本上还没有开展。在此,我们建议使用 Rasch 分析来识别 ChatGPT 在普通化学 II 多选考试中的独特行为模式。虽然原始统计数据(如平均值、能力、装束)不足以轻易识别 ChatGPT 实例,但在高成功率问题上固定能力尺度,然后重新拟合结果的策略在 Z 标准化失配统计量和能力位移方面显著增强了其离群行为。将检测阈值设定为真阳性率(TPR)1.0,在本文调查的 20 次考试中,大部分考试的假阳性率(FPR)为 0.1。此外,几乎所有检查的接收者操作特征曲线(即 FPR 与 TPR 的对比)都显示出了出色的曲线下面积(<0.9)。虽然我们描述了这种方法的局限性,而且分析也绝非详尽无遗,但这些结果表明,使用 Rasch 建模和拟合统计可以识别生成式人工智能聊天机器人的独特行为模式。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Journal of Chemical Education
Journal of Chemical Education 化学-化学综合
CiteScore
5.60
自引率
50.00%
发文量
465
审稿时长
6.5 months
期刊介绍: The Journal of Chemical Education is the official journal of the Division of Chemical Education of the American Chemical Society, co-published with the American Chemical Society Publications Division. Launched in 1924, the Journal of Chemical Education is the world’s premier chemical education journal. The Journal publishes peer-reviewed articles and related information as a resource to those in the field of chemical education and to those institutions that serve them. JCE typically addresses chemical content, activities, laboratory experiments, instructional methods, and pedagogies. The Journal serves as a means of communication among people across the world who are interested in the teaching and learning of chemistry. This includes instructors of chemistry from middle school through graduate school, professional staff who support these teaching activities, as well as some scientists in commerce, industry, and government.
期刊最新文献
Transforming Student Interactions with Flipped Content from an Isolated, Passive Activity into a Collaborative and Engaging Endeavor A Digital and Interactive Tool to Learn 1H NMR Spectroscopy: The SpinDrops Learning Environment Evolution of a Biocatalysis CURE for Organic Chemistry Students Analyzing the Impact of Time Spent on Practice Questions on General Chemistry Students’ Problem-Solving Performance Metal-Free Ring-Opening Polymerization of Propylene Oxide: Synthesis and Characterization of Polyether in the Undergraduate Organic Chemistry Laboratory
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1