Using Artificial Intelligence (AI) As An External Examiner

Esculapio Pub Date : 2023-11-08 DOI:10.51273/esc23.251319323

Tayyaba Azhar, Kinza Aslam, Zakia Saleem, Ahsan Sethi, Tahseen Fatima

引用次数: 0

Abstract

Objective: To access the validity of ChatGPT on AI assisted tool for evaluating essay questions. Material and Methods: This was a cross-sectional quantitative study conducted at University College of Medicine and Dentistry from June till August 2023. Eighteen questions were selected from fifteen exit tests of Certificate in HPE course. Each of the answers were independently graded by two assessors with doctorate in HPE. The same answers were then reevaluated using ChatGPT. The inter-rater reliability was determined using Kappa test. Results: The agreement between ChatGPT and examiner scores varied on various items. Weak agreement was observed for questions 8 and 9, moderate agreement for questions 2, 3, and 5, and strong kappa agreement for questions 1, 4, 6, and 7. Conclusion: Artificial intelligence assisted tools such as ChatGPT is a reality but its use in assessing essay questions would require massive training data from expert assessors. Once appropriately trained, it may replicate assessment decisions across the full range of subject.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

将人工智能 (AI) 用作外部考官

目的了解人工智能辅助工具 ChatGPT 对作文题评价的有效性。材料与方法：这是一项横断面定量研究，于 2023 年 6 月至 8 月在大学医学和牙科学院进行。从 HPE 证书课程的 15 个结业测试中选取了 18 个问题。每道试题的答案均由两名拥有 HPE 博士学位的评审员独立评分。然后使用 ChatGPT 对相同的答案进行重新评估。评分者之间的信度采用 Kappa 检验。结果显示在不同的题目上，ChatGPT 和考官评分之间的一致性各不相同。第 8 题和第 9 题的一致性较弱，第 2 题、第 3 题和第 5 题的一致性中等，而第 1 题、第 4 题、第 6 题和第 7 题的卡帕一致性较强。结论人工智能辅助工具（如 ChatGPT）已成为现实，但将其用于评估作文题目需要专家评估员提供大量训练数据。一旦经过适当的培训，它可以在所有科目中复制评估决定。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Esculapio

自引率

0.00%

发文量

审稿时长

12 weeks