Unveiling the efficacy of ChatGPT in evaluating critical thinking skills through peer feedback analysis: Leveraging existing classification criteria

IF 3.7 2区教育学 Q1 Social Sciences Thinking Skills and Creativity Pub Date : 2024-08-06 DOI:10.1016/j.tsc.2024.101607

Tianqi Tang , Jingrong Sha , Yanan Zhao , Saidi Wang , Zibin Wang , Sha Shen

{"title":"Unveiling the efficacy of ChatGPT in evaluating critical thinking skills through peer feedback analysis: Leveraging existing classification criteria","authors":"Tianqi Tang , Jingrong Sha , Yanan Zhao , Saidi Wang , Zibin Wang , Sha Shen","doi":"10.1016/j.tsc.2024.101607","DOIUrl":null,"url":null,"abstract":"<div><p>This study investigates the potential of using ChatGPT, a large language model, to assess students' critical thinking in online peer feedback. With the rapid development of technology, big language models, such as ChatGPT, have made significant progress in natural language processing in recent years and have good potential for application in teaching evaluation and feedback. However, can generative AI help educational practitioners in teaching and learning? How to accurately assess students' critical thinking using generative AI remains a challenging task. This study investigates whether ChatGPT can effectively evaluate critical thinking using established coding systems. By comparing the consistency and accuracy of manual coding with ChatGPT coding in online peer feedback texts, it clarifies how ChatGPT processes online peer feedback data and conducts assessments. Through a comprehensive analysis employing various metrics including precision, recall, F1 score, and a confusion matrix, we assess ChatGPT's performance. Additionally, we group students and analyze how ChatGPT's assessments relate to their critical thinking levels. Our findings suggest that the ChatGPT demonstrated some ability to assess higher dimensions of critical thinking, but showed limitations in assessing the more granular secondary dimensions under the higher dimensions of critical thinking. However for this kind of granular assessment will more accurately capture the level of learning critical thinking. Surprisingly, ChatGPT's evaluations are not influenced by students' critical thinking levels. This study underscores ChatGPT's potential in automating critical thinking assessment at scale, alleviating the burden on educators and enhancing understanding of critical thinking in peer feedback.</p></div>","PeriodicalId":47729,"journal":{"name":"Thinking Skills and Creativity","volume":"53 ","pages":"Article 101607"},"PeriodicalIF":3.7000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Thinking Skills and Creativity","FirstCategoryId":"95","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1871187124001457","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

This study investigates the potential of using ChatGPT, a large language model, to assess students' critical thinking in online peer feedback. With the rapid development of technology, big language models, such as ChatGPT, have made significant progress in natural language processing in recent years and have good potential for application in teaching evaluation and feedback. However, can generative AI help educational practitioners in teaching and learning? How to accurately assess students' critical thinking using generative AI remains a challenging task. This study investigates whether ChatGPT can effectively evaluate critical thinking using established coding systems. By comparing the consistency and accuracy of manual coding with ChatGPT coding in online peer feedback texts, it clarifies how ChatGPT processes online peer feedback data and conducts assessments. Through a comprehensive analysis employing various metrics including precision, recall, F1 score, and a confusion matrix, we assess ChatGPT's performance. Additionally, we group students and analyze how ChatGPT's assessments relate to their critical thinking levels. Our findings suggest that the ChatGPT demonstrated some ability to assess higher dimensions of critical thinking, but showed limitations in assessing the more granular secondary dimensions under the higher dimensions of critical thinking. However for this kind of granular assessment will more accurately capture the level of learning critical thinking. Surprisingly, ChatGPT's evaluations are not influenced by students' critical thinking levels. This study underscores ChatGPT's potential in automating critical thinking assessment at scale, alleviating the burden on educators and enhancing understanding of critical thinking in peer feedback.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过同行反馈分析，揭示 ChatGPT 在评价批判性思维能力方面的功效：利用现有的分类标准

本研究探讨了在在线同伴反馈中使用大语言模型 ChatGPT 评估学生批判性思维的潜力。随着技术的飞速发展，以 ChatGPT 为代表的大语言模型近年来在自然语言处理领域取得了长足的进步，在教学评价和反馈中具有良好的应用潜力。然而，生成式人工智能能否帮助教育从业者开展教学工作？如何利用生成式人工智能准确评估学生的批判性思维仍是一项具有挑战性的任务。本研究探讨了 ChatGPT 是否能利用已有的编码系统有效评估批判性思维。通过比较在线同伴反馈文本中人工编码与 ChatGPT 编码的一致性和准确性，本研究阐明了 ChatGPT 如何处理在线同伴反馈数据并进行评估。通过使用精确度、召回率、F1 分数和混淆矩阵等各种指标进行综合分析，我们评估了 ChatGPT 的性能。此外，我们还对学生进行了分组，并分析了 ChatGPT 的评估与学生批判性思维水平之间的关系。我们的研究结果表明，ChatGPT 在评估批判性思维的较高维度方面表现出了一定的能力，但在评估批判性思维较高维度下更细化的次要维度方面表现出了局限性。不过，这种细化评估能更准确地反映批判性思维的学习水平。令人惊讶的是，ChatGPT 的评价并不受学生批判性思维水平的影响。这项研究强调了 ChatGPT 在批判性思维大规模自动化评估方面的潜力，减轻了教育工作者的负担，并在同伴反馈中加深了对批判性思维的理解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Thinking Skills and Creativity EDUCATION & EDUCATIONAL RESEARCH-

CiteScore

6.40

自引率

16.20%

发文量

172

审稿时长

76 days

期刊介绍： Thinking Skills and Creativity is a new journal providing a peer-reviewed forum for communication and debate for the community of researchers interested in teaching for thinking and creativity. Papers may represent a variety of theoretical perspectives and methodological approaches and may relate to any age level in a diversity of settings: formal and informal, education and work-based.