用于解剖学教育的 ChatGPT 与定制的人工智能聊天机器人（Anatbuddy）：对比试验研究

IF 5.2 2区教育学 Q1 EDUCATION, SCIENTIFIC DISCIPLINES Anatomical Sciences Education Pub Date : 2024-08-21 DOI:10.1002/ase.2502

Gautham Arun, Vivek Perumal, Francis Paul John Bato Urias, Yan En Ler, Bryan Wen Tao Tan, Ranganath Vallabhajosyula, Emmanuel Tan, Olivia Ng, Kian Bee Ng, Sreenivasulu Reddy Mogali

{"title":"用于解剖学教育的 ChatGPT 与定制的人工智能聊天机器人（Anatbuddy）：对比试验研究","authors":"Gautham Arun, Vivek Perumal, Francis Paul John Bato Urias, Yan En Ler, Bryan Wen Tao Tan, Ranganath Vallabhajosyula, Emmanuel Tan, Olivia Ng, Kian Bee Ng, Sreenivasulu Reddy Mogali","doi":"10.1002/ase.2502","DOIUrl":null,"url":null,"abstract":"Large Language Models (LLMs) have the potential to improve education by personalizing learning. However, ChatGPT-generated content has been criticized for sometimes producing false, biased, and/or hallucinatory information. To evaluate AI's ability to return clear and accurate anatomy information, this study generated a custom interactive and intelligent chatbot (Anatbuddy) through an Open AI Application Programming Interface (API) that enables seamless AI-driven interactions within a secured cloud infrastructure. Anatbuddy was programmed through a Retrieval Augmented Generation (RAG) method to provide context-aware responses to user queries based on a predetermined knowledge base. To compare their outputs, various queries (i.e., prompts) on thoracic anatomy (n = 18) were fed into Anatbuddy and ChatGPT 3.5. A panel comprising three experienced anatomists evaluated both tools' responses for factual accuracy, relevance, completeness, coherence, and fluency on a 5-point Likert scale. These ratings were reviewed by a third party blinded to the study, who revised and finalized scores as needed. Anatbuddy's factual accuracy (mean ± SD = 4.78/5.00 ± 0.43; median = 5.00) was rated significantly higher (U = 84, p = 0.01) than ChatGPT's accuracy (4.11 ± 0.83; median = 4.00). No statistically significant differences were detected between the chatbots for the other variables. Given ChatGPT's current content knowledge limitations, we strongly recommend the anatomy profession develop a custom AI chatbot for anatomy education utilizing a carefully curated knowledge base to ensure accuracy. Further research is needed to determine students' acceptance of custom chatbots for anatomy education and their influence on learning experiences and outcomes.","PeriodicalId":124,"journal":{"name":"Anatomical Sciences Education","volume":"17 7","pages":"1396-1405"},"PeriodicalIF":5.2000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: A comparative pilot study\",\"authors\":\"Gautham Arun, Vivek Perumal, Francis Paul John Bato Urias, Yan En Ler, Bryan Wen Tao Tan, Ranganath Vallabhajosyula, Emmanuel Tan, Olivia Ng, Kian Bee Ng, Sreenivasulu Reddy Mogali\",\"doi\":\"10.1002/ase.2502\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large Language Models (LLMs) have the potential to improve education by personalizing learning. However, ChatGPT-generated content has been criticized for sometimes producing false, biased, and/or hallucinatory information. To evaluate AI's ability to return clear and accurate anatomy information, this study generated a custom interactive and intelligent chatbot (Anatbuddy) through an Open AI Application Programming Interface (API) that enables seamless AI-driven interactions within a secured cloud infrastructure. Anatbuddy was programmed through a Retrieval Augmented Generation (RAG) method to provide context-aware responses to user queries based on a predetermined knowledge base. To compare their outputs, various queries (i.e., prompts) on thoracic anatomy (n = 18) were fed into Anatbuddy and ChatGPT 3.5. A panel comprising three experienced anatomists evaluated both tools' responses for factual accuracy, relevance, completeness, coherence, and fluency on a 5-point Likert scale. These ratings were reviewed by a third party blinded to the study, who revised and finalized scores as needed. Anatbuddy's factual accuracy (mean ± SD = 4.78/5.00 ± 0.43; median = 5.00) was rated significantly higher (U = 84, p = 0.01) than ChatGPT's accuracy (4.11 ± 0.83; median = 4.00). No statistically significant differences were detected between the chatbots for the other variables. Given ChatGPT's current content knowledge limitations, we strongly recommend the anatomy profession develop a custom AI chatbot for anatomy education utilizing a carefully curated knowledge base to ensure accuracy. Further research is needed to determine students' acceptance of custom chatbots for anatomy education and their influence on learning experiences and outcomes.\",\"PeriodicalId\":124,\"journal\":{\"name\":\"Anatomical Sciences Education\",\"volume\":\"17 7\",\"pages\":\"1396-1405\"},\"PeriodicalIF\":5.2000,\"publicationDate\":\"2024-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Anatomical Sciences Education\",\"FirstCategoryId\":\"95\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/ase.2502\",\"RegionNum\":2,\"RegionCategory\":\"教育学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"EDUCATION, SCIENTIFIC DISCIPLINES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anatomical Sciences Education","FirstCategoryId":"95","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ase.2502","RegionNum":2,"RegionCategory":"教育学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION, SCIENTIFIC DISCIPLINES","Score":null,"Total":0}

引用次数: 0

摘要

大型语言模型（LLM）具有通过个性化学习来改善教育的潜力。然而，ChatGPT 生成的内容因有时会产生虚假、有偏见和/或幻觉信息而受到批评。为了评估人工智能返回清晰准确的解剖信息的能力，本研究通过开放式人工智能应用编程接口（API）生成了一个定制的交互式智能聊天机器人（Anatbuddy）。Anatbuddy是通过检索增强生成（RAG）方法编程的，可根据预先确定的知识库为用户查询提供上下文感知响应。为了比较它们的输出结果，向 Anatbuddy 和 ChatGPT 3.5 输入了有关胸部解剖的各种查询（即提示）（n = 18）。由三位经验丰富的解剖学家组成的小组以 5 分制李克特量表对两款工具的回答进行了评估，包括事实准确性、相关性、完整性、连贯性和流畅性。这些评分由对研究保密的第三方进行审核，并根据需要修改和最终确定分数。Anatbuddy 的事实准确度（平均值 ± SD = 4.78/5.00 ± 0.43；中位数 = 5.00）显著高于 ChatGPT 的准确度（U = 84，P = 0.01）（4.11 ± 0.83；中位数 = 4.00）。在其他变量方面，聊天机器人之间没有发现明显的统计学差异。鉴于 ChatGPT 目前在内容知识方面的局限性，我们强烈建议解剖学专业开发一种定制的人工智能聊天机器人，用于解剖学教育，利用精心策划的知识库确保准确性。还需要进一步研究，以确定学生对解剖学教育定制聊天机器人的接受程度及其对学习体验和结果的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ChatGPT versus a customized AI chatbot (Anatbuddy) for anatomy education: A comparative pilot study

Large Language Models (LLMs) have the potential to improve education by personalizing learning. However, ChatGPT-generated content has been criticized for sometimes producing false, biased, and/or hallucinatory information. To evaluate AI's ability to return clear and accurate anatomy information, this study generated a custom interactive and intelligent chatbot (Anatbuddy) through an Open AI Application Programming Interface (API) that enables seamless AI-driven interactions within a secured cloud infrastructure. Anatbuddy was programmed through a Retrieval Augmented Generation (RAG) method to provide context-aware responses to user queries based on a predetermined knowledge base. To compare their outputs, various queries (i.e., prompts) on thoracic anatomy (n = 18) were fed into Anatbuddy and ChatGPT 3.5. A panel comprising three experienced anatomists evaluated both tools' responses for factual accuracy, relevance, completeness, coherence, and fluency on a 5-point Likert scale. These ratings were reviewed by a third party blinded to the study, who revised and finalized scores as needed. Anatbuddy's factual accuracy (mean ± SD = 4.78/5.00 ± 0.43; median = 5.00) was rated significantly higher (U = 84, p = 0.01) than ChatGPT's accuracy (4.11 ± 0.83; median = 4.00). No statistically significant differences were detected between the chatbots for the other variables. Given ChatGPT's current content knowledge limitations, we strongly recommend the anatomy profession develop a custom AI chatbot for anatomy education utilizing a carefully curated knowledge base to ensure accuracy. Further research is needed to determine students' acceptance of custom chatbots for anatomy education and their influence on learning experiences and outcomes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Anatomical Sciences Education Anatomy/education-

CiteScore

10.30

自引率

39.70%

发文量

期刊介绍： Anatomical Sciences Education, affiliated with the American Association for Anatomy, serves as an international platform for sharing ideas, innovations, and research related to education in anatomical sciences. Covering gross anatomy, embryology, histology, and neurosciences, the journal addresses education at various levels, including undergraduate, graduate, post-graduate, allied health, medical (both allopathic and osteopathic), and dental. It fosters collaboration and discussion in the field of anatomical sciences education.