ChatGPT efficacy for answering musculoskeletal anatomy questions: a study evaluating quality and consistency between raters and timepoints

IF 1.2 4区 医学 Q3 ANATOMY & MORPHOLOGY Surgical and Radiologic Anatomy Pub Date : 2024-09-12 DOI:10.1007/s00276-024-03477-9
Nikolaos Mantzou, Vasileios Ediaroglou, Elena Drakonaki, Spyros A. Syggelos, Filippos F. Karageorgos, Trifon Totlis
{"title":"ChatGPT efficacy for answering musculoskeletal anatomy questions: a study evaluating quality and consistency between raters and timepoints","authors":"Nikolaos Mantzou, Vasileios Ediaroglou, Elena Drakonaki, Spyros A. Syggelos, Filippos F. Karageorgos, Trifon Totlis","doi":"10.1007/s00276-024-03477-9","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Purpose</h3><p>There is increasing interest in the use of digital platforms such as ChatGPT for anatomy education. This study aims to evaluate the efficacy of ChatGPT in providing accurate and consistent responses to questions focusing on musculoskeletal anatomy across various time points (hours and days).</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>A selection of 6 Anatomy-related questions were asked to ChatGPT 3.5 in 4 different timepoints. All answers were rated blindly by 3 expert raters for quality according to a 5 -point Likert Scale. Difference of 0 or 1 points in Likert scale scores between raters was considered as agreement and between different timepoints was considered as consistent indicating good reproducibility.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>There was significant variation in the quality of the answers ranging from extremely good to very poor quality. There was also variation of consistency levels between different timepoints. Answers were rated as good quality (<i>≥</i> 3 in Likert scale) in 50% of cases (3/6) and as consistent in 66.6% (4/6) of cases. In the low-quality answers, significant mistakes, conflicting data or lack of information were encountered.</p><h3 data-test=\"abstract-sub-heading\">Conclusion</h3><p>As of the time of this article, the quality and consistency of the ChatGPT v3.5 answers is variable, thus limiting its utility as independent and reliable resource of learning musculoskeletal anatomy. Validating information by reviewing the anatomical literature is highly recommended.</p>","PeriodicalId":49296,"journal":{"name":"Surgical and Radiologic Anatomy","volume":null,"pages":null},"PeriodicalIF":1.2000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Surgical and Radiologic Anatomy","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00276-024-03477-9","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ANATOMY & MORPHOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Purpose

There is increasing interest in the use of digital platforms such as ChatGPT for anatomy education. This study aims to evaluate the efficacy of ChatGPT in providing accurate and consistent responses to questions focusing on musculoskeletal anatomy across various time points (hours and days).

Methods

A selection of 6 Anatomy-related questions were asked to ChatGPT 3.5 in 4 different timepoints. All answers were rated blindly by 3 expert raters for quality according to a 5 -point Likert Scale. Difference of 0 or 1 points in Likert scale scores between raters was considered as agreement and between different timepoints was considered as consistent indicating good reproducibility.

Results

There was significant variation in the quality of the answers ranging from extremely good to very poor quality. There was also variation of consistency levels between different timepoints. Answers were rated as good quality ( 3 in Likert scale) in 50% of cases (3/6) and as consistent in 66.6% (4/6) of cases. In the low-quality answers, significant mistakes, conflicting data or lack of information were encountered.

Conclusion

As of the time of this article, the quality and consistency of the ChatGPT v3.5 answers is variable, thus limiting its utility as independent and reliable resource of learning musculoskeletal anatomy. Validating information by reviewing the anatomical literature is highly recommended.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT 在回答肌肉骨骼解剖问题方面的功效:一项评估评分者和时间点之间的质量和一致性的研究
目的 人们对使用 ChatGPT 等数字平台进行解剖学教育越来越感兴趣。本研究旨在评估 ChatGPT 在不同时间点(小时和天)对肌肉骨骼解剖学相关问题提供准确一致回答的效果。所有答案均由 3 位专家根据 5 点李克特量表进行盲评。评分者之间的李克特量表评分相差 0 或 1 分被视为一致,不同时间点之间的评分相差 0 或 1 分被视为一致,表明具有良好的可重复性。不同时间点之间的一致性水平也存在差异。在 50%的案例(3/6)中,答案质量被评为良好(李克特量表≥ 3),在 66.6%的案例(4/6)中,答案质量被评为一致。结论 截至本文撰写之时,ChatGPT v3.5 答案的质量和一致性参差不齐,因此限制了其作为学习肌肉骨骼解剖学的独立可靠资源的实用性。强烈建议通过查阅解剖学文献来验证信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Surgical and Radiologic Anatomy
Surgical and Radiologic Anatomy ANATOMY & MORPHOLOGY-RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
CiteScore
2.70
自引率
14.30%
发文量
183
审稿时长
4-8 weeks
期刊介绍: Anatomy is a morphological science which cannot fail to interest the clinician. The practical application of anatomical research to clinical problems necessitates special adaptation and selectivity in choosing from numerous international works. Although there is a tendency to believe that meaningful advances in anatomy are unlikely, constant revision is necessary. Surgical and Radiologic Anatomy, the first international journal of Clinical anatomy has been created in this spirit. Its goal is to serve clinicians, regardless of speciality-physicians, surgeons, radiologists or other specialists-as an indispensable aid with which they can improve their knowledge of anatomy. Each issue includes: Original papers, review articles, articles on the anatomical bases of medical, surgical and radiological techniques, articles of normal radiologic anatomy, brief reviews of anatomical publications of clinical interest. Particular attention is given to high quality illustrations, which are indispensable for a better understanding of anatomical problems. Surgical and Radiologic Anatomy is a journal written by anatomists for clinicians with a special interest in anatomy.
期刊最新文献
V3 segment of the right vertebral artery taking an anomalous posterosuperior course and penetrating occipital bone (wall of the jugular foramen) diagnosed by magnetic resonance angiography Anatomical investigation of the morphometry of the cerebral arteries using digital subtraction angiography in the Thai population ChatGPT efficacy for answering musculoskeletal anatomy questions: a study evaluating quality and consistency between raters and timepoints Morphology and arterial supply of the pyramidalis muscle in an Australian female population using computed tomography angiography Regional variations and sex-related differences of stiffness in human tracheal ligaments
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1