Can ChatGPT-4 Diagnose and Treat Like an Orthopaedic Surgeon? Testing Clinical Decision Making and Diagnostic Ability in Soft-Tissue Pathologies of the Foot and Ankle.

IF 2.6 2区 医学 Q1 ORTHOPEDICS Journal of the American Academy of Orthopaedic Surgeons Pub Date : 2024-10-15 DOI:10.5435/JAAOS-D-24-00595
Hayden Hartman, Maritza Diane Essis, Wei Shao Tung, Irvin Oh, Sean Peden, Arianna L Gianakos
{"title":"Can ChatGPT-4 Diagnose and Treat Like an Orthopaedic Surgeon? Testing Clinical Decision Making and Diagnostic Ability in Soft-Tissue Pathologies of the Foot and Ankle.","authors":"Hayden Hartman, Maritza Diane Essis, Wei Shao Tung, Irvin Oh, Sean Peden, Arianna L Gianakos","doi":"10.5435/JAAOS-D-24-00595","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>ChatGPT-4, a chatbot with an ability to carry human-like conversation, has attracted attention after demonstrating aptitude to pass professional licensure examinations. The purpose of this study was to explore the diagnostic and decision-making capacities of ChatGPT-4 in clinical management specifically assessing for accuracy in the identification and treatment of soft-tissue foot and ankle pathologies.</p><p><strong>Methods: </strong>This study presented eight soft-tissue-related foot and ankle cases to ChatGPT-4, with each case assessed by three fellowship-trained foot and ankle orthopaedic surgeons. The evaluation system included five criteria within a Likert scale, scoring from 5 (lowest) to 25 (highest possible).</p><p><strong>Results: </strong>The average sum score of all cases was 22.0. The Morton neuroma case received the highest score (24.7), and the peroneal tendon tear case received the lowest score (16.3). Subgroup analyses of each of the 5 criterion using showed no notable differences in surgeon grading. Criteria 3 (provide alternative treatments) and 4 (provide comprehensive information) were graded markedly lower than criteria 1 (diagnose), 2 (treat), and 5 (provide accurate information) (for both criteria 3 and 4: P = 0.007; P = 0.032; P < 0.0001). Criteria 5 was graded markedly higher than criteria 2, 3, and 4 (P = 0.02; P < 0.0001; P < 0.0001).</p><p><strong>Conclusion: </strong>This study demonstrates that ChatGPT-4 effectively diagnosed and provided reliable treatment options for most soft-tissue foot and ankle cases presented, noting consistency among surgeon evaluators. Individual criterion assessment revealed that ChatGPT-4 was most effective in diagnosing and suggesting appropriate treatment, but limitations were seen in the chatbot's ability to provide comprehensive information and alternative treatment options. In addition, the chatbot successfully did not suggest fabricated treatment options, a common concern in prior literature. This resource could be useful for clinicians seeking reliable patient education materials without the fear of inconsistencies, although comprehensive information beyond treatment may be limited.</p>","PeriodicalId":51098,"journal":{"name":"Journal of the American Academy of Orthopaedic Surgeons","volume":" ","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Academy of Orthopaedic Surgeons","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5435/JAAOS-D-24-00595","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ORTHOPEDICS","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: ChatGPT-4, a chatbot with an ability to carry human-like conversation, has attracted attention after demonstrating aptitude to pass professional licensure examinations. The purpose of this study was to explore the diagnostic and decision-making capacities of ChatGPT-4 in clinical management specifically assessing for accuracy in the identification and treatment of soft-tissue foot and ankle pathologies.

Methods: This study presented eight soft-tissue-related foot and ankle cases to ChatGPT-4, with each case assessed by three fellowship-trained foot and ankle orthopaedic surgeons. The evaluation system included five criteria within a Likert scale, scoring from 5 (lowest) to 25 (highest possible).

Results: The average sum score of all cases was 22.0. The Morton neuroma case received the highest score (24.7), and the peroneal tendon tear case received the lowest score (16.3). Subgroup analyses of each of the 5 criterion using showed no notable differences in surgeon grading. Criteria 3 (provide alternative treatments) and 4 (provide comprehensive information) were graded markedly lower than criteria 1 (diagnose), 2 (treat), and 5 (provide accurate information) (for both criteria 3 and 4: P = 0.007; P = 0.032; P < 0.0001). Criteria 5 was graded markedly higher than criteria 2, 3, and 4 (P = 0.02; P < 0.0001; P < 0.0001).

Conclusion: This study demonstrates that ChatGPT-4 effectively diagnosed and provided reliable treatment options for most soft-tissue foot and ankle cases presented, noting consistency among surgeon evaluators. Individual criterion assessment revealed that ChatGPT-4 was most effective in diagnosing and suggesting appropriate treatment, but limitations were seen in the chatbot's ability to provide comprehensive information and alternative treatment options. In addition, the chatbot successfully did not suggest fabricated treatment options, a common concern in prior literature. This resource could be useful for clinicians seeking reliable patient education materials without the fear of inconsistencies, although comprehensive information beyond treatment may be limited.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT-4 能否像骨科医生一样进行诊断和治疗?测试足踝软组织病变的临床决策和诊断能力。
简介ChatGPT-4是一款能进行类人对话的聊天机器人,在通过专业执照考试后备受关注。本研究的目的是探索 ChatGPT-4 在临床管理中的诊断和决策能力,特别是评估识别和治疗足踝软组织病变的准确性:本研究向 ChatGPT-4 提交了八个与足踝软组织相关的病例,每个病例都由三名受过研究培训的足踝矫形外科医生进行评估。评估系统包括李克特量表中的五个标准,得分从 5 分(最低)到 25 分(最高)不等:所有病例的平均总分为 22.0 分。莫顿神经瘤病例得分最高(24.7 分),腓骨肌腱撕裂病例得分最低(16.3 分)。对采用 5 项标准的每项标准进行的分组分析表明,外科医生的评分没有明显差异。标准 3(提供替代治疗方法)和标准 4(提供全面信息)的评分明显低于标准 1(诊断)、标准 2(治疗)和标准 5(提供准确信息)(标准 3 和标准 4:P = 0.007;P = 0.032;P < 0.0001)。标准 5 的评分明显高于标准 2、3 和 4(P = 0.02;P < 0.0001;P < 0.0001):本研究表明,ChatGPT-4 能有效诊断大多数足踝软组织病例,并提供可靠的治疗方案,外科医生的评估结果具有一致性。个人标准评估显示,聊天机器人 ChatGPT-4 在诊断和建议适当治疗方面最为有效,但在提供全面信息和替代治疗方案方面存在局限性。此外,聊天机器人成功地没有提出捏造的治疗方案,这也是之前文献中常见的问题。虽然治疗以外的综合信息可能有限,但这一资源对于寻求可靠的患者教育材料而不必担心不一致的临床医生来说可能很有用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.10
自引率
6.20%
发文量
529
审稿时长
4-8 weeks
期刊介绍: The Journal of the American Academy of Orthopaedic Surgeons was established in the fall of 1993 by the Academy in response to its membership’s demand for a clinical review journal. Two issues were published the first year, followed by six issues yearly from 1994 through 2004. In September 2005, JAAOS began publishing monthly issues. Each issue includes richly illustrated peer-reviewed articles focused on clinical diagnosis and management. Special features in each issue provide commentary on developments in pharmacotherapeutics, materials and techniques, and computer applications.
期刊最新文献
The 2024 Kappa Delta and OREF Awards. The 2024 Kappa Delta Young Investigator Award: Leveraging Insights From Development to Improve Adult Repair: Hedgehog Signaling as a Master Regulator of Enthesis Fibrocartilage Formation. The Role of Amino Acid Supplementation in Orthopaedic Surgery: Erratum. Risk Factors of Failure to Discharge Before "Two Midnights" in Outpatient-Designated Total Hip Arthroplasty. Cost Difference in Performing Total Knee Arthroplasty at Ambulatory Surgical Centers Compared With Hospital-Based Outpatient Departments: Observational Study.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1