Comparative diagnostic accuracy of ChatGPT-4 and machine learning in differentiating spinal tuberculosis and spinal tumors.

IF 4.9 1区 医学 Q1 CLINICAL NEUROLOGY Spine Journal Pub Date : 2025-01-11 DOI:10.1016/j.spinee.2024.12.035
Xiaojiang Hu, Dongcheng Xu, Hongqi Zhang, Mingxing Tang, Qile Gao
{"title":"Comparative diagnostic accuracy of ChatGPT-4 and machine learning in differentiating spinal tuberculosis and spinal tumors.","authors":"Xiaojiang Hu, Dongcheng Xu, Hongqi Zhang, Mingxing Tang, Qile Gao","doi":"10.1016/j.spinee.2024.12.035","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In clinical practice, distinguishing between spinal tuberculosis (STB) and spinal tumors (ST) poses a significant diagnostic challenge. The application of AI-driven large language models (LLMs) shows great potential for improving the accuracy of this differential diagnosis.</p><p><strong>Purpose: </strong>To evaluate the performance of various machine learning models and ChatGPT-4 in distinguishing between STB and ST.</p><p><strong>Study design: </strong>A retrospective cohort study.</p><p><strong>Patient sample: </strong>143 STB cases and 153 ST cases admitted to Xiangya Hospital Central South University, from January 2016 to June 2023 were collected.</p><p><strong>Outcome measures: </strong>This study incorporates basic patient information, standard laboratory results, serum tumor markers, and comprehensive imaging records, including Magnetic Resonance Imaging (MRI) and Computed Tomography (CT), for individuals diagnosed with STB and ST. Machine learning techniques and ChatGPT-4 were utilized to distinguish between STB and ST separately.</p><p><strong>Method: </strong>Six distinct machine learning models, along with ChatGPT-4, were employed to evaluate their differential diagnostic effectiveness.</p><p><strong>Result: </strong>Among the 6 machine learning models, the Gradient Boosting Machine (GBM) algorithm model demonstrated the highest differential diagnostic efficiency. In the training cohort, the GBM model achieved a sensitivity of 98.84% and a specificity of 100.00% in distinguishing STB from ST. In the testing cohort, its sensitivity was 98.25%, and specificity was 91.80%. ChatGPT-4 exhibited a sensitivity of 70.37% and a specificity of 90.65% for differential diagnosis. In single-question cases, ChatGPT-4's sensitivity and specificity were 71.67% and 92.55%, respectively, while in re-questioning cases, they were 44.44% and 76.92%.</p><p><strong>Conclusion: </strong>The GBM model demonstrates significant value in the differential diagnosis of STB and ST, whereas the diagnostic performance of ChatGPT-4 remains suboptimal.</p>","PeriodicalId":49484,"journal":{"name":"Spine Journal","volume":" ","pages":""},"PeriodicalIF":4.9000,"publicationDate":"2025-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.spinee.2024.12.035","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Background: In clinical practice, distinguishing between spinal tuberculosis (STB) and spinal tumors (ST) poses a significant diagnostic challenge. The application of AI-driven large language models (LLMs) shows great potential for improving the accuracy of this differential diagnosis.

Purpose: To evaluate the performance of various machine learning models and ChatGPT-4 in distinguishing between STB and ST.

Study design: A retrospective cohort study.

Patient sample: 143 STB cases and 153 ST cases admitted to Xiangya Hospital Central South University, from January 2016 to June 2023 were collected.

Outcome measures: This study incorporates basic patient information, standard laboratory results, serum tumor markers, and comprehensive imaging records, including Magnetic Resonance Imaging (MRI) and Computed Tomography (CT), for individuals diagnosed with STB and ST. Machine learning techniques and ChatGPT-4 were utilized to distinguish between STB and ST separately.

Method: Six distinct machine learning models, along with ChatGPT-4, were employed to evaluate their differential diagnostic effectiveness.

Result: Among the 6 machine learning models, the Gradient Boosting Machine (GBM) algorithm model demonstrated the highest differential diagnostic efficiency. In the training cohort, the GBM model achieved a sensitivity of 98.84% and a specificity of 100.00% in distinguishing STB from ST. In the testing cohort, its sensitivity was 98.25%, and specificity was 91.80%. ChatGPT-4 exhibited a sensitivity of 70.37% and a specificity of 90.65% for differential diagnosis. In single-question cases, ChatGPT-4's sensitivity and specificity were 71.67% and 92.55%, respectively, while in re-questioning cases, they were 44.44% and 76.92%.

Conclusion: The GBM model demonstrates significant value in the differential diagnosis of STB and ST, whereas the diagnostic performance of ChatGPT-4 remains suboptimal.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
ChatGPT-4与机器学习鉴别脊柱结核和脊柱肿瘤诊断准确性的比较
背景:在临床实践中,区分脊柱结核(STB)和脊柱肿瘤(ST)是一个重大的诊断挑战。人工智能驱动的大型语言模型(llm)的应用显示出提高这种鉴别诊断准确性的巨大潜力。目的:评估各种机器学习模型和ChatGPT-4在区分STB和st方面的性能。研究设计:回顾性队列研究。患者样本:收集2016年1月至2023年6月收治的STB病例143例,ST病例153例。结果测量:本研究纳入了诊断为STB和ST的患者的基本信息、标准实验室结果、血清肿瘤标志物和综合影像学记录,包括磁共振成像(MRI)和计算机断层扫描(CT),机器学习技术和ChatGPT-4分别用于区分STB和ST。方法:本研究纳入了143例诊断为STB和153例诊断为st的队列,采用6种不同的机器学习模型以及ChatGPT-4来评估它们的鉴别诊断效果。结果:在6种机器学习模型中,梯度增强机(Gradient Boosting machine, GBM)算法模型的鉴别诊断效率最高。在训练队列中,GBM模型区分STB和st的敏感性为98.84%,特异性为100.00%;在检测队列中,其敏感性为98.25%,特异性为91.80%。ChatGPT-4鉴别诊断的敏感性为70.37%,特异性为90.65%。在单次询问病例中,ChatGPT-4的敏感性和特异性分别为71.67%和92.55%,在二次询问病例中,敏感性和特异性分别为44.44%和76.92%。结论:GBM模型在STB和ST的鉴别诊断中具有重要价值,而ChatGPT-4的诊断性能仍不理想。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Spine Journal
Spine Journal 医学-临床神经学
CiteScore
8.20
自引率
6.70%
发文量
680
审稿时长
13.1 weeks
期刊介绍: The Spine Journal, the official journal of the North American Spine Society, is an international and multidisciplinary journal that publishes original, peer-reviewed articles on research and treatment related to the spine and spine care, including basic science and clinical investigations. It is a condition of publication that manuscripts submitted to The Spine Journal have not been published, and will not be simultaneously submitted or published elsewhere. The Spine Journal also publishes major reviews of specific topics by acknowledged authorities, technical notes, teaching editorials, and other special features, Letters to the Editor-in-Chief are encouraged.
期刊最新文献
Clinical Outcomes following Elective Lumbar Spine Surgery in Patients Living with Dementia. Letter to the editor concerning "What are the risk factors for a second osteoporotic vertebral compression fracture?" by Sang Hoon Hwang, et al. (Spine J. 2023; 23(11):1586-1592. Preoperative determinants of postoperative expectation fulfillment following elective lumbar spine surgery: an observational study from the Canadian Spine Outcome Research Network (CSORN). Meetings Calendar Editorial Board
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1