生成式预训练变换器(GPT)-4 支持神经放射学的鉴别诊断。

IF 2.9 2区 医学 Q2 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING Quantitative Imaging in Medicine and Surgery Pub Date : 2024-10-01 Epub Date: 2024-09-23 DOI:10.21037/qims-24-200
Vera Sorin, Eyal Klang, Tamer Sobeh, Eli Konen, Shai Shrot, Adva Livne, Yulian Weissbuch, Chen Hoffmann, Yiftach Barash
{"title":"生成式预训练变换器(GPT)-4 支持神经放射学的鉴别诊断。","authors":"Vera Sorin, Eyal Klang, Tamer Sobeh, Eli Konen, Shai Shrot, Adva Livne, Yulian Weissbuch, Chen Hoffmann, Yiftach Barash","doi":"10.21037/qims-24-200","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists.</p><p><strong>Methods: </strong>Sixty neuroradiology reports with variable diagnoses were inserted into GPT-4, which was tasked with generating a top-3 differential diagnosis for each case. The results were compared to the true diagnoses and to the differential diagnoses provided by three blinded neuroradiologists. Diagnostic accuracy and agreement between readers were assessed.</p><p><strong>Results: </strong>Of the 60 patients (mean age 47.8 years, 65% female), GPT-4 correctly included the diagnoses in its differentials in 61.7% (37/60) of cases, while the neuroradiologists' accuracy ranged from 63.3% (38/60) to 73.3% (44/60). Agreement between GPT-4 and the neuroradiologists, and among the neuroradiologists was fair to moderate [Cohen's kappa (kw) 0.34-0.44 and kw 0.39-0.54, respectively].</p><p><strong>Conclusions: </strong>GPT-4 shows potential as a support tool for differential diagnosis in neuroradiology, though it was outperformed by human experts. Radiologists should remain mindful to the limitations of LLMs, while harboring their potential to enhance educational and clinical work.</p>","PeriodicalId":54267,"journal":{"name":"Quantitative Imaging in Medicine and Surgery","volume":"14 10","pages":"7551-7560"},"PeriodicalIF":2.9000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11485343/pdf/","citationCount":"0","resultStr":"{\"title\":\"Generative pre-trained transformer (GPT)-4 support for differential diagnosis in neuroradiology.\",\"authors\":\"Vera Sorin, Eyal Klang, Tamer Sobeh, Eli Konen, Shai Shrot, Adva Livne, Yulian Weissbuch, Chen Hoffmann, Yiftach Barash\",\"doi\":\"10.21037/qims-24-200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists.</p><p><strong>Methods: </strong>Sixty neuroradiology reports with variable diagnoses were inserted into GPT-4, which was tasked with generating a top-3 differential diagnosis for each case. The results were compared to the true diagnoses and to the differential diagnoses provided by three blinded neuroradiologists. Diagnostic accuracy and agreement between readers were assessed.</p><p><strong>Results: </strong>Of the 60 patients (mean age 47.8 years, 65% female), GPT-4 correctly included the diagnoses in its differentials in 61.7% (37/60) of cases, while the neuroradiologists' accuracy ranged from 63.3% (38/60) to 73.3% (44/60). Agreement between GPT-4 and the neuroradiologists, and among the neuroradiologists was fair to moderate [Cohen's kappa (kw) 0.34-0.44 and kw 0.39-0.54, respectively].</p><p><strong>Conclusions: </strong>GPT-4 shows potential as a support tool for differential diagnosis in neuroradiology, though it was outperformed by human experts. Radiologists should remain mindful to the limitations of LLMs, while harboring their potential to enhance educational and clinical work.</p>\",\"PeriodicalId\":54267,\"journal\":{\"name\":\"Quantitative Imaging in Medicine and Surgery\",\"volume\":\"14 10\",\"pages\":\"7551-7560\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2024-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11485343/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Quantitative Imaging in Medicine and Surgery\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.21037/qims-24-200\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/9/23 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Quantitative Imaging in Medicine and Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.21037/qims-24-200","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/23 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0

摘要

背景:放射学中的鉴别诊断依赖于对成像模式的准确识别。大型语言模型(LLM)在放射学中的应用前景广阔,其许多潜在应用可提高放射科医生工作流程的效率。本研究旨在评估生成式预训练转换器(GPT)-4(一种 LLM)在神经放射学中提供鉴别诊断的功效,并将其性能与经委员会认证的神经放射科医生进行比较:将 60 份诊断不一的神经放射学报告插入 GPT-4,GPT-4 的任务是为每个病例生成前 3 位的鉴别诊断。将结果与真实诊断和三位盲神经放射学专家提供的鉴别诊断进行比较。结果:在 60 名患者(平均年龄 47.8 岁,65% 为女性)中,61.7%(37/60)的 GPT-4 诊断正确,而神经放射科医生的准确率为 63.3%(38/60)至 73.3%(44/60)。GPT-4与神经放射科医生之间以及神经放射科医生之间的一致性为一般到中等[科恩卡帕(kw)分别为0.34-0.44和0.39-0.54]:结论:GPT-4 显示出作为神经放射学鉴别诊断辅助工具的潜力,尽管其表现优于人类专家。放射科医生应继续注意 LLMs 的局限性,同时挖掘其潜力,加强教育和临床工作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Generative pre-trained transformer (GPT)-4 support for differential diagnosis in neuroradiology.

Background: Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists.

Methods: Sixty neuroradiology reports with variable diagnoses were inserted into GPT-4, which was tasked with generating a top-3 differential diagnosis for each case. The results were compared to the true diagnoses and to the differential diagnoses provided by three blinded neuroradiologists. Diagnostic accuracy and agreement between readers were assessed.

Results: Of the 60 patients (mean age 47.8 years, 65% female), GPT-4 correctly included the diagnoses in its differentials in 61.7% (37/60) of cases, while the neuroradiologists' accuracy ranged from 63.3% (38/60) to 73.3% (44/60). Agreement between GPT-4 and the neuroradiologists, and among the neuroradiologists was fair to moderate [Cohen's kappa (kw) 0.34-0.44 and kw 0.39-0.54, respectively].

Conclusions: GPT-4 shows potential as a support tool for differential diagnosis in neuroradiology, though it was outperformed by human experts. Radiologists should remain mindful to the limitations of LLMs, while harboring their potential to enhance educational and clinical work.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Quantitative Imaging in Medicine and Surgery
Quantitative Imaging in Medicine and Surgery Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
4.20
自引率
17.90%
发文量
252
期刊介绍: Information not localized
期刊最新文献
Metastatic bone lesion type in gastric cancer patients: imaging findings of case reports. Minimally invasive interventional guided imaging therapies of musculoskeletal tumors. Myosteatosis: diagnostic significance and assessment by imaging approaches. Narrative review of chest wall ultrasound: a practical approach. Natural language processing-based analysis of the level of adoption by expert radiologists of the ASSR, ASNR and NASS version 2.0 of lumbar disc nomenclature: an eight-year survey.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1