比医生更快更好?评估ChatGPT对视神经脊髓炎谱系障碍误诊患者的诊断能力。

IF 3.6 3区 医学 Q1 CLINICAL NEUROLOGY Journal of the Neurological Sciences Pub Date : 2025-01-15 Epub Date: 2024-12-19 DOI:10.1016/j.jns.2024.123360
Kevin Shan, Mahi A Patel, Morgan McCreary, Tom G Punnen, Francisco Villalobos, Lauren M Tardo, Lindsay A Horton, Peter V Sguigna, Kyle M Blackburn, Shanan B Munoz, Katy W Burgess, Tatum M Moog, Alexander D Smith, Darin T Okuda
{"title":"比医生更快更好?评估ChatGPT对视神经脊髓炎谱系障碍误诊患者的诊断能力。","authors":"Kevin Shan, Mahi A Patel, Morgan McCreary, Tom G Punnen, Francisco Villalobos, Lauren M Tardo, Lindsay A Horton, Peter V Sguigna, Kyle M Blackburn, Shanan B Munoz, Katy W Burgess, Tatum M Moog, Alexander D Smith, Darin T Okuda","doi":"10.1016/j.jns.2024.123360","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Neuromyelitis optica spectrum disorder (NMOSD) is a commonly misdiagnosed condition. Driven by cost-consciousness and technological fluency, distinct generations may gravitate towards healthcare alternatives, including artificial intelligence (AI) models, such as ChatGPT (Generative Pre-trained Transformer). Our objective was to evaluate the speed and accuracy of ChatGPT-3.5 (GPT-3.5) in the diagnosis of people with NMOSD (PwNMOSD) initially misdiagnosed.</p><p><strong>Methods: </strong>Misdiagnosed PwNMOSD were retrospectively identified with clinical symptoms and time line of medically related events processed through GPT-3.5. For each subject, seven digital derivatives representing different races, ethnicities, and sexes were created and processed identically to evaluate the impact of these variables on accuracy. Scoresheets were used to track diagnostic success and time to diagnosis. Diagnostic speed of GPT-3.5 was evaluated against physicians using a Cox proportional hazards model, clustered by subject. Logistical regression was used to estimate the diagnostic accuracy of GPT-3.5 compared with the estimated accuracy of physicians.</p><p><strong>Results: </strong>Clinical time lines for 68 individuals (59 female, 42 Black/African American, 13 White, 11 Hispanic, 2 Asian; mean age at first symptoms 34.4 years (y) (standard deviation = 15.5y)) were analyzed and 476 digital simulations created, yielding 544 conversations for analysis. The instantaneous probability of correct diagnosis was 70.65% less for physicians relative to GPT-3.5 within 240 days of symptom onset (p < 0.0001). The estimated probability of correct diagnosis for GPT-3.5 was 80.88% [95% CI = (76.35%, 99.81%)].</p><p><strong>Conclusion: </strong>GPT-3.5 may be of value in recognizing NMOSD. However, the manner in which medical information is conveyed, combined with the potential for inaccuracies may result in unnecessary psychological stress.</p>","PeriodicalId":17417,"journal":{"name":"Journal of the Neurological Sciences","volume":"468 ","pages":"123360"},"PeriodicalIF":3.6000,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Faster and better than a physician?: Assessing diagnostic proficiency of ChatGPT in misdiagnosed individuals with neuromyelitis optica spectrum disorder.\",\"authors\":\"Kevin Shan, Mahi A Patel, Morgan McCreary, Tom G Punnen, Francisco Villalobos, Lauren M Tardo, Lindsay A Horton, Peter V Sguigna, Kyle M Blackburn, Shanan B Munoz, Katy W Burgess, Tatum M Moog, Alexander D Smith, Darin T Okuda\",\"doi\":\"10.1016/j.jns.2024.123360\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Neuromyelitis optica spectrum disorder (NMOSD) is a commonly misdiagnosed condition. Driven by cost-consciousness and technological fluency, distinct generations may gravitate towards healthcare alternatives, including artificial intelligence (AI) models, such as ChatGPT (Generative Pre-trained Transformer). Our objective was to evaluate the speed and accuracy of ChatGPT-3.5 (GPT-3.5) in the diagnosis of people with NMOSD (PwNMOSD) initially misdiagnosed.</p><p><strong>Methods: </strong>Misdiagnosed PwNMOSD were retrospectively identified with clinical symptoms and time line of medically related events processed through GPT-3.5. For each subject, seven digital derivatives representing different races, ethnicities, and sexes were created and processed identically to evaluate the impact of these variables on accuracy. Scoresheets were used to track diagnostic success and time to diagnosis. Diagnostic speed of GPT-3.5 was evaluated against physicians using a Cox proportional hazards model, clustered by subject. Logistical regression was used to estimate the diagnostic accuracy of GPT-3.5 compared with the estimated accuracy of physicians.</p><p><strong>Results: </strong>Clinical time lines for 68 individuals (59 female, 42 Black/African American, 13 White, 11 Hispanic, 2 Asian; mean age at first symptoms 34.4 years (y) (standard deviation = 15.5y)) were analyzed and 476 digital simulations created, yielding 544 conversations for analysis. The instantaneous probability of correct diagnosis was 70.65% less for physicians relative to GPT-3.5 within 240 days of symptom onset (p < 0.0001). The estimated probability of correct diagnosis for GPT-3.5 was 80.88% [95% CI = (76.35%, 99.81%)].</p><p><strong>Conclusion: </strong>GPT-3.5 may be of value in recognizing NMOSD. However, the manner in which medical information is conveyed, combined with the potential for inaccuracies may result in unnecessary psychological stress.</p>\",\"PeriodicalId\":17417,\"journal\":{\"name\":\"Journal of the Neurological Sciences\",\"volume\":\"468 \",\"pages\":\"123360\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-01-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Neurological Sciences\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.jns.2024.123360\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/12/19 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"CLINICAL NEUROLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Neurological Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.jns.2024.123360","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/19 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0

摘要

背景:视神经脊髓炎谱系障碍(NMOSD)是一种常被误诊的疾病。在成本意识和技术流畅性的驱动下,不同的世代可能会被医疗保健替代方案所吸引,包括人工智能(AI)模型,如ChatGPT(生成预训练变压器)。我们的目的是评估ChatGPT-3.5 (GPT-3.5)在诊断最初误诊的NMOSD (PwNMOSD)患者中的速度和准确性。方法:对误诊的PwNMOSD进行回顾性鉴定,并通过GPT-3.5处理临床症状和医学相关事件时间线。对于每个主题,七个代表不同种族、民族和性别的数字衍生品被创建和处理,以评估这些变量对准确性的影响。记分表用于跟踪诊断成功和诊断时间。GPT-3.5的诊断速度采用Cox比例风险模型对医生进行评估,按受试者聚类。使用逻辑回归来估计GPT-3.5的诊断准确性,并与医生的估计准确性进行比较。结果:68例患者的临床时间线(女性59例,黑人/非裔美国人42例,白人13例,西班牙裔11例,亚洲人2例;分析了首次出现症状的平均年龄34.4岁(y)(标准差= 15.5y)),并创建了476个数字模拟,产生了544个用于分析的对话。在症状出现后240天内,医师对NMOSD的即时诊断正确率比GPT-3.5低70.65% (p)。然而,医疗信息的传递方式,加上可能出现的不准确,可能会造成不必要的心理压力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Faster and better than a physician?: Assessing diagnostic proficiency of ChatGPT in misdiagnosed individuals with neuromyelitis optica spectrum disorder.

Background: Neuromyelitis optica spectrum disorder (NMOSD) is a commonly misdiagnosed condition. Driven by cost-consciousness and technological fluency, distinct generations may gravitate towards healthcare alternatives, including artificial intelligence (AI) models, such as ChatGPT (Generative Pre-trained Transformer). Our objective was to evaluate the speed and accuracy of ChatGPT-3.5 (GPT-3.5) in the diagnosis of people with NMOSD (PwNMOSD) initially misdiagnosed.

Methods: Misdiagnosed PwNMOSD were retrospectively identified with clinical symptoms and time line of medically related events processed through GPT-3.5. For each subject, seven digital derivatives representing different races, ethnicities, and sexes were created and processed identically to evaluate the impact of these variables on accuracy. Scoresheets were used to track diagnostic success and time to diagnosis. Diagnostic speed of GPT-3.5 was evaluated against physicians using a Cox proportional hazards model, clustered by subject. Logistical regression was used to estimate the diagnostic accuracy of GPT-3.5 compared with the estimated accuracy of physicians.

Results: Clinical time lines for 68 individuals (59 female, 42 Black/African American, 13 White, 11 Hispanic, 2 Asian; mean age at first symptoms 34.4 years (y) (standard deviation = 15.5y)) were analyzed and 476 digital simulations created, yielding 544 conversations for analysis. The instantaneous probability of correct diagnosis was 70.65% less for physicians relative to GPT-3.5 within 240 days of symptom onset (p < 0.0001). The estimated probability of correct diagnosis for GPT-3.5 was 80.88% [95% CI = (76.35%, 99.81%)].

Conclusion: GPT-3.5 may be of value in recognizing NMOSD. However, the manner in which medical information is conveyed, combined with the potential for inaccuracies may result in unnecessary psychological stress.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of the Neurological Sciences
Journal of the Neurological Sciences 医学-临床神经学
CiteScore
7.60
自引率
2.30%
发文量
313
审稿时长
22 days
期刊介绍: The Journal of the Neurological Sciences provides a medium for the prompt publication of original articles in neurology and neuroscience from around the world. JNS places special emphasis on articles that: 1) provide guidance to clinicians around the world (Best Practices, Global Neurology); 2) report cutting-edge science related to neurology (Basic and Translational Sciences); 3) educate readers about relevant and practical clinical outcomes in neurology (Outcomes Research); and 4) summarize or editorialize the current state of the literature (Reviews, Commentaries, and Editorials). JNS accepts most types of manuscripts for consideration including original research papers, short communications, reviews, book reviews, letters to the Editor, opinions and editorials. Topics considered will be from neurology-related fields that are of interest to practicing physicians around the world. Examples include neuromuscular diseases, demyelination, atrophies, dementia, neoplasms, infections, epilepsies, disturbances of consciousness, stroke and cerebral circulation, growth and development, plasticity and intermediary metabolism.
期刊最新文献
A spectrum of neurological diseases with elevated cerebrospinal fluid adenosine deaminase levels. Clinical features of FOSMN syndrome in Korea: A comparative analysis with bulbar-onset amyotrophic lateral sclerosis. Does age, sex, and area of substantia nigra echogenicity predict the MRI appearance of nigrosome-1? Epidemiology and treatment trends for acute encephalopathy under the impact of SARS-CoV-2 pandemic based on a prospective multicenter consecutive case registry. A serial case report of hospitalized patients with Creutzfeldt-Jakob disease due to coronavirus disease (COVID)-19 in Brazil: A four-year profile.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1