Accuracy of Commercial Large Language Model (ChatGPT) to Predict the Diagnosis for Prehospital Patients Suitable for Ambulance Transport Decisions: Diagnostic Accuracy Study.

IF 2.1 3区 医学 Q2 EMERGENCY MEDICINE Prehospital Emergency Care Pub Date : 2025-02-12 DOI:10.1080/10903127.2025.2460775
Eric D Miller, Jeffrey Michael Franc, Attila J Hertelendy, Fadi Issa, Alexander Hart, Christina A Woodward, Bradford Newbury, Kiera Newbury, Dana Mathew, Kimberly Whitten-Chung, Eric Bauer, Amalia Voskanyan, Gregory R Ciottone
{"title":"Accuracy of Commercial Large Language Model (ChatGPT) to Predict the Diagnosis for Prehospital Patients Suitable for Ambulance Transport Decisions: Diagnostic Accuracy Study.","authors":"Eric D Miller, Jeffrey Michael Franc, Attila J Hertelendy, Fadi Issa, Alexander Hart, Christina A Woodward, Bradford Newbury, Kiera Newbury, Dana Mathew, Kimberly Whitten-Chung, Eric Bauer, Amalia Voskanyan, Gregory R Ciottone","doi":"10.1080/10903127.2025.2460775","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>While ambulance transport decisions guided by artificial intelligence (AI) could be useful, little is known of the accuracy of AI in making patient diagnoses based on the pre-hospital patient care report (PCR). The primary objective of this study was to assess the accuracy of ChatGPT (OpenAI, Inc., San Francisco, CA, USA) to predict a patient's diagnosis using the PCR by comparing to a reference standard assigned by experienced paramedics. The secondary objective was to classify cases where the AI diagnosis did not agree with the reference standard as paramedic correct, ChatGPT correct, or equally correct.</p><p><strong>Methods: </strong>This diagnostic accuracy study used a zero-shot learning model and greedy decoding. A convenience sample of PCRs from paramedic students was analyzed by an untrained ChatGPT-4 model to determine the single most likely diagnosis. A reference standard was provided by an experienced paramedic reviewing each PCR and giving a differential diagnosis of three items. A trained prehospital professional assessed the ChatGPT diagnosis as concordant or non-concordant with one of the three paramedic diagnoses. If non-concordant, two board-certified emergency physicians independently decided if the ChatGPT or the paramedic diagnosis was more likely to be correct.</p><p><strong>Results: </strong>ChatGPT-4 diagnosed 78/104 (75.0%) of PCRs correctly (95% confidence interval: 65.3-82.7%). Among the 26 cases of disagreement, judgment by the emergency physicians was that in 6/26 (23.0%) the paramedic diagnosis was more likely to be correct. There was only one case of the 104 (0.96%) where transport decisions based on the AI guided diagnosis would have been potentially dangerous to the patient (under-triage).</p><p><strong>Conclusions: </strong>In this study, overall accuracy of ChatGPT to diagnose patients based on their emergency medical services PCR was 75.0%. In cases where the ChatGPT diagnosis was considered less likely than paramedic diagnosis, most commonly the AI diagnosis was more critical than the paramedic diagnosis-potentially leading to over-triage. The under-triage rate was <1%.</p>","PeriodicalId":20336,"journal":{"name":"Prehospital Emergency Care","volume":" ","pages":"1-5"},"PeriodicalIF":2.1000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Prehospital Emergency Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/10903127.2025.2460775","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: While ambulance transport decisions guided by artificial intelligence (AI) could be useful, little is known of the accuracy of AI in making patient diagnoses based on the pre-hospital patient care report (PCR). The primary objective of this study was to assess the accuracy of ChatGPT (OpenAI, Inc., San Francisco, CA, USA) to predict a patient's diagnosis using the PCR by comparing to a reference standard assigned by experienced paramedics. The secondary objective was to classify cases where the AI diagnosis did not agree with the reference standard as paramedic correct, ChatGPT correct, or equally correct.

Methods: This diagnostic accuracy study used a zero-shot learning model and greedy decoding. A convenience sample of PCRs from paramedic students was analyzed by an untrained ChatGPT-4 model to determine the single most likely diagnosis. A reference standard was provided by an experienced paramedic reviewing each PCR and giving a differential diagnosis of three items. A trained prehospital professional assessed the ChatGPT diagnosis as concordant or non-concordant with one of the three paramedic diagnoses. If non-concordant, two board-certified emergency physicians independently decided if the ChatGPT or the paramedic diagnosis was more likely to be correct.

Results: ChatGPT-4 diagnosed 78/104 (75.0%) of PCRs correctly (95% confidence interval: 65.3-82.7%). Among the 26 cases of disagreement, judgment by the emergency physicians was that in 6/26 (23.0%) the paramedic diagnosis was more likely to be correct. There was only one case of the 104 (0.96%) where transport decisions based on the AI guided diagnosis would have been potentially dangerous to the patient (under-triage).

Conclusions: In this study, overall accuracy of ChatGPT to diagnose patients based on their emergency medical services PCR was 75.0%. In cases where the ChatGPT diagnosis was considered less likely than paramedic diagnosis, most commonly the AI diagnosis was more critical than the paramedic diagnosis-potentially leading to over-triage. The under-triage rate was <1%.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Prehospital Emergency Care
Prehospital Emergency Care 医学-公共卫生、环境卫生与职业卫生
CiteScore
4.30
自引率
12.50%
发文量
137
审稿时长
1 months
期刊介绍: Prehospital Emergency Care publishes peer-reviewed information relevant to the practice, educational advancement, and investigation of prehospital emergency care, including the following types of articles: Special Contributions - Original Articles - Education and Practice - Preliminary Reports - Case Conferences - Position Papers - Collective Reviews - Editorials - Letters to the Editor - Media Reviews.
期刊最新文献
Social Determinants of Health and Emergency Medical Services: A Scoping Review. Chest Compressions Synchronized to Native Cardiac Contractions are More Effective than Unsynchronized Compressions for Improving Coronary Perfusion Pressure in a Novel Pseudo-PEA Swine Model. Paramedic-administered fibrinolysis in older patients with prehospital ST-segment elevation myocardial infarction. Prehospital Whole Blood Administration Not Associated with Increased Transfusion Reactions: The Experience of a Metropolitan EMS Agency. Non-Invasive Ventilation as a Pre-Oxygenation Strategy During In-Flight Rapid Sequence Intubation: A Case Report.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1