Integrating a host transcriptomic biomarker with a large language model for diagnosis of lower respiratory tract infection

Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier
{"title":"Integrating a host transcriptomic biomarker with a large language model for diagnosis of lower respiratory tract infection","authors":"Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier","doi":"10.1101/2024.08.28.24312732","DOIUrl":null,"url":null,"abstract":"Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker <em>FABP4</em> with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary <em>FABP4</em> expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining <em>FABP4</em> and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on <em>FABP4</em> expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.","PeriodicalId":501509,"journal":{"name":"medRxiv - Infectious Diseases","volume":"2010 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Infectious Diseases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.28.24312732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker FABP4 with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary FABP4 expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining FABP4 and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on FABP4 expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
将宿主转录组生物标记物与大型语言模型相结合诊断下呼吸道感染
下呼吸道感染(LRTI)是导致全球死亡的主要原因。尽管如此,LRTI 的诊断仍然具有挑战性,尤其是在重症监护病房,因为非感染性呼吸道疾病也可能表现出类似的特征。在这里,我们测试了一种新的 LRTI 诊断方法,它将转录组生物标志物 FABP4 与使用大型语言模型生成预训练转换器 4 (GPT-4) 评估电子病历 (EMR) 中的文本相结合。我们在急性呼吸衰竭重症成人前瞻性队列中评估了这一方法,测量了肺部 FABP4 的表达,并通过回顾性判定确定了 LRTI 或非感染性疾病患者。通过五倍交叉验证(CV),结合 FABP4 和 GPT-4 的诊断分类器的接收运算曲线下面积(AUC)为 0.92 ± 0.06,优于仅基于 FABP4 表达的分类器(AUC 0.83)或仅基于 GPT-4 的分类器(AUC 0.84)。在每个交叉验证褶皱内的尤登指数上,组合分类器的平均灵敏度为 92% ± 7%,特异度为 90% ± 17%,准确度为 91% +/- 8%。综上所述,我们的研究结果表明,将宿主转录生物标记物与利用人工智能解读EMR数据相结合是一种很有前景的传染病诊断新方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reactogenicity and immunogenicity against MPXV of the intradermal administration of Modified V Vaccinia Ankara compared to the standard subcutaneous route. A next generation CRISPR diagnostic tool to survey drug resistance in Human African Trypanosomiasis. Hospital-onset bacteraemia and fungaemia as a novel automated surveillance indicator: results from four European university hospitals Integration of Group A Streptococcus Rapid Tests with the Open Fluidic CandyCollect Device Deep Learning Models for Predicting the Nugent Score to Diagnose Bacterial Vaginosis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1