将宿主转录组生物标记物与大型语言模型相结合诊断下呼吸道感染

medRxiv - Infectious Diseases Pub Date : 2024-08-29 DOI:10.1101/2024.08.28.24312732

Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier

{"title":"将宿主转录组生物标记物与大型语言模型相结合诊断下呼吸道感染","authors":"Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier","doi":"10.1101/2024.08.28.24312732","DOIUrl":null,"url":null,"abstract":"Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker FABP4 with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary FABP4 expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining FABP4 and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on FABP4 expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.","PeriodicalId":501509,"journal":{"name":"medRxiv - Infectious Diseases","volume":"2010 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Integrating a host transcriptomic biomarker with a large language model for diagnosis of lower respiratory tract infection\",\"authors\":\"Hoang Van Phan, Natasha Spottiswoode, Emily C. Lydon, Victoria T. Chu, Adolfo Cuesta, Alexander D. Kazberouk, Natalie L. Richmond, Carolyn S. Calfee, Charles R. Langelier\",\"doi\":\"10.1101/2024.08.28.24312732\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker FABP4 with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary FABP4 expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining FABP4 and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on FABP4 expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.\",\"PeriodicalId\":501509,\"journal\":{\"name\":\"medRxiv - Infectious Diseases\",\"volume\":\"2010 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv - Infectious Diseases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2024.08.28.24312732\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Infectious Diseases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.08.28.24312732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

下呼吸道感染（LRTI）是导致全球死亡的主要原因。尽管如此，LRTI 的诊断仍然具有挑战性，尤其是在重症监护病房，因为非感染性呼吸道疾病也可能表现出类似的特征。在这里，我们测试了一种新的 LRTI 诊断方法，它将转录组生物标志物 FABP4 与使用大型语言模型生成预训练转换器 4 (GPT-4) 评估电子病历 (EMR) 中的文本相结合。我们在急性呼吸衰竭重症成人前瞻性队列中评估了这一方法，测量了肺部 FABP4 的表达，并通过回顾性判定确定了 LRTI 或非感染性疾病患者。通过五倍交叉验证（CV），结合 FABP4 和 GPT-4 的诊断分类器的接收运算曲线下面积（AUC）为 0.92 ± 0.06，优于仅基于 FABP4 表达的分类器（AUC 0.83）或仅基于 GPT-4 的分类器（AUC 0.84）。在每个交叉验证褶皱内的尤登指数上，组合分类器的平均灵敏度为 92% ± 7%，特异度为 90% ± 17%，准确度为 91% +/- 8%。综上所述，我们的研究结果表明，将宿主转录生物标记物与利用人工智能解读EMR数据相结合是一种很有前景的传染病诊断新方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Integrating a host transcriptomic biomarker with a large language model for diagnosis of lower respiratory tract infection

Lower respiratory tract infections (LRTIs) are a leading cause of mortality worldwide. Despite this, diagnosing LRTI remains challenging, particularly in the intensive care unit, where non-infectious respiratory conditions can present with similar features. Here, we tested a new method for LRTI diagnosis that combines the transcriptomic biomarker FABP4 with assessment of text from the electronic medical record (EMR) using the large language model Generative Pre-trained Transformer 4 (GPT-4). We evaluated this methodology in a prospective cohort of critically ill adults with acute respiratory failure, in which we measured pulmonary FABP4 expression and identified patients with LRTI or non-infectious conditions using retrospective adjudication. A diagnostic classifier combining FABP4 and GPT-4 achieved an area under the receiver operator curve (AUC) of 0.92 ± 0.06 by five-fold cross validation (CV), outperforming classifiers based on FABP4 expression alone (AUC 0.83) or GPT-4 alone (AUC 0.84). At the Youden’s index within each CV fold, the combined classifier achieved a mean sensitivity of 92% ± 7%, specificity of 90% ± 17% and accuracy of 91% +/- 8%. Taken together, our findings suggest that combining a host transcriptional biomarker with interpretation of EMR data using artificial intelligence is a promising new approach to infectious disease diagnosis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

medRxiv - Infectious Diseases

自引率

0.00%

发文量