Crowd-sourced machine learning prediction of long COVID using data from the National COVID Cohort Collaborative.

IF 9.7 1区 医学 Q1 MEDICINE, RESEARCH & EXPERIMENTAL EBioMedicine Pub Date : 2024-10-01 Epub Date: 2024-09-24 DOI:10.1016/j.ebiom.2024.105333
Timothy Bergquist, Johanna Loomba, Emily Pfaff, Fangfang Xia, Zixuan Zhao, Yitan Zhu, Elliot Mitchell, Biplab Bhattacharya, Gaurav Shetty, Tamanna Munia, Grant Delong, Adbul Tariq, Zachary Butzin-Dozier, Yunwen Ji, Haodong Li, Jeremy Coyle, Seraphina Shi, Rachael V Philips, Andrew Mertens, Romain Pirracchio, Mark van der Laan, John M Colford, Alan Hubbard, Jifan Gao, Guanhua Chen, Neelay Velingker, Ziyang Li, Yinjun Wu, Adam Stein, Jiani Huang, Zongyu Dai, Qi Long, Mayur Naik, John Holmes, Danielle Mowery, Eric Wong, Ravi Parekh, Emily Getzen, Jake Hightower, Jennifer Blase
{"title":"Crowd-sourced machine learning prediction of long COVID using data from the National COVID Cohort Collaborative.","authors":"Timothy Bergquist, Johanna Loomba, Emily Pfaff, Fangfang Xia, Zixuan Zhao, Yitan Zhu, Elliot Mitchell, Biplab Bhattacharya, Gaurav Shetty, Tamanna Munia, Grant Delong, Adbul Tariq, Zachary Butzin-Dozier, Yunwen Ji, Haodong Li, Jeremy Coyle, Seraphina Shi, Rachael V Philips, Andrew Mertens, Romain Pirracchio, Mark van der Laan, John M Colford, Alan Hubbard, Jifan Gao, Guanhua Chen, Neelay Velingker, Ziyang Li, Yinjun Wu, Adam Stein, Jiani Huang, Zongyu Dai, Qi Long, Mayur Naik, John Holmes, Danielle Mowery, Eric Wong, Ravi Parekh, Emily Getzen, Jake Hightower, Jennifer Blase","doi":"10.1016/j.ebiom.2024.105333","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>While many patients seem to recover from SARS-CoV-2 infections, many patients report experiencing SARS-CoV-2 symptoms for weeks or months after their acute COVID-19 ends, even developing new symptoms weeks after infection. These long-term effects are called post-acute sequelae of SARS-CoV-2 (PASC) or, more commonly, Long COVID. The overall prevalence of Long COVID is currently unknown, and tools are needed to help identify patients at risk for developing long COVID.</p><p><strong>Methods: </strong>A working group of the Rapid Acceleration of Diagnostics-radical (RADx-rad) program, comprised of individuals from various NIH institutes and centers, in collaboration with REsearching COVID to Enhance Recovery (RECOVER) developed and organized the Long COVID Computational Challenge (L3C), a community challenge aimed at incentivizing the broader scientific community to develop interpretable and accurate methods for identifying patients at risk of developing Long COVID. From August 2022 to December 2022, participants developed Long COVID risk prediction algorithms using the National COVID Cohort Collaborative (N3C) data enclave, a harmonized data repository from over 75 healthcare institutions from across the United States (U.S.).</p><p><strong>Findings: </strong>Over the course of the challenge, 74 teams designed and built 35 Long COVID prediction models using the N3C data enclave. The top 10 teams all scored above a 0.80 Area Under the Receiver Operator Curve (AUROC) with the highest scoring model achieving a mean AUROC of 0.895. Included in the top submission was a visualization dashboard that built timelines for each patient, updating the risk of a patient developing Long COVID in response to clinical events.</p><p><strong>Interpretation: </strong>As a result of L3C, federal reviewers identified multiple machine learning models that can be used to identify patients at risk for developing Long COVID. Many of the teams used approaches in their submissions which can be applied to future clinical prediction questions.</p><p><strong>Funding: </strong>Research reported in this RADx® Rad publication was supported by the National Institutes of Health. Timothy Bergquist, Johanna Loomba, and Emily Pfaff were supported by Axle Subcontract: NCATS-STSS-P00438.</p>","PeriodicalId":11494,"journal":{"name":"EBioMedicine","volume":"108 ","pages":"105333"},"PeriodicalIF":9.7000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462169/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EBioMedicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.ebiom.2024.105333","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/9/24 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Background: While many patients seem to recover from SARS-CoV-2 infections, many patients report experiencing SARS-CoV-2 symptoms for weeks or months after their acute COVID-19 ends, even developing new symptoms weeks after infection. These long-term effects are called post-acute sequelae of SARS-CoV-2 (PASC) or, more commonly, Long COVID. The overall prevalence of Long COVID is currently unknown, and tools are needed to help identify patients at risk for developing long COVID.

Methods: A working group of the Rapid Acceleration of Diagnostics-radical (RADx-rad) program, comprised of individuals from various NIH institutes and centers, in collaboration with REsearching COVID to Enhance Recovery (RECOVER) developed and organized the Long COVID Computational Challenge (L3C), a community challenge aimed at incentivizing the broader scientific community to develop interpretable and accurate methods for identifying patients at risk of developing Long COVID. From August 2022 to December 2022, participants developed Long COVID risk prediction algorithms using the National COVID Cohort Collaborative (N3C) data enclave, a harmonized data repository from over 75 healthcare institutions from across the United States (U.S.).

Findings: Over the course of the challenge, 74 teams designed and built 35 Long COVID prediction models using the N3C data enclave. The top 10 teams all scored above a 0.80 Area Under the Receiver Operator Curve (AUROC) with the highest scoring model achieving a mean AUROC of 0.895. Included in the top submission was a visualization dashboard that built timelines for each patient, updating the risk of a patient developing Long COVID in response to clinical events.

Interpretation: As a result of L3C, federal reviewers identified multiple machine learning models that can be used to identify patients at risk for developing Long COVID. Many of the teams used approaches in their submissions which can be applied to future clinical prediction questions.

Funding: Research reported in this RADx® Rad publication was supported by the National Institutes of Health. Timothy Bergquist, Johanna Loomba, and Emily Pfaff were supported by Axle Subcontract: NCATS-STSS-P00438.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
利用国家 COVID 队列协作组织的数据,通过众包机器学习预测长 COVID。
背景:虽然许多患者似乎都能从 SARS-CoV-2 感染中康复,但许多患者报告说,他们在急性 COVID-19 结束后的数周或数月内都会出现 SARS-CoV-2 症状,甚至在感染数周后出现新的症状。这些长期影响被称为 SARS-CoV-2 急性后遗症 (PASC),或更常见的 Long COVID。Long COVID 的总体发病率目前尚不清楚,需要一些工具来帮助识别有可能患上 Long COVID 的病人:由美国国立卫生研究院(NIH)各研究所和中心的人员组成的快速加速诊断-激进(RADx-rad)项目工作组与 "研究COVID,促进康复"(RECOVER)合作,开发并组织了 "长COVID计算挑战赛"(L3C),这是一项社区挑战赛,旨在激励更广泛的科学界开发可解释的准确方法,以识别有罹患长COVID风险的患者。从 2022 年 8 月到 2022 年 12 月,参赛者利用全美 COVID 队列协作组织 (N3C) 的数据飞地(来自全美超过 75 家医疗保健机构的统一数据存储库)开发了 Long COVID 风险预测算法:在挑战赛期间,74 个团队利用 N3C 数据飞地设计并构建了 35 个长 COVID 预测模型。前 10 个团队的得分均超过了 0.80 的接收器运算曲线下面积 (AUROC),得分最高的模型平均 AUROC 为 0.895。最高分提交的报告中包括一个可视化仪表板,该仪表板可为每位患者建立时间轴,并根据临床事件更新患者发生长COVID的风险:通过 L3C,联邦评审员发现了多种机器学习模型,可用于识别有罹患 Long COVID 风险的患者。许多团队在提交的报告中使用了可应用于未来临床预测问题的方法:本 RADx® Rad 出版物中报道的研究得到了美国国立卫生研究院的支持。Timothy Bergquist、Johanna Loomba和Emily Pfaff得到了Axle分包合同:NCATS-STSS-P00438的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
EBioMedicine
EBioMedicine Biochemistry, Genetics and Molecular Biology-General Biochemistry,Genetics and Molecular Biology
CiteScore
17.70
自引率
0.90%
发文量
579
审稿时长
5 weeks
期刊介绍: eBioMedicine is a comprehensive biomedical research journal that covers a wide range of studies that are relevant to human health. Our focus is on original research that explores the fundamental factors influencing human health and disease, including the discovery of new therapeutic targets and treatments, the identification of biomarkers and diagnostic tools, and the investigation and modification of disease pathways and mechanisms. We welcome studies from any biomedical discipline that contribute to our understanding of disease and aim to improve human health.
期刊最新文献
Circadian rhythms in haematological malignancies: therapeutic potential and personalised interventions. Involvement of Mediterranean fever gene mutations in colchicine-responsive enterocolitis: a retrospective cohort study. Cross-sectional and longitudinal genotype to phenotype surveillance of SARS-CoV-2 variants over the first four years of the COVID-19 pandemic. Composition of the neutralising antibody response predicts risk of BK virus DNAaemia in recipients of kidney transplants. Exposure to air pollution increases susceptibility to ulcerative colitis through epigenetic alterations in CXCR2 and MHC class III region.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1