Naresh Doni Jayavelu, Hady Samaha, Sonia Tandon Wimalasena, Annmarie Hoch, Jeremy P Gygi, Gisela Gabernet, Al Ozonoff, Shanshan Liu, Carly E Milliren, Ofer Levy, Lindsey R Baden, Esther Melamed, Lauren I R Ehrlich, Grace A McComsey, Rafick P Sekaly, Charles B Cairns, Elias K Haddad, Joanna Schaenman, Albert C Shaw, David A Hafler, Ruth R Montgomery, David B Corry, Farrah Kheradmand, Mark A Atkinson, Scott C Brakenridge, Nelson I Agudelo Higuita, Jordan P Metcalf, Catherine L Hough, William B Messer, Bali Pulendran, Kari C Nadeau, Mark M Davis, Linda N Geng, Ana Fernandez Sesma, Viviana Simon, Florian Krammer, Monica Kraft, Chris Bime, Carolyn S Calfee, David J Erle, Charles R Langelier, Leying Guan, Holden T Maecker, Bjoern Peters, Steven H Kleinstein, Elaine F Reed, Joann Diray-Arce, Nadine Rouphael, Matthew C Altman
{"title":"Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors.","authors":"Naresh Doni Jayavelu, Hady Samaha, Sonia Tandon Wimalasena, Annmarie Hoch, Jeremy P Gygi, Gisela Gabernet, Al Ozonoff, Shanshan Liu, Carly E Milliren, Ofer Levy, Lindsey R Baden, Esther Melamed, Lauren I R Ehrlich, Grace A McComsey, Rafick P Sekaly, Charles B Cairns, Elias K Haddad, Joanna Schaenman, Albert C Shaw, David A Hafler, Ruth R Montgomery, David B Corry, Farrah Kheradmand, Mark A Atkinson, Scott C Brakenridge, Nelson I Agudelo Higuita, Jordan P Metcalf, Catherine L Hough, William B Messer, Bali Pulendran, Kari C Nadeau, Mark M Davis, Linda N Geng, Ana Fernandez Sesma, Viviana Simon, Florian Krammer, Monica Kraft, Chris Bime, Carolyn S Calfee, David J Erle, Charles R Langelier, Leying Guan, Holden T Maecker, Bjoern Peters, Steven H Kleinstein, Elaine F Reed, Joann Diray-Arce, Nadine Rouphael, Matthew C Altman","doi":"10.1101/2025.02.12.25322164","DOIUrl":null,"url":null,"abstract":"<p><p>The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will go on to develop long COVID is challenging due to the lack of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models offer the potential to address this by leveraging clinical data to enhance diagnostic precision. We utilized clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, to predict the likelihood of acute COVID-19 progressing to long COVID. Our machine learning models achieved median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating their predictive capabilities. Feature importance analysis revealed that low antibody titers and high viral loads at hospital admission were the strongest predictors of long COVID outcomes. Comorbidities, including chronic respiratory, cardiac, and neurologic diseases, as well as female sex, were also identified as significant risk factors for long COVID. Our findings suggest that ML models have the potential to identify patients at risk for developing long COVID based on baseline clinical characteristics. These models can help guide early interventions, improving patient outcomes and mitigating the long-term public health impacts of SARS-CoV-2.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11844586/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.02.12.25322164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will go on to develop long COVID is challenging due to the lack of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models offer the potential to address this by leveraging clinical data to enhance diagnostic precision. We utilized clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, to predict the likelihood of acute COVID-19 progressing to long COVID. Our machine learning models achieved median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating their predictive capabilities. Feature importance analysis revealed that low antibody titers and high viral loads at hospital admission were the strongest predictors of long COVID outcomes. Comorbidities, including chronic respiratory, cardiac, and neurologic diseases, as well as female sex, were also identified as significant risk factors for long COVID. Our findings suggest that ML models have the potential to identify patients at risk for developing long COVID based on baseline clinical characteristics. These models can help guide early interventions, improving patient outcomes and mitigating the long-term public health impacts of SARS-CoV-2.