Naresh Doni Jayavelu, Hady Samaha, Sonia Tandon Wimalasena, Annmarie Hoch, Jeremy P Gygi, Gisela Gabernet, Al Ozonoff, Shanshan Liu, Carly E Milliren, Ofer Levy, Lindsey R Baden, Esther Melamed, Lauren I R Ehrlich, Grace A McComsey, Rafick P Sekaly, Charles B Cairns, Elias K Haddad, Joanna Schaenman, Albert C Shaw, David A Hafler, Ruth R Montgomery, David B Corry, Farrah Kheradmand, Mark A Atkinson, Scott C Brakenridge, Nelson I Agudelo Higuita, Jordan P Metcalf, Catherine L Hough, William B Messer, Bali Pulendran, Kari C Nadeau, Mark M Davis, Linda N Geng, Ana Fernandez Sesma, Viviana Simon, Florian Krammer, Monica Kraft, Chris Bime, Carolyn S Calfee, David J Erle, Charles R Langelier, Leying Guan, Holden T Maecker, Bjoern Peters, Steven H Kleinstein, Elaine F Reed, Joann Diray-Arce, Nadine Rouphael, Matthew C Altman
{"title":"机器学习模型基于基线临床和免疫因素预测长期COVID结果。","authors":"Naresh Doni Jayavelu, Hady Samaha, Sonia Tandon Wimalasena, Annmarie Hoch, Jeremy P Gygi, Gisela Gabernet, Al Ozonoff, Shanshan Liu, Carly E Milliren, Ofer Levy, Lindsey R Baden, Esther Melamed, Lauren I R Ehrlich, Grace A McComsey, Rafick P Sekaly, Charles B Cairns, Elias K Haddad, Joanna Schaenman, Albert C Shaw, David A Hafler, Ruth R Montgomery, David B Corry, Farrah Kheradmand, Mark A Atkinson, Scott C Brakenridge, Nelson I Agudelo Higuita, Jordan P Metcalf, Catherine L Hough, William B Messer, Bali Pulendran, Kari C Nadeau, Mark M Davis, Linda N Geng, Ana Fernandez Sesma, Viviana Simon, Florian Krammer, Monica Kraft, Chris Bime, Carolyn S Calfee, David J Erle, Charles R Langelier, Leying Guan, Holden T Maecker, Bjoern Peters, Steven H Kleinstein, Elaine F Reed, Joann Diray-Arce, Nadine Rouphael, Matthew C Altman","doi":"10.1101/2025.02.12.25322164","DOIUrl":null,"url":null,"abstract":"<p><p>The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will go on to develop long COVID is challenging due to the lack of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models offer the potential to address this by leveraging clinical data to enhance diagnostic precision. We utilized clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, to predict the likelihood of acute COVID-19 progressing to long COVID. Our machine learning models achieved median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating their predictive capabilities. Feature importance analysis revealed that low antibody titers and high viral loads at hospital admission were the strongest predictors of long COVID outcomes. Comorbidities, including chronic respiratory, cardiac, and neurologic diseases, as well as female sex, were also identified as significant risk factors for long COVID. Our findings suggest that ML models have the potential to identify patients at risk for developing long COVID based on baseline clinical characteristics. These models can help guide early interventions, improving patient outcomes and mitigating the long-term public health impacts of SARS-CoV-2.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11844586/pdf/","citationCount":"0","resultStr":"{\"title\":\"Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors.\",\"authors\":\"Naresh Doni Jayavelu, Hady Samaha, Sonia Tandon Wimalasena, Annmarie Hoch, Jeremy P Gygi, Gisela Gabernet, Al Ozonoff, Shanshan Liu, Carly E Milliren, Ofer Levy, Lindsey R Baden, Esther Melamed, Lauren I R Ehrlich, Grace A McComsey, Rafick P Sekaly, Charles B Cairns, Elias K Haddad, Joanna Schaenman, Albert C Shaw, David A Hafler, Ruth R Montgomery, David B Corry, Farrah Kheradmand, Mark A Atkinson, Scott C Brakenridge, Nelson I Agudelo Higuita, Jordan P Metcalf, Catherine L Hough, William B Messer, Bali Pulendran, Kari C Nadeau, Mark M Davis, Linda N Geng, Ana Fernandez Sesma, Viviana Simon, Florian Krammer, Monica Kraft, Chris Bime, Carolyn S Calfee, David J Erle, Charles R Langelier, Leying Guan, Holden T Maecker, Bjoern Peters, Steven H Kleinstein, Elaine F Reed, Joann Diray-Arce, Nadine Rouphael, Matthew C Altman\",\"doi\":\"10.1101/2025.02.12.25322164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will go on to develop long COVID is challenging due to the lack of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models offer the potential to address this by leveraging clinical data to enhance diagnostic precision. We utilized clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, to predict the likelihood of acute COVID-19 progressing to long COVID. Our machine learning models achieved median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating their predictive capabilities. Feature importance analysis revealed that low antibody titers and high viral loads at hospital admission were the strongest predictors of long COVID outcomes. Comorbidities, including chronic respiratory, cardiac, and neurologic diseases, as well as female sex, were also identified as significant risk factors for long COVID. Our findings suggest that ML models have the potential to identify patients at risk for developing long COVID based on baseline clinical characteristics. These models can help guide early interventions, improving patient outcomes and mitigating the long-term public health impacts of SARS-CoV-2.</p>\",\"PeriodicalId\":94281,\"journal\":{\"name\":\"medRxiv : the preprint server for health sciences\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11844586/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"medRxiv : the preprint server for health sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2025.02.12.25322164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.02.12.25322164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine learning models predict long COVID outcomes based on baseline clinical and immunologic factors.
The post-acute sequelae of SARS-CoV-2 (PASC), also known as long COVID, remain a significant health issue that is incompletely understood. Predicting which acutely infected individuals will go on to develop long COVID is challenging due to the lack of established biomarkers, clear disease mechanisms, or well-defined sub-phenotypes. Machine learning (ML) models offer the potential to address this by leveraging clinical data to enhance diagnostic precision. We utilized clinical data, including antibody titers and viral load measurements collected at the time of hospital admission, to predict the likelihood of acute COVID-19 progressing to long COVID. Our machine learning models achieved median AUROC values ranging from 0.64 to 0.66 and AUPRC values between 0.51 and 0.54, demonstrating their predictive capabilities. Feature importance analysis revealed that low antibody titers and high viral loads at hospital admission were the strongest predictors of long COVID outcomes. Comorbidities, including chronic respiratory, cardiac, and neurologic diseases, as well as female sex, were also identified as significant risk factors for long COVID. Our findings suggest that ML models have the potential to identify patients at risk for developing long COVID based on baseline clinical characteristics. These models can help guide early interventions, improving patient outcomes and mitigating the long-term public health impacts of SARS-CoV-2.