Alaleh Azhir, Jonas Hügel, Jiazi Tian, Jingya Cheng, Ingrid V Bassett, Douglas S Bell, Elmer V Bernstam, Maha R Farhat, Darren W Henderson, Emily S Lau, Michele Morris, Yevgeniy R Semenov, Virginia A Triant, Shyam Visweswaran, Zachary H Strasser, Jeffrey G Klann, Shawn N Murphy, Hossein Estiri
{"title":"Precision phenotyping for curating research cohorts of patients with unexplained post-acute sequelae of COVID-19.","authors":"Alaleh Azhir, Jonas Hügel, Jiazi Tian, Jingya Cheng, Ingrid V Bassett, Douglas S Bell, Elmer V Bernstam, Maha R Farhat, Darren W Henderson, Emily S Lau, Michele Morris, Yevgeniy R Semenov, Virginia A Triant, Shyam Visweswaran, Zachary H Strasser, Jeffrey G Klann, Shawn N Murphy, Hossein Estiri","doi":"10.1016/j.medj.2024.10.009","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Scalable identification of patients with post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms, which has led to suboptimal accuracy, demographic biases, and underestimation of the PASC.</p><p><strong>Methods: </strong>In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying cohorts of patients with PASC. We used longitudinal electronic health records data from over 295,000 patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to simultaneously exclude sequelae that prior conditions can explain and include infection-associated chronic conditions. We performed independent chart reviews to tune and validate the algorithm.</p><p><strong>Findings: </strong>The PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying PASC cohorts compared to the ICD-10-CM code U09.9. The algorithm identified a cohort of over 24,000 patients with 79.9% precision. Our estimated prevalence of PASC was 22.8%, which is close to the national estimates for the region. We also provide in-depth analyses, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC.</p><p><strong>Conclusions: </strong>PASC precision phenotyping boasts superior precision and prevalence estimation while exhibiting less bias in identifying patients with PASC. The cohort derived from this algorithm will serve as a springboard for delving into the genetic, metabolomic, and clinical intricacies of PASC, surmounting the constraints of prior PASC cohort studies.</p><p><strong>Funding: </strong>This research was funded by the US National Institute of Allergy and Infectious Diseases (NIAID).</p>","PeriodicalId":29964,"journal":{"name":"Med","volume":" ","pages":""},"PeriodicalIF":12.8000,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Med","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.medj.2024.10.009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Scalable identification of patients with post-acute sequelae of COVID-19 (PASC) is challenging due to a lack of reproducible precision phenotyping algorithms, which has led to suboptimal accuracy, demographic biases, and underestimation of the PASC.
Methods: In a retrospective case-control study, we developed a precision phenotyping algorithm for identifying cohorts of patients with PASC. We used longitudinal electronic health records data from over 295,000 patients from 14 hospitals and 20 community health centers in Massachusetts. The algorithm employs an attention mechanism to simultaneously exclude sequelae that prior conditions can explain and include infection-associated chronic conditions. We performed independent chart reviews to tune and validate the algorithm.
Findings: The PASC phenotyping algorithm improves precision and prevalence estimation and reduces bias in identifying PASC cohorts compared to the ICD-10-CM code U09.9. The algorithm identified a cohort of over 24,000 patients with 79.9% precision. Our estimated prevalence of PASC was 22.8%, which is close to the national estimates for the region. We also provide in-depth analyses, encompassing identified lingering effects by organ, comorbidity profiles, and temporal differences in the risk of PASC.
Conclusions: PASC precision phenotyping boasts superior precision and prevalence estimation while exhibiting less bias in identifying patients with PASC. The cohort derived from this algorithm will serve as a springboard for delving into the genetic, metabolomic, and clinical intricacies of PASC, surmounting the constraints of prior PASC cohort studies.
Funding: This research was funded by the US National Institute of Allergy and Infectious Diseases (NIAID).
期刊介绍:
Med is a flagship medical journal published monthly by Cell Press, the global publisher of trusted and authoritative science journals including Cell, Cancer Cell, and Cell Reports Medicine. Our mission is to advance clinical research and practice by providing a communication forum for the publication of clinical trial results, innovative observations from longitudinal cohorts, and pioneering discoveries about disease mechanisms. The journal also encourages thought-leadership discussions among biomedical researchers, physicians, and other health scientists and stakeholders. Our goal is to improve health worldwide sustainably and ethically.
Med publishes rigorously vetted original research and cutting-edge review and perspective articles on critical health issues globally and regionally. Our research section covers clinical case reports, first-in-human studies, large-scale clinical trials, population-based studies, as well as translational research work with the potential to change the course of medical research and improve clinical practice.