Identification of Severe Acute Exacerbations of Chronic Obstructive Pulmonary Disease Subgroups by Machine Learning Implementation in Electronic Health Records.

IF 2.3 4区医学 Q2 RESPIRATORY SYSTEM Chronic Obstructive Pulmonary Diseases-Journal of the Copd Foundation Pub Date : 2024-11-22 DOI:10.15326/jcopdf.2024.0556

Huan Li, John Huston, Jana Zielonka, Shannon Kay, Maor Sauler, Jose Gomez

{"title":"Identification of Severe Acute Exacerbations of Chronic Obstructive Pulmonary Disease Subgroups by Machine Learning Implementation in Electronic Health Records.","authors":"Huan Li, John Huston, Jana Zielonka, Shannon Kay, Maor Sauler, Jose Gomez","doi":"10.15326/jcopdf.2024.0556","DOIUrl":null,"url":null,"abstract":"Rationale: Acute exacerbations of chronic obstructive pulmonary disease (AECOPDs) are heterogeneous. Machine learning (ML) has previously been used to dissect some of the heterogeneity in COPD. The widespread adoption of electronic health records (EHRs) has led to the rapid accumulation of large amounts of patient data as part of routine clinical care. However, it is unclear whether the implementation of ML in EHR-derived data has the potential to identify subgroups of AECOPD.Objectives: To determine whether ML implementation using EHR data from severe AECOPDs requiring hospitalization identifies relevant subgroups.Methods: This study used 2 retrospective cohorts of patients with AECOPDs (non-COVID-19 and COVID-19) treated at Yale-New Haven Hospital. K-means clustering was used to identify patient subgroups.Measurements and main results: We identified 3 subgroups in the non-COVID cohort (n=1736). Each subgroup had distinct clinical characteristics. The reference subgroup was the largest (n=904), followed by cardio-renal (n=548) and eosinophilic (n=284). The eosinophilic subgroup had milder severity of AECOPD, including a shorter hospital stay (p<0.01). The cardio-renal subgroup had the highest mortality during (5%) and in the year after hospitalization (30%). Validation of the severe AECOPD classifier in the COVID-19 cohort recapitulated the characteristics seen in the non-COVID cohort. AECOPD subgroups in the COVID-19 cohort had different interleukin (IL)-1 beta, IL-2R, and IL-8 levels (false discovery rate ≤ 0.05). These specific leukocyte and cytokine profiles resulted in inflammatory differences between the AECOPD subgroups based on C-reactive protein levels.Conclusions: Incorporating ML with EHR data allows the identification of specific clinical and biological subgroups for severe AECOPD.","PeriodicalId":51340,"journal":{"name":"Chronic Obstructive Pulmonary Diseases-Journal of the Copd Foundation","volume":" ","pages":"611-623"},"PeriodicalIF":2.3000,"publicationDate":"2024-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11703024/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chronic Obstructive Pulmonary Diseases-Journal of the Copd Foundation","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.15326/jcopdf.2024.0556","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"RESPIRATORY SYSTEM","Score":null,"Total":0}

引用次数: 0

Abstract

Rationale: Acute exacerbations of chronic obstructive pulmonary disease (AECOPDs) are heterogeneous. Machine learning (ML) has previously been used to dissect some of the heterogeneity in COPD. The widespread adoption of electronic health records (EHRs) has led to the rapid accumulation of large amounts of patient data as part of routine clinical care. However, it is unclear whether the implementation of ML in EHR-derived data has the potential to identify subgroups of AECOPD.

Objectives: To determine whether ML implementation using EHR data from severe AECOPDs requiring hospitalization identifies relevant subgroups.

Methods: This study used 2 retrospective cohorts of patients with AECOPDs (non-COVID-19 and COVID-19) treated at Yale-New Haven Hospital. K-means clustering was used to identify patient subgroups.

Measurements and main results: We identified 3 subgroups in the non-COVID cohort (n=1736). Each subgroup had distinct clinical characteristics. The reference subgroup was the largest (n=904), followed by cardio-renal (n=548) and eosinophilic (n=284). The eosinophilic subgroup had milder severity of AECOPD, including a shorter hospital stay (p<0.01). The cardio-renal subgroup had the highest mortality during (5%) and in the year after hospitalization (30%). Validation of the severe AECOPD classifier in the COVID-19 cohort recapitulated the characteristics seen in the non-COVID cohort. AECOPD subgroups in the COVID-19 cohort had different interleukin (IL)-1 beta, IL-2R, and IL-8 levels (false discovery rate ≤ 0.05). These specific leukocyte and cytokine profiles resulted in inflammatory differences between the AECOPD subgroups based on C-reactive protein levels.

Conclusions: Incorporating ML with EHR data allows the identification of specific clinical and biological subgroups for severe AECOPD.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

通过在电子健康记录中实施机器学习，识别慢性阻塞性肺病严重急性加重亚组。

理由：慢性阻塞性肺病急性加重（AECOPD）具有异质性。机器学习（ML）曾被用于剖析慢性阻塞性肺病的一些异质性。随着电子健康记录（EHR）的广泛应用，作为常规临床护理的一部分，大量患者数据得以迅速积累。然而，目前还不清楚在 EHR 衍生数据中实施 ML 是否有可能识别 AECOPD 亚组：确定使用需要住院治疗的严重 AECOPD 的电子病历数据实施 ML 是否能识别相关亚组：本研究使用了耶鲁-纽黑文医院（Yale-New Haven Hospital，YNHHS）收治的两个回顾性 AECOPD 患者队列（非 COVID-19 和 COVID-19）。采用 K-均值聚类法确定患者亚组：我们在非COVID队列（n=1,736）中确定了三个亚组。每个亚组都有不同的临床特征。参照亚组人数最多（904 人），其次是心肾亚组（548 人）和嗜酸性粒细胞亚组（284 人）。嗜酸性粒细胞亚组的 AECOPD 严重程度较轻，包括住院时间较短（p结论：将 ML 与电子病历数据相结合，可以确定严重 AECOPD 的特定临床和生物学亚组。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Chronic Obstructive Pulmonary Diseases-Journal of the Copd Foundation RESPIRATORY SYSTEM-

CiteScore

3.70

自引率

8.30%

发文量