Zhengxian Fan, Mohammad Mamouei, Yikuan Li, Shishir Rao, Kazem Rahimi
{"title":"Identification of heart failure subtypes using transformer-based deep learning modelling: a population-based study of 379,108 individuals.","authors":"Zhengxian Fan, Mohammad Mamouei, Yikuan Li, Shishir Rao, Kazem Rahimi","doi":"10.1016/j.ebiom.2025.105657","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Heart failure (HF) is a complex syndrome with varied presentations and progression patterns. Traditional classification systems based on left ventricular ejection fraction (LVEF) have limitations in capturing the heterogeneity of HF. We aimed to explore the application of deep learning, specifically a Transformer-based approach, to analyse electronic health records (EHR) for a refined subtyping of patients with HF.</p><p><strong>Methods: </strong>We utilised linked EHR from primary and secondary care, sourced from the Clinical Practice Research Datalink (CPRD) Aurum, which encompassed health data of over 30 million individuals in the UK. Individuals aged 35 and above with incident reports of HF between January 1, 2005, and January 1, 2018, were included. We proposed a Transformer-based approach to cluster patients based on all clinical diagnoses, procedures, and medication records in EHR. Statistical machine learning (ML) methods were used for comparative benchmarking. The models were trained on a derivation cohort and assessed for their ability to delineate distinct clusters and prognostic value by comparing one-year all-cause mortality and HF hospitalisation rates among the identified subgroups in a separate validation cohort. Association analyses were conducted to elucidate the clinical characteristics of the derived clusters.</p><p><strong>Findings: </strong>A total of 379,108 patients were included in the HF subtyping analysis. The Transformer-based approach outperformed alternative methods, delineating more distinct and prognostically valuable clusters. This approach identified seven unique HF patient clusters characterised by differing patterns of mortality, hospitalisation, and comorbidities. These clusters were labelled based on the dominant clinical features present at the initial diagnosis of HF: early-onset, hypertension, ischaemic heart disease, metabolic problems, chronic obstructive pulmonary disease (COPD), thyroid dysfunction, and late-onset clusters. The Transformer-based subtyping approach successfully captured the multifaceted nature of HF.</p><p><strong>Interpretation: </strong>This study identified seven distinct subtypes, including COPD-related and thyroid dysfunction-related subgroups, which are two high-risk subgroups not recognised in previous subtyping analyses. These insights lay the groundwork for further investigations into tailored and effective management strategies for HF.</p><p><strong>Funding: </strong>British Heart Foundation, European Union - Horizon Europe, and Novo Nordisk Research Centre Oxford.</p>","PeriodicalId":11494,"journal":{"name":"EBioMedicine","volume":"114 ","pages":"105657"},"PeriodicalIF":9.7000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EBioMedicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.ebiom.2025.105657","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Heart failure (HF) is a complex syndrome with varied presentations and progression patterns. Traditional classification systems based on left ventricular ejection fraction (LVEF) have limitations in capturing the heterogeneity of HF. We aimed to explore the application of deep learning, specifically a Transformer-based approach, to analyse electronic health records (EHR) for a refined subtyping of patients with HF.
Methods: We utilised linked EHR from primary and secondary care, sourced from the Clinical Practice Research Datalink (CPRD) Aurum, which encompassed health data of over 30 million individuals in the UK. Individuals aged 35 and above with incident reports of HF between January 1, 2005, and January 1, 2018, were included. We proposed a Transformer-based approach to cluster patients based on all clinical diagnoses, procedures, and medication records in EHR. Statistical machine learning (ML) methods were used for comparative benchmarking. The models were trained on a derivation cohort and assessed for their ability to delineate distinct clusters and prognostic value by comparing one-year all-cause mortality and HF hospitalisation rates among the identified subgroups in a separate validation cohort. Association analyses were conducted to elucidate the clinical characteristics of the derived clusters.
Findings: A total of 379,108 patients were included in the HF subtyping analysis. The Transformer-based approach outperformed alternative methods, delineating more distinct and prognostically valuable clusters. This approach identified seven unique HF patient clusters characterised by differing patterns of mortality, hospitalisation, and comorbidities. These clusters were labelled based on the dominant clinical features present at the initial diagnosis of HF: early-onset, hypertension, ischaemic heart disease, metabolic problems, chronic obstructive pulmonary disease (COPD), thyroid dysfunction, and late-onset clusters. The Transformer-based subtyping approach successfully captured the multifaceted nature of HF.
Interpretation: This study identified seven distinct subtypes, including COPD-related and thyroid dysfunction-related subgroups, which are two high-risk subgroups not recognised in previous subtyping analyses. These insights lay the groundwork for further investigations into tailored and effective management strategies for HF.
Funding: British Heart Foundation, European Union - Horizon Europe, and Novo Nordisk Research Centre Oxford.
EBioMedicineBiochemistry, Genetics and Molecular Biology-General Biochemistry,Genetics and Molecular Biology
CiteScore
17.70
自引率
0.90%
发文量
579
审稿时长
5 weeks
期刊介绍:
eBioMedicine is a comprehensive biomedical research journal that covers a wide range of studies that are relevant to human health. Our focus is on original research that explores the fundamental factors influencing human health and disease, including the discovery of new therapeutic targets and treatments, the identification of biomarkers and diagnostic tools, and the investigation and modification of disease pathways and mechanisms. We welcome studies from any biomedical discipline that contribute to our understanding of disease and aim to improve human health.