Random Survival Forest Machine Learning for the Prediction of Cardiovascular Events Among Patients With a Measured Lipoprotein(a) Level: A Model Development Study.
Jay B Lusk, Emily C O'Brien, Bradley G Hammill, Fan Li, Brian Mac Grory, Manesh R Patel, Neha J Pagidipati, Nishant P Shah
{"title":"Random Survival Forest Machine Learning for the Prediction of Cardiovascular Events Among Patients With a Measured Lipoprotein(a) Level: A Model Development Study.","authors":"Jay B Lusk, Emily C O'Brien, Bradley G Hammill, Fan Li, Brian Mac Grory, Manesh R Patel, Neha J Pagidipati, Nishant P Shah","doi":"10.1161/CIRCGEN.124.004629","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Established risk models may not be applicable to patients at higher cardiovascular risk with a measured Lp(a) (lipoprotein[a]) level, a causal risk factor for atherosclerotic cardiovascular disease.</p><p><strong>Methods: </strong>This was a model development study. The data source was the Nashville Biosciences Lp(a) data set, which includes clinical data from the Vanderbilt University Health System. We included patients with an Lp(a) measured between 1989 and 2022 and who had at least 1 year of electronic health record data before measurement of an Lp(a) level. The end point of interest was time to first myocardial infarction, stroke/TIA, or coronary revascularization. A random survival forest model was derived and compared with a Cox proportional hazards model derived from traditional cardiovascular risk factors (ie, the variables used to estimate the Pooled Cohort Equations for the primary prevention population and the variables used to estimate the Second Manifestations of Arterial Disease and Thrombolysis in Myocardial Infarction Risk Score for Secondary Prevention scores for the secondary prevention population). Model discrimination was evaluated using Harrell's C-index.</p><p><strong>Results: </strong>A total of 4369 patients were included in the study (49.5% were female, mean age was 51 [SD 18] years, and mean Lp(a) level was 33.6 [38.6] mg/dL, of whom 23.7% had a prior cardiovascular event). The random survival forest model outperformed the traditional risk factor models in the test set (c-index, 0.82 [random forest model] versus 0.69 [primary prevention model] versus 0.80 [secondary prevention model]). These results were similar when restricted to a primary prevention population and under various strategies to handle competing risk. A Cox proportional hazard model based on the top 25 variables from the random forest model had a c-index of 0.80.</p><p><strong>Conclusions: </strong>A random survival forest model outperformed a model using traditional risk factors for predicting cardiovascular events in patients with a measured Lp(a) level.</p>","PeriodicalId":10326,"journal":{"name":"Circulation: Genomic and Precision Medicine","volume":" ","pages":"e004629"},"PeriodicalIF":6.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Circulation: Genomic and Precision Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1161/CIRCGEN.124.004629","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/23 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Established risk models may not be applicable to patients at higher cardiovascular risk with a measured Lp(a) (lipoprotein[a]) level, a causal risk factor for atherosclerotic cardiovascular disease.
Methods: This was a model development study. The data source was the Nashville Biosciences Lp(a) data set, which includes clinical data from the Vanderbilt University Health System. We included patients with an Lp(a) measured between 1989 and 2022 and who had at least 1 year of electronic health record data before measurement of an Lp(a) level. The end point of interest was time to first myocardial infarction, stroke/TIA, or coronary revascularization. A random survival forest model was derived and compared with a Cox proportional hazards model derived from traditional cardiovascular risk factors (ie, the variables used to estimate the Pooled Cohort Equations for the primary prevention population and the variables used to estimate the Second Manifestations of Arterial Disease and Thrombolysis in Myocardial Infarction Risk Score for Secondary Prevention scores for the secondary prevention population). Model discrimination was evaluated using Harrell's C-index.
Results: A total of 4369 patients were included in the study (49.5% were female, mean age was 51 [SD 18] years, and mean Lp(a) level was 33.6 [38.6] mg/dL, of whom 23.7% had a prior cardiovascular event). The random survival forest model outperformed the traditional risk factor models in the test set (c-index, 0.82 [random forest model] versus 0.69 [primary prevention model] versus 0.80 [secondary prevention model]). These results were similar when restricted to a primary prevention population and under various strategies to handle competing risk. A Cox proportional hazard model based on the top 25 variables from the random forest model had a c-index of 0.80.
Conclusions: A random survival forest model outperformed a model using traditional risk factors for predicting cardiovascular events in patients with a measured Lp(a) level.
期刊介绍:
Circulation: Genomic and Precision Medicine is a distinguished journal dedicated to advancing the frontiers of cardiovascular genomics and precision medicine. It publishes a diverse array of original research articles that delve into the genetic and molecular underpinnings of cardiovascular diseases. The journal's scope is broad, encompassing studies from human subjects to laboratory models, and from in vitro experiments to computational simulations.
Circulation: Genomic and Precision Medicine is committed to publishing studies that have direct relevance to human cardiovascular biology and disease, with the ultimate goal of improving patient care and outcomes. The journal serves as a platform for researchers to share their groundbreaking work, fostering collaboration and innovation in the field of cardiovascular genomics and precision medicine.