Tailoring Risk Prediction Models to Local Populations.

IF 14.1 1区医学 Q1 CARDIAC & CARDIOVASCULAR SYSTEMS JAMA cardiology Pub Date : 2024-09-18 DOI:10.1001/jamacardio.2024.2912

Aniket N Zinzuwadia,Olga Mineeva,Chunying Li,Zareen Farukhi,Franco Giulianini,Brian Cade,Lin Chen,Elizabeth Karlson,Nina Paynter,Samia Mora,Olga Demler

{"title":"Tailoring Risk Prediction Models to Local Populations.","authors":"Aniket N Zinzuwadia,Olga Mineeva,Chunying Li,Zareen Farukhi,Franco Giulianini,Brian Cade,Lin Chen,Elizabeth Karlson,Nina Paynter,Samia Mora,Olga Demler","doi":"10.1001/jamacardio.2024.2912","DOIUrl":null,"url":null,"abstract":"Importance\r\nRisk estimation is an integral part of cardiovascular care. Local recalibration of guideline-recommended models could address the limitations of existing tools.\r\n\r\nObjective\r\nTo provide a machine learning (ML) approach to augment the performance of the American Heart Association's Predicting Risk of Cardiovascular Disease Events (AHA-PREVENT) equations when applied to a local population while preserving clinical interpretability.\r\n\r\nDesign, Setting, and Participants\r\nThis cohort study used a New England-based electronic health record cohort of patients without prior atherosclerotic cardiovascular disease (ASCVD) who had the data necessary to calculate the AHA-PREVENT 10-year risk of developing ASCVD in the event period (2007-2016). Patients with prior ASCVD events, death prior to 2007, or age 79 years or older in 2007 were subsequently excluded. The final study population of 95 326 patients was split into 3 nonoverlapping subsets for training, testing, and validation. The AHA-PREVENT model was adapted to this local population using the open-source ML model (MLM) Extreme Gradient Boosting model (XGBoost) with minimal predictor variables, including age, sex, and AHA-PREVENT. The MLM was monotonically constrained to preserve known associations between risk factors and ASCVD risk. Along with sex, race and ethnicity data from the electronic health record were collected to validate the performance of ASCVD risk prediction in subgroups. Data were analyzed from August 2021 to February 2024.\r\n\r\nMain Outcomes and Measures\r\nConsistent with the AHA-PREVENT model, ASCVD events were defined as the first occurrence of either nonfatal myocardial infarction, coronary artery disease, ischemic stroke, or cardiovascular death. Cardiovascular death was coded via government registries. Discrimination, calibration, and risk reclassification were assessed using the Harrell C index, a modified Hosmer-Lemeshow goodness-of-fit test and calibration curves, and reclassification tables, respectively.\r\n\r\nResults\r\nIn the test set of 38 137 patients (mean [SD] age, 64.8 [6.9] years, 22 708 [59.5]% women and 15 429 [40.5%] men; 935 [2.5%] Asian, 2153 [5.6%] Black, 1414 [3.7%] Hispanic, 31 400 [82.3%] White, and 2235 [5.9%] other, including American Indian, multiple races, unspecified, and unrecorded, consolidated owing to small numbers), MLM-PREVENT had improved calibration (modified Hosmer-Lemeshow P > .05) compared to the AHA-PREVENT model across risk categories in the overall cohort (χ23 = 2.2; P = .53 vs χ23 > 16.3; P < .001) and sex subgroups (men: χ23 = 2.1; P = .55 vs χ23 > 16.3; P < .001; women: χ23 = 6.5; P = .09 vs. χ23 > 16.3; P < .001), while also surpassing a traditional recalibration approach. MLM-PREVENT maintained or improved AHA-PREVENT's calibration in Asian, Black, and White individuals. Both MLM-PREVENT and AHA-PREVENT performed equally well in discriminating risk (approximate ΔC index, ±0.01). Using a clinically significant 7.5% risk threshold, MLM-PREVENT reclassified a total of 11.5% of patients. We visualize the recalibration through MLM-PREVENT ASCVD risk charts that highlight preserved risk associations of the original AHA-PREVENT model.\r\n\r\nConclusions and Relevance\r\nThe interpretable ML approach presented in this article enhanced the accuracy of the AHA-PREVENT model when applied to a local population while still preserving the risk associations found by the original model. This method has the potential to recalibrate other established risk tools and is implementable in electronic health record systems for improved cardiovascular risk assessment.","PeriodicalId":14657,"journal":{"name":"JAMA cardiology","volume":"14 1","pages":""},"PeriodicalIF":14.1000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA cardiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamacardio.2024.2912","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Importance Risk estimation is an integral part of cardiovascular care. Local recalibration of guideline-recommended models could address the limitations of existing tools. Objective To provide a machine learning (ML) approach to augment the performance of the American Heart Association's Predicting Risk of Cardiovascular Disease Events (AHA-PREVENT) equations when applied to a local population while preserving clinical interpretability. Design, Setting, and Participants This cohort study used a New England-based electronic health record cohort of patients without prior atherosclerotic cardiovascular disease (ASCVD) who had the data necessary to calculate the AHA-PREVENT 10-year risk of developing ASCVD in the event period (2007-2016). Patients with prior ASCVD events, death prior to 2007, or age 79 years or older in 2007 were subsequently excluded. The final study population of 95 326 patients was split into 3 nonoverlapping subsets for training, testing, and validation. The AHA-PREVENT model was adapted to this local population using the open-source ML model (MLM) Extreme Gradient Boosting model (XGBoost) with minimal predictor variables, including age, sex, and AHA-PREVENT. The MLM was monotonically constrained to preserve known associations between risk factors and ASCVD risk. Along with sex, race and ethnicity data from the electronic health record were collected to validate the performance of ASCVD risk prediction in subgroups. Data were analyzed from August 2021 to February 2024. Main Outcomes and Measures Consistent with the AHA-PREVENT model, ASCVD events were defined as the first occurrence of either nonfatal myocardial infarction, coronary artery disease, ischemic stroke, or cardiovascular death. Cardiovascular death was coded via government registries. Discrimination, calibration, and risk reclassification were assessed using the Harrell C index, a modified Hosmer-Lemeshow goodness-of-fit test and calibration curves, and reclassification tables, respectively. Results In the test set of 38 137 patients (mean [SD] age, 64.8 [6.9] years, 22 708 [59.5]% women and 15 429 [40.5%] men; 935 [2.5%] Asian, 2153 [5.6%] Black, 1414 [3.7%] Hispanic, 31 400 [82.3%] White, and 2235 [5.9%] other, including American Indian, multiple races, unspecified, and unrecorded, consolidated owing to small numbers), MLM-PREVENT had improved calibration (modified Hosmer-Lemeshow P > .05) compared to the AHA-PREVENT model across risk categories in the overall cohort (χ23 = 2.2; P = .53 vs χ23 > 16.3; P < .001) and sex subgroups (men: χ23 = 2.1; P = .55 vs χ23 > 16.3; P < .001; women: χ23 = 6.5; P = .09 vs. χ23 > 16.3; P < .001), while also surpassing a traditional recalibration approach. MLM-PREVENT maintained or improved AHA-PREVENT's calibration in Asian, Black, and White individuals. Both MLM-PREVENT and AHA-PREVENT performed equally well in discriminating risk (approximate ΔC index, ±0.01). Using a clinically significant 7.5% risk threshold, MLM-PREVENT reclassified a total of 11.5% of patients. We visualize the recalibration through MLM-PREVENT ASCVD risk charts that highlight preserved risk associations of the original AHA-PREVENT model. Conclusions and Relevance The interpretable ML approach presented in this article enhanced the accuracy of the AHA-PREVENT model when applied to a local population while still preserving the risk associations found by the original model. This method has the potential to recalibrate other established risk tools and is implementable in electronic health record systems for improved cardiovascular risk assessment.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

根据当地人口定制风险预测模型。

重要性风险评估是心血管护理不可或缺的一部分。目标提供一种机器学习（ML）方法，以提高美国心脏协会心血管疾病事件风险预测（AHA-PREVENT）方程在应用于本地人群时的性能，同时保留临床可解释性。这项队列研究使用的是新英格兰地区的电子健康记录队列，对象是既往未患动脉粥样硬化性心血管疾病（ASCVD）的患者，这些患者拥有计算事件发生期（2007-2016 年）AHA-PREVENT 10 年罹患 ASCVD 风险所需的数据。随后排除了曾发生 ASCVD 事件、2007 年前死亡或 2007 年年龄为 79 岁或以上的患者。最终 95 326 例患者被分成 3 个不重叠的子集进行训练、测试和验证。AHA-PREVENT 模型是使用开源 ML 模型（MLM）极端梯度提升模型（XGBoost）和最小预测变量（包括年龄、性别和 AHA-PREVENT）来适应本地人群的。MLM 采用单调约束，以保留风险因素与 ASCVD 风险之间的已知关联。除了性别外，还收集了电子健康记录中的种族和民族数据，以验证亚组中 ASCVD 风险预测的性能。主要结果和测量与 AHA-PREVENT 模型一致，ASCVD 事件定义为首次发生非致命性心肌梗死、冠状动脉疾病、缺血性中风或心血管死亡。心血管死亡通过政府登记处进行编码。分别使用 Harrell C 指数、改良的 Hosmer-Lemeshow 拟合度检验和校准曲线以及再分类表来评估识别、校准和风险再分类。结果在 38 137 名患者（平均 [SD] 年龄 64.8 [6.9] 岁，女性 22 708 [59.5] %，男性 15 429 [40.5%]；亚裔 935 [2.5%]，黑人 2153 [5.6%]，西班牙裔 1414 [3.7%]，白人 31 400 [82.3%] 和其他 2235 [5.与 AHA-PREVENT 模型相比，MLM-PREVENT 在总体队列的不同风险类别中的校准效果有所改善（修改后的 Hosmer-Lemeshow P > .05）（χ23 = 2.2; P = .53 vs χ23 > 16.3; P 16.3; P 16.3; P < .001），同时也超过了传统的重新校准方法。MLM-PREVENT保持或提高了AHA-PREVENT在亚洲人、黑人和白人中的校准效果。MLM-PREVENT 和 AHA-PREVENT 在风险判别方面的表现相当出色（近似 ΔC 指数，±0.01）。使用具有临床意义的 7.5% 风险阈值，MLM-PREVENT 共对 11.5% 的患者进行了重新分类。我们通过 MLM-PREVENT ASCVD 风险图表直观地展示了重新校准的结果，该图表突出显示了原始 AHA-PREVENT 模型所保留的风险关联。这种方法有可能重新校准其他已建立的风险工具，并可在电子健康记录系统中实施，以改进心血管风险评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

JAMA cardiology Medicine-Cardiology and Cardiovascular Medicine

CiteScore

45.80

自引率

1.70%

发文量

264

期刊介绍： JAMA Cardiology, an international peer-reviewed journal, serves as the premier publication for clinical investigators, clinicians, and trainees in cardiovascular medicine worldwide. As a member of the JAMA Network, it aligns with a consortium of peer-reviewed general medical and specialty publications. Published online weekly, every Wednesday, and in 12 print/online issues annually, JAMA Cardiology attracts over 4.3 million annual article views and downloads. Research articles become freely accessible online 12 months post-publication without any author fees. Moreover, the online version is readily accessible to institutions in developing countries through the World Health Organization's HINARI program. Positioned at the intersection of clinical investigation, actionable clinical science, and clinical practice, JAMA Cardiology prioritizes traditional and evolving cardiovascular medicine, alongside evidence-based health policy. It places particular emphasis on health equity, especially when grounded in original science, as a top editorial priority.