Unlocking the link: predicting cardiovascular disease risk with a focus on airflow obstruction using machine learning.

IF 3.8 3区 医学 Q2 MEDICAL INFORMATICS BMC Medical Informatics and Decision Making Pub Date : 2025-02-03 DOI:10.1186/s12911-025-02885-0
Xiyu Cao, Jianli Ma, Xiaoyi He, Yufei Liu, Yang Yang, Yaqi Wang, Chuantao Zhang
{"title":"Unlocking the link: predicting cardiovascular disease risk with a focus on airflow obstruction using machine learning.","authors":"Xiyu Cao, Jianli Ma, Xiaoyi He, Yufei Liu, Yang Yang, Yaqi Wang, Chuantao Zhang","doi":"10.1186/s12911-025-02885-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Respiratory diseases and Cardiovascular Diseases (CVD) often coexist, with airflow obstruction (AO) severity closely linked to CVD incidence and mortality. As both conditions rise, early identification and intervention in risk populations are crucial. However, current CVD risk models inadequately consider AO as an independent risk factor. Therefore, developing an accurate risk prediction model can help identify and intervene early.</p><p><strong>Methods: </strong>This study used the National Health and Nutrition Examination Survey (NHANES) III (1988-1994) and NHANES 2007-2012 datasets. Inclusion criteria were participants aged over 40 with complete AO and CVD data; exclusions were those with missing key data. Analysis included 12 variables: age, gender, race, PIR, education, smoking, alcohol, BMI, hyperlipidemia, hypertension, diabetes, and AO. Logistic regression analyzed the association between AO and CVD, with sensitivity and subgroup analyses. Six ML models predicted CVD risk for the general population, using AO as a predictor. RandomizedSearchCV with 5-fold cross-validation was used for hyperparameter optimization. Models were evaluated by AUC, accuracy, precision, recall, F1 score, and Brier score, with the SHapley Additive exPlanations (SHAP) enhancing explainability. A separate ML model was built for the subpopulation with AO, evaluated similarly.</p><p><strong>Results: </strong>The cross-sectional analysis showed that there was a significant positive correlation between AO occurrence and CVD prevalence, indicating that AO is an important risk factor for CVD (all P < 0.05). For the general population, the XGBoost model was selected as the optimal model for predicting CVD risk (AUC = 0.7508, AP = 0.3186). The top three features in terms of importance were age, hypertension, and PIR. For the subpopulation with airflow obstruction, the XGBoost model was also selected as the optimal model for predicting CVD risk (AUC = 0.6645, AP = 0.3545). SHAP shows that education level has the greatest impact on predicting CVD risk, followed by gender and race.</p><p><strong>Conclusion: </strong>AO correlates positively with CVD. Age, hypertension, PIR affect CVD risk most in general. For AO patients, education, gender, ethnicity are key CVD risk factors.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"50"},"PeriodicalIF":3.8000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11792416/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-02885-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Respiratory diseases and Cardiovascular Diseases (CVD) often coexist, with airflow obstruction (AO) severity closely linked to CVD incidence and mortality. As both conditions rise, early identification and intervention in risk populations are crucial. However, current CVD risk models inadequately consider AO as an independent risk factor. Therefore, developing an accurate risk prediction model can help identify and intervene early.

Methods: This study used the National Health and Nutrition Examination Survey (NHANES) III (1988-1994) and NHANES 2007-2012 datasets. Inclusion criteria were participants aged over 40 with complete AO and CVD data; exclusions were those with missing key data. Analysis included 12 variables: age, gender, race, PIR, education, smoking, alcohol, BMI, hyperlipidemia, hypertension, diabetes, and AO. Logistic regression analyzed the association between AO and CVD, with sensitivity and subgroup analyses. Six ML models predicted CVD risk for the general population, using AO as a predictor. RandomizedSearchCV with 5-fold cross-validation was used for hyperparameter optimization. Models were evaluated by AUC, accuracy, precision, recall, F1 score, and Brier score, with the SHapley Additive exPlanations (SHAP) enhancing explainability. A separate ML model was built for the subpopulation with AO, evaluated similarly.

Results: The cross-sectional analysis showed that there was a significant positive correlation between AO occurrence and CVD prevalence, indicating that AO is an important risk factor for CVD (all P < 0.05). For the general population, the XGBoost model was selected as the optimal model for predicting CVD risk (AUC = 0.7508, AP = 0.3186). The top three features in terms of importance were age, hypertension, and PIR. For the subpopulation with airflow obstruction, the XGBoost model was also selected as the optimal model for predicting CVD risk (AUC = 0.6645, AP = 0.3545). SHAP shows that education level has the greatest impact on predicting CVD risk, followed by gender and race.

Conclusion: AO correlates positively with CVD. Age, hypertension, PIR affect CVD risk most in general. For AO patients, education, gender, ethnicity are key CVD risk factors.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
解开联系:利用机器学习预测心血管疾病风险,重点关注气流阻塞。
背景:呼吸系统疾病和心血管疾病(CVD)经常共存,气流阻塞(AO)的严重程度与CVD的发病率和死亡率密切相关。随着这两种情况的增加,对危险人群的早期识别和干预至关重要。然而,目前的心血管疾病风险模型没有充分考虑到AO是一个独立的风险因素。因此,建立准确的风险预测模型有助于早期识别和干预。方法:本研究采用国家健康与营养调查(NHANES) III(1988-1994)和NHANES 2007-2012数据集。入选标准:年龄在40岁以上,有完整的AO和CVD数据;排除那些缺少关键数据的。分析包括12个变量:年龄、性别、种族、PIR、教育程度、吸烟、饮酒、BMI、高脂血症、高血压、糖尿病和AO。Logistic回归分析AO与CVD之间的关系,采用敏感性和亚组分析。6个ML模型预测普通人群的心血管疾病风险,使用AO作为预测因子。采用5倍交叉验证的RandomizedSearchCV进行超参数优化。通过AUC、准确度、精密度、召回率、F1评分和Brier评分对模型进行评价,SHapley加性解释(SHAP)增强了模型的可解释性。为AO亚群建立了一个单独的ML模型,进行了类似的评估。结果:横断面分析显示AO的发生与CVD患病率呈显著正相关,提示AO是CVD的重要危险因素(均P)。结论:AO与CVD呈正相关。一般来说,年龄、高血压、PIR对CVD风险影响最大。对于AO患者,教育程度、性别、种族是关键的心血管疾病危险因素。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
期刊最新文献
Applying a multi-level usability framework for evaluation of a pregnancy health promotion app. Transforming unstructured breast cancer pathology reports into the Observational Medical Outcomes Partnership Common Data Model. Explainable deep learning and radiomics integration for differentiating insulinomas from NF-PNETs in EUS imaging. Predicting adverse prognostic outcomes in hospitalized breast cancer patients: development and validation of a risk model. Machine learning for predictive risk stratification in recurrent miscarriage: a systematic review.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1