{"title":"Optimized machine learning framework for cardiovascular disease diagnosis: a novel ethical perspective.","authors":"Ghadah Alwakid, Farman Ul Haq, Noshina Tariq, Mamoona Humayun, Momina Shaheen, Marwa Alsadun","doi":"10.1186/s12872-025-04550-w","DOIUrl":null,"url":null,"abstract":"<p><p>Alignment of advanced cutting-edge technologies such as Artificial Intelligence (AI) has emerged as a significant driving force to achieve greater precision and timeliness in identifying cardiovascular diseases (CVDs). However, it is difficult to achieve high accuracy and reliability in CVD diagnostics due to complex clinical data and the selection and modeling process of useful features. Therefore, this paper studies advanced AI-based feature selection techniques and the application of AI technologies in the CVD classification. It uses methodologies such as Chi-square, Info Gain, Forward Selection, and Backward Elimination as an essence of cardiovascular health indicators into a refined eight-feature subset. This study emphasizes ethical considerations, including transparency, interpretability, and bias mitigation. This is achieved by employing unbiased datasets, fair feature selection techniques, and rigorous validation metrics to ensure fairness and trustworthiness in the AI-based diagnostic process. In addition, the integration of various Machine Learning (ML) models, encompassing Random Forest (RF), XGBoost, Decision Trees (DT), and Logistic Regression (LR), facilitates a comprehensive exploration of predictive performance. Among this diverse range of models, XGBoost stands out as the top performer, achieving exceptional scores with a 99% accuracy rate, 100% recall, 99% F1-measure, and 99% precision. Furthermore, we venture into dimensionality reduction, applying Principal Component Analysis (PCA) to the eight-feature subset, effectively refining it to a compact six-attribute feature subset. Once again, XGBoost shines as the model of choice, yielding outstanding results. It achieves accuracy, recall, F1-measure, and precision scores of 98%, 100%, 98%, and 97%, respectively, when applied to the feature subset derived from the combination of Chi-square and Forward Selection methods.</p>","PeriodicalId":9195,"journal":{"name":"BMC Cardiovascular Disorders","volume":"25 1","pages":"123"},"PeriodicalIF":2.0000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11844188/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Cardiovascular Disorders","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12872-025-04550-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Alignment of advanced cutting-edge technologies such as Artificial Intelligence (AI) has emerged as a significant driving force to achieve greater precision and timeliness in identifying cardiovascular diseases (CVDs). However, it is difficult to achieve high accuracy and reliability in CVD diagnostics due to complex clinical data and the selection and modeling process of useful features. Therefore, this paper studies advanced AI-based feature selection techniques and the application of AI technologies in the CVD classification. It uses methodologies such as Chi-square, Info Gain, Forward Selection, and Backward Elimination as an essence of cardiovascular health indicators into a refined eight-feature subset. This study emphasizes ethical considerations, including transparency, interpretability, and bias mitigation. This is achieved by employing unbiased datasets, fair feature selection techniques, and rigorous validation metrics to ensure fairness and trustworthiness in the AI-based diagnostic process. In addition, the integration of various Machine Learning (ML) models, encompassing Random Forest (RF), XGBoost, Decision Trees (DT), and Logistic Regression (LR), facilitates a comprehensive exploration of predictive performance. Among this diverse range of models, XGBoost stands out as the top performer, achieving exceptional scores with a 99% accuracy rate, 100% recall, 99% F1-measure, and 99% precision. Furthermore, we venture into dimensionality reduction, applying Principal Component Analysis (PCA) to the eight-feature subset, effectively refining it to a compact six-attribute feature subset. Once again, XGBoost shines as the model of choice, yielding outstanding results. It achieves accuracy, recall, F1-measure, and precision scores of 98%, 100%, 98%, and 97%, respectively, when applied to the feature subset derived from the combination of Chi-square and Forward Selection methods.
期刊介绍:
BMC Cardiovascular Disorders is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of disorders of the heart and circulatory system, as well as related molecular and cell biology, genetics, pathophysiology, epidemiology, and controlled trials.