Transfer learning-based hybrid VGG16-machine learning approach for heart disease detection with explainable artificial intelligence.

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Frontiers in Artificial Intelligence Pub Date : 2025-02-25 eCollection Date: 2025-01-01 DOI:10.3389/frai.2025.1504281

Eshetie Gizachew Addisu, Tahayu Gizachew Yirga, Hailu Gizachew Yirga, Alemu Demeke Yehuala

{"title":"Transfer learning-based hybrid VGG16-machine learning approach for heart disease detection with explainable artificial intelligence.","authors":"Eshetie Gizachew Addisu, Tahayu Gizachew Yirga, Hailu Gizachew Yirga, Alemu Demeke Yehuala","doi":"10.3389/frai.2025.1504281","DOIUrl":null,"url":null,"abstract":"<p><p>Heart disease is a leading cause of mortality worldwide, making accurate early detection essential for effective treatment and management. This study introduces a novel hybrid machine-learning approach that combines transfer learning using the VGG16 convolutional neural network (CNN) with various machine-learning classifiers for heart disease detection. A conditional tabular generative adversarial network (CTGAN) was employed to generate synthetic data samples from actual datasets; these were evaluated using statistical metrics, correlation analysis, and domain expert assessments to ensure the quality of the synthetic datasets. The dataset comprises tabular data with 13 features, which are reshaped into an image-like format and resized to 224x224x3 to meet the input requirements of the VGG16 model. Feature extraction is performed using VGG16, and the extracted features are then fused with the original tabular data. This combined feature set is then used to train various machine learning models, including Support Vector Machines (SVM), Gradient Boosting, Random Forest, Logistic Regression, K-nearest neighbors (KNN), and Decision Trees. Among these models, the VGG16-Random Forest hybrid achieved notable results across all evaluation metrics, including 92% accuracy, 91.3% precision, 92.2% recall, 91.82% specificity, 92.2% sensitivity, and 91.75% F1-score. The hybrid models were also evaluated using unseen datasets to assess the generalizability of the proposed approaches, with the VGG16-Random Forest combination showing relatively promising results. Additionally, explainability is integrated into the model using SHAP values, providing insights into the contribution of each feature to the model's predictions. This hybrid VGG16-ML approach demonstrates the potential for highly accurate and interpretable heart disease detection, offering valuable support in clinical decision-making processes.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1504281"},"PeriodicalIF":4.7000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11893864/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1504281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Heart disease is a leading cause of mortality worldwide, making accurate early detection essential for effective treatment and management. This study introduces a novel hybrid machine-learning approach that combines transfer learning using the VGG16 convolutional neural network (CNN) with various machine-learning classifiers for heart disease detection. A conditional tabular generative adversarial network (CTGAN) was employed to generate synthetic data samples from actual datasets; these were evaluated using statistical metrics, correlation analysis, and domain expert assessments to ensure the quality of the synthetic datasets. The dataset comprises tabular data with 13 features, which are reshaped into an image-like format and resized to 224x224x3 to meet the input requirements of the VGG16 model. Feature extraction is performed using VGG16, and the extracted features are then fused with the original tabular data. This combined feature set is then used to train various machine learning models, including Support Vector Machines (SVM), Gradient Boosting, Random Forest, Logistic Regression, K-nearest neighbors (KNN), and Decision Trees. Among these models, the VGG16-Random Forest hybrid achieved notable results across all evaluation metrics, including 92% accuracy, 91.3% precision, 92.2% recall, 91.82% specificity, 92.2% sensitivity, and 91.75% F1-score. The hybrid models were also evaluated using unseen datasets to assess the generalizability of the proposed approaches, with the VGG16-Random Forest combination showing relatively promising results. Additionally, explainability is integrated into the model using SHAP values, providing insights into the contribution of each feature to the model's predictions. This hybrid VGG16-ML approach demonstrates the potential for highly accurate and interpretable heart disease detection, offering valuable support in clinical decision-making processes.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于迁移学习的混合vgg16 -机器学习方法用于可解释的人工智能心脏病检测。

心脏病是世界范围内死亡的主要原因，因此准确的早期发现对于有效治疗和管理至关重要。本研究引入了一种新的混合机器学习方法，该方法将使用VGG16卷积神经网络（CNN）的迁移学习与各种机器学习分类器相结合，用于心脏病检测。采用条件表格生成对抗网络（CTGAN）从实际数据集生成合成数据样本；使用统计度量、相关分析和领域专家评估来评估这些数据集，以确保合成数据集的质量。数据集由表格数据组成，包含13个特征，这些特征被重塑成类似图像的格式，并调整为224x224x3，以满足VGG16模型的输入要求。使用VGG16进行特征提取，并将提取的特征与原始表格数据融合。这个组合的特征集然后用于训练各种机器学习模型，包括支持向量机（SVM）、梯度增强、随机森林、逻辑回归、k近邻（KNN）和决策树。在这些模型中，VGG16-Random Forest混合模型在所有评价指标上都取得了显著的结果，其中准确率为92%，精密度为91.3%，召回率为92.2%，特异性为91.82%，灵敏度为92.2%，f1评分为91.75%。混合模型还使用未见过的数据集进行评估，以评估所提出方法的泛化性，VGG16-Random Forest组合显示出相对有希望的结果。此外，使用SHAP值将可解释性集成到模型中，从而深入了解每个特征对模型预测的贡献。这种混合VGG16-ML方法显示了高度准确和可解释的心脏病检测的潜力，为临床决策过程提供了宝贵的支持。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊