Transfer learning-based hybrid VGG16-machine learning approach for heart disease detection with explainable artificial intelligence.

IF 4.7 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Frontiers in Artificial Intelligence Pub Date : 2025-02-25 eCollection Date: 2025-01-01 DOI:10.3389/frai.2025.1504281
Eshetie Gizachew Addisu, Tahayu Gizachew Yirga, Hailu Gizachew Yirga, Alemu Demeke Yehuala
{"title":"Transfer learning-based hybrid VGG16-machine learning approach for heart disease detection with explainable artificial intelligence.","authors":"Eshetie Gizachew Addisu, Tahayu Gizachew Yirga, Hailu Gizachew Yirga, Alemu Demeke Yehuala","doi":"10.3389/frai.2025.1504281","DOIUrl":null,"url":null,"abstract":"<p><p>Heart disease is a leading cause of mortality worldwide, making accurate early detection essential for effective treatment and management. This study introduces a novel hybrid machine-learning approach that combines transfer learning using the VGG16 convolutional neural network (CNN) with various machine-learning classifiers for heart disease detection. A conditional tabular generative adversarial network (CTGAN) was employed to generate synthetic data samples from actual datasets; these were evaluated using statistical metrics, correlation analysis, and domain expert assessments to ensure the quality of the synthetic datasets. The dataset comprises tabular data with 13 features, which are reshaped into an image-like format and resized to 224x224x3 to meet the input requirements of the VGG16 model. Feature extraction is performed using VGG16, and the extracted features are then fused with the original tabular data. This combined feature set is then used to train various machine learning models, including Support Vector Machines (SVM), Gradient Boosting, Random Forest, Logistic Regression, K-nearest neighbors (KNN), and Decision Trees. Among these models, the VGG16-Random Forest hybrid achieved notable results across all evaluation metrics, including 92% accuracy, 91.3% precision, 92.2% recall, 91.82% specificity, 92.2% sensitivity, and 91.75% F1-score. The hybrid models were also evaluated using unseen datasets to assess the generalizability of the proposed approaches, with the VGG16-Random Forest combination showing relatively promising results. Additionally, explainability is integrated into the model using SHAP values, providing insights into the contribution of each feature to the model's predictions. This hybrid VGG16-ML approach demonstrates the potential for highly accurate and interpretable heart disease detection, offering valuable support in clinical decision-making processes.</p>","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1504281"},"PeriodicalIF":4.7000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11893864/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1504281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Heart disease is a leading cause of mortality worldwide, making accurate early detection essential for effective treatment and management. This study introduces a novel hybrid machine-learning approach that combines transfer learning using the VGG16 convolutional neural network (CNN) with various machine-learning classifiers for heart disease detection. A conditional tabular generative adversarial network (CTGAN) was employed to generate synthetic data samples from actual datasets; these were evaluated using statistical metrics, correlation analysis, and domain expert assessments to ensure the quality of the synthetic datasets. The dataset comprises tabular data with 13 features, which are reshaped into an image-like format and resized to 224x224x3 to meet the input requirements of the VGG16 model. Feature extraction is performed using VGG16, and the extracted features are then fused with the original tabular data. This combined feature set is then used to train various machine learning models, including Support Vector Machines (SVM), Gradient Boosting, Random Forest, Logistic Regression, K-nearest neighbors (KNN), and Decision Trees. Among these models, the VGG16-Random Forest hybrid achieved notable results across all evaluation metrics, including 92% accuracy, 91.3% precision, 92.2% recall, 91.82% specificity, 92.2% sensitivity, and 91.75% F1-score. The hybrid models were also evaluated using unseen datasets to assess the generalizability of the proposed approaches, with the VGG16-Random Forest combination showing relatively promising results. Additionally, explainability is integrated into the model using SHAP values, providing insights into the contribution of each feature to the model's predictions. This hybrid VGG16-ML approach demonstrates the potential for highly accurate and interpretable heart disease detection, offering valuable support in clinical decision-making processes.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于迁移学习的混合vgg16 -机器学习方法用于可解释的人工智能心脏病检测。
心脏病是世界范围内死亡的主要原因,因此准确的早期发现对于有效治疗和管理至关重要。本研究引入了一种新的混合机器学习方法,该方法将使用VGG16卷积神经网络(CNN)的迁移学习与各种机器学习分类器相结合,用于心脏病检测。采用条件表格生成对抗网络(CTGAN)从实际数据集生成合成数据样本;使用统计度量、相关分析和领域专家评估来评估这些数据集,以确保合成数据集的质量。数据集由表格数据组成,包含13个特征,这些特征被重塑成类似图像的格式,并调整为224x224x3,以满足VGG16模型的输入要求。使用VGG16进行特征提取,并将提取的特征与原始表格数据融合。这个组合的特征集然后用于训练各种机器学习模型,包括支持向量机(SVM)、梯度增强、随机森林、逻辑回归、k近邻(KNN)和决策树。在这些模型中,VGG16-Random Forest混合模型在所有评价指标上都取得了显著的结果,其中准确率为92%,精密度为91.3%,召回率为92.2%,特异性为91.82%,灵敏度为92.2%,f1评分为91.75%。混合模型还使用未见过的数据集进行评估,以评估所提出方法的泛化性,VGG16-Random Forest组合显示出相对有希望的结果。此外,使用SHAP值将可解释性集成到模型中,从而深入了解每个特征对模型预测的贡献。这种混合VGG16-ML方法显示了高度准确和可解释的心脏病检测的潜力,为临床决策过程提供了宝贵的支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.10
自引率
2.50%
发文量
272
审稿时长
13 weeks
期刊最新文献
A review of optimization strategies for deep and machine learning in diabetic macular edema. An AI approach to lunar phase detection: enhancing the identification of the new crescent with astronomical data integration. Improving reliability and accuracy of structured data extraction using a consensus large-language model approach-a use case description in multiple sclerosis. Tabular diffusion counterfactual explanations. Large language models as cognitive shortcuts: a systems-theoretic reframing beyond bullshit.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1