Yangjie Yu, Weikai Li, Jiajia Wu, Xuyun Hua, Bo Jin, Haiming Shi, Qiying Chen, Junjie Pan
{"title":"Machine learning models using symptoms and clinical variables to predict coronary artery disease on coronary angiography.","authors":"Yangjie Yu, Weikai Li, Jiajia Wu, Xuyun Hua, Bo Jin, Haiming Shi, Qiying Chen, Junjie Pan","doi":"10.5114/aic.2024.136416","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Coronary angiography (CAG) is invasive and expensive, while numbers of patients suspected of coronary artery disease (CAD) undergoing CAG results have no coronary lesions.</p><p><strong>Aim: </strong>To develop machine learning algorithms using symptoms and clinical variables to predict CAD.</p><p><strong>Material and methods: </strong>This study was conducted as a cross-sectional study of patients undergoing CAG. We randomly chose 2082 patients from 2602 patients suspected of CAD as the training set, and 520 patients as the test set. We utilized LASSO regression to do feature selection. The area under the receiver operating characteristic curve (AUC), confusion matrix of different thresholds, positive predictive value (PPV) and negative predictive value (NPV) were shown. Support vector machine algorithm performances in 10 folds were conducted in the training set for detecting severe CAD, while XGBoost algorithm performances were conducted in the test set for detecting severe CAD.</p><p><strong>Results: </strong>The algorithm of logistic regression achieved an average AUC of 0.77 in the training set during 10-fold validation and an AUC of 0.75 in the test set. When probability predicted by the model was less than 0.1, 11 patients in the test set (520 patients) were screened out, and NPV reached 90.9%. When probability predicted by the model was less than 0.2, 110 patients in the test set were screened out, and reached 83.6%. Meanwhile, when threshold was set to 0.9, PPV reached 97.4%. When the threshold was set to 0.8, PPV reached 91.5%.</p><p><strong>Conclusions: </strong>Machine learning algorithm using data from hospital information systems could assist in severe CAD exclusion and confirmation, and thus help patients avoid unnecessary CAG.</p>","PeriodicalId":49678,"journal":{"name":"Postepy W Kardiologii Interwencyjnej","volume":"20 1","pages":"30-36"},"PeriodicalIF":1.5000,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11008511/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Postepy W Kardiologii Interwencyjnej","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.5114/aic.2024.136416","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/3/15 0:00:00","PubModel":"Epub","JCR":"Q3","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Coronary angiography (CAG) is invasive and expensive, while numbers of patients suspected of coronary artery disease (CAD) undergoing CAG results have no coronary lesions.
Aim: To develop machine learning algorithms using symptoms and clinical variables to predict CAD.
Material and methods: This study was conducted as a cross-sectional study of patients undergoing CAG. We randomly chose 2082 patients from 2602 patients suspected of CAD as the training set, and 520 patients as the test set. We utilized LASSO regression to do feature selection. The area under the receiver operating characteristic curve (AUC), confusion matrix of different thresholds, positive predictive value (PPV) and negative predictive value (NPV) were shown. Support vector machine algorithm performances in 10 folds were conducted in the training set for detecting severe CAD, while XGBoost algorithm performances were conducted in the test set for detecting severe CAD.
Results: The algorithm of logistic regression achieved an average AUC of 0.77 in the training set during 10-fold validation and an AUC of 0.75 in the test set. When probability predicted by the model was less than 0.1, 11 patients in the test set (520 patients) were screened out, and NPV reached 90.9%. When probability predicted by the model was less than 0.2, 110 patients in the test set were screened out, and reached 83.6%. Meanwhile, when threshold was set to 0.9, PPV reached 97.4%. When the threshold was set to 0.8, PPV reached 91.5%.
Conclusions: Machine learning algorithm using data from hospital information systems could assist in severe CAD exclusion and confirmation, and thus help patients avoid unnecessary CAG.
期刊介绍:
Postępy w Kardiologii Interwencyjnej/Advances in Interventional Cardiology is indexed in:
Index Copernicus, Ministry of Science and Higher Education Index (MNiSW).
Advances in Interventional Cardiology is a quarterly aimed at specialists, mainly at cardiologists and cardiosurgeons.
Official journal of the Association on Cardiovascular Interventions of the Polish Cardiac Society.