Gang Luo, Zhixin Li, Zhixian Ji, Sibao Wang, Silin Pan
{"title":"预测房间隔缺损患者部分异常肺静脉连接的可解释深度学习模型。","authors":"Gang Luo, Zhixin Li, Zhixian Ji, Sibao Wang, Silin Pan","doi":"10.1186/s12887-024-05193-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Patients with partial anomalous pulmonary venous connection (PAPVC) usually present asymptomatic and accompanied by intricate anatomical types, which results in missed diagnosis from atrial septal defect (ASD). The present study aimed to explore the predictive variables of PAPVC from patients with ASD and constructed an explainable prediction model based on deep learning.</p><p><strong>Methods: </strong>The retrospective study included 834 inpatients with ASD in Women and Children's Hospital, Qingdao University from January 2018 to January 2023. They were separated into two groups based on the presence of PAPVC. Propensity score matching and SMOTE were used to balance the baseline data between groups. The differential variables between the two groups were determined by univariate logistic regression. The patients were randomly divided into the training set and the validation set in a ratio of 8:2. Support vector machines (SVM), Random forest, Decision tree, XGBoost, and LightGBM were used to build models by differential variables. The classification performance of models was compared. Split, gain and SHAP were used to measure the importance of differential variables and improve the interpretability of the model. Moreover, a portion of the patients was included in the validation set to test the performance of the selected models.</p><p><strong>Results: </strong>Three hundred twenty-eight patients with ASD and patients with 82 PAPVC were included in the training set and the validation set, respectively. The selection of 10 differential variables was based on univariate logistic regression, including right atrial diameter (longitudinal axis and transverse axis), right ventricular diameter, left atrial diameter, left ventricular end-diastolic diameter, left ventricular end-systolic diameter, P-wave voltage, P-wave interval PR interval, and QRS-wave voltage. In the classification model established based on differential variables, the LightGBM model achieved the highest performance on the validation set (AUC = 0.93). Based on variables importance analysis, the LightGBM-Clinic model was retrained by P-wave voltage, P-wave interval, PR interval, QRS wave interval, and right ventricular diameter, and performed excellently (AUC = 0.90). The AUC of the LightGBM-Clinic model was 0.87 in the test set.</p><p><strong>Conclusion: </strong>In this study, the LightGBM model performs excellently in determining whether patients with ASD are accompanied by PAPVC. ECG parameters such as P-wave voltage were important to predictive value and enhance the explainability of the model.</p>","PeriodicalId":9144,"journal":{"name":"BMC Pediatrics","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11546076/pdf/","citationCount":"0","resultStr":"{\"title\":\"An explainable deep learning model to predict partial anomalous pulmonary venous connection for patients with atrial septal defect.\",\"authors\":\"Gang Luo, Zhixin Li, Zhixian Ji, Sibao Wang, Silin Pan\",\"doi\":\"10.1186/s12887-024-05193-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Patients with partial anomalous pulmonary venous connection (PAPVC) usually present asymptomatic and accompanied by intricate anatomical types, which results in missed diagnosis from atrial septal defect (ASD). The present study aimed to explore the predictive variables of PAPVC from patients with ASD and constructed an explainable prediction model based on deep learning.</p><p><strong>Methods: </strong>The retrospective study included 834 inpatients with ASD in Women and Children's Hospital, Qingdao University from January 2018 to January 2023. They were separated into two groups based on the presence of PAPVC. Propensity score matching and SMOTE were used to balance the baseline data between groups. The differential variables between the two groups were determined by univariate logistic regression. The patients were randomly divided into the training set and the validation set in a ratio of 8:2. Support vector machines (SVM), Random forest, Decision tree, XGBoost, and LightGBM were used to build models by differential variables. The classification performance of models was compared. Split, gain and SHAP were used to measure the importance of differential variables and improve the interpretability of the model. Moreover, a portion of the patients was included in the validation set to test the performance of the selected models.</p><p><strong>Results: </strong>Three hundred twenty-eight patients with ASD and patients with 82 PAPVC were included in the training set and the validation set, respectively. The selection of 10 differential variables was based on univariate logistic regression, including right atrial diameter (longitudinal axis and transverse axis), right ventricular diameter, left atrial diameter, left ventricular end-diastolic diameter, left ventricular end-systolic diameter, P-wave voltage, P-wave interval PR interval, and QRS-wave voltage. In the classification model established based on differential variables, the LightGBM model achieved the highest performance on the validation set (AUC = 0.93). Based on variables importance analysis, the LightGBM-Clinic model was retrained by P-wave voltage, P-wave interval, PR interval, QRS wave interval, and right ventricular diameter, and performed excellently (AUC = 0.90). The AUC of the LightGBM-Clinic model was 0.87 in the test set.</p><p><strong>Conclusion: </strong>In this study, the LightGBM model performs excellently in determining whether patients with ASD are accompanied by PAPVC. ECG parameters such as P-wave voltage were important to predictive value and enhance the explainability of the model.</p>\",\"PeriodicalId\":9144,\"journal\":{\"name\":\"BMC Pediatrics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2024-11-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11546076/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Pediatrics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12887-024-05193-0\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"PEDIATRICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Pediatrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12887-024-05193-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PEDIATRICS","Score":null,"Total":0}
An explainable deep learning model to predict partial anomalous pulmonary venous connection for patients with atrial septal defect.
Background: Patients with partial anomalous pulmonary venous connection (PAPVC) usually present asymptomatic and accompanied by intricate anatomical types, which results in missed diagnosis from atrial septal defect (ASD). The present study aimed to explore the predictive variables of PAPVC from patients with ASD and constructed an explainable prediction model based on deep learning.
Methods: The retrospective study included 834 inpatients with ASD in Women and Children's Hospital, Qingdao University from January 2018 to January 2023. They were separated into two groups based on the presence of PAPVC. Propensity score matching and SMOTE were used to balance the baseline data between groups. The differential variables between the two groups were determined by univariate logistic regression. The patients were randomly divided into the training set and the validation set in a ratio of 8:2. Support vector machines (SVM), Random forest, Decision tree, XGBoost, and LightGBM were used to build models by differential variables. The classification performance of models was compared. Split, gain and SHAP were used to measure the importance of differential variables and improve the interpretability of the model. Moreover, a portion of the patients was included in the validation set to test the performance of the selected models.
Results: Three hundred twenty-eight patients with ASD and patients with 82 PAPVC were included in the training set and the validation set, respectively. The selection of 10 differential variables was based on univariate logistic regression, including right atrial diameter (longitudinal axis and transverse axis), right ventricular diameter, left atrial diameter, left ventricular end-diastolic diameter, left ventricular end-systolic diameter, P-wave voltage, P-wave interval PR interval, and QRS-wave voltage. In the classification model established based on differential variables, the LightGBM model achieved the highest performance on the validation set (AUC = 0.93). Based on variables importance analysis, the LightGBM-Clinic model was retrained by P-wave voltage, P-wave interval, PR interval, QRS wave interval, and right ventricular diameter, and performed excellently (AUC = 0.90). The AUC of the LightGBM-Clinic model was 0.87 in the test set.
Conclusion: In this study, the LightGBM model performs excellently in determining whether patients with ASD are accompanied by PAPVC. ECG parameters such as P-wave voltage were important to predictive value and enhance the explainability of the model.
期刊介绍:
BMC Pediatrics is an open access journal publishing peer-reviewed research articles in all aspects of health care in neonates, children and adolescents, as well as related molecular genetics, pathophysiology, and epidemiology.