{"title":"Construction and SHAP interpretability analysis of a risk prediction model for feeding intolerance in preterm newborns based on machine learning.","authors":"Hui Xu, Xingwang Peng, Ziyu Peng, Rui Wang, Rui Zhou, Lianguo Fu","doi":"10.1186/s12911-024-02751-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.</p><p><strong>Methods: </strong>In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.</p><p><strong>Results: </strong>The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.</p><p><strong>Conclusions: </strong>In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"342"},"PeriodicalIF":3.3000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572196/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02751-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.
Methods: In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.
Results: The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.
Conclusions: In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.