{"title":"基于机器学习的早产新生儿喂养不耐受风险预测模型的构建和 SHAP 可解释性分析。","authors":"Hui Xu, Xingwang Peng, Ziyu Peng, Rui Wang, Rui Zhou, Lianguo Fu","doi":"10.1186/s12911-024-02751-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.</p><p><strong>Methods: </strong>In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.</p><p><strong>Results: </strong>The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.</p><p><strong>Conclusions: </strong>In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"342"},"PeriodicalIF":3.3000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572196/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction and SHAP interpretability analysis of a risk prediction model for feeding intolerance in preterm newborns based on machine learning.\",\"authors\":\"Hui Xu, Xingwang Peng, Ziyu Peng, Rui Wang, Rui Zhou, Lianguo Fu\",\"doi\":\"10.1186/s12911-024-02751-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.</p><p><strong>Methods: </strong>In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.</p><p><strong>Results: </strong>The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.</p><p><strong>Conclusions: </strong>In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"24 1\",\"pages\":\"342\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572196/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-024-02751-5\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02751-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
摘要
目的基于机器学习(ML)构建一个高度准确且可解释的早产新生儿喂养不耐受(FI)风险预测模型,以协助医务人员进行临床诊断:本研究对 350 例住院早产新生儿进行了回顾性分析。首先,进行了双重特征选择,以确定用于构建模型的重要特征变量。其次,根据逻辑回归(LR)、决策树(DT)、支持向量机(SVM)和极梯度提升(XGBoost)算法构建 ML 模型,然后分别使用随机抽样和十倍交叉验证对这些模型进行评估和比较,并确定最佳模型。最后,我们应用SHAPLE Additive exPlanation(SHAP)可解释框架分析了最优模型的决策原理,并阐述了影响早产新生儿FI的重要因素及其作用模式:XGBoost的准确率为87.62%,曲线下面积(AUC)为92.2%。应用十倍交叉验证后,准确率为 83.43%,AUC 为 89.45%,明显优于其他模型。利用 SHAP 可解释框架对 XGBoost 模型进行的分析表明,复苏史、益生菌的使用、开奶时间、两次大便间隔时间和胎龄是影响早产新生儿 FI 发生的主要因素,其重要性得分分别为 0.632、0.407、0.313、0.313 和 0.258。复苏史、首次开奶时间≥24 h和大便间隔≥3天是早产新生儿FI的风险因素,而使用益生菌和胎龄≥34周是早产新生儿FI的保护因素:在实践中,我们应改善围产期护理和产科,以减少缺氧和早产的发生。在喂养时,应采取早期开奶、使用益生菌、刺激排便等措施,以减少 FI 的发生。
Construction and SHAP interpretability analysis of a risk prediction model for feeding intolerance in preterm newborns based on machine learning.
Objective: To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.
Methods: In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.
Results: The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.
Conclusions: In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.