基于机器学习的早产新生儿喂养不耐受风险预测模型的构建和 SHAP 可解释性分析。

IF 3.3 3区 医学 Q2 MEDICAL INFORMATICS BMC Medical Informatics and Decision Making Pub Date : 2024-11-18 DOI:10.1186/s12911-024-02751-5
Hui Xu, Xingwang Peng, Ziyu Peng, Rui Wang, Rui Zhou, Lianguo Fu
{"title":"基于机器学习的早产新生儿喂养不耐受风险预测模型的构建和 SHAP 可解释性分析。","authors":"Hui Xu, Xingwang Peng, Ziyu Peng, Rui Wang, Rui Zhou, Lianguo Fu","doi":"10.1186/s12911-024-02751-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.</p><p><strong>Methods: </strong>In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.</p><p><strong>Results: </strong>The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.</p><p><strong>Conclusions: </strong>In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"24 1","pages":"342"},"PeriodicalIF":3.3000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572196/pdf/","citationCount":"0","resultStr":"{\"title\":\"Construction and SHAP interpretability analysis of a risk prediction model for feeding intolerance in preterm newborns based on machine learning.\",\"authors\":\"Hui Xu, Xingwang Peng, Ziyu Peng, Rui Wang, Rui Zhou, Lianguo Fu\",\"doi\":\"10.1186/s12911-024-02751-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.</p><p><strong>Methods: </strong>In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.</p><p><strong>Results: </strong>The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.</p><p><strong>Conclusions: </strong>In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.</p>\",\"PeriodicalId\":9340,\"journal\":{\"name\":\"BMC Medical Informatics and Decision Making\",\"volume\":\"24 1\",\"pages\":\"342\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11572196/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Informatics and Decision Making\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12911-024-02751-5\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MEDICAL INFORMATICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02751-5","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

摘要

目的基于机器学习(ML)构建一个高度准确且可解释的早产新生儿喂养不耐受(FI)风险预测模型,以协助医务人员进行临床诊断:本研究对 350 例住院早产新生儿进行了回顾性分析。首先,进行了双重特征选择,以确定用于构建模型的重要特征变量。其次,根据逻辑回归(LR)、决策树(DT)、支持向量机(SVM)和极梯度提升(XGBoost)算法构建 ML 模型,然后分别使用随机抽样和十倍交叉验证对这些模型进行评估和比较,并确定最佳模型。最后,我们应用SHAPLE Additive exPlanation(SHAP)可解释框架分析了最优模型的决策原理,并阐述了影响早产新生儿FI的重要因素及其作用模式:XGBoost的准确率为87.62%,曲线下面积(AUC)为92.2%。应用十倍交叉验证后,准确率为 83.43%,AUC 为 89.45%,明显优于其他模型。利用 SHAP 可解释框架对 XGBoost 模型进行的分析表明,复苏史、益生菌的使用、开奶时间、两次大便间隔时间和胎龄是影响早产新生儿 FI 发生的主要因素,其重要性得分分别为 0.632、0.407、0.313、0.313 和 0.258。复苏史、首次开奶时间≥24 h和大便间隔≥3天是早产新生儿FI的风险因素,而使用益生菌和胎龄≥34周是早产新生儿FI的保护因素:在实践中,我们应改善围产期护理和产科,以减少缺氧和早产的发生。在喂养时,应采取早期开奶、使用益生菌、刺激排便等措施,以减少 FI 的发生。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Construction and SHAP interpretability analysis of a risk prediction model for feeding intolerance in preterm newborns based on machine learning.

Objective: To construct a highly accurate and interpretable feeding intolerance (FI) risk prediction model for preterm newborns based on machine learning (ML) to assist medical staff in clinical diagnosis.

Methods: In this study, a sample of 350 hospitalized preterm newborns were retrospectively analysed. First, dual feature selection was conducted to identify important feature variables for model construction. Second, ML models were constructed based on the logistic regression (LR), decision tree (DT), support vector machine (SVM) and eXtreme Gradient Boosting (XGBoost) algorithms, after which random sampling and tenfold cross-validation were separately used to evaluate and compare these models and identify the optimal model. Finally, we apply the SHapley Additive exPlanation (SHAP) interpretable framework to analyse the decision-making principles of the optimal model and expound upon the important factors affecting FI in preterm newborns and their modes of action.

Results: The accuracy of XGBoost was 87.62%, and the area under the curve (AUC) was 92.2%. After the application of tenfold cross-validation, the accuracy was 83.43%, and the AUC was 89.45%, which was significantly better than those of the other models. Analysis of the XGBoost model with the SHAP interpretable framework showed that a history of resuscitation, use of probiotics, milk opening time, interval between two stools and gestational age were the main factors affecting the occurrence of FI in preterm newborns, yielding importance scores of 0.632, 0.407, 0.313, 0.313, and 0.258, respectively. A history of resuscitation, first milk opening time ≥ 24 h and interval between stools ≥ 3 days were risk factors for FI, while the use of probiotics and gestational age ≥ 34 weeks were protective factors against FI in preterm newborns.

Conclusions: In practice, we should improve perinatal care and obstetrics with the aim of reducing the occurrence of hypoxia and preterm delivery. When feeding, early milk opening, the use of probiotics, the stimulation of defecation and other measures should be implemented with the aim of reducing the occurrence of FI.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
5.70%
发文量
297
审稿时长
1 months
期刊介绍: BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.
期刊最新文献
Explainable predictions of a machine learning model to forecast the postoperative length of stay for severe patients: machine learning model development and evaluation. Risk factors and prediction model for acute ischemic stroke after off-pump coronary artery bypass grafting based on Bayesian network. Anomaly-based threat detection in smart health using machine learning. DAPNet: multi-view graph contrastive network incorporating disease clinical and molecular associations for disease progression prediction. Modified multiscale Renyi distribution entropy for short-term heart rate variability analysis.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1