Alessandro Simonini, Jeevitha Murugan, Alessandro Vittori, Roberta Pallotto, Elena Giovanna Bignami, Maria Grazia Calevo, Ornella Piazza, Marco Cascella
{"title":"用于扁桃体切除术/腺样体切除术儿科患者风险分层和新发谵妄预测的数据驱动型机器学习模型","authors":"Alessandro Simonini, Jeevitha Murugan, Alessandro Vittori, Roberta Pallotto, Elena Giovanna Bignami, Maria Grazia Calevo, Ornella Piazza, Marco Cascella","doi":"10.62713/aic.3485","DOIUrl":null,"url":null,"abstract":"<p><strong>Aim: </strong>In the pediatric surgical population, Emergence Delirium (ED) poses a significant challenge. This study aims to develop and validate machine learning (ML) models to identify key features associated with ED and predict its occurrence in children undergoing tonsillectomy or adenotonsillectomy.</p><p><strong>Methods: </strong>The analysis involved data cleaning, exploratory data analysis (EDA), supervised predictive modeling, and unsupervised learning on a medical dataset (n = 423). After preliminary data cleaning, EDA encompassed plotting histograms, boxplots, pairplots, and correlation heatmaps to understand variable distributions and relationships. Four predictive models were trained including logistic regression (LR), random forest (RF), Support Vector Machine (SVM), and Gradient Boosting (XGBoost). The models were evaluated and compared using Receiver Operating Characteristic (ROC) Area Under the Curve (AUC), precision, recall, and feature importance. The RF model showed better performance and was used for the test (AUC-ROC 0.96, precision 1.00, and recall 0.92 on the validation set). K-means clustering was applied to find groups within the data. Elbow method and silhouette scores were used to determine the optimal number of clusters. The formed clusters were analyzed by aggregating features to understand the characteristics of each cluster.</p><p><strong>Results: </strong>EDA revealed significant positive correlations between age, weight, American Society of Anesthesiologists (ASA) health score, and surgery duration with the risk of developing ED. Among the ML models, RF achieved the highest performance. Key predictive variables, based on the model's feature importance, included delirium screening scales, extubation time, and time to regain consciousness. Unsupervised K-means clustering identified 2-3 optimal clusters, which represented distinct patient subgroups: younger, healthier, low-risk individuals (cluster 0), and older patients with increasing chronic disease burden, higher delirium screening scores, and consequently higher post-operative delirium risk (clusters 1 and 2).</p><p><strong>Conclusions: </strong>ML techniques are valuable tools for extracting insights and making accurate predictions from healthcare data. High-performing algorithm-based models can be implemented for clinical decision support systems, facilitating early identification and intervention for ED in pediatric patients. By investigating various variables, it is possible to assess risk and implement preventive measures effectively. Furthermore, unsupervised clustering reveals distinct patient subgroups, enabling personalized perioperative management strategies and enhancing overall patient care.</p>","PeriodicalId":8210,"journal":{"name":"Annali italiani di chirurgia","volume":"95 5","pages":"944-955"},"PeriodicalIF":0.9000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data-driven Machine Learning Models for Risk Stratification and Prediction of Emergence Delirium in Pediatric Patients Underwent Tonsillectomy/Adenotonsillectomy.\",\"authors\":\"Alessandro Simonini, Jeevitha Murugan, Alessandro Vittori, Roberta Pallotto, Elena Giovanna Bignami, Maria Grazia Calevo, Ornella Piazza, Marco Cascella\",\"doi\":\"10.62713/aic.3485\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Aim: </strong>In the pediatric surgical population, Emergence Delirium (ED) poses a significant challenge. This study aims to develop and validate machine learning (ML) models to identify key features associated with ED and predict its occurrence in children undergoing tonsillectomy or adenotonsillectomy.</p><p><strong>Methods: </strong>The analysis involved data cleaning, exploratory data analysis (EDA), supervised predictive modeling, and unsupervised learning on a medical dataset (n = 423). After preliminary data cleaning, EDA encompassed plotting histograms, boxplots, pairplots, and correlation heatmaps to understand variable distributions and relationships. Four predictive models were trained including logistic regression (LR), random forest (RF), Support Vector Machine (SVM), and Gradient Boosting (XGBoost). The models were evaluated and compared using Receiver Operating Characteristic (ROC) Area Under the Curve (AUC), precision, recall, and feature importance. The RF model showed better performance and was used for the test (AUC-ROC 0.96, precision 1.00, and recall 0.92 on the validation set). K-means clustering was applied to find groups within the data. Elbow method and silhouette scores were used to determine the optimal number of clusters. The formed clusters were analyzed by aggregating features to understand the characteristics of each cluster.</p><p><strong>Results: </strong>EDA revealed significant positive correlations between age, weight, American Society of Anesthesiologists (ASA) health score, and surgery duration with the risk of developing ED. Among the ML models, RF achieved the highest performance. Key predictive variables, based on the model's feature importance, included delirium screening scales, extubation time, and time to regain consciousness. Unsupervised K-means clustering identified 2-3 optimal clusters, which represented distinct patient subgroups: younger, healthier, low-risk individuals (cluster 0), and older patients with increasing chronic disease burden, higher delirium screening scores, and consequently higher post-operative delirium risk (clusters 1 and 2).</p><p><strong>Conclusions: </strong>ML techniques are valuable tools for extracting insights and making accurate predictions from healthcare data. High-performing algorithm-based models can be implemented for clinical decision support systems, facilitating early identification and intervention for ED in pediatric patients. By investigating various variables, it is possible to assess risk and implement preventive measures effectively. Furthermore, unsupervised clustering reveals distinct patient subgroups, enabling personalized perioperative management strategies and enhancing overall patient care.</p>\",\"PeriodicalId\":8210,\"journal\":{\"name\":\"Annali italiani di chirurgia\",\"volume\":\"95 5\",\"pages\":\"944-955\"},\"PeriodicalIF\":0.9000,\"publicationDate\":\"2024-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annali italiani di chirurgia\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.62713/aic.3485\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"SURGERY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annali italiani di chirurgia","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.62713/aic.3485","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
摘要
目的:在儿科手术人群中,出现谵妄(ED)是一项重大挑战。本研究旨在开发和验证机器学习(ML)模型,以确定与ED相关的关键特征,并预测接受扁桃体切除术或腺样体切除术的儿童中ED的发生率:分析包括数据清理、探索性数据分析(EDA)、监督预测建模以及对医疗数据集(n = 423)的无监督学习。经过初步数据清理后,探索性数据分析包括绘制直方图、箱形图、配对图和相关热图,以了解变量分布和关系。训练了四个预测模型,包括逻辑回归 (LR)、随机森林 (RF)、支持向量机 (SVM) 和梯度提升 (XGBoost)。使用接收器工作特征曲线(ROC)下面积(AUC)、精确度、召回率和特征重要性对这些模型进行了评估和比较。RF 模型显示出更好的性能,并被用于测试(验证集上的 AUC-ROC 为 0.96,精确度为 1.00,召回率为 0.92)。K-means 聚类用于在数据中寻找分组。使用肘法和剪影评分来确定最佳聚类数量。通过聚合特征对已形成的聚类进行分析,以了解每个聚类的特征:EDA显示,年龄、体重、美国麻醉医师协会(ASA)健康评分和手术持续时间与发生ED的风险之间存在明显的正相关。在 ML 模型中,RF 的性能最高。基于模型特征重要性的关键预测变量包括谵妄筛查量表、拔管时间和恢复意识时间。无监督 K 均值聚类确定了 2-3 个最佳聚类,它们代表了不同的患者亚群:年轻、健康、低风险的个体(聚类 0),以及慢性疾病负担加重、谵妄筛查评分较高、术后谵妄风险较高的老年患者(聚类 1 和 2):ML 技术是从医疗数据中提取洞察力并做出准确预测的重要工具。基于算法的高性能模型可用于临床决策支持系统,促进对儿科患者 ED 的早期识别和干预。通过调查各种变量,可以评估风险并有效实施预防措施。此外,无监督聚类还能揭示不同的患者亚群,从而制定个性化的围手术期管理策略,加强对患者的整体护理。
Data-driven Machine Learning Models for Risk Stratification and Prediction of Emergence Delirium in Pediatric Patients Underwent Tonsillectomy/Adenotonsillectomy.
Aim: In the pediatric surgical population, Emergence Delirium (ED) poses a significant challenge. This study aims to develop and validate machine learning (ML) models to identify key features associated with ED and predict its occurrence in children undergoing tonsillectomy or adenotonsillectomy.
Methods: The analysis involved data cleaning, exploratory data analysis (EDA), supervised predictive modeling, and unsupervised learning on a medical dataset (n = 423). After preliminary data cleaning, EDA encompassed plotting histograms, boxplots, pairplots, and correlation heatmaps to understand variable distributions and relationships. Four predictive models were trained including logistic regression (LR), random forest (RF), Support Vector Machine (SVM), and Gradient Boosting (XGBoost). The models were evaluated and compared using Receiver Operating Characteristic (ROC) Area Under the Curve (AUC), precision, recall, and feature importance. The RF model showed better performance and was used for the test (AUC-ROC 0.96, precision 1.00, and recall 0.92 on the validation set). K-means clustering was applied to find groups within the data. Elbow method and silhouette scores were used to determine the optimal number of clusters. The formed clusters were analyzed by aggregating features to understand the characteristics of each cluster.
Results: EDA revealed significant positive correlations between age, weight, American Society of Anesthesiologists (ASA) health score, and surgery duration with the risk of developing ED. Among the ML models, RF achieved the highest performance. Key predictive variables, based on the model's feature importance, included delirium screening scales, extubation time, and time to regain consciousness. Unsupervised K-means clustering identified 2-3 optimal clusters, which represented distinct patient subgroups: younger, healthier, low-risk individuals (cluster 0), and older patients with increasing chronic disease burden, higher delirium screening scores, and consequently higher post-operative delirium risk (clusters 1 and 2).
Conclusions: ML techniques are valuable tools for extracting insights and making accurate predictions from healthcare data. High-performing algorithm-based models can be implemented for clinical decision support systems, facilitating early identification and intervention for ED in pediatric patients. By investigating various variables, it is possible to assess risk and implement preventive measures effectively. Furthermore, unsupervised clustering reveals distinct patient subgroups, enabling personalized perioperative management strategies and enhancing overall patient care.
期刊介绍:
Annali Italiani di Chirurgia is a bimonthly journal and covers all aspects of surgery:elective, emergency and experimental surgery, as well as problems involving technology, teaching, organization and forensic medicine. The articles are published in Italian or English, though English is preferred because it facilitates the international diffusion of the journal (v.Guidelines for Authors and Norme per gli Autori). The articles published are divided into three main sections:editorials, original articles, and case reports and innovations.