Performance of machine learning models in predicting difficult laryngoscopy in the emergency department: a single-centre retrospective study comparing with conventional regression method.
{"title":"Performance of machine learning models in predicting difficult laryngoscopy in the emergency department: a single-centre retrospective study comparing with conventional regression method.","authors":"Winchana Srivilaithon, Pichamon Thanasarnpaiboon","doi":"10.1186/s12873-025-01185-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Emergency endotracheal intubation is a critical skill for managing airway emergencies in the emergency department (ED). Accurate prediction of difficult laryngoscopy is essential for improving first-attempt success, minimizing complications, optimizing resource utilization, and enhancing patient outcomes. Traditional methods, such as the LEMON criteria, have limited predictive accuracy. Machine learning (ML) offers advanced predictive capabilities by analyzing large datasets and identifying complex variable interactions. This study aimed to develop and validate the performance of ML models for predicting difficult laryngoscopy in the ED, comparing it with a conventional regression model.</p><p><strong>Methods: </strong>A retrospective cohort study was conducted on 4,370 adult patients who underwent intubation in the ED at Thammasat University Hospital. Difficult laryngoscopy was defined as a Cormack-Lehane grade III or IV. Patients were divided into development (training, 70%) and validation (testing, 30%) cohorts. Predictors of difficult laryngoscopy were identified using multivariable stepwise backward elimination logistic regression and were used to develop ML models, including Logistic Regression, Decision Tree, Random Forest, and XGBoost. Model performance was evaluated using the area under the receiver operating characteristic curve (AuROC), accuracy, precision, recall, and F1-score. Validation was performed on the validation cohort to confirm model accuracy.</p><p><strong>Results: </strong>Nine significant predictors were identified: male sex, trauma, absence of neuromuscular blocking agents, large incisors, large tongue, limited mouth opening, short thyrohyoid distance, obstructed airway, and poor neck mobility. The Random Forest model demonstrated the highest predictive performance, with an AuROC of 0.82 (95% CI: 0.78-0.85), accuracy of 0.89, recall of 0.89, and F1-score of 0.87, outperforming conventional regression (AuROC 0.76, 95% CI: 0.73-0.78) and other ML models. DeLong's test confirmed a statistically significant difference in AuROC between the two models (p = 0.002). The Decision Tree showed limited performance due to overfitting, while XGBoost demonstrated strong precision. No significant differences were found when comparing the two models with conventional regression (p = 0.498 and 0.496, respectively).</p><p><strong>Conclusion: </strong>The Random Forest model provides the most robust prediction of difficult laryngoscopy, outperforming both conventional and other ML methods. While ML models improve predictive accuracy, logistic regression remains a practical option in resource-limited settings. Integrating ML into clinical workflows could enhance decision-making, resource allocation, and patient safety in emergency airway management. Future research should prioritize external validation and real-world implementation.</p>","PeriodicalId":9002,"journal":{"name":"BMC Emergency Medicine","volume":"25 1","pages":"28"},"PeriodicalIF":2.3000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11846364/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Emergency Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12873-025-01185-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EMERGENCY MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Emergency endotracheal intubation is a critical skill for managing airway emergencies in the emergency department (ED). Accurate prediction of difficult laryngoscopy is essential for improving first-attempt success, minimizing complications, optimizing resource utilization, and enhancing patient outcomes. Traditional methods, such as the LEMON criteria, have limited predictive accuracy. Machine learning (ML) offers advanced predictive capabilities by analyzing large datasets and identifying complex variable interactions. This study aimed to develop and validate the performance of ML models for predicting difficult laryngoscopy in the ED, comparing it with a conventional regression model.
Methods: A retrospective cohort study was conducted on 4,370 adult patients who underwent intubation in the ED at Thammasat University Hospital. Difficult laryngoscopy was defined as a Cormack-Lehane grade III or IV. Patients were divided into development (training, 70%) and validation (testing, 30%) cohorts. Predictors of difficult laryngoscopy were identified using multivariable stepwise backward elimination logistic regression and were used to develop ML models, including Logistic Regression, Decision Tree, Random Forest, and XGBoost. Model performance was evaluated using the area under the receiver operating characteristic curve (AuROC), accuracy, precision, recall, and F1-score. Validation was performed on the validation cohort to confirm model accuracy.
Results: Nine significant predictors were identified: male sex, trauma, absence of neuromuscular blocking agents, large incisors, large tongue, limited mouth opening, short thyrohyoid distance, obstructed airway, and poor neck mobility. The Random Forest model demonstrated the highest predictive performance, with an AuROC of 0.82 (95% CI: 0.78-0.85), accuracy of 0.89, recall of 0.89, and F1-score of 0.87, outperforming conventional regression (AuROC 0.76, 95% CI: 0.73-0.78) and other ML models. DeLong's test confirmed a statistically significant difference in AuROC between the two models (p = 0.002). The Decision Tree showed limited performance due to overfitting, while XGBoost demonstrated strong precision. No significant differences were found when comparing the two models with conventional regression (p = 0.498 and 0.496, respectively).
Conclusion: The Random Forest model provides the most robust prediction of difficult laryngoscopy, outperforming both conventional and other ML methods. While ML models improve predictive accuracy, logistic regression remains a practical option in resource-limited settings. Integrating ML into clinical workflows could enhance decision-making, resource allocation, and patient safety in emergency airway management. Future research should prioritize external validation and real-world implementation.
期刊介绍:
BMC Emergency Medicine is an open access, peer-reviewed journal that considers articles on all urgent and emergency aspects of medicine, in both practice and basic research. In addition, the journal covers aspects of disaster medicine and medicine in special locations, such as conflict areas and military medicine, together with articles concerning healthcare services in the emergency departments.