Introduction: Accurate estimation of surgical case duration is essential for operating room (OR) efficiency. We aimed to evaluate the performance of machine learning (ML) models to predict surgery duration compared to conventional estimation, and to explore the factors affecting ML performance and its practical implementation.
Methods: Following PRISMA guidelines, we searched literature using MEDLINE, Embase, and CINAHL for articles published between January 2019 and October 2024. Studies were eligible if they evaluated an ML-based model, reported performance data, and compared the models to traditional estimation methods. The risk of bias was assessed using the Prediction model Risk Of Bias Assessment Tool.
Results: Eleven studies met the inclusion criteria. Models trained on specific surgical populations generally outperformed broader models. Several studies had methodological issues, such as incomplete handling of missing data and limited validation. ML models typically improved accuracy over traditional estimates. The average improvement was 25.7 %, with the best models reducing error rates by 51 %. We found no correlation (r = −0.01) between the number of predictor variables and the percentage improvement in prediction accuracy.
Discussion: ML-based surgical duration prediction shows promise for improving OR scheduling efficiency. However, challenges remain, including the need for standardized reporting, robust external validation, and practical integration into existing workflows. The risk of bias and inconsistent reporting of validation methods reduces confidence in the generalizability of ML performance. Heterogeneity in study and model designs complicates direct comparisons. Adopting standardized ML model development and testing protocols for surgical duration prediction can better demonstrate its benefits.
扫码关注我们
求助内容:
应助结果提醒方式:
