An increasing number of Artificial intelligence (AI) and machine learning (ML) models are being developed to predict radiation-induced toxicities (RITs) in patients with head and neck cancer (HNC). But their performance and reliability remain uncertain. This systematic review and meta-analysis evaluated the predictive accuracy and methodological quality of these models. We comprehensively searched PubMed, EMBASE, Web of Science, and the Cochrane Library to identify studies reporting on ML/AI models for predicting RITs in HNC patients. Eligible studies were assessed for bias risk using the PROBAST tool, and key performance metrics, including the area under the receiver operating curve (AUROC), were extracted. A hierarchical multilevel meta-analysis was performed to estimate pooled AUROC values, and subgroup analyses explored the influence of study characteristics on model performance. A total of 67 studies with a total of 568 models were included, showing moderate discriminatory power of ML/AI models, with a pooled AUROC = 0.76; 95 % CI: 0.73–0.78. Nonetheless, substantial heterogeneity was observed across studies. Incorporating imaging biomarkers significantly improved model performance. Prospective and internal validation showed comparable performance; external validation shows true generalizability. The predominance of retrospective designs and variability in predictor selection may have introduced bias, affecting model reliability and generalisability. ML/AI models hold promise for predicting RITs in HNC patients, but methodological constraints limit their applicability. Standardised and transparent reporting of model development and validation processes is vital for improving comparability among studies. Future research should explore hybrid modelling methods and the integration of clinical, dosimetric, radiomic, and genomic data to boost predictive accuracy.
扫码关注我们
求助内容:
应助结果提醒方式:
