This study aims to develop and evaluate advanced machine learning (ML) models for accurate and scalable early detection of cervical cancer, addressing critical limitations in current diagnostic practices. In leveraging exploratory data analysis (EDA), rigorous data preprocessing, and multiple ML techniques—including Random Forest, ANN, SVM, XGBoost, and ensemble models—we systematically analyzed a comprehensive dataset from the UCI repository comprising demographic, clinical, and behavioral features. Results indicated that the Random Forest model achieved the highest performance, with an accuracy of 98.4 %, a sensitivity of 99.3 %, and a specificity of 97.6 %, substantially surpassing the other evaluated models. Despite limitations related to dataset homogeneity and potential biases introduced by synthetic oversampling methods, these findings represent significant methodological and practical advancements. By offering an interpretable and robust diagnostic tool, the study significantly contributes to the improvement of cervical cancer detection, particularly benefitting low-resource clinical environments where effective, scalable screening methods are urgently needed. The proposed framework—developed and evaluated solely on the UCI tabular cervical cancer dataset—achieved high discriminative performance with the Random Forest model (accuracy = 98.4 %, sensitivity = 99.3 %, specificity = 97.6 %). A previously published imaging-based ResNet-50 model (AUC = 0.97) is referenced for contextual comparison only and was not part of our experimental work. However, deployment in resource-constrained environments will require further optimization and cost-efficiency analyses to confirm feasibility.
扫码关注我们
求助内容:
应助结果提醒方式:
