Background: Colorectal cancer (CRC) is one of the most prevalent malignant diseases worldwide and displays significant heterogeneity. The aim of this study was to investigate the application of machine learning algorithms to incorporate preoperative laboratory tests for predicting the 5-year recurrence risk in patients with stage III colon cancer (CC) postsurgery.
Methods: This study included two patient cohorts: the Zhejiang Cancer Hospital CC cohort (ZCC set, n = 290), which served as the training cohort, and the Dongyang CC cohort (DYC set, n = 125), which was utilized as an external testing cohort. Univariate analysis was initially performed on the 48 preoperative laboratory tests and 15 clinical and pathological features within the training cohort to pinpoint potential predictors. Features with a p value less than 0.05 were incorporated, and six machine learning models-logistic regression, random forest, XGBoost, support vector machine (SVM), back propagation neural network (BP NET), and K-nearest neighbour (KNN)-were employed to develop a model for predicting the 5-year recurrence risk in patients with stage III colon cancer. The prediction efficacy was assessed by calculating the area under the curve (AUC) of the machine learning model using the external test dataset, and comparisons were performed via the DeLong test. Ultimately, the Shapley additive explanations (SHAP) algorithm was applied to rank feature importance and compute the SHAP values for each feature, which were then visualized.
Results: Univariate analysis identified 10 laboratory tests and 6 clinical and pathological features that were incorporated into six machine learning models. The random forest model exhibited the highest predictive performance in the test cohort, with an AUC of 0.845. Logistic regression closely trailed, achieving an AUC of 0.823. The DeLong test revealed that the predictive performance of the random forest model was comparable to that of logistic regression and outperformed the other models. SHAP analysis indicated that the most important feature for predicting the 5-year recurrence risk of stage III colon cancer was perineural invasion, followed by FIB and then PT.
Conclusions: A machine learning model constructed using preoperative laboratory tests and clinical and pathological features can assist in predicting the 5-year recurrence risk of patients with stage III colon cancer. This model provides potential reference values for the clinical development of individualized treatment strategies.
扫码关注我们
求助内容:
应助结果提醒方式:
