Objective
Breast cancer prognosis depends on early detection. We developed and externally validated a model using routine, readily available clinical and laboratory variables to discriminate malignant from benign breast lesions, aiming to reduce unnecessary biopsies and support early decision-making.
Methods
This retrospective two-center study included a development cohort 1from Jiujiang First People’s Hospital (N = 745; malignant 573, benign 172) and an external cohort2 from the First Affiliated Hospital of Nanchang University (N = 221; malignant 161, benign 60).Cohort 1 was randomly split into a 70:30 training and test set. Five-fold cross-validation was used to compare multiple algorithms and lock the model and hyperparameters; the locked model was evaluated on a fixed test set and the external cohort. The primary metric was AUC, with sensitivity, specificity, F1, Brier score, calibration curve, decision curve analysis (DCA), and SHAP for explanation.
Results
Logistic regression was selected, using Age, TT, APTT, CEA, and Ca. Cross-validated AUCs were 0.910 (training) and 0.905 (internal validation). The fixed test set yielded AUC 0.865 (sensitivity 0.802; specificity 0.712; F1 0.849; Brier 0.112). External validation achieved AUC 0.861, specificity 0.883, and PPV 0.934. DCA showed net benefit over “treat-all/none” across 20 %–95 % threshold probabilities. SHAP identified Age, TT, CEA, APTT and Ca as the dominant contributors.
Conclusions
A logistic model based on routine laboratory variables effectively distinguishes malignant from benign breast lesions, with robust external performance and clear clinical net benefit, enabling early risk stratification and fewer unnecessary biopsies.This study proposes a tool that quantifies breast tumor malignancy risk using only objective indicators, without subjective factors. Online tool: prediction-for-bc.shinyapps.io/dynnomapp/.
扫码关注我们
求助内容:
应助结果提醒方式:
