Background: Radiomics has shown promise in the diagnosis and prognosis of lung cancer. Here, we investigated the performance of computed tomography-based radiomic features, extracted from gross tumor volume (GTV), peritumoral volume (PTV), and GTV + PTV (GPTV), for predicting the pathological invasiveness of pure ground-glass nodules present in lung adenocarcinoma.
Methods: This was a retrospective, cross-sectional, bicentric study with data collected from January 1, 2018, to June 1, 2022. We divided the dataset into a training cohort (n = 88) from one center and an external validation cohort (n = 59) from another center. Radiomic signatures (rad-scores) were obtained after features were selected through correlation and least absolute shrinkage and selection operator analysis. Three machine learning models, a support vector machine model, a random forest model, and a generalized linear model, were then applied to build radiomic models.
Results: Invasive adenocarcinoma had a higher rad-score (P<0.001) in the GTV and GPTV. The area under the curves (AUC) of GTV, PTV, and GPTV were 0.839, 0.809, and 0.855 in the training cohort and 0.755, 0.777, and 0.801 in the external validation cohort, respectively. The GPTV model had higher AUCs for predicting pathological invasiveness. The random forest model had the best validity and fit for the proposed machine learning approach, suggesting that it may be the most appropriate model.
Conclusions: GPTV had the highest diagnostic efficiency for predicting pathological invasiveness in patients with pure ground-grass nodules, and the random forest model outperformed other predictive models.