Enhanced lung cancer subtype classification using attention-integrated DeepCNN and radiomic features from CT images: a focus on feature reproducibility.
Muna Alsallal, Hanan Hassan Ahmed, Radhwan Abdul Kareem, Anupam Yadav, Subbulakshmi Ganesan, Aman Shankhyan, Sofia Gupta, Kamal Kant Joshi, Hayder Naji Sameer, Ahmed Yaseen, Zainab H Athab, Mohaned Adil, Bagher Farhood
{"title":"Enhanced lung cancer subtype classification using attention-integrated DeepCNN and radiomic features from CT images: a focus on feature reproducibility.","authors":"Muna Alsallal, Hanan Hassan Ahmed, Radhwan Abdul Kareem, Anupam Yadav, Subbulakshmi Ganesan, Aman Shankhyan, Sofia Gupta, Kamal Kant Joshi, Hayder Naji Sameer, Ahmed Yaseen, Zainab H Athab, Mohaned Adil, Bagher Farhood","doi":"10.1007/s12672-025-02115-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>This study aims to assess a hybrid framework that combines radiomic features with deep learning and attention mechanisms to improve the accuracy of classifying lung cancer subtypes using CT images.</p><p><strong>Materials and methods: </strong>A dataset of 2725 lung cancer images was used, covering various subtypes: adenocarcinoma (552 images), SCC (380 images), small cell lung cancer (SCLC) (307 images), large cell carcinoma (215 images), and pulmonary carcinoid tumors (180 images). The images were extracted as 2D slices from 3D CT scans, with tumor-containing slices selected from scans obtained across five healthcare centers. The number of slices per patient varied between 7 and 30, depending on tumor visibility. CT images were preprocessed using standardization, cropping, and Gaussian smoothing to ensure consistency across scans from different imaging instruments used at the centers. Radiomic features, including first-order statistics (FOS), shape-based, and texture-based features, were extracted using the PyRadiomics library. A DeepCNN architecture, integrated with attention mechanisms in the second convolutional block, was used for deep feature extraction, focusing on diagnostically important regions. The dataset was split into training (60%), validation (20%), and testing (20%) sets. Various feature selection techniques, such as Non-negative Matrix Factorization (NMF) and Recursive Feature Elimination (RFE), were used, and multiple machines learning models, including XGBoost and Stacking, were evaluated using accuracy, sensitivity, and AUC metrics. The model's reproducibility was validated using ICC analysis across different imaging conditions.</p><p><strong>Results: </strong>The hybrid model, which integrates DeepCNN with attention mechanisms, outperformed traditional methods. It achieved a testing accuracy of 92.47%, an AUC of 93.99%, and a sensitivity of 92.11%. XGBoost with NMF showed the best performance across all models, and the combination of radiomic and deep features improved classification further. Attention mechanisms played a key role in enhancing model performance by focusing on relevant tumor areas, reducing misclassification from irrelevant features. This also improved the performance of the 3D Autoencoder, boosting the AUC to 93.89% and accuracy to 93.24%.</p><p><strong>Conclusions: </strong>This study shows that combining radiomic features with deep learning-especially when enhanced by attention mechanisms-creates a powerful and accurate framework for classifying lung cancer subtypes. Clinical trial number Not applicable.</p>","PeriodicalId":11148,"journal":{"name":"Discover. Oncology","volume":"16 1","pages":"336"},"PeriodicalIF":2.8000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Discover. Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s12672-025-02115-z","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENDOCRINOLOGY & METABOLISM","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: This study aims to assess a hybrid framework that combines radiomic features with deep learning and attention mechanisms to improve the accuracy of classifying lung cancer subtypes using CT images.
Materials and methods: A dataset of 2725 lung cancer images was used, covering various subtypes: adenocarcinoma (552 images), SCC (380 images), small cell lung cancer (SCLC) (307 images), large cell carcinoma (215 images), and pulmonary carcinoid tumors (180 images). The images were extracted as 2D slices from 3D CT scans, with tumor-containing slices selected from scans obtained across five healthcare centers. The number of slices per patient varied between 7 and 30, depending on tumor visibility. CT images were preprocessed using standardization, cropping, and Gaussian smoothing to ensure consistency across scans from different imaging instruments used at the centers. Radiomic features, including first-order statistics (FOS), shape-based, and texture-based features, were extracted using the PyRadiomics library. A DeepCNN architecture, integrated with attention mechanisms in the second convolutional block, was used for deep feature extraction, focusing on diagnostically important regions. The dataset was split into training (60%), validation (20%), and testing (20%) sets. Various feature selection techniques, such as Non-negative Matrix Factorization (NMF) and Recursive Feature Elimination (RFE), were used, and multiple machines learning models, including XGBoost and Stacking, were evaluated using accuracy, sensitivity, and AUC metrics. The model's reproducibility was validated using ICC analysis across different imaging conditions.
Results: The hybrid model, which integrates DeepCNN with attention mechanisms, outperformed traditional methods. It achieved a testing accuracy of 92.47%, an AUC of 93.99%, and a sensitivity of 92.11%. XGBoost with NMF showed the best performance across all models, and the combination of radiomic and deep features improved classification further. Attention mechanisms played a key role in enhancing model performance by focusing on relevant tumor areas, reducing misclassification from irrelevant features. This also improved the performance of the 3D Autoencoder, boosting the AUC to 93.89% and accuracy to 93.24%.
Conclusions: This study shows that combining radiomic features with deep learning-especially when enhanced by attention mechanisms-creates a powerful and accurate framework for classifying lung cancer subtypes. Clinical trial number Not applicable.