Early detection of thyroid malignancies is crucial, yet traditional diagnostic methods are often costly and carry inherent risks. Thermography presents a non-invasive alternative, but existing studies frequently lack comprehensive methodological frameworks for broader applications. In the realm of machine learning and classification, feature selection is pivotal for enhancing model performance by reducing overfitting, shortening training times, minimizing dimensionality, improving interpretability, and focusing on the most relevant features.
This study aims to identify the most informative features and evaluate the efficacy of various feature selection techniques—both unsupervised and supervised (filter, wrapper, and embedded)—in improving the classification accuracy of thyroid nodules using thermography images. Multiple machine learning models, including Support Vector Machines, Random Forest, Decision Tree, AdaBoost, and XGBoost, were assessed as classifiers utilizing group k-fold cross-validation.
Among the feature selection methods, LASSO (supervised embedding-based feature selection) showed the best performance, achieving 86% accuracy with an AUC of 0.91 for the random forest model and 86 % accuracy with an AUC of 0.92 for the XGBoost model. This research underscores the critical role of feature selection in the classification of thyroid nodules using thermography, providing valuable insights for advancing non-invasive diagnostic methodologies in thyroid assessment.