Purpose: Alzheimer's disease (AD), a neurodegenerative disorder, is a condition that impairs cognition, memory, and behavior. Mild cognitive impairment (MCI), a transitional stage before AD, urgently needs the development of prediction models for conversion from MCI to AD.
Method: This study used machine learning methods to predict whether MCI subjects would develop AD, highlighting the importance of biomarkers (biological indicators from neuroimaging, such as MRI and PET scans, and molecular assays from cerebrospinal fluid or blood) and non-biomarker features in AD research and clinical practice. These indicators aid in early diagnosis, disease monitoring, and the development of potential treatments for MCI subjects. Using baseline data, which includes measurements of different biomarkers, we predicted disease progression at the patient's last visit. The Shapley value explanation (SHAP) technique was used to identify key features for predicting patient progression.
Results: The study used the ADNI database to evaluate the effectiveness of eight classification methods for predicting progression from MCI to AD. Four fundamental data sampling approaches were compared to balance the dataset and reduce overfitting. The SHAP technique improved the ability to identify biomarkers and non-biomarker features, enhancing the prediction of disease progression. NEAR-MISS was found to be the most advantageous sampling method, while XGBoost was found to be the superior classification method, offering enhanced accuracy and predictive power.
Conclusion: The proposed SHAP for feature selection combined with XGBoost may provide improved predictive accuracy in diagnosing Alzheimer's patients.