Khoi Nguyen, My Nguyen, Khiet Dang, Bao Pham, Vy Huynh, Toi Vo, Lua Ngo, Huong Ha
{"title":"Early Alzheimer’s disease diagnosis using an XG-Boost model applied to MRI images","authors":"Khoi Nguyen, My Nguyen, Khiet Dang, Bao Pham, Vy Huynh, Toi Vo, Lua Ngo, Huong Ha","doi":"10.15419/bmrat.v10i9.832","DOIUrl":null,"url":null,"abstract":"Introduction: Early Alzheimer's disease (AD) diagnosis is critical to improving the success of new treatments in clinical trials, especially at the early mild cognitive impairment (EMCI) stage. This study aimed to tackle this problem by developing an accurate classification model for early AD detection at the EMCI stage based on magnetic resonance imaging (MRI). Methods: This study developed the proposed classification model through a machine-learning pipeline with three main steps. First, features were extracted from MRI images using FreeSurfer. Second, the extracted features were filtered using principal component analysis (PCA), backward elimination (BE), and extreme gradient (XG)-Boost importance (XGBI), the efficiency of which was evaluated. Finally, the selected features were combined with cognitive scores (Mini Mental State Examination [MMSE] and Clinical Dementia Rating [CDR]) to create an XG-Boost three-class classifier: AD vs. EMCI vs. cognitively normal (CN). Results: The MMSE and CDR had the highest importance weights, followed by the thickness of the left superior temporal sulcus and banks of the superior temporal lobe. Without feature selection, the model had the lowest accuracy of 69.0%. After feature selection and the addition of cognitive scores, the accuracy of the PCA, BE, and XGBI approaches improved to 74.0%, 90.9%, and 91.5%, respectively. The BE with tuning parameters model was chosen as the final model since it had the highest accuracy of 92.0%. The area under the receiver operating characteristic curve for the CN, AD, and EMCI classes were 0.98, 0.94, and 0.88, respectively. Conclusion: Our proposed model shows promise in early AD diagnosis and can be fine-tuned in the future through testing on a multi-dataset.","PeriodicalId":8870,"journal":{"name":"Biomedical Research and Therapy","volume":"88 1","pages":"0"},"PeriodicalIF":0.6000,"publicationDate":"2023-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Research and Therapy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15419/bmrat.v10i9.832","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction: Early Alzheimer's disease (AD) diagnosis is critical to improving the success of new treatments in clinical trials, especially at the early mild cognitive impairment (EMCI) stage. This study aimed to tackle this problem by developing an accurate classification model for early AD detection at the EMCI stage based on magnetic resonance imaging (MRI). Methods: This study developed the proposed classification model through a machine-learning pipeline with three main steps. First, features were extracted from MRI images using FreeSurfer. Second, the extracted features were filtered using principal component analysis (PCA), backward elimination (BE), and extreme gradient (XG)-Boost importance (XGBI), the efficiency of which was evaluated. Finally, the selected features were combined with cognitive scores (Mini Mental State Examination [MMSE] and Clinical Dementia Rating [CDR]) to create an XG-Boost three-class classifier: AD vs. EMCI vs. cognitively normal (CN). Results: The MMSE and CDR had the highest importance weights, followed by the thickness of the left superior temporal sulcus and banks of the superior temporal lobe. Without feature selection, the model had the lowest accuracy of 69.0%. After feature selection and the addition of cognitive scores, the accuracy of the PCA, BE, and XGBI approaches improved to 74.0%, 90.9%, and 91.5%, respectively. The BE with tuning parameters model was chosen as the final model since it had the highest accuracy of 92.0%. The area under the receiver operating characteristic curve for the CN, AD, and EMCI classes were 0.98, 0.94, and 0.88, respectively. Conclusion: Our proposed model shows promise in early AD diagnosis and can be fine-tuned in the future through testing on a multi-dataset.