Johanna Wallensten, Caroline Wachtler, Nenad Bogdanovic, Anna Olofsson, Miia Kivipelto, Linus Jönsson, Predrag Petrovic, Axel C Carlsson
{"title":"Machine learning to detect Alzheimer's disease with data on drugs and diagnoses.","authors":"Johanna Wallensten, Caroline Wachtler, Nenad Bogdanovic, Anna Olofsson, Miia Kivipelto, Linus Jönsson, Predrag Petrovic, Axel C Carlsson","doi":"10.1016/j.tjpad.2025.100115","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Integrating machine learning with medical records offers potential for early detection of Alzheimer's disease (AD), enabling timely interventions.</p><p><strong>Objectives: </strong>This study aimed to evaluate the effectiveness of machine learning in constructing a predictive model for AD, designed to predict AD with data up to three years before diagnosis. Using clinical data, including prior diagnoses and medical treatments, we sought to enhance sensitivity and specificity in diagnostic procedures. A second aim was to identify the most important factors in the machine learning models, as these may be important predictors of AD.</p><p><strong>Design: </strong>The study employed Stochastic Gradient Boosting, a machine learning method, to identify diagnoses predictive of AD using primary healthcare data. The analyses were stratified by sex and age groups.</p><p><strong>Setting: </strong>The study included individuals within Region Stockholm, Sweden, using medical records from 2010 to 2022.</p><p><strong>Participants: </strong>The study analyzed clinical data for individuals over the age of 40. Patients with an AD diagnosis (ICD-10-SE codes F00 or G30) during 2010-2012 were excluded to ensure prospective modeling. In total, AD was identified in 3,407 patients aged 41-69 years and 25,796 patients aged over 69.</p><p><strong>Measurements: </strong>The machine learning model ranked predictive diagnoses, with performance assessed by the area under the receiver operating characteristic curve (AUC). Known and novel predictors were evaluated for their contribution to AD risk.</p><p><strong>Results: </strong>AUC values ranged from 0.748 (women aged 41-69) to 0.816 (women over 69), with men across age groups falling within this range. Sensitivity and specificity ranged from 0.73 to 0.79 and 0.66 to 0.79, respectively, across age and gender groups. Negative predictive values were consistently high (≥0.954), while positive predictive values were lower (0.199-0.351). Additionally, we confirmed known risk factors as predictors and identified novel predictors that warrant further investigation. Key predictors included medical observations, cognitive symptoms, antidepressant treatment, visit frequency, and vitamin B12/folic acid treatment.</p><p><strong>Conclusions: </strong>Machine learning applied to clinical data shows promise in predicting AD, with robust model performance across age and sex groups. The findings confirmed known risk factors, such as depression and vitamin B12 deficiency, while also identifying novel predictors that may guide future research. Clinically, this approach could enhance early detection and risk stratification, facilitating timely interventions and improving patient outcomes.</p>","PeriodicalId":22711,"journal":{"name":"The Journal of Prevention of Alzheimer's Disease","volume":" ","pages":"100115"},"PeriodicalIF":4.3000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Prevention of Alzheimer's Disease","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.tjpad.2025.100115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BUSINESS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Integrating machine learning with medical records offers potential for early detection of Alzheimer's disease (AD), enabling timely interventions.
Objectives: This study aimed to evaluate the effectiveness of machine learning in constructing a predictive model for AD, designed to predict AD with data up to three years before diagnosis. Using clinical data, including prior diagnoses and medical treatments, we sought to enhance sensitivity and specificity in diagnostic procedures. A second aim was to identify the most important factors in the machine learning models, as these may be important predictors of AD.
Design: The study employed Stochastic Gradient Boosting, a machine learning method, to identify diagnoses predictive of AD using primary healthcare data. The analyses were stratified by sex and age groups.
Setting: The study included individuals within Region Stockholm, Sweden, using medical records from 2010 to 2022.
Participants: The study analyzed clinical data for individuals over the age of 40. Patients with an AD diagnosis (ICD-10-SE codes F00 or G30) during 2010-2012 were excluded to ensure prospective modeling. In total, AD was identified in 3,407 patients aged 41-69 years and 25,796 patients aged over 69.
Measurements: The machine learning model ranked predictive diagnoses, with performance assessed by the area under the receiver operating characteristic curve (AUC). Known and novel predictors were evaluated for their contribution to AD risk.
Results: AUC values ranged from 0.748 (women aged 41-69) to 0.816 (women over 69), with men across age groups falling within this range. Sensitivity and specificity ranged from 0.73 to 0.79 and 0.66 to 0.79, respectively, across age and gender groups. Negative predictive values were consistently high (≥0.954), while positive predictive values were lower (0.199-0.351). Additionally, we confirmed known risk factors as predictors and identified novel predictors that warrant further investigation. Key predictors included medical observations, cognitive symptoms, antidepressant treatment, visit frequency, and vitamin B12/folic acid treatment.
Conclusions: Machine learning applied to clinical data shows promise in predicting AD, with robust model performance across age and sex groups. The findings confirmed known risk factors, such as depression and vitamin B12 deficiency, while also identifying novel predictors that may guide future research. Clinically, this approach could enhance early detection and risk stratification, facilitating timely interventions and improving patient outcomes.
期刊介绍:
The JPAD Journal of Prevention of Alzheimer’Disease will publish reviews, original research articles and short reports to improve our knowledge in the field of Alzheimer prevention including: neurosciences, biomarkers, imaging, epidemiology, public health, physical cognitive exercise, nutrition, risk and protective factors, drug development, trials design, and heath economic outcomes.JPAD will publish also the meeting abstracts from Clinical Trial on Alzheimer Disease (CTAD) and will be distributed both in paper and online version worldwide.We hope that JPAD with your contribution will play a role in the development of Alzheimer prevention.