Han Qin, Lili Zhang, Jianhong Wang, Weiheng Yan, Xi Wang, Xia Qu, Nan Peng, Lin Wang
{"title":"Interpretable machine learning approaches for children's ADHD detection using clinical assessment data: an online web application deployment.","authors":"Han Qin, Lili Zhang, Jianhong Wang, Weiheng Yan, Xi Wang, Xia Qu, Nan Peng, Lin Wang","doi":"10.1186/s12888-025-06573-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Attention-deficit/hyperactivity disorder (ADHD) is a prevalent mental disorder characterized by hyperactivity, impulsivity, and inattention. This study aims to develop a verifiable and interpretable machine learning model to identify ADHD and its subtypes in children using clinical assessment scales data.</p><p><strong>Methods: </strong>This study utilized the ADHD-200 dataset, including demographic data, Behavioral Rating Scale, and Wechsler Intelligence Scale assessments, to train and validate our models. The model's performance was evaluated using 10-fold cross-validation within the internal dataset, and the best model will be used for external validation. Seven machine learning models were evaluated. The SHapley Additive exPlanations (SHAP) method was employed for model interpretation. Finally, a web application will deploy the prediction model to provide ADHD probabilities based on user input.</p><p><strong>Results: </strong>The Random Forest (RF) model performing best in identifying ADHD and the Support Vector Machine (SVM) model excelling in distinguishing ADHD subtypes. The RF model achieved an AUC of 0.99 in 10-fold cross-validation and an AUC of 0.99 in external validation, and the SVM model achieved a micro-average AUC of 0.96 and an accuracy of 0.83 in internal validation and a micro-average AUC of 0.96 and an accuracy of 0.85 in external validation. We used SHAP to interpret the models, revealing that higher ADHD Index pushed the model towards ADHD classification. Additionally, lower IQ scores were correlated with a higher likelihood of ADHD, consistent with previous studies. The dependency analysis found that the model can identify different behavioral scales. We deployed the final model online using a web application and showed users how the model made decisions.</p><p><strong>Conclusions: </strong>Our findings highlight the potential of using machine learning and clinical assessment scales to support the diagnosis and subtype identification of ADHD in children, offering a practical solution for improving diagnostic accuracy and efficiency in clinical settings.</p>","PeriodicalId":9029,"journal":{"name":"BMC Psychiatry","volume":"25 1","pages":"139"},"PeriodicalIF":3.4000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12888-025-06573-1","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Attention-deficit/hyperactivity disorder (ADHD) is a prevalent mental disorder characterized by hyperactivity, impulsivity, and inattention. This study aims to develop a verifiable and interpretable machine learning model to identify ADHD and its subtypes in children using clinical assessment scales data.
Methods: This study utilized the ADHD-200 dataset, including demographic data, Behavioral Rating Scale, and Wechsler Intelligence Scale assessments, to train and validate our models. The model's performance was evaluated using 10-fold cross-validation within the internal dataset, and the best model will be used for external validation. Seven machine learning models were evaluated. The SHapley Additive exPlanations (SHAP) method was employed for model interpretation. Finally, a web application will deploy the prediction model to provide ADHD probabilities based on user input.
Results: The Random Forest (RF) model performing best in identifying ADHD and the Support Vector Machine (SVM) model excelling in distinguishing ADHD subtypes. The RF model achieved an AUC of 0.99 in 10-fold cross-validation and an AUC of 0.99 in external validation, and the SVM model achieved a micro-average AUC of 0.96 and an accuracy of 0.83 in internal validation and a micro-average AUC of 0.96 and an accuracy of 0.85 in external validation. We used SHAP to interpret the models, revealing that higher ADHD Index pushed the model towards ADHD classification. Additionally, lower IQ scores were correlated with a higher likelihood of ADHD, consistent with previous studies. The dependency analysis found that the model can identify different behavioral scales. We deployed the final model online using a web application and showed users how the model made decisions.
Conclusions: Our findings highlight the potential of using machine learning and clinical assessment scales to support the diagnosis and subtype identification of ADHD in children, offering a practical solution for improving diagnostic accuracy and efficiency in clinical settings.
期刊介绍:
BMC Psychiatry is an open access, peer-reviewed journal that considers articles on all aspects of the prevention, diagnosis and management of psychiatric disorders, as well as related molecular genetics, pathophysiology, and epidemiology.