Identifying Adolescent Depression and Anxiety Through Real-World Data and Social Determinants of Health: Machine Learning Model Development and Validation.
Mamoun T Mardini, Georges E Khalil, Chen Bai, Aparna Menon DivaKaran, Jessica M Ray
{"title":"Identifying Adolescent Depression and Anxiety Through Real-World Data and Social Determinants of Health: Machine Learning Model Development and Validation.","authors":"Mamoun T Mardini, Georges E Khalil, Chen Bai, Aparna Menon DivaKaran, Jessica M Ray","doi":"10.2196/66665","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The prevalence of adolescent mental health conditions such as depression and anxiety has significantly increased. Despite the potential of machine learning (ML), there is a shortage of models that use real-world data (RWD) to enhance early detection and intervention for these conditions.</p><p><strong>Objective: </strong>This study aimed to identify depression and anxiety in adolescents using ML techniques on RWD and social determinants of health (SDoH).</p><p><strong>Methods: </strong>We analyzed RWD of adolescents aged 10-17 years, considering various factors such as demographics, prior diagnoses, prescribed medications, medical procedures, and laboratory measurements recorded before the onset of anxiety or depression. Clinical data were linked with SDoH at the block-level. Three separate models were developed to predict anxiety, depression, and both conditions. Our ML model of choice was Extreme Gradient Boosting (XGBoost) and we evaluated its performance using the nested cross-validation technique. To interpret the model predictions, we used the Shapley additive explanation method.</p><p><strong>Results: </strong>Our cohort included 52,054 adolescents, identifying 12,572 with anxiety, 7812 with depression, and 14,019 with either condition. The models achieved area under the curve values of 0.80 for anxiety, 0.81 for depression, and 0.78 for both combined. Excluding SDoH data had a minimal impact on model performance. Shapley additive explanation analysis identified gender, race, educational attainment, and various medical factors as key predictors of anxiety and depression.</p><p><strong>Conclusions: </strong>This study highlights the potential of ML in early identification of depression and anxiety in adolescents using RWD. By leveraging RWD, health care providers may more precisely identify at-risk adolescents and intervene earlier, potentially leading to improved mental health outcomes.</p>","PeriodicalId":48616,"journal":{"name":"Jmir Mental Health","volume":"12 ","pages":"e66665"},"PeriodicalIF":4.8000,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Jmir Mental Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/66665","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The prevalence of adolescent mental health conditions such as depression and anxiety has significantly increased. Despite the potential of machine learning (ML), there is a shortage of models that use real-world data (RWD) to enhance early detection and intervention for these conditions.
Objective: This study aimed to identify depression and anxiety in adolescents using ML techniques on RWD and social determinants of health (SDoH).
Methods: We analyzed RWD of adolescents aged 10-17 years, considering various factors such as demographics, prior diagnoses, prescribed medications, medical procedures, and laboratory measurements recorded before the onset of anxiety or depression. Clinical data were linked with SDoH at the block-level. Three separate models were developed to predict anxiety, depression, and both conditions. Our ML model of choice was Extreme Gradient Boosting (XGBoost) and we evaluated its performance using the nested cross-validation technique. To interpret the model predictions, we used the Shapley additive explanation method.
Results: Our cohort included 52,054 adolescents, identifying 12,572 with anxiety, 7812 with depression, and 14,019 with either condition. The models achieved area under the curve values of 0.80 for anxiety, 0.81 for depression, and 0.78 for both combined. Excluding SDoH data had a minimal impact on model performance. Shapley additive explanation analysis identified gender, race, educational attainment, and various medical factors as key predictors of anxiety and depression.
Conclusions: This study highlights the potential of ML in early identification of depression and anxiety in adolescents using RWD. By leveraging RWD, health care providers may more precisely identify at-risk adolescents and intervene earlier, potentially leading to improved mental health outcomes.
期刊介绍:
JMIR Mental Health (JMH, ISSN 2368-7959) is a PubMed-indexed, peer-reviewed sister journal of JMIR, the leading eHealth journal (Impact Factor 2016: 5.175).
JMIR Mental Health focusses on digital health and Internet interventions, technologies and electronic innovations (software and hardware) for mental health, addictions, online counselling and behaviour change. This includes formative evaluation and system descriptions, theoretical papers, review papers, viewpoint/vision papers, and rigorous evaluations.