Bo Wang, Yi‐Han Sheu, Hyunjoon Lee, Robert G. Mealer, Victor M. Castro, Jordan W. Smoller
{"title":"Prediction of early‐onset bipolar using electronic health records","authors":"Bo Wang, Yi‐Han Sheu, Hyunjoon Lee, Robert G. Mealer, Victor M. Castro, Jordan W. Smoller","doi":"10.1111/jcpp.14131","DOIUrl":null,"url":null,"abstract":"BackgroundEarly identification of bipolar disorder (BD) provides an important opportunity for timely intervention. In this study, we aimed to develop machine learning models using large‐scale electronic health record (EHR) data including clinical notes for predicting early‐onset BD.MethodsStructured and unstructured data were extracted from the longitudinal EHR of the Mass General Brigham health system. We defined three cohorts aged 10–25 years: (1) the full youth cohort (<jats:italic>N</jats:italic> = 300,398); (2) a subcohort defined by having a mental health visit (<jats:italic>N</jats:italic> = 105,461); and (3) a subcohort defined by having a diagnosis of mood disorder or ADHD (<jats:italic>N</jats:italic> = 35,213). By adopting a prospective landmark modeling approach that aligns with clinical practice, we developed and validated a range of machine learning models, across different cohorts and prediction windows.ResultsWe found the two tree‐based models, random forests (RF) and light gradient‐boosting machine (LGBM), achieving good discriminative performance across different clinical settings (area under the receiver operating characteristic curve 0.76–0.88 for RF and 0.74–0.89 for LGBM). In addition, we showed comparable performance can be achieved with a greatly reduced set of features, demonstrating computational efficiency can be attained without significant compromise of model accuracy.ConclusionsGood discriminative performance for models predicting early‐onset BD can be achieved utilizing large‐scale EHR data. Our study offers a scalable and accurate method for identifying youth at risk for BD that could help inform clinical decision‐making and facilitate early intervention. Future work includes evaluating the portability of our approach to other healthcare systems and exploring considerations regarding possible implementation.","PeriodicalId":187,"journal":{"name":"Journal of Child Psychology and Psychiatry","volume":"64 1","pages":""},"PeriodicalIF":6.5000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Child Psychology and Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/jcpp.14131","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
BackgroundEarly identification of bipolar disorder (BD) provides an important opportunity for timely intervention. In this study, we aimed to develop machine learning models using large‐scale electronic health record (EHR) data including clinical notes for predicting early‐onset BD.MethodsStructured and unstructured data were extracted from the longitudinal EHR of the Mass General Brigham health system. We defined three cohorts aged 10–25 years: (1) the full youth cohort (N = 300,398); (2) a subcohort defined by having a mental health visit (N = 105,461); and (3) a subcohort defined by having a diagnosis of mood disorder or ADHD (N = 35,213). By adopting a prospective landmark modeling approach that aligns with clinical practice, we developed and validated a range of machine learning models, across different cohorts and prediction windows.ResultsWe found the two tree‐based models, random forests (RF) and light gradient‐boosting machine (LGBM), achieving good discriminative performance across different clinical settings (area under the receiver operating characteristic curve 0.76–0.88 for RF and 0.74–0.89 for LGBM). In addition, we showed comparable performance can be achieved with a greatly reduced set of features, demonstrating computational efficiency can be attained without significant compromise of model accuracy.ConclusionsGood discriminative performance for models predicting early‐onset BD can be achieved utilizing large‐scale EHR data. Our study offers a scalable and accurate method for identifying youth at risk for BD that could help inform clinical decision‐making and facilitate early intervention. Future work includes evaluating the portability of our approach to other healthcare systems and exploring considerations regarding possible implementation.
期刊介绍:
The Journal of Child Psychology and Psychiatry (JCPP) is a highly regarded international publication that focuses on the fields of child and adolescent psychology and psychiatry. It is recognized for publishing top-tier, clinically relevant research across various disciplines related to these areas. JCPP has a broad global readership and covers a diverse range of topics, including:
Epidemiology: Studies on the prevalence and distribution of mental health issues in children and adolescents.
Diagnosis: Research on the identification and classification of childhood disorders.
Treatments: Psychotherapeutic and psychopharmacological interventions for child and adolescent mental health.
Behavior and Cognition: Studies on the behavioral and cognitive aspects of childhood disorders.
Neuroscience and Neurobiology: Research on the neural and biological underpinnings of child mental health.
Genetics: Genetic factors contributing to the development of childhood disorders.
JCPP serves as a platform for integrating empirical research, clinical studies, and high-quality reviews from diverse perspectives, theoretical viewpoints, and disciplines. This interdisciplinary approach is a key feature of the journal, as it fosters a comprehensive understanding of child and adolescent mental health.
The Journal of Child Psychology and Psychiatry is published 12 times a year and is affiliated with the Association for Child and Adolescent Mental Health (ACAMH), which supports the journal's mission to advance knowledge and practice in the field of child and adolescent mental health.