Hamed Javidi, Arshiya Mariam, Lina Alkhaled, Kevin M Pantalone, Daniel M Rotroff
{"title":"An interpretable predictive deep learning platform for pediatric metabolic diseases.","authors":"Hamed Javidi, Arshiya Mariam, Lina Alkhaled, Kevin M Pantalone, Daniel M Rotroff","doi":"10.1093/jamia/ocae049","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Metabolic disease in children is increasing worldwide and predisposes a wide array of chronic comorbid conditions with severe impacts on quality of life. Tools for early detection are needed to promptly intervene to prevent or slow the development of these long-term complications.</p><p><strong>Materials and methods: </strong>No clinically available tools are currently in widespread use that can predict the onset of metabolic diseases in pediatric patients. Here, we use interpretable deep learning, leveraging longitudinal clinical measurements, demographical data, and diagnosis codes from electronic health record data from a large integrated health system to predict the onset of prediabetes, type 2 diabetes (T2D), and metabolic syndrome in pediatric cohorts.</p><p><strong>Results: </strong>The cohort included 49 517 children with overweight or obesity aged 2-18 (54.9% male, 73% Caucasian), with a median follow-up time of 7.5 years and mean body mass index (BMI) percentile of 88.6%. Our model demonstrated area under receiver operating characteristic curve (AUC) accuracies up to 0.87, 0.79, and 0.79 for predicting T2D, metabolic syndrome, and prediabetes, respectively. Whereas most risk calculators use only recently available data, incorporating longitudinal data improved AUCs by 13.04%, 11.48%, and 11.67% for T2D, syndrome, and prediabetes, respectively, versus models using the most recent BMI (P < 2.2 × 10-16).</p><p><strong>Discussion: </strong>Despite most risk calculators using only the most recent data, incorporating longitudinal data improved the model accuracies because utilizing trajectories provides a more comprehensive characterization of the patient's health history. Our interpretable model indicated that BMI trajectories were consistently identified as one of the most influential features for prediction, highlighting the advantages of incorporating longitudinal data when available.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"1227-1238"},"PeriodicalIF":4.7000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11105121/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae049","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Metabolic disease in children is increasing worldwide and predisposes a wide array of chronic comorbid conditions with severe impacts on quality of life. Tools for early detection are needed to promptly intervene to prevent or slow the development of these long-term complications.
Materials and methods: No clinically available tools are currently in widespread use that can predict the onset of metabolic diseases in pediatric patients. Here, we use interpretable deep learning, leveraging longitudinal clinical measurements, demographical data, and diagnosis codes from electronic health record data from a large integrated health system to predict the onset of prediabetes, type 2 diabetes (T2D), and metabolic syndrome in pediatric cohorts.
Results: The cohort included 49 517 children with overweight or obesity aged 2-18 (54.9% male, 73% Caucasian), with a median follow-up time of 7.5 years and mean body mass index (BMI) percentile of 88.6%. Our model demonstrated area under receiver operating characteristic curve (AUC) accuracies up to 0.87, 0.79, and 0.79 for predicting T2D, metabolic syndrome, and prediabetes, respectively. Whereas most risk calculators use only recently available data, incorporating longitudinal data improved AUCs by 13.04%, 11.48%, and 11.67% for T2D, syndrome, and prediabetes, respectively, versus models using the most recent BMI (P < 2.2 × 10-16).
Discussion: Despite most risk calculators using only the most recent data, incorporating longitudinal data improved the model accuracies because utilizing trajectories provides a more comprehensive characterization of the patient's health history. Our interpretable model indicated that BMI trajectories were consistently identified as one of the most influential features for prediction, highlighting the advantages of incorporating longitudinal data when available.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.