{"title":"Dimensional Measures of Psychopathology in Children and Adolescents Using Large Language Models.","authors":"Thomas H McCoy, Roy H Perlis","doi":"10.1016/j.biopsych.2024.05.008","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>To enable greater use of National Institute of Mental Health Research Domain Criteria (RDoC) in real-world settings, we applied large language models (LLMs) to estimate dimensional psychopathology from narrative clinical notes.</p><p><strong>Methods: </strong>We conducted a cohort study using health records from individuals age ≤18 years evaluated in the psychiatric emergency department of a large academic medical center between November 2008 and March 2015. Outcomes were hospital admission and length of emergency department stay. RDoC domains were estimated using a Health Insurance Portability and Accountability Act-compliant LLM (gpt-4-1106-preview) and compared with a previously validated token-based approach.</p><p><strong>Results: </strong>The cohort included 3059 individuals (median age 16 years [interquartile range, 13-18]; 1580 [52%] female, 1479 [48%] male; 105 [3.4%] identified as Asian, 329 [11%] as Black, 288 [9.4%] as Hispanic, 474 [15%] as other race, and 1863 [61%] as White), of whom 1695 (55%) were admitted. Correlation between LLM-extracted RDoC scores and the token-based scores ranged from small to medium as assessed by Kendall's tau (0.14-0.22). In logistic regression models adjusting for sociodemographic and clinical features, admission likelihood was associated with greater scores on all domains, with the exception of the sensorimotor domain, which was inversely associated (p < .001 for all adjusted associations). Tests for bias suggested modest but statistically significant differences in positive valence scores by race (p < .05 for Asian, Black, and Hispanic individuals).</p><p><strong>Conclusions: </strong>An LLM extracted estimates of 6 RDoC domains in an explainable manner, which were associated with clinical outcomes. This approach can contribute to a new generation of prediction models or biological investigations based on dimensional psychopathology.</p>","PeriodicalId":8918,"journal":{"name":"Biological Psychiatry","volume":null,"pages":null},"PeriodicalIF":9.6000,"publicationDate":"2024-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biological Psychiatry","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.biopsych.2024.05.008","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/10 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Background: To enable greater use of National Institute of Mental Health Research Domain Criteria (RDoC) in real-world settings, we applied large language models (LLMs) to estimate dimensional psychopathology from narrative clinical notes.
Methods: We conducted a cohort study using health records from individuals age ≤18 years evaluated in the psychiatric emergency department of a large academic medical center between November 2008 and March 2015. Outcomes were hospital admission and length of emergency department stay. RDoC domains were estimated using a Health Insurance Portability and Accountability Act-compliant LLM (gpt-4-1106-preview) and compared with a previously validated token-based approach.
Results: The cohort included 3059 individuals (median age 16 years [interquartile range, 13-18]; 1580 [52%] female, 1479 [48%] male; 105 [3.4%] identified as Asian, 329 [11%] as Black, 288 [9.4%] as Hispanic, 474 [15%] as other race, and 1863 [61%] as White), of whom 1695 (55%) were admitted. Correlation between LLM-extracted RDoC scores and the token-based scores ranged from small to medium as assessed by Kendall's tau (0.14-0.22). In logistic regression models adjusting for sociodemographic and clinical features, admission likelihood was associated with greater scores on all domains, with the exception of the sensorimotor domain, which was inversely associated (p < .001 for all adjusted associations). Tests for bias suggested modest but statistically significant differences in positive valence scores by race (p < .05 for Asian, Black, and Hispanic individuals).
Conclusions: An LLM extracted estimates of 6 RDoC domains in an explainable manner, which were associated with clinical outcomes. This approach can contribute to a new generation of prediction models or biological investigations based on dimensional psychopathology.
期刊介绍:
Biological Psychiatry is an official journal of the Society of Biological Psychiatry and was established in 1969. It is the first journal in the Biological Psychiatry family, which also includes Biological Psychiatry: Cognitive Neuroscience and Neuroimaging and Biological Psychiatry: Global Open Science. The Society's main goal is to promote excellence in scientific research and education in the fields related to the nature, causes, mechanisms, and treatments of disorders pertaining to thought, emotion, and behavior. To fulfill this mission, Biological Psychiatry publishes peer-reviewed, rapid-publication articles that present new findings from original basic, translational, and clinical mechanistic research, ultimately advancing our understanding of psychiatric disorders and their treatment. The journal also encourages the submission of reviews and commentaries on current research and topics of interest.