{"title":"Linguistic changes in spontaneous speech for detecting Parkinson's disease using large language models.","authors":"Jonathan L Crawford","doi":"10.1371/journal.pdig.0000757","DOIUrl":null,"url":null,"abstract":"<p><p>Parkinson's disease is the second most prevalent neurodegenerative disorder with over ten million active cases worldwide and one million new diagnoses per year. Detecting and subsequently diagnosing the disease is challenging because of symptom heterogeneity with respect to complexity, as well as the type and timing of phenotypic manifestations. Typically, language impairment can present in the prodromal phase and precede motor symptoms suggesting that a linguistic-based approach could serve as a diagnostic method for incipient Parkinson's disease. Additionally, improved linguistic models may enhance other approaches through fusion techniques. The field of large language models is advancing rapidly, presenting the opportunity to explore the use of these new models for detecting Parkinson's disease and to improve on current linguistic approaches with high-dimensional representations of linguistics. We evaluate the application of state-of-the-art large language models to detect Parkinson's disease automatically from spontaneous speech with up to 78% accuracy. We also demonstrate that large language models can be used to predict the severity of PD in a regression task. We further demonstrate that the better performance of large language models is due to their ability to extract more relevant linguistic features and not due to increased dimensionality of the feature space.</p>","PeriodicalId":74465,"journal":{"name":"PLOS digital health","volume":"4 2","pages":"e0000757"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809853/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLOS digital health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1371/journal.pdig.0000757","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Parkinson's disease is the second most prevalent neurodegenerative disorder with over ten million active cases worldwide and one million new diagnoses per year. Detecting and subsequently diagnosing the disease is challenging because of symptom heterogeneity with respect to complexity, as well as the type and timing of phenotypic manifestations. Typically, language impairment can present in the prodromal phase and precede motor symptoms suggesting that a linguistic-based approach could serve as a diagnostic method for incipient Parkinson's disease. Additionally, improved linguistic models may enhance other approaches through fusion techniques. The field of large language models is advancing rapidly, presenting the opportunity to explore the use of these new models for detecting Parkinson's disease and to improve on current linguistic approaches with high-dimensional representations of linguistics. We evaluate the application of state-of-the-art large language models to detect Parkinson's disease automatically from spontaneous speech with up to 78% accuracy. We also demonstrate that large language models can be used to predict the severity of PD in a regression task. We further demonstrate that the better performance of large language models is due to their ability to extract more relevant linguistic features and not due to increased dimensionality of the feature space.