Daphne Ter Huurne, Nina Possemis, Leonie Banning, Angélique Gruters, Alexandra König, Nicklas Linz, Johannes Tröger, Kai Langel, Frans Verhey, Marjolein de Vugt, Inez Ramakers
{"title":"半自动电话评估中认知任务的自动语音分析的验证。","authors":"Daphne Ter Huurne, Nina Possemis, Leonie Banning, Angélique Gruters, Alexandra König, Nicklas Linz, Johannes Tröger, Kai Langel, Frans Verhey, Marjolein de Vugt, Inez Ramakers","doi":"10.1159/000533188","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>We studied the accuracy of the automatic speech recognition (ASR) software by comparing ASR scores with manual scores from a verbal learning test (VLT) and a semantic verbal fluency (SVF) task in a semiautomated phone assessment in a memory clinic population. Furthermore, we examined the differentiating value of these tests between participants with subjective cognitive decline (SCD) and mild cognitive impairment (MCI). We also investigated whether the automatically calculated speech and linguistic features had an additional value compared to the commonly used total scores in a semiautomated phone assessment.</p><p><strong>Methods: </strong>We included 94 participants from the memory clinic of the Maastricht University Medical Center+ (SCD <i>N</i> = 56 and MCI <i>N</i> = 38). The test leader guided the participant through a semiautomated phone assessment. The VLT and SVF were audio recorded and processed via a mobile application. The recall count and speech and linguistic features were automatically extracted. The diagnostic groups were classified by training machine learning classifiers to differentiate SCD and MCI participants.</p><p><strong>Results: </strong>The intraclass correlation for inter-rater reliability between the manual and the ASR total word count was 0.89 (95% CI 0.09-0.97) for the VLT immediate recall, 0.94 (95% CI 0.68-0.98) for the VLT delayed recall, and 0.93 (95% CI 0.56-0.97) for the SVF. The full model including the total word count and speech and linguistic features had an area under the curve of 0.81 and 0.77 for the VLT immediate and delayed recall, respectively, and 0.61 for the SVF.</p><p><strong>Conclusion: </strong>There was a high agreement between the ASR and manual scores, keeping the broad confidence intervals in mind. The phone-based VLT was able to differentiate between SCD and MCI and can have opportunities for clinical trial screening.</p>","PeriodicalId":11242,"journal":{"name":"Digital Biomarkers","volume":"7 1","pages":"115-123"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10601928/pdf/","citationCount":"0","resultStr":"{\"title\":\"Validation of an Automated Speech Analysis of Cognitive Tasks within a Semiautomated Phone Assessment.\",\"authors\":\"Daphne Ter Huurne, Nina Possemis, Leonie Banning, Angélique Gruters, Alexandra König, Nicklas Linz, Johannes Tröger, Kai Langel, Frans Verhey, Marjolein de Vugt, Inez Ramakers\",\"doi\":\"10.1159/000533188\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>We studied the accuracy of the automatic speech recognition (ASR) software by comparing ASR scores with manual scores from a verbal learning test (VLT) and a semantic verbal fluency (SVF) task in a semiautomated phone assessment in a memory clinic population. Furthermore, we examined the differentiating value of these tests between participants with subjective cognitive decline (SCD) and mild cognitive impairment (MCI). We also investigated whether the automatically calculated speech and linguistic features had an additional value compared to the commonly used total scores in a semiautomated phone assessment.</p><p><strong>Methods: </strong>We included 94 participants from the memory clinic of the Maastricht University Medical Center+ (SCD <i>N</i> = 56 and MCI <i>N</i> = 38). The test leader guided the participant through a semiautomated phone assessment. The VLT and SVF were audio recorded and processed via a mobile application. The recall count and speech and linguistic features were automatically extracted. The diagnostic groups were classified by training machine learning classifiers to differentiate SCD and MCI participants.</p><p><strong>Results: </strong>The intraclass correlation for inter-rater reliability between the manual and the ASR total word count was 0.89 (95% CI 0.09-0.97) for the VLT immediate recall, 0.94 (95% CI 0.68-0.98) for the VLT delayed recall, and 0.93 (95% CI 0.56-0.97) for the SVF. The full model including the total word count and speech and linguistic features had an area under the curve of 0.81 and 0.77 for the VLT immediate and delayed recall, respectively, and 0.61 for the SVF.</p><p><strong>Conclusion: </strong>There was a high agreement between the ASR and manual scores, keeping the broad confidence intervals in mind. The phone-based VLT was able to differentiate between SCD and MCI and can have opportunities for clinical trial screening.</p>\",\"PeriodicalId\":11242,\"journal\":{\"name\":\"Digital Biomarkers\",\"volume\":\"7 1\",\"pages\":\"115-123\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10601928/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Biomarkers\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1159/000533188\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2023/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Biomarkers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1159/000533188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2023/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}
Validation of an Automated Speech Analysis of Cognitive Tasks within a Semiautomated Phone Assessment.
Introduction: We studied the accuracy of the automatic speech recognition (ASR) software by comparing ASR scores with manual scores from a verbal learning test (VLT) and a semantic verbal fluency (SVF) task in a semiautomated phone assessment in a memory clinic population. Furthermore, we examined the differentiating value of these tests between participants with subjective cognitive decline (SCD) and mild cognitive impairment (MCI). We also investigated whether the automatically calculated speech and linguistic features had an additional value compared to the commonly used total scores in a semiautomated phone assessment.
Methods: We included 94 participants from the memory clinic of the Maastricht University Medical Center+ (SCD N = 56 and MCI N = 38). The test leader guided the participant through a semiautomated phone assessment. The VLT and SVF were audio recorded and processed via a mobile application. The recall count and speech and linguistic features were automatically extracted. The diagnostic groups were classified by training machine learning classifiers to differentiate SCD and MCI participants.
Results: The intraclass correlation for inter-rater reliability between the manual and the ASR total word count was 0.89 (95% CI 0.09-0.97) for the VLT immediate recall, 0.94 (95% CI 0.68-0.98) for the VLT delayed recall, and 0.93 (95% CI 0.56-0.97) for the SVF. The full model including the total word count and speech and linguistic features had an area under the curve of 0.81 and 0.77 for the VLT immediate and delayed recall, respectively, and 0.61 for the SVF.
Conclusion: There was a high agreement between the ASR and manual scores, keeping the broad confidence intervals in mind. The phone-based VLT was able to differentiate between SCD and MCI and can have opportunities for clinical trial screening.