通过自发言语自动诊断和预测阿尔茨海默氏痴呆症相关认知衰退

2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA) Pub Date : 2021-09-13 DOI:10.1109/ICSIPA52582.2021.9576784

Ziming Liu, Lauren Proctor, Parker N. Collier, Xiaopeng Zhao

{"title":"通过自发言语自动诊断和预测阿尔茨海默氏痴呆症相关认知衰退","authors":"Ziming Liu, Lauren Proctor, Parker N. Collier, Xiaopeng Zhao","doi":"10.1109/ICSIPA52582.2021.9576784","DOIUrl":null,"url":null,"abstract":"With the increasing prevalence of Alzheimer’s disease (AD), it is important to develop detectable biomarkers to reliably identify AD in the early stage. Language deficit is one of the common signs that appear in the early stage of mild Alzheimer’s disease. Therefore, using natural language processing and related machine learning algorithms for AD diagnosis using patients’ speech recordings has drawn more attention in recent years. In this study, three approaches are proposed to extract features through speech recording in this model: (1) using fine-tuning pre-trained encoder model (BERT) for transcripts from automatic transcription, (2) hand-crafted linguistic features for transcripts from automatic transcription, and (3) selected acoustic features for denoised speech recordings. The three designed approaches are applied to three tasks: AD diagnosis, MMSE score prediction, and cognitive decline inference. The approach using BERT yields the best performance in all three challenge tasks based on cross-validation results using the training dataset. Specifically, in the AD diagnosis task, 5-fold cross-validation using encoded features based on transcripts generated from Deep Speech yields an average classification accuracy of 97.18%. In the MMSE score prediction task, 5-fold cross-validation using BERT encoded features based on transcripts generated from Deep Speech yields an average Root Mean Squared Error (RMSE) of 3.76. In the cognitive decline inference task, the leave-one-out cross-validation using BERT encoded features based on transcripts generated from Sphinx or Deep Speech yields an average classification accuracy of 100%. The analyses suggest that the combination of automatic transcription and BERT may produce a significant performance in AD related detection and prediction problems.","PeriodicalId":326688,"journal":{"name":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Automatic Diagnosis and Prediction of Cognitive Decline Associated with Alzheimer’s Dementia through Spontaneous Speech\",\"authors\":\"Ziming Liu, Lauren Proctor, Parker N. Collier, Xiaopeng Zhao\",\"doi\":\"10.1109/ICSIPA52582.2021.9576784\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the increasing prevalence of Alzheimer’s disease (AD), it is important to develop detectable biomarkers to reliably identify AD in the early stage. Language deficit is one of the common signs that appear in the early stage of mild Alzheimer’s disease. Therefore, using natural language processing and related machine learning algorithms for AD diagnosis using patients’ speech recordings has drawn more attention in recent years. In this study, three approaches are proposed to extract features through speech recording in this model: (1) using fine-tuning pre-trained encoder model (BERT) for transcripts from automatic transcription, (2) hand-crafted linguistic features for transcripts from automatic transcription, and (3) selected acoustic features for denoised speech recordings. The three designed approaches are applied to three tasks: AD diagnosis, MMSE score prediction, and cognitive decline inference. The approach using BERT yields the best performance in all three challenge tasks based on cross-validation results using the training dataset. Specifically, in the AD diagnosis task, 5-fold cross-validation using encoded features based on transcripts generated from Deep Speech yields an average classification accuracy of 97.18%. In the MMSE score prediction task, 5-fold cross-validation using BERT encoded features based on transcripts generated from Deep Speech yields an average Root Mean Squared Error (RMSE) of 3.76. In the cognitive decline inference task, the leave-one-out cross-validation using BERT encoded features based on transcripts generated from Sphinx or Deep Speech yields an average classification accuracy of 100%. The analyses suggest that the combination of automatic transcription and BERT may produce a significant performance in AD related detection and prediction problems.\",\"PeriodicalId\":326688,\"journal\":{\"name\":\"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSIPA52582.2021.9576784\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSIPA52582.2021.9576784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

随着阿尔茨海默病(AD)患病率的增加，开发可检测的生物标志物以在早期可靠地识别AD非常重要。语言障碍是出现在轻度阿尔茨海默病早期的常见症状之一。因此，利用自然语言处理及相关机器学习算法，利用患者语音录音进行AD诊断，近年来备受关注。在本研究中，该模型提出了三种通过语音记录提取特征的方法:(1)对自动转录的转录本使用微调预训练编码器模型(BERT)，(2)对自动转录的转录本手工制作语言特征，(3)对去噪的语音记录选择声学特征。设计的三种方法应用于三个任务:AD诊断，MMSE评分预测和认知衰退推断。基于使用训练数据集的交叉验证结果，使用BERT的方法在所有三个挑战任务中产生最佳性能。具体来说，在AD诊断任务中，使用基于Deep Speech生成的转录本的编码特征进行5次交叉验证，平均分类准确率为97.18%。在MMSE评分预测任务中，使用基于深度语音生成的转录本的BERT编码特征进行5次交叉验证，平均均方根误差(RMSE)为3.76。在认知衰退推理任务中，使用基于Sphinx或Deep Speech生成的转录本的BERT编码特征进行留一交叉验证，平均分类准确率为100%。分析表明，自动转录和BERT的结合可能在AD相关的检测和预测问题上产生显著的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Automatic Diagnosis and Prediction of Cognitive Decline Associated with Alzheimer’s Dementia through Spontaneous Speech

With the increasing prevalence of Alzheimer’s disease (AD), it is important to develop detectable biomarkers to reliably identify AD in the early stage. Language deficit is one of the common signs that appear in the early stage of mild Alzheimer’s disease. Therefore, using natural language processing and related machine learning algorithms for AD diagnosis using patients’ speech recordings has drawn more attention in recent years. In this study, three approaches are proposed to extract features through speech recording in this model: (1) using fine-tuning pre-trained encoder model (BERT) for transcripts from automatic transcription, (2) hand-crafted linguistic features for transcripts from automatic transcription, and (3) selected acoustic features for denoised speech recordings. The three designed approaches are applied to three tasks: AD diagnosis, MMSE score prediction, and cognitive decline inference. The approach using BERT yields the best performance in all three challenge tasks based on cross-validation results using the training dataset. Specifically, in the AD diagnosis task, 5-fold cross-validation using encoded features based on transcripts generated from Deep Speech yields an average classification accuracy of 97.18%. In the MMSE score prediction task, 5-fold cross-validation using BERT encoded features based on transcripts generated from Deep Speech yields an average Root Mean Squared Error (RMSE) of 3.76. In the cognitive decline inference task, the leave-one-out cross-validation using BERT encoded features based on transcripts generated from Sphinx or Deep Speech yields an average classification accuracy of 100%. The analyses suggest that the combination of automatic transcription and BERT may produce a significant performance in AD related detection and prediction problems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE International Conference on Signal and Image Processing Applications (ICSIPA)

自引率

0.00%

发文量

期刊最新文献

Personal Protective Equipment Detection with Live Camera A Fast and Unbiased Minimalistic Resampling Approach for the Particle Filter Sparse Checkerboard Corner Detection from Global Perspective Comparison of Dental Caries Level Images Classification Performance using KNN and SVM Methods An Insight Into the Rise Time of Exponential Smoothing for Speech Enhancement Methods