基于瓶颈特征和数据增强的神经网络语音检测阿尔茨海默病

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) Pub Date : 2021-06-06 DOI:10.1109/ICASSP39728.2021.9413566

Zhaoci Liu, Zhiqiang Guo, Zhenhua Ling, Yunxia Li

{"title":"基于瓶颈特征和数据增强的神经网络语音检测阿尔茨海默病","authors":"Zhaoci Liu, Zhiqiang Guo, Zhenhua Ling, Yunxia Li","doi":"10.1109/ICASSP39728.2021.9413566","DOIUrl":null,"url":null,"abstract":"This paper presents a method of detecting Alzheimer’s disease (AD) from the spontaneous speech of subjects in a picture description task using neural networks. This method does not rely on the manual transcriptions and annotations of a subject’s speech, but utilizes the bottleneck features extracted from audio using an ASR model. The neural network contains convolutional neural network (CNN) layers for local context modeling, bidirectional long shortterm memory (BiLSTM) layers for global context modeling and an attention pooling layer for classification. Furthermore, a masking- based data augmentation method is designed to deal with the data scarcity problem. Experiments on the DementiaBank dataset show that the detection accuracy of our proposed method is 82.59%, which is better than the baseline method based on manually-designed acoustic features and support vector machines (SVM), and achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Detecting Alzheimer’s Disease from Speech Using Neural Networks with Bottleneck Features and Data Augmentation\",\"authors\":\"Zhaoci Liu, Zhiqiang Guo, Zhenhua Ling, Yunxia Li\",\"doi\":\"10.1109/ICASSP39728.2021.9413566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a method of detecting Alzheimer’s disease (AD) from the spontaneous speech of subjects in a picture description task using neural networks. This method does not rely on the manual transcriptions and annotations of a subject’s speech, but utilizes the bottleneck features extracted from audio using an ASR model. The neural network contains convolutional neural network (CNN) layers for local context modeling, bidirectional long shortterm memory (BiLSTM) layers for global context modeling and an attention pooling layer for classification. Furthermore, a masking- based data augmentation method is designed to deal with the data scarcity problem. Experiments on the DementiaBank dataset show that the detection accuracy of our proposed method is 82.59%, which is better than the baseline method based on manually-designed acoustic features and support vector machines (SVM), and achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.\",\"PeriodicalId\":347060,\"journal\":{\"name\":\"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP39728.2021.9413566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP39728.2021.9413566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

本文提出了一种利用神经网络从图片描述任务中受试者的自发言语中检测阿尔茨海默病的方法。该方法不依赖于受试者语音的手动转录和注释，而是利用使用ASR模型从音频中提取的瓶颈特征。该神经网络包含卷积神经网络(CNN)层用于局部上下文建模，双向长短期记忆(BiLSTM)层用于全局上下文建模，注意池层用于分类。在此基础上，设计了一种基于掩蔽的数据增强方法来解决数据稀缺性问题。在DementiaBank数据集上的实验表明，该方法的检测准确率为82.59%，优于基于人工设计声学特征和支持向量机(SVM)的基线方法，达到了仅使用该数据集上的音频数据检测AD的最先进性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Detecting Alzheimer’s Disease from Speech Using Neural Networks with Bottleneck Features and Data Augmentation

This paper presents a method of detecting Alzheimer’s disease (AD) from the spontaneous speech of subjects in a picture description task using neural networks. This method does not rely on the manual transcriptions and annotations of a subject’s speech, but utilizes the bottleneck features extracted from audio using an ASR model. The neural network contains convolutional neural network (CNN) layers for local context modeling, bidirectional long shortterm memory (BiLSTM) layers for global context modeling and an attention pooling layer for classification. Furthermore, a masking- based data augmentation method is designed to deal with the data scarcity problem. Experiments on the DementiaBank dataset show that the detection accuracy of our proposed method is 82.59%, which is better than the baseline method based on manually-designed acoustic features and support vector machines (SVM), and achieves the state-of-the-art performance of detecting AD using only audio data on this dataset.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

自引率

0.00%

发文量