Automated Dysarthria Severity Classification Using Deep Learning Frameworks

2020 28th European Signal Processing Conference (EUSIPCO) Pub Date : 2021-01-24 DOI:10.23919/Eusipco47968.2020.9287741

Amlu Anna Joshy, R. Rajan

引用次数: 15

Abstract

Dysarthria is a neuro-motor speech disorder that renders speech unintelligible, in proportional to its severity. Assessing the severity level of dysarthria, apart from being a diagnostic step to evaluate the patient's improvement, is also capable of aiding automatic dysarthric speech recognition systems. In this paper, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network (CNN) and long short-term memory network (LSTM) is carried out. Mel frequency cepstral coefficients (MFCCs) and its derivatives are used as features. Performance of these models are compared with a baseline support vector machine (SVM) classifier using the UA-Speech corpus and the TORGO database. The highest classification accuracy of 96.18% and 93.24% are reported for TORGO and UA-Speech respectively. Detailed analysis on performance of these models shows that a proper choice of a deep learning architecture can ensure better performance than the conventionally used SVM classifier.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用深度学习框架的构音障碍严重程度自动分类

构音障碍是一种神经运动语言障碍，其严重程度与其言语无法理解成正比。评估构音障碍的严重程度，除了作为评估患者改善的诊断步骤外，还能够帮助自动构音障碍语音识别系统。本文采用深度学习的多种架构选择，即深度神经网络(deep neural network, DNN)、卷积神经网络(convolutional neural network, CNN)和长短期记忆网络(long - short-term memory network, LSTM)，对dysarthia的严重程度分类进行了详细研究。用Mel频率倒谱系数及其导数作为特征。将这些模型的性能与使用UA-Speech语料库和TORGO数据库的基线支持向量机(SVM)分类器进行比较。TORGO和UA-Speech的分类准确率最高，分别为96.18%和93.24%。对这些模型性能的详细分析表明，适当选择深度学习架构可以确保比传统使用的SVM分类器更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 28th European Signal Processing Conference (EUSIPCO)

自引率

0.00%

发文量

期刊最新文献

Eusipco 2021 Cover Page A graph-theoretic sensor-selection scheme for covariance-based Motor Imagery (MI) decoding Hidden Markov Model Based Data-driven Calibration of Non-dispersive Infrared Gas Sensor Deep Transform Learning for Multi-Sensor Fusion Two Stages Parallel LMS Structure: A Pipelined Hardware Architecture