{"title":"使用深度学习框架的构音障碍严重程度自动分类","authors":"Amlu Anna Joshy, R. Rajan","doi":"10.23919/Eusipco47968.2020.9287741","DOIUrl":null,"url":null,"abstract":"Dysarthria is a neuro-motor speech disorder that renders speech unintelligible, in proportional to its severity. Assessing the severity level of dysarthria, apart from being a diagnostic step to evaluate the patient's improvement, is also capable of aiding automatic dysarthric speech recognition systems. In this paper, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network (CNN) and long short-term memory network (LSTM) is carried out. Mel frequency cepstral coefficients (MFCCs) and its derivatives are used as features. Performance of these models are compared with a baseline support vector machine (SVM) classifier using the UA-Speech corpus and the TORGO database. The highest classification accuracy of 96.18% and 93.24% are reported for TORGO and UA-Speech respectively. Detailed analysis on performance of these models shows that a proper choice of a deep learning architecture can ensure better performance than the conventionally used SVM classifier.","PeriodicalId":6705,"journal":{"name":"2020 28th European Signal Processing Conference (EUSIPCO)","volume":"1 1","pages":"116-120"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Automated Dysarthria Severity Classification Using Deep Learning Frameworks\",\"authors\":\"Amlu Anna Joshy, R. Rajan\",\"doi\":\"10.23919/Eusipco47968.2020.9287741\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dysarthria is a neuro-motor speech disorder that renders speech unintelligible, in proportional to its severity. Assessing the severity level of dysarthria, apart from being a diagnostic step to evaluate the patient's improvement, is also capable of aiding automatic dysarthric speech recognition systems. In this paper, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network (CNN) and long short-term memory network (LSTM) is carried out. Mel frequency cepstral coefficients (MFCCs) and its derivatives are used as features. Performance of these models are compared with a baseline support vector machine (SVM) classifier using the UA-Speech corpus and the TORGO database. The highest classification accuracy of 96.18% and 93.24% are reported for TORGO and UA-Speech respectively. Detailed analysis on performance of these models shows that a proper choice of a deep learning architecture can ensure better performance than the conventionally used SVM classifier.\",\"PeriodicalId\":6705,\"journal\":{\"name\":\"2020 28th European Signal Processing Conference (EUSIPCO)\",\"volume\":\"1 1\",\"pages\":\"116-120\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 28th European Signal Processing Conference (EUSIPCO)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/Eusipco47968.2020.9287741\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th European Signal Processing Conference (EUSIPCO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/Eusipco47968.2020.9287741","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automated Dysarthria Severity Classification Using Deep Learning Frameworks
Dysarthria is a neuro-motor speech disorder that renders speech unintelligible, in proportional to its severity. Assessing the severity level of dysarthria, apart from being a diagnostic step to evaluate the patient's improvement, is also capable of aiding automatic dysarthric speech recognition systems. In this paper, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network (CNN) and long short-term memory network (LSTM) is carried out. Mel frequency cepstral coefficients (MFCCs) and its derivatives are used as features. Performance of these models are compared with a baseline support vector machine (SVM) classifier using the UA-Speech corpus and the TORGO database. The highest classification accuracy of 96.18% and 93.24% are reported for TORGO and UA-Speech respectively. Detailed analysis on performance of these models shows that a proper choice of a deep learning architecture can ensure better performance than the conventionally used SVM classifier.