{"title":"Automatic Speaker Verification System for Dysarthria Patients","authors":"Shinimol Salim, S. Shahnawazuddin, Waquar Ahmad","doi":"10.21437/interspeech.2022-375","DOIUrl":null,"url":null,"abstract":"Dysarthria is one of the most common speech communication disorder associate with a neurological damage that weakens the muscles necessary for speech. In this paper, we present our efforts towards developing an automatic speaker verification (ASV) system based on x -vectors for dysarthric speakers with varying speech intelligibility (low, medium and high). For that purpose, a baseline ASV system was trained on speech data from healthy speakers since there is severe scarcity of data from dysarthric speakers. To improve the performance with respect to dysarthric speakers, data augmentation based on duration modification is proposed in this study. Duration modification with several scaling factors was applied to healthy training speech. An ASV system was then trained on healthy speech augmented with its duration modified versions. It compen-sates for the substantial disparities in phone duration between normal and dysarthric speakers of varying speech intelligibilty. Experiment evaluations presented in this study show that proposed duration-modification-based data augmentation resulted in a relative improvement of 22% over the baseline. Further to that, a relative improvement of 26% was obtained in the case of speakers with high severity level of dysarthria.","PeriodicalId":73500,"journal":{"name":"Interspeech","volume":"1 1","pages":"5070-5074"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Interspeech","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/interspeech.2022-375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Dysarthria is one of the most common speech communication disorder associate with a neurological damage that weakens the muscles necessary for speech. In this paper, we present our efforts towards developing an automatic speaker verification (ASV) system based on x -vectors for dysarthric speakers with varying speech intelligibility (low, medium and high). For that purpose, a baseline ASV system was trained on speech data from healthy speakers since there is severe scarcity of data from dysarthric speakers. To improve the performance with respect to dysarthric speakers, data augmentation based on duration modification is proposed in this study. Duration modification with several scaling factors was applied to healthy training speech. An ASV system was then trained on healthy speech augmented with its duration modified versions. It compen-sates for the substantial disparities in phone duration between normal and dysarthric speakers of varying speech intelligibilty. Experiment evaluations presented in this study show that proposed duration-modification-based data augmentation resulted in a relative improvement of 22% over the baseline. Further to that, a relative improvement of 26% was obtained in the case of speakers with high severity level of dysarthria.