{"title":"Speaker identification using dynamic time warping with stress compensation technique","authors":"I. Shahin, N. Botros","doi":"10.1109/SECON.1998.673293","DOIUrl":null,"url":null,"abstract":"We present an algorithm for an isolated-word text-dependent speaker identification under normal and four stressful styles. The styles are: shout, slow, loud, and soft which are designed to simulate speech produced under real stressful conditions. The algorithm is based on dynamic time warping (DTW) with a cepstral stress compensation technique. Comparing DTW combined with cepstral stress compensation, with DTW without cepstral stress compensation, the recognition rate has improved to some extent with a little increase in the computations. The recognition rate is improved: from 33% to 67% in shout style, from 51% to 84% in slow style, from 40% to 80% in loud style, and from 52% to 70% in soft style. The cepstral coefficients and transitional coefficients are combined to form an observation vector for dynamic time warping. This algorithm is tested on a limited number of speakers due to our limited data base.","PeriodicalId":281991,"journal":{"name":"Proceedings IEEE Southeastcon '98 'Engineering for a New Era'","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE Southeastcon '98 'Engineering for a New Era'","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SECON.1998.673293","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
We present an algorithm for an isolated-word text-dependent speaker identification under normal and four stressful styles. The styles are: shout, slow, loud, and soft which are designed to simulate speech produced under real stressful conditions. The algorithm is based on dynamic time warping (DTW) with a cepstral stress compensation technique. Comparing DTW combined with cepstral stress compensation, with DTW without cepstral stress compensation, the recognition rate has improved to some extent with a little increase in the computations. The recognition rate is improved: from 33% to 67% in shout style, from 51% to 84% in slow style, from 40% to 80% in loud style, and from 52% to 70% in soft style. The cepstral coefficients and transitional coefficients are combined to form an observation vector for dynamic time warping. This algorithm is tested on a limited number of speakers due to our limited data base.