{"title":"文本独立说话人识别使用Mel频率倒谱系数和神经网络分类器","authors":"H. Seddik, A. Rahmouni, M. Sayadi","doi":"10.1109/ISCCSP.2004.1296479","DOIUrl":null,"url":null,"abstract":"Modern speaker recognition applications require high accuracy at low complexity and easy calculation. In this paper, we propose a new method of text independent speaker recognition based on the use of the mean of the Mel frequency cepstral coefficients (MFCC) as a speaker model. These MFCC are extracted from the speaker phonemes in the pre-segmented speech sentences. A multi-layer neural network trained with the back propagation algorithm is proposed to classify these discriminative models. A study is carried out in order to view these models efficiency. Several experiments are made and show that the proposed method gives a high speaker recognition rate. Furthermore, throw these experiments; a technique is proposed to improve this recognition rate by an appropriate phonemes database selection.","PeriodicalId":146713,"journal":{"name":"First International Symposium on Control, Communications and Signal Processing, 2004.","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"50","resultStr":"{\"title\":\"Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier\",\"authors\":\"H. Seddik, A. Rahmouni, M. Sayadi\",\"doi\":\"10.1109/ISCCSP.2004.1296479\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern speaker recognition applications require high accuracy at low complexity and easy calculation. In this paper, we propose a new method of text independent speaker recognition based on the use of the mean of the Mel frequency cepstral coefficients (MFCC) as a speaker model. These MFCC are extracted from the speaker phonemes in the pre-segmented speech sentences. A multi-layer neural network trained with the back propagation algorithm is proposed to classify these discriminative models. A study is carried out in order to view these models efficiency. Several experiments are made and show that the proposed method gives a high speaker recognition rate. Furthermore, throw these experiments; a technique is proposed to improve this recognition rate by an appropriate phonemes database selection.\",\"PeriodicalId\":146713,\"journal\":{\"name\":\"First International Symposium on Control, Communications and Signal Processing, 2004.\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-09-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"50\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"First International Symposium on Control, Communications and Signal Processing, 2004.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCCSP.2004.1296479\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"First International Symposium on Control, Communications and Signal Processing, 2004.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCCSP.2004.1296479","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier
Modern speaker recognition applications require high accuracy at low complexity and easy calculation. In this paper, we propose a new method of text independent speaker recognition based on the use of the mean of the Mel frequency cepstral coefficients (MFCC) as a speaker model. These MFCC are extracted from the speaker phonemes in the pre-segmented speech sentences. A multi-layer neural network trained with the back propagation algorithm is proposed to classify these discriminative models. A study is carried out in order to view these models efficiency. Several experiments are made and show that the proposed method gives a high speaker recognition rate. Furthermore, throw these experiments; a technique is proposed to improve this recognition rate by an appropriate phonemes database selection.