B. Kusumoputro, A. Triyanto, M. I. Fanany, W. Jatmiko
{"title":"基于双谱分析和概率神经网络的噪声环境下说话人识别","authors":"B. Kusumoputro, A. Triyanto, M. I. Fanany, W. Jatmiko","doi":"10.1109/ICCIMA.2001.970480","DOIUrl":null,"url":null,"abstract":"The paper describes the application of a neural processing for extracting bispectrum feature of speech data, and the use of probabilistic neural network as a classifier in an automatic speech recognition system. The usually used feature extraction paradigm in the early development of the speech recognition system is power spectrum analysis, however, the recognition rate of this system is not high enough, especially when a Gaussian noise is added to the utterance speech data. In this paper, we developed a speaker identification system using bispectrum feature analysis. To analyse the distribution of the bispectrum data along its two dimensional representation, we developed an adaptive feature extraction mechanism of the bispectrum speech data based on cascade neural network. A cascade configuration of SOFM (Self-Organizing Feature Map) and LVQ (Learning Vector Quantization) is used as an adaptive codebook generation algorithm for determining the feature distribution of the bispectrum speech data. The K-L transformation (K-LT) technique is then used as a preprocessing element before the neural classifier is utilized. This K-LT has shown as an effective procedure for orthogonalization and dimensionality reduction of the codebook vectors generated from bispectrum data. Experimental results show that our system could perform with high recognition rate on the undirected utterance speech, especially when a higher number of codebook vectors are utilized. It is also shown that the use of PNN could increase the recognition rate significantly, even using speech data with additional Gaussian noise.","PeriodicalId":232504,"journal":{"name":"Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2001","volume":"104 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Speaker identification in noisy environment using bispectrum analysis and probabilistic neural network\",\"authors\":\"B. Kusumoputro, A. Triyanto, M. I. Fanany, W. Jatmiko\",\"doi\":\"10.1109/ICCIMA.2001.970480\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper describes the application of a neural processing for extracting bispectrum feature of speech data, and the use of probabilistic neural network as a classifier in an automatic speech recognition system. The usually used feature extraction paradigm in the early development of the speech recognition system is power spectrum analysis, however, the recognition rate of this system is not high enough, especially when a Gaussian noise is added to the utterance speech data. In this paper, we developed a speaker identification system using bispectrum feature analysis. To analyse the distribution of the bispectrum data along its two dimensional representation, we developed an adaptive feature extraction mechanism of the bispectrum speech data based on cascade neural network. A cascade configuration of SOFM (Self-Organizing Feature Map) and LVQ (Learning Vector Quantization) is used as an adaptive codebook generation algorithm for determining the feature distribution of the bispectrum speech data. The K-L transformation (K-LT) technique is then used as a preprocessing element before the neural classifier is utilized. This K-LT has shown as an effective procedure for orthogonalization and dimensionality reduction of the codebook vectors generated from bispectrum data. Experimental results show that our system could perform with high recognition rate on the undirected utterance speech, especially when a higher number of codebook vectors are utilized. It is also shown that the use of PNN could increase the recognition rate significantly, even using speech data with additional Gaussian noise.\",\"PeriodicalId\":232504,\"journal\":{\"name\":\"Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2001\",\"volume\":\"104 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-10-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2001\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCIMA.2001.970480\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2001","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIMA.2001.970480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Speaker identification in noisy environment using bispectrum analysis and probabilistic neural network
The paper describes the application of a neural processing for extracting bispectrum feature of speech data, and the use of probabilistic neural network as a classifier in an automatic speech recognition system. The usually used feature extraction paradigm in the early development of the speech recognition system is power spectrum analysis, however, the recognition rate of this system is not high enough, especially when a Gaussian noise is added to the utterance speech data. In this paper, we developed a speaker identification system using bispectrum feature analysis. To analyse the distribution of the bispectrum data along its two dimensional representation, we developed an adaptive feature extraction mechanism of the bispectrum speech data based on cascade neural network. A cascade configuration of SOFM (Self-Organizing Feature Map) and LVQ (Learning Vector Quantization) is used as an adaptive codebook generation algorithm for determining the feature distribution of the bispectrum speech data. The K-L transformation (K-LT) technique is then used as a preprocessing element before the neural classifier is utilized. This K-LT has shown as an effective procedure for orthogonalization and dimensionality reduction of the codebook vectors generated from bispectrum data. Experimental results show that our system could perform with high recognition rate on the undirected utterance speech, especially when a higher number of codebook vectors are utilized. It is also shown that the use of PNN could increase the recognition rate significantly, even using speech data with additional Gaussian noise.