{"title":"基于MFCC和前馈神经网络的元音分类","authors":"M. Paulraj, S. Yaacob, A. Nazri, Sathees Kumar","doi":"10.1109/CSPA.2009.5069189","DOIUrl":null,"url":null,"abstract":"The English language as spoken by Malaysians varies from place to place and differs from one ethnic community and its sub-group to another. Hence, it is necessary to develop an exclusive Speech to text translation system for understanding the English pronunciation as spoken by Malaysians. Speech translation is a process of both speech recognition and equivalent phonemic to word translation. Speech recognition is a process of identifying phonemes from the speech segment. In this paper, the initial step for speech recognition by identifying the phoneme features is proposed. In order to classify the phoneme features, Mel-frequency cepstral coefficients (MFCC) are computed in this paper. A simple feed forward Neural Network (FFNN) trained by back propagation procedure is proposed for identifying the phonemes features. The extracted MFCC coefficients are used as input to a neural network classifier for associating it to one of the 11 classes.","PeriodicalId":338469,"journal":{"name":"2009 5th International Colloquium on Signal Processing & Its Applications","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":"{\"title\":\"Classification of vowel sounds using MFCC and feed forward Neural Network\",\"authors\":\"M. Paulraj, S. Yaacob, A. Nazri, Sathees Kumar\",\"doi\":\"10.1109/CSPA.2009.5069189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The English language as spoken by Malaysians varies from place to place and differs from one ethnic community and its sub-group to another. Hence, it is necessary to develop an exclusive Speech to text translation system for understanding the English pronunciation as spoken by Malaysians. Speech translation is a process of both speech recognition and equivalent phonemic to word translation. Speech recognition is a process of identifying phonemes from the speech segment. In this paper, the initial step for speech recognition by identifying the phoneme features is proposed. In order to classify the phoneme features, Mel-frequency cepstral coefficients (MFCC) are computed in this paper. A simple feed forward Neural Network (FFNN) trained by back propagation procedure is proposed for identifying the phonemes features. The extracted MFCC coefficients are used as input to a neural network classifier for associating it to one of the 11 classes.\",\"PeriodicalId\":338469,\"journal\":{\"name\":\"2009 5th International Colloquium on Signal Processing & Its Applications\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"16\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 5th International Colloquium on Signal Processing & Its Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CSPA.2009.5069189\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 5th International Colloquium on Signal Processing & Its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSPA.2009.5069189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Classification of vowel sounds using MFCC and feed forward Neural Network
The English language as spoken by Malaysians varies from place to place and differs from one ethnic community and its sub-group to another. Hence, it is necessary to develop an exclusive Speech to text translation system for understanding the English pronunciation as spoken by Malaysians. Speech translation is a process of both speech recognition and equivalent phonemic to word translation. Speech recognition is a process of identifying phonemes from the speech segment. In this paper, the initial step for speech recognition by identifying the phoneme features is proposed. In order to classify the phoneme features, Mel-frequency cepstral coefficients (MFCC) are computed in this paper. A simple feed forward Neural Network (FFNN) trained by back propagation procedure is proposed for identifying the phonemes features. The extracted MFCC coefficients are used as input to a neural network classifier for associating it to one of the 11 classes.