{"title":"利用神经网络和LPCC改进语音识别","authors":"M. Zbancioc, M. Costin","doi":"10.1109/SCS.2003.1227085","DOIUrl":null,"url":null,"abstract":"Linear Predictive Coding (LPC), powerful speech analysis technique, is very useful for encoding speech at a low bit rate and provides extremely accurate estimates of speech parameters - based on the assumption that speech signal is produced by a buzzer at the end of the tube (the glottis produces the buzz, characterized by its intensity and frequency, and the vocal tract forms the tube, characterized by resonance frequencies (formants) according to Calliope(1989), is very efficient for the vocalic areas. The model is less efficient for transient, unvowel or not stationary regions according to R. Lawrence and B. Hwang Juang (1993). A Radial Basis Function network is able to recognize in a satisfying percent a set of phonemes pronounced by different speakers, using LPC sets as input.","PeriodicalId":375963,"journal":{"name":"Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":"{\"title\":\"Using neural networks and LPCC to improve speech recognition\",\"authors\":\"M. Zbancioc, M. Costin\",\"doi\":\"10.1109/SCS.2003.1227085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Linear Predictive Coding (LPC), powerful speech analysis technique, is very useful for encoding speech at a low bit rate and provides extremely accurate estimates of speech parameters - based on the assumption that speech signal is produced by a buzzer at the end of the tube (the glottis produces the buzz, characterized by its intensity and frequency, and the vocal tract forms the tube, characterized by resonance frequencies (formants) according to Calliope(1989), is very efficient for the vocalic areas. The model is less efficient for transient, unvowel or not stationary regions according to R. Lawrence and B. Hwang Juang (1993). A Radial Basis Function network is able to recognize in a satisfying percent a set of phonemes pronounced by different speakers, using LPC sets as input.\",\"PeriodicalId\":375963,\"journal\":{\"name\":\"Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on\",\"volume\":\"101 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-07-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"18\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCS.2003.1227085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signals, Circuits and Systems, 2003. SCS 2003. International Symposium on","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCS.2003.1227085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using neural networks and LPCC to improve speech recognition
Linear Predictive Coding (LPC), powerful speech analysis technique, is very useful for encoding speech at a low bit rate and provides extremely accurate estimates of speech parameters - based on the assumption that speech signal is produced by a buzzer at the end of the tube (the glottis produces the buzz, characterized by its intensity and frequency, and the vocal tract forms the tube, characterized by resonance frequencies (formants) according to Calliope(1989), is very efficient for the vocalic areas. The model is less efficient for transient, unvowel or not stationary regions according to R. Lawrence and B. Hwang Juang (1993). A Radial Basis Function network is able to recognize in a satisfying percent a set of phonemes pronounced by different speakers, using LPC sets as input.