{"title":"基于多分类器的语音识别错误鲁棒性语音分类","authors":"Takeshi Homma, Kazuaki Shima, Takuya Matsumoto","doi":"10.1109/SLT.2016.7846291","DOIUrl":null,"url":null,"abstract":"In order to achieve an utterance classifier that not only works robustly against speech recognition errors but also maintains high accuracy for input with no errors, we propose the following techniques. First, we propose a classifier training method in which not only error-free transcriptions but also recognized sentences with errors were used as training data. To maintain high accuracy whether or not input has recognition errors, we adjusted a scaling factor of the number of transcriptions for training data. Second, we introduced three classifiers that utilize different input features: words, phonemes, and words recovered from phonetic recognition errors. We also introduced a selection method that selects the most probable utterance class from outputs of multiple utterance classifiers using recognition results obtained from enhanced and non-enhanced speech signals. Experimental results showed our method cuts 55% of classification errors for speech recognition input while accuracy degradation rate for transcription input is 0.7%.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Robust utterance classification using multiple classifiers in the presence of speech recognition errors\",\"authors\":\"Takeshi Homma, Kazuaki Shima, Takuya Matsumoto\",\"doi\":\"10.1109/SLT.2016.7846291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to achieve an utterance classifier that not only works robustly against speech recognition errors but also maintains high accuracy for input with no errors, we propose the following techniques. First, we propose a classifier training method in which not only error-free transcriptions but also recognized sentences with errors were used as training data. To maintain high accuracy whether or not input has recognition errors, we adjusted a scaling factor of the number of transcriptions for training data. Second, we introduced three classifiers that utilize different input features: words, phonemes, and words recovered from phonetic recognition errors. We also introduced a selection method that selects the most probable utterance class from outputs of multiple utterance classifiers using recognition results obtained from enhanced and non-enhanced speech signals. Experimental results showed our method cuts 55% of classification errors for speech recognition input while accuracy degradation rate for transcription input is 0.7%.\",\"PeriodicalId\":281635,\"journal\":{\"name\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Spoken Language Technology Workshop (SLT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SLT.2016.7846291\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Robust utterance classification using multiple classifiers in the presence of speech recognition errors
In order to achieve an utterance classifier that not only works robustly against speech recognition errors but also maintains high accuracy for input with no errors, we propose the following techniques. First, we propose a classifier training method in which not only error-free transcriptions but also recognized sentences with errors were used as training data. To maintain high accuracy whether or not input has recognition errors, we adjusted a scaling factor of the number of transcriptions for training data. Second, we introduced three classifiers that utilize different input features: words, phonemes, and words recovered from phonetic recognition errors. We also introduced a selection method that selects the most probable utterance class from outputs of multiple utterance classifiers using recognition results obtained from enhanced and non-enhanced speech signals. Experimental results showed our method cuts 55% of classification errors for speech recognition input while accuracy degradation rate for transcription input is 0.7%.