{"title":"基于点阵评分的土耳其语大词汇连续语音识别语言建模方法","authors":"E. Arisoy, M. Saraçlar","doi":"10.1109/SIU.2006.1659773","DOIUrl":null,"url":null,"abstract":"In this paper, we have tried some language modelling approaches for large vocabulary continuous speech recognition (LVCSR) of Turkish. The agglutinative nature of Turkish makes Turkish a challenging language in terms of speech recognition since it is impossible to include all possible words in the recognition lexicon. Therefore, instead of using words as recognition units, we use a data-driven sub-word approach called morphs. This method was previously applied to Finnish, Estonian and Turkish and promising recognition results were achieved compared to words as recognition units. In our database, we obtained word error rates (WER) of 38.8% for the baseline word-based model and 33.9% for the baseline morph-based model. In addition, we tried some new methods. Recognition lattice outputs of each model were rescored with the root-based and root-class-based models for the word-based case and first-morph-based model for the morph-based case. The word-root composition approach achieves a 0.5% increase in the recognition performance. However, other two approaches fail due to the non-robust estimates over the baseline models","PeriodicalId":415037,"journal":{"name":"2006 IEEE 14th Signal Processing and Communications Applications","volume":"81 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Language Modelling Approaches for Turkish Large Vocabulary Continuous Speech Recognition Based on Lattice Rescoring\",\"authors\":\"E. Arisoy, M. Saraçlar\",\"doi\":\"10.1109/SIU.2006.1659773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we have tried some language modelling approaches for large vocabulary continuous speech recognition (LVCSR) of Turkish. The agglutinative nature of Turkish makes Turkish a challenging language in terms of speech recognition since it is impossible to include all possible words in the recognition lexicon. Therefore, instead of using words as recognition units, we use a data-driven sub-word approach called morphs. This method was previously applied to Finnish, Estonian and Turkish and promising recognition results were achieved compared to words as recognition units. In our database, we obtained word error rates (WER) of 38.8% for the baseline word-based model and 33.9% for the baseline morph-based model. In addition, we tried some new methods. Recognition lattice outputs of each model were rescored with the root-based and root-class-based models for the word-based case and first-morph-based model for the morph-based case. The word-root composition approach achieves a 0.5% increase in the recognition performance. However, other two approaches fail due to the non-robust estimates over the baseline models\",\"PeriodicalId\":415037,\"journal\":{\"name\":\"2006 IEEE 14th Signal Processing and Communications Applications\",\"volume\":\"81 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-04-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 IEEE 14th Signal Processing and Communications Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIU.2006.1659773\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 IEEE 14th Signal Processing and Communications Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIU.2006.1659773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Language Modelling Approaches for Turkish Large Vocabulary Continuous Speech Recognition Based on Lattice Rescoring
In this paper, we have tried some language modelling approaches for large vocabulary continuous speech recognition (LVCSR) of Turkish. The agglutinative nature of Turkish makes Turkish a challenging language in terms of speech recognition since it is impossible to include all possible words in the recognition lexicon. Therefore, instead of using words as recognition units, we use a data-driven sub-word approach called morphs. This method was previously applied to Finnish, Estonian and Turkish and promising recognition results were achieved compared to words as recognition units. In our database, we obtained word error rates (WER) of 38.8% for the baseline word-based model and 33.9% for the baseline morph-based model. In addition, we tried some new methods. Recognition lattice outputs of each model were rescored with the root-based and root-class-based models for the word-based case and first-morph-based model for the morph-based case. The word-root composition approach achieves a 0.5% increase in the recognition performance. However, other two approaches fail due to the non-robust estimates over the baseline models