{"title":"An unsupervised vocabulary selection technique for Chinese automatic speech recognition","authors":"Yike Zhang, Pengyuan Zhang, Ta Li, Yonghong Yan","doi":"10.1109/SLT.2016.7846298","DOIUrl":null,"url":null,"abstract":"The vocabulary is a vital component of automatic speech recognition(ASR) systems. For a specific Chinese speech recognition task, using a large general vocabulary not only leads to a much longer time to decode, but also hurts the recognition accuracy. In this paper, we proposed an unsupervised algorithm to select task-specific words from a large general vocabulary. The out-of-vocabulary(OOV) rate is a measure of vocabularies, and it is related to the recognition accuracy. However, it is hard to compute OOV rate for a Chinese vocabulary, since OOVs are often segmented into single Chinese characters and most Chinese vocabularies contain all the single Chinese characters. To deal with this problem, we proposed a novel method to estimate the OOV rate of Chinese vocabularies. In experiments, we found that our estimated OOV rate is related to the character error rate(CER) of recognition. Our proposed vocabulary selection method provided both the lowest OOV rate and CER on two Chinese conversational telephone speech(CTS) evaluation sets compared to the general vocabulary and frequency based vocabulary selection method. In addition, our proposed method significantly reduced the size of the language model(LM) and the corresponding weighted finite state transducer(WFST) network, which led to a more efficient decoding.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The vocabulary is a vital component of automatic speech recognition(ASR) systems. For a specific Chinese speech recognition task, using a large general vocabulary not only leads to a much longer time to decode, but also hurts the recognition accuracy. In this paper, we proposed an unsupervised algorithm to select task-specific words from a large general vocabulary. The out-of-vocabulary(OOV) rate is a measure of vocabularies, and it is related to the recognition accuracy. However, it is hard to compute OOV rate for a Chinese vocabulary, since OOVs are often segmented into single Chinese characters and most Chinese vocabularies contain all the single Chinese characters. To deal with this problem, we proposed a novel method to estimate the OOV rate of Chinese vocabularies. In experiments, we found that our estimated OOV rate is related to the character error rate(CER) of recognition. Our proposed vocabulary selection method provided both the lowest OOV rate and CER on two Chinese conversational telephone speech(CTS) evaluation sets compared to the general vocabulary and frequency based vocabulary selection method. In addition, our proposed method significantly reduced the size of the language model(LM) and the corresponding weighted finite state transducer(WFST) network, which led to a more efficient decoding.