Improving ASR for Code-Switched Speech in Under-Resourced Languages Using Out-of-Domain Data

Workshop on Spoken Language Technologies for Under-resourced Languages Pub Date : 2018-08-29 DOI:10.21437/SLTU.2018-26

A. Biswas, E. V. D. Westhuizen, T. Niesler, F. D. Wet

{"title":"Improving ASR for Code-Switched Speech in Under-Resourced Languages Using Out-of-Domain Data","authors":"A. Biswas, E. V. D. Westhuizen, T. Niesler, F. D. Wet","doi":"10.21437/SLTU.2018-26","DOIUrl":null,"url":null,"abstract":"We explore the use of out-of-domain monolingual data for the improvement of automatic speech recognition (ASR) of code-switched speech. This is relevant because annotated code-switched speech data is both scarce and very hard to produce, especially when the languages concerned are under-resourced, while monolingual corpora are generally better-resourced. We perform experiments using a recently-introduced small ﬁve-language corpus of code-switched South African soap opera speech. We consider speciﬁcally whether ASR of English– isiZulu code-switched speech can be improved by incorporating monolingual data from unrelated but larger corpora. TDNN-BLSTM acoustic models are trained using various conﬁgura-tions of training data. The utility of artiﬁcially-generated bilingual English–isiZulu text to augment language model training data is also explored. We ﬁnd that English-isiZulu speech recognition accuracy can be improved by incorporating mono-lingual out-of-domain data despite the differences between the soap-opera and monolingual speech.","PeriodicalId":190269,"journal":{"name":"Workshop on Spoken Language Technologies for Under-resourced Languages","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop on Spoken Language Technologies for Under-resourced Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21437/SLTU.2018-26","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

We explore the use of out-of-domain monolingual data for the improvement of automatic speech recognition (ASR) of code-switched speech. This is relevant because annotated code-switched speech data is both scarce and very hard to produce, especially when the languages concerned are under-resourced, while monolingual corpora are generally better-resourced. We perform experiments using a recently-introduced small ﬁve-language corpus of code-switched South African soap opera speech. We consider speciﬁcally whether ASR of English– isiZulu code-switched speech can be improved by incorporating monolingual data from unrelated but larger corpora. TDNN-BLSTM acoustic models are trained using various conﬁgura-tions of training data. The utility of artiﬁcially-generated bilingual English–isiZulu text to augment language model training data is also explored. We ﬁnd that English-isiZulu speech recognition accuracy can be improved by incorporating mono-lingual out-of-domain data despite the differences between the soap-opera and monolingual speech.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用域外数据提高资源不足语言中编码切换语音的自动识别能力

我们探索使用域外单语数据来改进编码切换语音的自动语音识别(ASR)。这是相关的，因为带注释的代码转换语音数据既稀少又很难产生，特别是当有关语言资源不足时，而单语语料库通常资源较好。我们使用一个最近引入的小型五语言语料库进行实验，该语料库由南非肥皂剧的语码转换而成。我们特别考虑是否可以通过合并来自不相关但更大语料库的单语数据来提高英语- isiZulu语码转换语音的ASR。TDNN-BLSTM声学模型使用各种配置的训练数据进行训练。本文还探讨了人工生成的双语英语- isizulu文本对语言模型训练数据的增强作用。我们发现，尽管肥皂剧和单语语音之间存在差异，但通过合并单语域外数据可以提高英语- isizulu语音识别的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Workshop on Spoken Language Technologies for Under-resourced Languages

自引率

0.00%

发文量