Effective pseudo-relevance feedback for language modeling in speech recognition

2013 IEEE Workshop on Automatic Speech Recognition and Understanding Pub Date : 2013-12-01 DOI:10.1109/ASRU.2013.6707698

Berlin Chen, Yi-Wen Chen, Kuan-Yu Chen, E. Jan

{"title":"Effective pseudo-relevance feedback for language modeling in speech recognition","authors":"Berlin Chen, Yi-Wen Chen, Kuan-Yu Chen, E. Jan","doi":"10.1109/ASRU.2013.6707698","DOIUrl":null,"url":null,"abstract":"A part and parcel of any automatic speech recognition (ASR) system is language modeling (LM), which helps to constrain the acoustic analysis, guide the search through multiple candidate word strings, and quantify the acceptability of the final output hypothesis given an input utterance. Despite the fact that the n-gram model remains the predominant one, a number of novel and ingenious LM methods have been developed to complement or be used in place of the n-gram model. A more recent line of research is to leverage information cues gleaned from pseudo-relevance feedback (PRF) to derive an utterance-regularized language model for complementing the n-gram model. This paper presents a continuation of this general line of research and its main contribution is two-fold. First, we explore an alternative and more efficient formulation to construct such an utterance-regularized language model for ASR. Second, the utilities of various utterance-regularized language models are analyzed and compared extensively. Empirical experiments on a large vocabulary continuous speech recognition (LVCSR) task demonstrate that our proposed language models can offer substantial improvements over the baseline n-gram system, and achieve performance competitive to, or better than, some state-of-the-art language models.","PeriodicalId":265258,"journal":{"name":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","volume":"215 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2013.6707698","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

A part and parcel of any automatic speech recognition (ASR) system is language modeling (LM), which helps to constrain the acoustic analysis, guide the search through multiple candidate word strings, and quantify the acceptability of the final output hypothesis given an input utterance. Despite the fact that the n-gram model remains the predominant one, a number of novel and ingenious LM methods have been developed to complement or be used in place of the n-gram model. A more recent line of research is to leverage information cues gleaned from pseudo-relevance feedback (PRF) to derive an utterance-regularized language model for complementing the n-gram model. This paper presents a continuation of this general line of research and its main contribution is two-fold. First, we explore an alternative and more efficient formulation to construct such an utterance-regularized language model for ASR. Second, the utilities of various utterance-regularized language models are analyzed and compared extensively. Empirical experiments on a large vocabulary continuous speech recognition (LVCSR) task demonstrate that our proposed language models can offer substantial improvements over the baseline n-gram system, and achieve performance competitive to, or better than, some state-of-the-art language models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

语音识别中有效的伪相关反馈语言建模

任何自动语音识别(ASR)系统的重要组成部分都是语言建模(LM)，它有助于约束声学分析，指导在多个候选词串中的搜索，并量化给定输入话语的最终输出假设的可接受性。尽管n-gram模型仍然是主要的模型，但已经开发了许多新颖而巧妙的LM方法来补充或代替n-gram模型。最近的一项研究是利用从伪相关反馈(PRF)中收集的信息线索来推导一个话语正则化语言模型，以补充n-gram模型。本文提出了这一研究总路线的延续，其主要贡献有两个方面。首先，我们探索了一种更有效的替代方案来构建ASR的话语正则化语言模型。其次，对各种话语正则化语言模型的效用进行了广泛的分析和比较。在大词汇量连续语音识别(LVCSR)任务上的经验实验表明，我们提出的语言模型可以在基线n-gram系统上提供实质性的改进，并实现与一些最先进的语言模型相媲美或更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

自引率

0.00%

发文量