Some properties of Bayesian sensing hidden Markov models

2011 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2011-12-01 DOI:10.1109/ASRU.2011.6163907

G. Saon, Jen-Tzung Chien

{"title":"Some properties of Bayesian sensing hidden Markov models","authors":"G. Saon, Jen-Tzung Chien","doi":"10.1109/ASRU.2011.6163907","DOIUrl":null,"url":null,"abstract":"In Bayesian sensing hidden Markov models (BSHMMs) the acoustic feature vectors are represented by a set of state-dependent basis vectors and by time-dependent sensing weights. The Bayesian formulation comes from assuming state-dependent zero mean Gaussian priors for the weights and from using marginal likelihood functions obtained by integrating out the weights. Here, we discuss two properties of BSHMMs. The first property is that the marginal likelihood is Gaussian with a factor analyzed covariance matrix with the basis providing a low-rank correction to the diagonal covariance of the reconstruction errors. The second property, termed automatic relevance determination, provides a method for discarding basis vectors that are not relevant for encoding feature vectors. This allows model complexity control where one can initially train a large model and then prune it to a smaller size by removing the basis vectors which correspond to the largest precision values of the sensing weights. The last property turned out to be useful in successfully deploying models trained on 1800 hours of data during the 2011 DARPA GALE Arabic broadcast news transcription evaluation.","PeriodicalId":338241,"journal":{"name":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Workshop on Automatic Speech Recognition & Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2011.6163907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 9

Abstract

In Bayesian sensing hidden Markov models (BSHMMs) the acoustic feature vectors are represented by a set of state-dependent basis vectors and by time-dependent sensing weights. The Bayesian formulation comes from assuming state-dependent zero mean Gaussian priors for the weights and from using marginal likelihood functions obtained by integrating out the weights. Here, we discuss two properties of BSHMMs. The first property is that the marginal likelihood is Gaussian with a factor analyzed covariance matrix with the basis providing a low-rank correction to the diagonal covariance of the reconstruction errors. The second property, termed automatic relevance determination, provides a method for discarding basis vectors that are not relevant for encoding feature vectors. This allows model complexity control where one can initially train a large model and then prune it to a smaller size by removing the basis vectors which correspond to the largest precision values of the sensing weights. The last property turned out to be useful in successfully deploying models trained on 1800 hours of data during the 2011 DARPA GALE Arabic broadcast news transcription evaluation.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

贝叶斯感知隐马尔可夫模型的一些性质

在贝叶斯感知隐马尔可夫模型(bshmm)中，声学特征向量由一组状态相关基向量和时变感知权值表示。贝叶斯公式来自于假设权值为状态相关的零均值高斯先验，以及使用通过积分权值得到的边际似然函数。这里，我们讨论了bshmm的两个性质。第一个性质是边际似然是高斯的，有一个因子分析的协方差矩阵，该矩阵的基对重建误差的对角协方差提供低秩校正。第二个特性，称为自动相关性确定，提供了一种丢弃与编码特征向量不相关的基向量的方法。这允许模型复杂性控制，其中可以首先训练一个大模型，然后通过去除与传感权值的最大精度值对应的基向量将其修剪成较小的大小。最后一个特性在2011年DARPA GALE阿拉伯广播新闻转录评估期间成功部署了1800小时数据训练的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2011 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量

期刊最新文献

Applying feature bagging for more accurate and robust automated speaking assessment Towards choosing better primes for spoken dialog systems Accent level adjustment in bilingual Thai-English text-to-speech synthesis Fast speaker diarization using a high-level scripting language Evaluating prosodic features for automated scoring of non-native read speech