Tied posteriors: an approach for effective introduction of context dependency in hybrid NN/HMM LVCSR

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100) Pub Date : 2000-06-05 DOI:10.1109/ICASSP.2000.861800

J. Rottland, G. Rigoll

{"title":"Tied posteriors: an approach for effective introduction of context dependency in hybrid NN/HMM LVCSR","authors":"J. Rottland, G. Rigoll","doi":"10.1109/ICASSP.2000.861800","DOIUrl":null,"url":null,"abstract":"This paper presents a method to improve the recognition rate of hybrid connectionist/HMM speech recognition systems. At the same time this approach allows the easy introduction of context dependent models in the hybrid framework. The approach is based on a standard hybrid connectionist/HMM recognizer, in which the neural nets are trained to estimate the a posteriori probabilities for all phones in each input frame. In the approach presented here, the probabilities of the neural nets are used to replace the codebook of a tied-mixture HMM system. Therefore the resulting system is called tied posterior. The advantages of this structure are that an arbitrary HMM-topology can be used, and that all context dependency and all clustering techniques used in tied-mixture systems can be applied to this hybrid speech recognition system. The approach has been evaluated on the Wall Street Journal (WSJ) database, with the result, that it outperforms the standard hybrid approach on this task.","PeriodicalId":164817,"journal":{"name":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"34","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP.2000.861800","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 34

Abstract

This paper presents a method to improve the recognition rate of hybrid connectionist/HMM speech recognition systems. At the same time this approach allows the easy introduction of context dependent models in the hybrid framework. The approach is based on a standard hybrid connectionist/HMM recognizer, in which the neural nets are trained to estimate the a posteriori probabilities for all phones in each input frame. In the approach presented here, the probabilities of the neural nets are used to replace the codebook of a tied-mixture HMM system. Therefore the resulting system is called tied posterior. The advantages of this structure are that an arbitrary HMM-topology can be used, and that all context dependency and all clustering techniques used in tied-mixture systems can be applied to this hybrid speech recognition system. The approach has been evaluated on the Wall Street Journal (WSJ) database, with the result, that it outperforms the standard hybrid approach on this task.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

捆绑后验:在混合神经网络/HMM LVCSR中有效引入上下文依赖的方法

提出了一种提高连接主义/HMM混合语音识别系统识别率的方法。同时，这种方法允许在混合框架中轻松引入依赖于上下文的模型。该方法基于标准的混合连接主义/HMM识别器，其中神经网络被训练来估计每个输入帧中所有手机的后验概率。在此方法中，使用神经网络的概率来替换捆绑混合HMM系统的码本。因此，由此产生的系统被称为后系。这种结构的优点是可以使用任意的hmm拓扑结构，并且在绑定混合系统中使用的所有上下文依赖和所有聚类技术都可以应用于这种混合语音识别系统。该方法已在《华尔街日报》(WSJ)数据库中进行了评估，结果表明，它在此任务上优于标准混合方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100)

自引率

0.00%

发文量

期刊最新文献

Phase-based multidimensional volume registration Generation of optimum signature base sequences for speech signals Denoising of human speech using combined acoustic and EM sensor signal processing New estimation technique for a class of chaotic signals Inversion of block matrices with block banded inverses: application to Kalman-Bucy filtering