Semi-supervised bootstrapping approach for neural network feature extractor training

2013 IEEE Workshop on Automatic Speech Recognition and Understanding Pub Date : 2013-12-01 DOI:10.1109/ASRU.2013.6707775

F. Grézl, M. Karafiát

引用次数: 54

Abstract

This paper presents bootstrapping approach for neural network training. The neural networks serve as bottle-neck feature extractor for subsequent GMM-HMM recognizer. The recognizer is also used for transcription and confidence assignment of untranscribed data. Based on the confidence, segments are selected and mixed with supervised data and new NNs are trained. With this approach, it is possible to recover 40-55% of the difference between partially and fully transcribed data (3 to 5% absolute improvement over NN trained on supervised data only). Using 70-85% of automatically transcribed segments with the highest confidence was found optimal to achieve this result.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

神经网络特征提取器训练的半监督自举方法

提出了一种神经网络训练的自举方法。神经网络作为瓶颈特征提取器用于后续的GMM-HMM识别。识别器还用于未转录数据的转录和置信度分配。基于置信度，选择片段并与监督数据混合，训练新的神经网络。使用这种方法，可以恢复部分转录和完全转录数据之间40-55%的差异(比仅在监督数据上训练的神经网络绝对提高3 - 5%)。使用70-85%具有最高置信度的自动转录片段被认为是实现这一结果的最佳选择。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

自引率

0.00%

发文量