改进的拼接及其扩展到非立体声数据的噪声鲁棒语音识别

2013 IEEE Workshop on Automatic Speech Recognition and Understanding Pub Date : 2013-07-15 DOI:10.1109/ASRU.2013.6707725

D. S. P. Kumar, N. Prasad, Vikas Joshi, S. Umesh

{"title":"改进的拼接及其扩展到非立体声数据的噪声鲁棒语音识别","authors":"D. S. P. Kumar, N. Prasad, Vikas Joshi, S. Umesh","doi":"10.1109/ASRU.2013.6707725","DOIUrl":null,"url":null,"abstract":"In this paper, a modification to the training process of the popular SPLICE algorithm has been proposed for noise robust speech recognition. The modification is based on feature correlations, and enables this stereo-based algorithm to improve the performance in all noise conditions, especially in unseen cases. Further, the modified framework is extended to work for non-stereo datasets where clean and noisy training utterances, but not stereo counterparts, are required. Finally, an MLLR-based computationally efficient run-time noise adaptation method in SPLICE framework has been proposed. The modified SPLICE shows 8.6% absolute improvement over SPLICE in Test C of Aurora-2 database, and 2.93% overall. Non-stereo method shows 10.37% and 6.93% absolute improvements over Aurora-2 and Aurora-4 baseline models respectively. Run-time adaptation shows 9.89% absolute improvement in modified framework as compared to SPLICE for Test C, and 4.96% overall w.r.t. standard MLLR adaptation on HMMs.","PeriodicalId":265258,"journal":{"name":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Modified splice and its extension to non-stereo data for noise robust speech recognition\",\"authors\":\"D. S. P. Kumar, N. Prasad, Vikas Joshi, S. Umesh\",\"doi\":\"10.1109/ASRU.2013.6707725\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a modification to the training process of the popular SPLICE algorithm has been proposed for noise robust speech recognition. The modification is based on feature correlations, and enables this stereo-based algorithm to improve the performance in all noise conditions, especially in unseen cases. Further, the modified framework is extended to work for non-stereo datasets where clean and noisy training utterances, but not stereo counterparts, are required. Finally, an MLLR-based computationally efficient run-time noise adaptation method in SPLICE framework has been proposed. The modified SPLICE shows 8.6% absolute improvement over SPLICE in Test C of Aurora-2 database, and 2.93% overall. Non-stereo method shows 10.37% and 6.93% absolute improvements over Aurora-2 and Aurora-4 baseline models respectively. Run-time adaptation shows 9.89% absolute improvement in modified framework as compared to SPLICE for Test C, and 4.96% overall w.r.t. standard MLLR adaptation on HMMs.\",\"PeriodicalId\":265258,\"journal\":{\"name\":\"2013 IEEE Workshop on Automatic Speech Recognition and Understanding\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE Workshop on Automatic Speech Recognition and Understanding\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2013.6707725\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE Workshop on Automatic Speech Recognition and Understanding","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2013.6707725","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本文对流行的SPLICE算法的训练过程进行了改进，用于噪声鲁棒语音识别。这种改进是基于特征相关性的，使这种基于立体的算法能够提高在所有噪声条件下的性能，特别是在看不见的情况下。此外，改进的框架被扩展到非立体数据集，其中需要干净和嘈杂的训练话语，但不需要立体对口。最后，在SPLICE框架下提出了一种基于mllr的高效运行时噪声自适应方法。改进后的SPLICE在Aurora-2数据库的C测试中比SPLICE的绝对性能提高了8.6%，总体性能提高了2.93%。与Aurora-2和Aurora-4基线模型相比，非立体方法的绝对改进率分别为10.37%和6.93%。与SPLICE测试C相比，修改后的框架的运行时适应性提高了9.89%，在hmm上的w.r.t.标准MLLR适应性提高了4.96%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Modified splice and its extension to non-stereo data for noise robust speech recognition

In this paper, a modification to the training process of the popular SPLICE algorithm has been proposed for noise robust speech recognition. The modification is based on feature correlations, and enables this stereo-based algorithm to improve the performance in all noise conditions, especially in unseen cases. Further, the modified framework is extended to work for non-stereo datasets where clean and noisy training utterances, but not stereo counterparts, are required. Finally, an MLLR-based computationally efficient run-time noise adaptation method in SPLICE framework has been proposed. The modified SPLICE shows 8.6% absolute improvement over SPLICE in Test C of Aurora-2 database, and 2.93% overall. Non-stereo method shows 10.37% and 6.93% absolute improvements over Aurora-2 and Aurora-4 baseline models respectively. Run-time adaptation shows 9.89% absolute improvement in modified framework as compared to SPLICE for Test C, and 4.96% overall w.r.t. standard MLLR adaptation on HMMs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE Workshop on Automatic Speech Recognition and Understanding

自引率

0.00%

发文量