{"title":"Multiresolution CNN for reverberant speech recognition","authors":"Sunchan Park, Yongwon Jeong, H. S. Kim","doi":"10.1109/ICSDA.2017.8384470","DOIUrl":null,"url":null,"abstract":"The performance of automatic speech recognition (ASR) has been greatly improved by deep neural network (DNN) acoustic models. However, DNN-based systems still perform poorly in reverberant environments. Convolutional neural network (CNN) acoustic models showed lower word error rate (WER) in distant speech recognition than fully-connected DNN acoustic models. To improve the performance of reverberant speech recognition using CNN acoustic models, we propose the multiresolution CNN that has two separate streams: one is the wideband feature with wide-context window and the other is the narrowband feature with narrow-context window. The experiments on the ASR task of the REVERB challenge 2014 showed that the proposed multiresolution CNN based approach reduced the WER by 8.79% and 8.83% for the simulated test data and the real-condition test data, respectively, compared with the conventional CNN based method.","PeriodicalId":255147,"journal":{"name":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSDA.2017.8384470","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18

Abstract

The performance of automatic speech recognition (ASR) has been greatly improved by deep neural network (DNN) acoustic models. However, DNN-based systems still perform poorly in reverberant environments. Convolutional neural network (CNN) acoustic models showed lower word error rate (WER) in distant speech recognition than fully-connected DNN acoustic models. To improve the performance of reverberant speech recognition using CNN acoustic models, we propose the multiresolution CNN that has two separate streams: one is the wideband feature with wide-context window and the other is the narrowband feature with narrow-context window. The experiments on the ASR task of the REVERB challenge 2014 showed that the proposed multiresolution CNN based approach reduced the WER by 8.79% and 8.83% for the simulated test data and the real-condition test data, respectively, compared with the conventional CNN based method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
多分辨率CNN混响语音识别
深度神经网络声学模型极大地提高了自动语音识别(ASR)的性能。然而,基于dnn的系统在混响环境中仍然表现不佳。卷积神经网络(CNN)声学模型在远端语音识别中的单词错误率(WER)低于全连接DNN声学模型。为了提高使用CNN声学模型进行混响语音识别的性能,我们提出了具有两个独立流的多分辨率CNN:一个是具有宽上下文窗口的宽带特征,另一个是具有窄上下文窗口的窄带特征。在REVERB challenge 2014的ASR任务上进行的实验表明,与传统的基于CNN的方法相比,本文提出的基于多分辨率CNN的方法对模拟测试数据和真实条件测试数据的WER分别降低了8.79%和8.83%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Feature selection method for real-time speech emotion recognition Spectral analysis of English voiced palato-alveolar fricative /Ʒ/ produced by Chinese WU Speakers Corpus-based evaluation of Chinese text normalization Acoustic analysis of vowels in five low resource north East Indian languages of Nagaland A progress report of the Taiwan Mandarin radio speech corpus project
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1