Multilingual BLSTM and speaker-specific vector adaptation in 2016 but babel system

M. Karafiát, M. Baskar, P. Matejka, Karel Veselý, F. Grézl, J. Černocký
{"title":"Multilingual BLSTM and speaker-specific vector adaptation in 2016 but babel system","authors":"M. Karafiát, M. Baskar, P. Matejka, Karel Veselý, F. Grézl, J. Černocký","doi":"10.1109/SLT.2016.7846330","DOIUrl":null,"url":null,"abstract":"This paper provides an extensive summary of BUT 2016 system for the last IARPA Babel evaluations. It concentrates on multi-lingual training of both deep neural network (DNN)-based feature extraction and acoustic models including multilingual training of bidirectional Long Short Term memory networks. Next, two low-dimensional vector approaches to speaker adaptation are investigated: i-vectors and sequence-summarizing neural networks (SSNN). The results provided on three Babel Year 4 languages show clear advantage of both approaches in case limited amount of training data is available. The time necessary for the development of a new system is addressed too, as some of the investigated techniques do not require extensive re-training of the whole system.","PeriodicalId":281635,"journal":{"name":"2016 IEEE Spoken Language Technology Workshop (SLT)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Spoken Language Technology Workshop (SLT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SLT.2016.7846330","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

This paper provides an extensive summary of BUT 2016 system for the last IARPA Babel evaluations. It concentrates on multi-lingual training of both deep neural network (DNN)-based feature extraction and acoustic models including multilingual training of bidirectional Long Short Term memory networks. Next, two low-dimensional vector approaches to speaker adaptation are investigated: i-vectors and sequence-summarizing neural networks (SSNN). The results provided on three Babel Year 4 languages show clear advantage of both approaches in case limited amount of training data is available. The time necessary for the development of a new system is addressed too, as some of the investigated techniques do not require extensive re-training of the whole system.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
2016年多语言BLSTM和特定说话人的载体适应,但巴别塔系统
本文对上届IARPA巴别塔评估的BUT 2016系统进行了广泛的总结。主要研究基于深度神经网络(DNN)的特征提取和声学模型的多语言训练,包括双向长短期记忆网络的多语言训练。接下来,研究了两种低维向量方法:i向量和序列汇总神经网络(SSNN)。对三种Babel四年级语文的结果表明,在训练数据有限的情况下,这两种方法都有明显的优势。开发新系统所需的时间也得到了解决,因为所研究的一些技术不需要对整个系统进行广泛的重新培训。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Further optimisations of constant Q cepstral processing for integrated utterance and text-dependent speaker verification Learning dialogue dynamics with the method of moments A study of speech distortion conditions in real scenarios for speech processing applications Comparing speaker independent and speaker adapted classification for word prominence detection Influence of corpus size and content on the perceptual quality of a unit selection MaryTTS voice
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1