Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition

Vinit Unni, Nitish Joshi, P. Jyothi
{"title":"Coupled Training of Sequence-to-Sequence Models for Accented Speech Recognition","authors":"Vinit Unni, Nitish Joshi, P. Jyothi","doi":"10.1109/ICASSP40776.2020.9052912","DOIUrl":null,"url":null,"abstract":"Accented speech poses significant challenges for state-of-the-art automatic speech recognition (ASR) systems. Accent is a property of speech that lasts throughout an utterance in varying degrees of strength. This makes it hard to isolate the influence of accent on individual speech sounds. We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents. This training regime introduces an L2 loss between the attention-weighted representations corresponding to pairs of utterances with the same text, thus acting as a regularizer and encouraging representations from the encoder to be more accent-invariant. We focus on recognizing accented English samples from the Mozilla Common Voice corpus. We obtain significant error rate reductions on accented samples from a large set of diverse accents using coupled training. We also show consistent improvements in performance on heavily accented samples (as determined by a standalone accent classifier).","PeriodicalId":13127,"journal":{"name":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":"238 1","pages":"8254-8258"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP40776.2020.9052912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Accented speech poses significant challenges for state-of-the-art automatic speech recognition (ASR) systems. Accent is a property of speech that lasts throughout an utterance in varying degrees of strength. This makes it hard to isolate the influence of accent on individual speech sounds. We propose coupled training for encoder-decoder ASR models that acts on pairs of utterances corresponding to the same text spoken by speakers with different accents. This training regime introduces an L2 loss between the attention-weighted representations corresponding to pairs of utterances with the same text, thus acting as a regularizer and encouraging representations from the encoder to be more accent-invariant. We focus on recognizing accented English samples from the Mozilla Common Voice corpus. We obtain significant error rate reductions on accented samples from a large set of diverse accents using coupled training. We also show consistent improvements in performance on heavily accented samples (as determined by a standalone accent classifier).
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
重音语音识别中序列到序列模型的耦合训练
重音语音对最先进的自动语音识别(ASR)系统提出了重大挑战。口音是一种语言的特性,它以不同程度的强度贯穿整个话语。这使得很难分离出重音对单个语音的影响。我们提出了对编码器-解码器ASR模型的耦合训练,该模型作用于不同口音的说话者所说的同一文本对应的话语对。这种训练机制在具有相同文本的话语对对应的注意加权表示之间引入了L2损失,从而充当正则化器,并鼓励编码器的表示更具重音不变性。我们专注于识别来自Mozilla公共语音语料库的重音英语样本。我们使用耦合训练从大量不同的口音样本中获得了显着的错误率降低。我们还展示了在重口音样本(由独立的口音分类器确定)上性能的持续改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Theoretical Analysis of Multi-Carrier Agile Phased Array Radar Paco and Paco-Dct: Patch Consensus and Its Application To Inpainting Array-Geometry-Aware Spatial Active Noise Control Based on Direction-of-Arrival Weighting Neural Network Wiretap Code Design for Multi-Mode Fiber Optical Channels Distributed Non-Orthogonal Pilot Design for Multi-Cell Massive Mimo Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1