Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak

Jan Lehecka, J. Psutka, J. Psutka
{"title":"Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak","authors":"Jan Lehecka, J. Psutka, J. Psutka","doi":"10.48550/arXiv.2306.04399","DOIUrl":null,"url":null,"abstract":"In this paper, we are comparing several methods of training the Slovak speech recognition models based on the Transformers architecture. Specifically, we are exploring the approach of transfer learning from the existing Czech pre-trained Wav2Vec 2.0 model into Slovak. We are demonstrating the benefits of the proposed approach on three Slovak datasets. Our Slovak models scored the best results when initializing the weights from the Czech model at the beginning of the pre-training phase. Our results show that the knowledge stored in the Cezch pre-trained model can be successfully reused to solve tasks in Slovak while outperforming even much larger public multilingual models.","PeriodicalId":358274,"journal":{"name":"International Conference on Text, Speech and Dialogue","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Text, Speech and Dialogue","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.04399","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we are comparing several methods of training the Slovak speech recognition models based on the Transformers architecture. Specifically, we are exploring the approach of transfer learning from the existing Czech pre-trained Wav2Vec 2.0 model into Slovak. We are demonstrating the benefits of the proposed approach on three Slovak datasets. Our Slovak models scored the best results when initializing the weights from the Czech model at the beginning of the pre-training phase. Our results show that the knowledge stored in the Cezch pre-trained model can be successfully reused to solve tasks in Slovak while outperforming even much larger public multilingual models.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于变压器的捷克语到斯洛伐克语语音识别模型的迁移学习
在本文中,我们比较了几种基于变形金刚架构的斯洛伐克语语音识别模型的训练方法。具体来说,我们正在探索将现有的捷克语预训练的Wav2Vec 2.0模型迁移学习到斯洛伐克语的方法。我们正在三个斯洛伐克数据集上展示拟议方法的好处。在预训练阶段开始时,我们的斯洛伐克模型在初始化捷克模型的权重时获得了最好的结果。我们的结果表明,存储在捷克语预训练模型中的知识可以成功地重复使用,以解决斯洛伐克语的任务,同时优于更大的公共多语言模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Advancing Hungarian Text Processing with HuSpaCy: Efficient and Accurate NLP Pipelines A Dataset and Strong Baselines for Classification of Czech News Texts Measuring Sentiment Bias in Machine Translation Transfer Learning of Transformer-based Speech Recognition Models from Czech to Slovak Sub 8-Bit Quantization of Streaming Keyword Spotting Models for Embedded Chipsets
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1