Real-time translation of English speech through speech feature extraction

Pub Date : 2024-05-27 DOI:10.1007/s10015-024-00951-w
Xiaoyan Lei
{"title":"Real-time translation of English speech through speech feature extraction","authors":"Xiaoyan Lei","doi":"10.1007/s10015-024-00951-w","DOIUrl":null,"url":null,"abstract":"<div><p>Real-time English speech translation is useful in numerous situations, including business and travel. The goal of this research is to improve real-time English speech translation efficacy. Initially, filter bank (FBank) features were extracted from English speech. Subsequently, an enhanced Transformer model was introduced, incorporating a causal convolution module in the front end of the encoder to capture English speech features with location information. The performance of the optimized model in translating English speech to different target languages was tested using the MuST-C dataset. The results revealed differences in translation results for different target languages using the improved Transformer. The highest bilingual evaluation understudy (BLEU) score was observed for Spanish text at 20.84, while Russian text obtained the lowest score of 10.56. The average BLEU score was 18.51, with an average lag time delay of 1202.33 ms. Compared to the conventional Transformer model, the improved model exhibited higher BLEU scores, lower time delay, and optimal performance when utilizing a convolutional kernel size of 3 × 3. The results demonstrate the dependability of the improved Transformer model in real-time English speech translation, highlighting its practical usefulness.</p></div>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s10015-024-00951-w","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Real-time English speech translation is useful in numerous situations, including business and travel. The goal of this research is to improve real-time English speech translation efficacy. Initially, filter bank (FBank) features were extracted from English speech. Subsequently, an enhanced Transformer model was introduced, incorporating a causal convolution module in the front end of the encoder to capture English speech features with location information. The performance of the optimized model in translating English speech to different target languages was tested using the MuST-C dataset. The results revealed differences in translation results for different target languages using the improved Transformer. The highest bilingual evaluation understudy (BLEU) score was observed for Spanish text at 20.84, while Russian text obtained the lowest score of 10.56. The average BLEU score was 18.51, with an average lag time delay of 1202.33 ms. Compared to the conventional Transformer model, the improved model exhibited higher BLEU scores, lower time delay, and optimal performance when utilizing a convolutional kernel size of 3 × 3. The results demonstrate the dependability of the improved Transformer model in real-time English speech translation, highlighting its practical usefulness.

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
通过语音特征提取实现英语语音的实时翻译
实时英语语音翻译在商务和旅行等多种场合都非常有用。本研究的目标是提高实时英语语音翻译的效率。最初,从英语语音中提取滤波器库(FBank)特征。随后,引入了增强型变换器模型,在编码器前端加入了因果卷积模块,以捕捉带有位置信息的英语语音特征。我们使用 MuST-C 数据集测试了优化模型将英语语音翻译成不同目标语言的性能。结果显示,使用改进后的转换器,不同目标语言的翻译结果存在差异。西班牙语文本的双语评估劣等(BLEU)得分最高,为 20.84 分,而俄语文本的得分最低,为 10.56 分。平均 BLEU 得分为 18.51,平均延迟时间为 1202.33 毫秒。与传统的 Transformer 模型相比,改进后的模型在使用 3 × 3 的卷积核大小时,显示出更高的 BLEU 分数、更低的时延和最佳性能。这些结果证明了改进的 Transformer 模型在实时英语语音翻译中的可靠性,突出了其实用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1