语音不流利度检测方法在口语文本机器翻译中的应用

A. Kramov, S. Pogorilyy
{"title":"语音不流利度检测方法在口语文本机器翻译中的应用","authors":"A. Kramov, S. Pogorilyy","doi":"10.18523/2617-3808.2022.5.54-61","DOIUrl":null,"url":null,"abstract":"Neural machine translation falls into the category of natural language processing tasks. Despite the availability of a big number of research papers that are devoted to the improvement of the quality of the machine translation of documents, the problem of the translation of the spoken language that contains the elements of the disfluency speech is still an actual task, especially for low-resource languages like the Ukrainian language. In this paper, the problem of the neural machine translation of the transcription results of the spoken language that incorporate different elements of the disfluency speech has been considered in the case of the translation from the English language to the Ukrainian language. Different methods and software libraries for the detection of the elements of disfluency speech in English texts have been analyzed. Due to the lack of open-access corpora of the speech disfluency samples, a new synthetic labeled corpus has been created. The created corpus contains both the original version of a document and its modified version according to the different types of speech disfluency: filler words (uh, ah, etc.) and phrases (you know, I mean), reparandum-repair pairs (cases when a speaker corrects himself during the speech). The experimental verification of the effectiveness of the usage of the method of disfluency speech detection for the improvement of the machine translation of the spoken language has been performed for the pair of English and Ukrainian languages. It has been shown that the current state-of-the-art neural translation models cannot produce the appropriate translation of the elements of speech disfluency, especially, in the reparandum-repair cases. The results obtained may indicate that the mentioned method of disfluency speech detection can be used for the previous processing of the transcriptions of spoken dialogues for the creation of coherent translations by the usage of the different models of neural machine translation.","PeriodicalId":433538,"journal":{"name":"NaUKMA Research Papers. Computer Science","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Usage of the Speech Disfluency Detection Method for the Machine Translation of the Transcriptions of Spoken Language\",\"authors\":\"A. Kramov, S. Pogorilyy\",\"doi\":\"10.18523/2617-3808.2022.5.54-61\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neural machine translation falls into the category of natural language processing tasks. Despite the availability of a big number of research papers that are devoted to the improvement of the quality of the machine translation of documents, the problem of the translation of the spoken language that contains the elements of the disfluency speech is still an actual task, especially for low-resource languages like the Ukrainian language. In this paper, the problem of the neural machine translation of the transcription results of the spoken language that incorporate different elements of the disfluency speech has been considered in the case of the translation from the English language to the Ukrainian language. Different methods and software libraries for the detection of the elements of disfluency speech in English texts have been analyzed. Due to the lack of open-access corpora of the speech disfluency samples, a new synthetic labeled corpus has been created. The created corpus contains both the original version of a document and its modified version according to the different types of speech disfluency: filler words (uh, ah, etc.) and phrases (you know, I mean), reparandum-repair pairs (cases when a speaker corrects himself during the speech). The experimental verification of the effectiveness of the usage of the method of disfluency speech detection for the improvement of the machine translation of the spoken language has been performed for the pair of English and Ukrainian languages. It has been shown that the current state-of-the-art neural translation models cannot produce the appropriate translation of the elements of speech disfluency, especially, in the reparandum-repair cases. The results obtained may indicate that the mentioned method of disfluency speech detection can be used for the previous processing of the transcriptions of spoken dialogues for the creation of coherent translations by the usage of the different models of neural machine translation.\",\"PeriodicalId\":433538,\"journal\":{\"name\":\"NaUKMA Research Papers. Computer Science\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"NaUKMA Research Papers. Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18523/2617-3808.2022.5.54-61\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"NaUKMA Research Papers. Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18523/2617-3808.2022.5.54-61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

神经机器翻译属于自然语言处理任务的范畴。尽管有大量的研究论文致力于提高文档的机器翻译质量,但包含不流利语音元素的口语翻译问题仍然是一个实际的任务,特别是对于像乌克兰语这样的低资源语言。在本文中,在从英语到乌克兰语的翻译中,考虑了包含不同不流利语音元素的口语转录结果的神经机器翻译问题。分析了英语语篇中不流利言语成分检测的不同方法和软件库。由于缺乏开放获取的语音不流畅样本语料库,本文创建了一种新的合成标注语料库。创建的语料库既包含文档的原始版本,也包含根据不同类型的语音不流畅进行修改的版本:填充词(呃,啊等)和短语(你知道,我的意思是),修复-修复对(演讲者在演讲中纠正自己的情况)。用英语和乌克兰语对机器翻译进行了实验验证,验证了使用不流利语音检测方法改进口语机器翻译的有效性。研究表明,目前最先进的神经翻译模型不能对言语不流利的要素产生适当的翻译,特别是在修复-修复的情况下。结果表明,上述语音不流畅检测方法可用于口语对话转录的先前处理,从而通过使用不同的神经机器翻译模型来创建连贯的翻译。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Usage of the Speech Disfluency Detection Method for the Machine Translation of the Transcriptions of Spoken Language
Neural machine translation falls into the category of natural language processing tasks. Despite the availability of a big number of research papers that are devoted to the improvement of the quality of the machine translation of documents, the problem of the translation of the spoken language that contains the elements of the disfluency speech is still an actual task, especially for low-resource languages like the Ukrainian language. In this paper, the problem of the neural machine translation of the transcription results of the spoken language that incorporate different elements of the disfluency speech has been considered in the case of the translation from the English language to the Ukrainian language. Different methods and software libraries for the detection of the elements of disfluency speech in English texts have been analyzed. Due to the lack of open-access corpora of the speech disfluency samples, a new synthetic labeled corpus has been created. The created corpus contains both the original version of a document and its modified version according to the different types of speech disfluency: filler words (uh, ah, etc.) and phrases (you know, I mean), reparandum-repair pairs (cases when a speaker corrects himself during the speech). The experimental verification of the effectiveness of the usage of the method of disfluency speech detection for the improvement of the machine translation of the spoken language has been performed for the pair of English and Ukrainian languages. It has been shown that the current state-of-the-art neural translation models cannot produce the appropriate translation of the elements of speech disfluency, especially, in the reparandum-repair cases. The results obtained may indicate that the mentioned method of disfluency speech detection can be used for the previous processing of the transcriptions of spoken dialogues for the creation of coherent translations by the usage of the different models of neural machine translation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Bicycle Protection System Using GPS/GSM Modules аnd Radio Protocol Parking Spot Occupancy Classification Using Deep Learning Information System Assessment of the Creditworthiness of an Individual Transdisciplinary Information and Analytical Platform Supporting Evaluation Processes Two-Stage Transportation Problem with Unknown Consumer Demands
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1