通过词汇语义网络增强神经机器翻译

Quang-Phuoc Nguyen, Anh-Dung Vo, Joon-Choul Shin, Cheolyoung Ock
{"title":"通过词汇语义网络增强神经机器翻译","authors":"Quang-Phuoc Nguyen, Anh-Dung Vo, Joon-Choul Shin, Cheolyoung Ock","doi":"10.1145/3177457.3177461","DOIUrl":null,"url":null,"abstract":"In most languages, many words have multiple senses, thus machine translation systems have to choose between several candidates representing different senses of an input word. Although neural machine translation has recently become a dominant paradigm and achieved great progress, it still has to confront with the challenge of word sense disambiguation. Neural machine translation models are trained to identify the correct sense of a word as part of an end-to-end translation task, and their performances on word sense disambiguation are not satisfactory. This paper presents a case study of machine translation for Korean language. We have manually built a Korean lexical semantic network - UWordMap - as a large-scale lexical semantic knowledge-based in which each sense of every polysemous word is associated with a sense-code constituting a network node. Then, based on UWordMap, we determine the correct sense and tag the appropriated sense-code for polysemous words of the training corpus before training neural machine translation models. Experiments on translation from Korean to English and Vietnamese show that UWordMap can significantly improve quality of Korean neural machine translation systems in terms of BLEU and TER cores.","PeriodicalId":297531,"journal":{"name":"Proceedings of the 10th International Conference on Computer Modeling and Simulation","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Neural Machine Translation Enhancements through Lexical Semantic Network\",\"authors\":\"Quang-Phuoc Nguyen, Anh-Dung Vo, Joon-Choul Shin, Cheolyoung Ock\",\"doi\":\"10.1145/3177457.3177461\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In most languages, many words have multiple senses, thus machine translation systems have to choose between several candidates representing different senses of an input word. Although neural machine translation has recently become a dominant paradigm and achieved great progress, it still has to confront with the challenge of word sense disambiguation. Neural machine translation models are trained to identify the correct sense of a word as part of an end-to-end translation task, and their performances on word sense disambiguation are not satisfactory. This paper presents a case study of machine translation for Korean language. We have manually built a Korean lexical semantic network - UWordMap - as a large-scale lexical semantic knowledge-based in which each sense of every polysemous word is associated with a sense-code constituting a network node. Then, based on UWordMap, we determine the correct sense and tag the appropriated sense-code for polysemous words of the training corpus before training neural machine translation models. Experiments on translation from Korean to English and Vietnamese show that UWordMap can significantly improve quality of Korean neural machine translation systems in terms of BLEU and TER cores.\",\"PeriodicalId\":297531,\"journal\":{\"name\":\"Proceedings of the 10th International Conference on Computer Modeling and Simulation\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 10th International Conference on Computer Modeling and Simulation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3177457.3177461\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 10th International Conference on Computer Modeling and Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3177457.3177461","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

在大多数语言中,许多单词都有多个意义,因此机器翻译系统必须在代表输入单词的不同意义的几个候选词之间进行选择。虽然神经机器翻译近年来已成为一种占主导地位的翻译范式,并取得了很大的进展,但它仍然面临着词义消歧的挑战。作为端到端翻译任务的一部分,神经机器翻译模型被训练来识别正确的词义,但它们在词义消歧方面的表现并不令人满意。本文以韩语为例进行了机器翻译研究。我们人工构建了一个基于大规模词汇语义知识的韩语词汇语义网络——UWordMap,其中每个多义词的每一个意义都与一个构成网络节点的意义码相关联。然后,在训练神经机器翻译模型之前,基于UWordMap对训练语料库中的多义词确定正确的意义,并标记适当的意义码。韩语到英语和越南语的翻译实验表明,UWordMap可以在BLEU和TER核心方面显著提高韩语神经机器翻译系统的质量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Neural Machine Translation Enhancements through Lexical Semantic Network
In most languages, many words have multiple senses, thus machine translation systems have to choose between several candidates representing different senses of an input word. Although neural machine translation has recently become a dominant paradigm and achieved great progress, it still has to confront with the challenge of word sense disambiguation. Neural machine translation models are trained to identify the correct sense of a word as part of an end-to-end translation task, and their performances on word sense disambiguation are not satisfactory. This paper presents a case study of machine translation for Korean language. We have manually built a Korean lexical semantic network - UWordMap - as a large-scale lexical semantic knowledge-based in which each sense of every polysemous word is associated with a sense-code constituting a network node. Then, based on UWordMap, we determine the correct sense and tag the appropriated sense-code for polysemous words of the training corpus before training neural machine translation models. Experiments on translation from Korean to English and Vietnamese show that UWordMap can significantly improve quality of Korean neural machine translation systems in terms of BLEU and TER cores.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
rTuner: A Performance Enhancement of MapReduce Job Sensitivity Analysis of a Causality-Informed Genetic Programming Ensemble for Inferring Dynamical Systems Improving Efficiency of TV PCB Assembly Line Using a Discrete Event Simulation Approach: A Case Study Workflow for Developing High-Resolution 3D City Models in Korea Standard Values of Service Level of Intersection for Collection and Distribution Roads of Container Terminals
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1