NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model

Hao Yang, Ying Qin, Yao Deng, Minghan Wang
{"title":"NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model","authors":"Hao Yang, Ying Qin, Yao Deng, Minghan Wang","doi":"10.23919/ICACT48636.2020.9061292","DOIUrl":null,"url":null,"abstract":"Pre-trained language models like Bert, RoBERTa, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage","PeriodicalId":296763,"journal":{"name":"2020 22nd International Conference on Advanced Communication Technology (ICACT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 22nd International Conference on Advanced Communication Technology (ICACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICACT48636.2020.9061292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Pre-trained language models like Bert, RoBERTa, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于预训练语言模型的知识图挖掘的NMT增强
Bert、RoBERTa、GPT等预训练语言模型已经在多个NLP任务(如情感分类、信息提取、事件提取等)上实现了SOTA效果。提出了一种基于知识图的简单方法来提高机器翻译的质量。首先,我们提出了一个同时学习主语、宾语和谓语的多任务学习模型。其次,我们将不同的谓词视为不同的领域,并通过分类标签提高NMT模型在不同领域的识别能力。最后,波束搜索结合L2R、R2L通过实体对结果进行重新排列。基于CWMT2018实验数据,使用谓词的领域分类标识符,BLUE得分从33.58%提高到37.63%,通过L2R、R2L重排,BLEU得分提高到39.25%,整体提升幅度超过5个百分点
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Classify and Analyze the Security Issues and Challenges in Mobile banking in Uzbekistan 2 to 4 Digital Optical Line Decoder based on Photonic Micro-Ring Resonators Session Overview Analysis and Protection of Computer Network Security Issues Preliminary Study of the Voice-controlled Electric Heat Radiator
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1