{"title":"NMT Enhancement based on Knowledge Graph Mining with Pre-trained Language Model","authors":"Hao Yang, Ying Qin, Yao Deng, Minghan Wang","doi":"10.23919/ICACT48636.2020.9061292","DOIUrl":null,"url":null,"abstract":"Pre-trained language models like Bert, RoBERTa, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage","PeriodicalId":296763,"journal":{"name":"2020 22nd International Conference on Advanced Communication Technology (ICACT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 22nd International Conference on Advanced Communication Technology (ICACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/ICACT48636.2020.9061292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Pre-trained language models like Bert, RoBERTa, GPT, etc. have achieved SOTA effects on multiple NLP tasks (e.g. sentiment classification, information extraction, event extraction, etc.). We propose a simple method based on knowledge graph to improve the quality of machine translation. First, we propose a multi-task learning model that learns subjects, objects, and predicates at the same time. Second, we treat different predicates as different fields, and improve the recognition ability of NMT models in different fields through classification labels. Finally, beam search combined with L2R, R2L rearranges results through entities. Based on the CWMT2018 experimental data, using the predicate's domain classification identifier, the BLUE score increased from 33.58% to 37.63%, and through L2R, R2L rearrangement, the BLEU score increased to 39.25%, overall improvement is more than 5 percentage