{"title":"基于语素的蒙汉神经机器翻译研究","authors":"Siriguleng Wang, Wuyuntana","doi":"10.1109/INFOCT.2019.8710936","DOIUrl":null,"url":null,"abstract":"In view of the rich morphology of Mongolian language and the limited vocabulary of neural machine translation, this paper firstly segmenting Mongolian words from different granularity, which are the segmentation of separates morphological suffixes and the segmentation of Ligatures morphological suffixes. For Chinese, we use word segmentation and word division. Then, we studied the morpheme-based Mongolian-Chinese end-to-end neural machine translation under the framework of bidirectional encoder and attention-based decoder. The experimental results show that the segmentation of Mongolian word effectively solves the data sparsity of Mongolian, and the morpheme-based Mongolian-Chinese neural machine translation model can improve the quality of machine translation. The best NIST and BLEU values of the morpheme-based Mongolian-Chinese Neural Machine Translation results were respectively reached 9.4216 and 0.6320.","PeriodicalId":369231,"journal":{"name":"2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Research on Morpheme-Based Mongolian-Chinese Neural Machine Translation\",\"authors\":\"Siriguleng Wang, Wuyuntana\",\"doi\":\"10.1109/INFOCT.2019.8710936\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In view of the rich morphology of Mongolian language and the limited vocabulary of neural machine translation, this paper firstly segmenting Mongolian words from different granularity, which are the segmentation of separates morphological suffixes and the segmentation of Ligatures morphological suffixes. For Chinese, we use word segmentation and word division. Then, we studied the morpheme-based Mongolian-Chinese end-to-end neural machine translation under the framework of bidirectional encoder and attention-based decoder. The experimental results show that the segmentation of Mongolian word effectively solves the data sparsity of Mongolian, and the morpheme-based Mongolian-Chinese neural machine translation model can improve the quality of machine translation. The best NIST and BLEU values of the morpheme-based Mongolian-Chinese Neural Machine Translation results were respectively reached 9.4216 and 0.6320.\",\"PeriodicalId\":369231,\"journal\":{\"name\":\"2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INFOCT.2019.8710936\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 2nd International Conference on Information and Computer Technologies (ICICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCT.2019.8710936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Research on Morpheme-Based Mongolian-Chinese Neural Machine Translation
In view of the rich morphology of Mongolian language and the limited vocabulary of neural machine translation, this paper firstly segmenting Mongolian words from different granularity, which are the segmentation of separates morphological suffixes and the segmentation of Ligatures morphological suffixes. For Chinese, we use word segmentation and word division. Then, we studied the morpheme-based Mongolian-Chinese end-to-end neural machine translation under the framework of bidirectional encoder and attention-based decoder. The experimental results show that the segmentation of Mongolian word effectively solves the data sparsity of Mongolian, and the morpheme-based Mongolian-Chinese neural machine translation model can improve the quality of machine translation. The best NIST and BLEU values of the morpheme-based Mongolian-Chinese Neural Machine Translation results were respectively reached 9.4216 and 0.6320.