{"title":"在词对齐模块中集成语言知识增强机器翻译","authors":"Safae Berrichi, A. Mazroui","doi":"10.1109/ISCV49265.2020.9204328","DOIUrl":null,"url":null,"abstract":"The word alignment process, which is a critical step in statistical translation systems (SMT), has been suggested by several researchers as a promising track for enhancing neural translation system (NMT) performance in low-resource environments. Furthermore, given the negative impact on English/Arabic machine translation quality arising from the morphological richness and complexity of the Arabic language compared to the English language, we assessed in this study the relevance of the integration of morphosyntactic characteristics during the alignment phase. Indeed, we have enriched parallel corpora by morphosyntactic features such as stems, lemmas, roots, and POS tags; yet we have developed new SMT systems embedding one of these features in the word alignment phase. The test results proved the interest to use these features and highlighted the most relevant morphosyntactic information to the translation system.","PeriodicalId":313743,"journal":{"name":"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Enhancing Machine Translation by Integrating Linguistic Knowledge in the Word Alignment Module\",\"authors\":\"Safae Berrichi, A. Mazroui\",\"doi\":\"10.1109/ISCV49265.2020.9204328\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The word alignment process, which is a critical step in statistical translation systems (SMT), has been suggested by several researchers as a promising track for enhancing neural translation system (NMT) performance in low-resource environments. Furthermore, given the negative impact on English/Arabic machine translation quality arising from the morphological richness and complexity of the Arabic language compared to the English language, we assessed in this study the relevance of the integration of morphosyntactic characteristics during the alignment phase. Indeed, we have enriched parallel corpora by morphosyntactic features such as stems, lemmas, roots, and POS tags; yet we have developed new SMT systems embedding one of these features in the word alignment phase. The test results proved the interest to use these features and highlighted the most relevant morphosyntactic information to the translation system.\",\"PeriodicalId\":313743,\"journal\":{\"name\":\"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCV49265.2020.9204328\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Intelligent Systems and Computer Vision (ISCV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCV49265.2020.9204328","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Enhancing Machine Translation by Integrating Linguistic Knowledge in the Word Alignment Module
The word alignment process, which is a critical step in statistical translation systems (SMT), has been suggested by several researchers as a promising track for enhancing neural translation system (NMT) performance in low-resource environments. Furthermore, given the negative impact on English/Arabic machine translation quality arising from the morphological richness and complexity of the Arabic language compared to the English language, we assessed in this study the relevance of the integration of morphosyntactic characteristics during the alignment phase. Indeed, we have enriched parallel corpora by morphosyntactic features such as stems, lemmas, roots, and POS tags; yet we have developed new SMT systems embedding one of these features in the word alignment phase. The test results proved the interest to use these features and highlighted the most relevant morphosyntactic information to the translation system.