{"title":"英语-拉脱维亚语机器翻译中的多词表达","authors":"I. Skadina","doi":"10.22364/BJMC.2016.4.4.14","DOIUrl":null,"url":null,"abstract":"The paper presents series of experiments that aim to find best method how to treat multi-word expressions (MWE) in machine translation task. Methods have been investigated in a framework of statistical machine translation (SMT) for translation form English into Latvian. MWE candidates have been extracted using pattern-based and statistical approaches. Different techniques for MWE integration into SMT system are analysed. The best result +0.59 BLEU points – has been achieved by combining two phrase tables bilingual MWE dictionary and phrase table created from the parallel corpus in which statistically extracted MWE candidates are treated as single tokens. Using only bilingual dictionary as additional source of information the best result (+0.36 BLEU points) is obtained by combining two phrase tables. In case of statistically obtained MWE lists, the best result (+0.51 BLEU points) is achieved with the largest list of MWE candidates.","PeriodicalId":431209,"journal":{"name":"Balt. J. Mod. Comput.","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Multi-word Expressions in English-Latvian Machine Translation\",\"authors\":\"I. Skadina\",\"doi\":\"10.22364/BJMC.2016.4.4.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The paper presents series of experiments that aim to find best method how to treat multi-word expressions (MWE) in machine translation task. Methods have been investigated in a framework of statistical machine translation (SMT) for translation form English into Latvian. MWE candidates have been extracted using pattern-based and statistical approaches. Different techniques for MWE integration into SMT system are analysed. The best result +0.59 BLEU points – has been achieved by combining two phrase tables bilingual MWE dictionary and phrase table created from the parallel corpus in which statistically extracted MWE candidates are treated as single tokens. Using only bilingual dictionary as additional source of information the best result (+0.36 BLEU points) is obtained by combining two phrase tables. In case of statistically obtained MWE lists, the best result (+0.51 BLEU points) is achieved with the largest list of MWE candidates.\",\"PeriodicalId\":431209,\"journal\":{\"name\":\"Balt. J. Mod. Comput.\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Balt. J. Mod. Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22364/BJMC.2016.4.4.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Balt. J. Mod. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22364/BJMC.2016.4.4.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-word Expressions in English-Latvian Machine Translation
The paper presents series of experiments that aim to find best method how to treat multi-word expressions (MWE) in machine translation task. Methods have been investigated in a framework of statistical machine translation (SMT) for translation form English into Latvian. MWE candidates have been extracted using pattern-based and statistical approaches. Different techniques for MWE integration into SMT system are analysed. The best result +0.59 BLEU points – has been achieved by combining two phrase tables bilingual MWE dictionary and phrase table created from the parallel corpus in which statistically extracted MWE candidates are treated as single tokens. Using only bilingual dictionary as additional source of information the best result (+0.36 BLEU points) is obtained by combining two phrase tables. In case of statistically obtained MWE lists, the best result (+0.51 BLEU points) is achieved with the largest list of MWE candidates.