Yewei Song, Saad Ezzini, Jacques Klein, Tegawendé F. Bissyandé, C. Lefebvre, A. Goujon
{"title":"Letz Translate:卢森堡语的低资源机器翻译","authors":"Yewei Song, Saad Ezzini, Jacques Klein, Tegawendé F. Bissyandé, C. Lefebvre, A. Goujon","doi":"10.1109/ICNLP58431.2023.00036","DOIUrl":null,"url":null,"abstract":"Natural language processing of Low-Resource Languages (LRL) is often challenged by the lack of data. Therefore, achieving accurate machine translation (MT) in a low-resource environment is a real problem that requires practical solutions. Research in multilingual models have shown that some LRLs can be handled with such models. However, their large size and computational needs make their use in constrained environments (e.g., mobile/IoT devices or limited/old servers) impractical. In this paper, we address this problem by leveraging the power of large multilingual MT models using knowledge distillation. Knowledge distillation can transfer knowledge from a large and complex teacher model to a simpler and smaller student model without losing much in performance. We also make use of high-resource languages that are related or share the same linguistic root as the target LRL. For our evaluation, we consider Luxembourgish as the LRL that shares some roots and properties with German. We build multiple resource-efficient models based on German, knowledge distillation from the multilingual No Language Left Behind (NLLB) model, and pseudo-translation. We find that our efficient models are more than 30% faster and perform only 4% lower compared to the large state-of-the-art NLLB model.","PeriodicalId":53637,"journal":{"name":"Icon","volume":"55 1","pages":"165-170"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Letz Translate: Low-Resource Machine Translation for Luxembourgish\",\"authors\":\"Yewei Song, Saad Ezzini, Jacques Klein, Tegawendé F. Bissyandé, C. Lefebvre, A. Goujon\",\"doi\":\"10.1109/ICNLP58431.2023.00036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Natural language processing of Low-Resource Languages (LRL) is often challenged by the lack of data. Therefore, achieving accurate machine translation (MT) in a low-resource environment is a real problem that requires practical solutions. Research in multilingual models have shown that some LRLs can be handled with such models. However, their large size and computational needs make their use in constrained environments (e.g., mobile/IoT devices or limited/old servers) impractical. In this paper, we address this problem by leveraging the power of large multilingual MT models using knowledge distillation. Knowledge distillation can transfer knowledge from a large and complex teacher model to a simpler and smaller student model without losing much in performance. We also make use of high-resource languages that are related or share the same linguistic root as the target LRL. For our evaluation, we consider Luxembourgish as the LRL that shares some roots and properties with German. We build multiple resource-efficient models based on German, knowledge distillation from the multilingual No Language Left Behind (NLLB) model, and pseudo-translation. We find that our efficient models are more than 30% faster and perform only 4% lower compared to the large state-of-the-art NLLB model.\",\"PeriodicalId\":53637,\"journal\":{\"name\":\"Icon\",\"volume\":\"55 1\",\"pages\":\"165-170\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Icon\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNLP58431.2023.00036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Icon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNLP58431.2023.00036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0
摘要
低资源语言(LRL)的自然语言处理经常受到数据缺乏的挑战。因此,在低资源环境下实现准确的机器翻译(MT)是一个需要实际解决方案的现实问题。对多语言模型的研究表明,一些LRLs可以用这种模型来处理。然而,它们的大尺寸和计算需求使得它们在受限环境(例如,移动/物联网设备或有限/旧服务器)中的使用不切实际。在本文中,我们通过利用知识蒸馏的大型多语言机器翻译模型的功能来解决这个问题。知识蒸馏可以将知识从一个庞大而复杂的教师模型转移到一个更简单、更小的学生模型中,而不会损失太多的性能。我们还利用与目标LRL相关或具有相同语言根的高资源语言。在我们的评估中,我们将卢森堡语视为与德语具有某些根源和特性的LRL。我们构建了基于德语的多个资源高效模型、基于多语言无语言遗留模型(No Language Left Behind, NLLB)的知识蒸馏和伪翻译。我们发现,与最先进的大型NLLB模型相比,我们的高效模型速度快30%以上,性能仅低4%。
Letz Translate: Low-Resource Machine Translation for Luxembourgish
Natural language processing of Low-Resource Languages (LRL) is often challenged by the lack of data. Therefore, achieving accurate machine translation (MT) in a low-resource environment is a real problem that requires practical solutions. Research in multilingual models have shown that some LRLs can be handled with such models. However, their large size and computational needs make their use in constrained environments (e.g., mobile/IoT devices or limited/old servers) impractical. In this paper, we address this problem by leveraging the power of large multilingual MT models using knowledge distillation. Knowledge distillation can transfer knowledge from a large and complex teacher model to a simpler and smaller student model without losing much in performance. We also make use of high-resource languages that are related or share the same linguistic root as the target LRL. For our evaluation, we consider Luxembourgish as the LRL that shares some roots and properties with German. We build multiple resource-efficient models based on German, knowledge distillation from the multilingual No Language Left Behind (NLLB) model, and pseudo-translation. We find that our efficient models are more than 30% faster and perform only 4% lower compared to the large state-of-the-art NLLB model.