Pub Date : 2023-07-21DOI: 10.48550/arXiv.2307.11457
Zeynep Yi̇rmi̇beşoğlu, Olgun Dursun, Harun Dalli, Mehmet Şahin, Ena Hodzik, Sabri Gürses, Tunga Güngör
Although machine translation systems are mostly designed to serve in the general domain, there is a growing tendency to adapt these systems to other domains like literary translation. In this paper, we focus on English-Turkish literary translation and develop machine translation models that take into account the stylistic features of translators. We fine-tune a pre-trained machine translation model by the manually-aligned works of a particular translator. We make a detailed analysis of the effects of manual and automatic alignments, data augmentation methods, and corpus size on the translations. We propose an approach based on stylistic features to evaluate the style of a translator in the output translations. We show that the human translator style can be highly recreated in the target machine translations by adapting the models to the style of the translator.
{"title":"Incorporating Human Translator Style into English-Turkish Literary Machine Translation","authors":"Zeynep Yi̇rmi̇beşoğlu, Olgun Dursun, Harun Dalli, Mehmet Şahin, Ena Hodzik, Sabri Gürses, Tunga Güngör","doi":"10.48550/arXiv.2307.11457","DOIUrl":"https://doi.org/10.48550/arXiv.2307.11457","url":null,"abstract":"Although machine translation systems are mostly designed to serve in the general domain, there is a growing tendency to adapt these systems to other domains like literary translation. In this paper, we focus on English-Turkish literary translation and develop machine translation models that take into account the stylistic features of translators. We fine-tune a pre-trained machine translation model by the manually-aligned works of a particular translator. We make a detailed analysis of the effects of manual and automatic alignments, data augmentation methods, and corpus size on the translations. We propose an approach based on stylistic features to evaluate the style of a translator in the output translations. We show that the human translator style can be highly recreated in the target machine translations by adapting the models to the style of the translator.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115778725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-31DOI: 10.48550/arXiv.2305.19757
Mălina Chichirău, Rik van Noord, Antonio Toral
We tackle the task of automatically discriminating between human and machine translations. As opposed to most previous work, we perform experiments in a multilingual setting, considering multiple languages and multilingual pretrained language models. We show that a classifier trained on parallel data with a single source language (in our case German–English) can still perform well on English translations that come from different source languages, even when the machine translations were produced by other systems than the one it was trained on. Additionally, we demonstrate that incorporating the source text in the input of a multilingual classifier improves (i) its accuracy and (ii) its robustness on cross-system evaluation, compared to a monolingual classifier. Furthermore, we find that using training data from multiple source languages (German, Russian and Chinese) tends to improve the accuracy of both monolingual and multilingual classifiers. Finally, we show that bilingual classifiers and classifiers trained on multiple source languages benefit from being trained on longer text sequences, rather than on sentences.
{"title":"Automatic Discrimination of Human and Neural Machine Translation in Multilingual Scenarios","authors":"Mălina Chichirău, Rik van Noord, Antonio Toral","doi":"10.48550/arXiv.2305.19757","DOIUrl":"https://doi.org/10.48550/arXiv.2305.19757","url":null,"abstract":"We tackle the task of automatically discriminating between human and machine translations. As opposed to most previous work, we perform experiments in a multilingual setting, considering multiple languages and multilingual pretrained language models. We show that a classifier trained on parallel data with a single source language (in our case German–English) can still perform well on English translations that come from different source languages, even when the machine translations were produced by other systems than the one it was trained on. Additionally, we demonstrate that incorporating the source text in the input of a multilingual classifier improves (i) its accuracy and (ii) its robustness on cross-system evaluation, compared to a monolingual classifier. Furthermore, we find that using training data from multiple source languages (German, Russian and Chinese) tends to improve the accuracy of both monolingual and multilingual classifiers. Finally, we show that bilingual classifiers and classifiers trained on multiple source languages benefit from being trained on longer text sequences, rather than on sentences.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129901357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-05-04DOI: 10.48550/arXiv.2305.03207
Sonal Sannigrahi, Rachel Bawden
Multilingual language models have shown impressive cross-lingual transfer ability across a diverse set of languages and tasks. To improve the cross-lingual ability of these models, some strategies include transliteration and finer-grained segmentation into characters as opposed to subwords. In this work, we investigate lexical sharing in multilingual machine translation (MT) from Hindi, Gujarati, Nepali into English. We explore the trade-offs that exist in translation performance between data sampling and vocabulary size, and we explore whether transliteration is useful in encouraging cross-script generalisation. We also verify how the different settings generalise to unseen languages (Marathi and Bengali). We find that transliteration does not give pronounced improvements and our analysis suggests that our multilingual MT models trained on original scripts are already robust to cross-script differences even for relatively low-resource languages.
{"title":"Investigating Lexical Sharing in Multilingual Machine Translation for Indian Languages","authors":"Sonal Sannigrahi, Rachel Bawden","doi":"10.48550/arXiv.2305.03207","DOIUrl":"https://doi.org/10.48550/arXiv.2305.03207","url":null,"abstract":"Multilingual language models have shown impressive cross-lingual transfer ability across a diverse set of languages and tasks. To improve the cross-lingual ability of these models, some strategies include transliteration and finer-grained segmentation into characters as opposed to subwords. In this work, we investigate lexical sharing in multilingual machine translation (MT) from Hindi, Gujarati, Nepali into English. We explore the trade-offs that exist in translation performance between data sampling and vocabulary size, and we explore whether transliteration is useful in encouraging cross-script generalisation. We also verify how the different settings generalise to unseen languages (Marathi and Bengali). We find that transliteration does not give pronounced improvements and our analysis suggests that our multilingual MT models trained on original scripts are already robust to cross-script differences even for relatively low-resource languages.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128673885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-25DOI: 10.48550/arXiv.2304.12776
Ali Vardasbi, Telmo Pires, Robin M. Schmidt, Stephan Peitz
Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e.g. vision, language modelling, and audio. Thanks to its mathematical formulation, it compresses its input to a single hidden state, and is able to capture long range dependencies while avoiding the need for an attention mechanism. In this work, we apply S4 to Machine Translation (MT), and evaluate several encoder-decoder variants on WMT’14 and WMT’16. In contrast with the success in language modeling, we find that S4 lags behind the Transformer by approximately 4 BLEU points, and that it counter-intuitively struggles with long sentences. Finally, we show that this gap is caused by S4’s inability to summarize the full source sentence in a single hidden state, and show that we can close the gap by introducing an attention mechanism.
{"title":"State Spaces Aren’t Enough: Machine Translation Needs Attention","authors":"Ali Vardasbi, Telmo Pires, Robin M. Schmidt, Stephan Peitz","doi":"10.48550/arXiv.2304.12776","DOIUrl":"https://doi.org/10.48550/arXiv.2304.12776","url":null,"abstract":"Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e.g. vision, language modelling, and audio. Thanks to its mathematical formulation, it compresses its input to a single hidden state, and is able to capture long range dependencies while avoiding the need for an attention mechanism. In this work, we apply S4 to Machine Translation (MT), and evaluate several encoder-decoder variants on WMT’14 and WMT’16. In contrast with the success in language modeling, we find that S4 lags behind the Transformer by approximately 4 BLEU points, and that it counter-intuitively struggles with long sentences. Finally, we show that this gap is caused by S4’s inability to summarize the full source sentence in a single hidden state, and show that we can close the gap by introducing an attention mechanism.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132168333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-19DOI: 10.48550/arXiv.2304.09388
Varun Gumma, Raj Dabre, Pratyush Kumar
Knowledge distillation (KD) is a well-known method for compressing neural models. However, works focusing on distilling knowledge from large multilingual neural machine translation (MNMT) models into smaller ones are practically nonexistent, despite the popularity and superiority of MNMT. This paper bridges this gap by presenting an empirical investigation of knowledge distillation for compressing MNMT models. We take Indic to English translation as a case study and demonstrate that commonly used language-agnostic and language-aware KD approaches yield models that are 4-5x smaller but also suffer from performance drops of up to 3.5 BLEU. To mitigate this, we then experiment with design considerations such as shallower versus deeper models, heavy parameter sharing, multistage training, and adapters. We observe that deeper compact models tend to be as good as shallower non-compact ones and that fine-tuning a distilled model on a high-quality subset slightly boosts translation quality. Overall, we conclude that compressing MNMT models via KD is challenging, indicating immense scope for further research.
{"title":"An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models","authors":"Varun Gumma, Raj Dabre, Pratyush Kumar","doi":"10.48550/arXiv.2304.09388","DOIUrl":"https://doi.org/10.48550/arXiv.2304.09388","url":null,"abstract":"Knowledge distillation (KD) is a well-known method for compressing neural models. However, works focusing on distilling knowledge from large multilingual neural machine translation (MNMT) models into smaller ones are practically nonexistent, despite the popularity and superiority of MNMT. This paper bridges this gap by presenting an empirical investigation of knowledge distillation for compressing MNMT models. We take Indic to English translation as a case study and demonstrate that commonly used language-agnostic and language-aware KD approaches yield models that are 4-5x smaller but also suffer from performance drops of up to 3.5 BLEU. To mitigate this, we then experiment with design considerations such as shallower versus deeper models, heavy parameter sharing, multistage training, and adapters. We observe that deeper compact models tend to be as good as shallower non-compact ones and that fine-tuning a distilled model on a high-quality subset slightly boosts translation quality. Overall, we conclude that compressing MNMT models via KD is challenging, indicating immense scope for further research.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132500205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-18DOI: 10.48550/arXiv.2304.08891
Javad Pourmostafa Roshan Sharami, D. Shterionov, F. Blain, Eva Vanmassenhove, M. D. Sisto, Chris Emmery, P. Spronck
While quality estimation (QE) can play an important role in the translation process, its effectiveness relies on the availability and quality of training data. For QE in particular, high-quality labeled data is often lacking due to the high-cost and effort associated with labeling such data. Aside from the data scarcity challenge, QE models should also be generalizabile, i.e., they should be able to handle data from different domains, both generic and specific. To alleviate these two main issues — data scarcity and domain mismatch — this paper combines domain adaptation and data augmentation within a robust QE system. Our method is to first train a generic QE model and then fine-tune it on a specific domain while retaining generic knowledge. Our results show a significant improvement for all the language pairs investigated, better cross-lingual inference, and a superior performance in zero-shot learning scenarios as compared to state-of-the-art baselines.
{"title":"Tailoring Domain Adaptation for Machine Translation Quality Estimation","authors":"Javad Pourmostafa Roshan Sharami, D. Shterionov, F. Blain, Eva Vanmassenhove, M. D. Sisto, Chris Emmery, P. Spronck","doi":"10.48550/arXiv.2304.08891","DOIUrl":"https://doi.org/10.48550/arXiv.2304.08891","url":null,"abstract":"While quality estimation (QE) can play an important role in the translation process, its effectiveness relies on the availability and quality of training data. For QE in particular, high-quality labeled data is often lacking due to the high-cost and effort associated with labeling such data. Aside from the data scarcity challenge, QE models should also be generalizabile, i.e., they should be able to handle data from different domains, both generic and specific. To alleviate these two main issues — data scarcity and domain mismatch — this paper combines domain adaptation and data augmentation within a robust QE system. Our method is to first train a generic QE model and then fine-tune it on a specific domain while retaining generic knowledge. Our results show a significant improvement for all the language pairs investigated, better cross-lingual inference, and a superior performance in zero-shot learning scenarios as compared to state-of-the-art baselines.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125019293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-03DOI: 10.48550/arXiv.2303.01911
Rachel Bawden, Franccois Yvon
The NLP community recently saw the release of a new large open-access multilingual language model, BLOOM (BigScience et al., 2022) covering 46 languages. We focus on BLOOM’s multilingual ability by evaluating its machine translation performance across several datasets (WMT, Flores-101 and DiaBLa) and language pairs (high- and low-resourced). Our results show that 0-shot performance suffers from overgeneration and generating in the wrong language, but this is greatly improved in the few-shot setting, with very good results for a number of language pairs. We study several aspects including prompt design, model sizes, cross-lingual transfer and the use of discursive context.
NLP社区最近发布了一个新的大型开放获取多语言模型BLOOM (BigScience et al., 2022),涵盖46种语言。我们通过评估BLOOM在多个数据集(WMT, Flores-101和DiaBLa)和语言对(高资源和低资源)上的机器翻译性能来关注BLOOM的多语言能力。我们的结果表明,0次射击的性能会受到过度生成和错误语言生成的影响,但在少数射击设置中,这种情况得到了极大的改善,对于许多语言对都有很好的结果。我们研究了几个方面,包括提示设计,模型大小,跨语言迁移和语篇语境的使用。
{"title":"Investigating the Translation Performance of a Large Multilingual Language Model: the Case of BLOOM","authors":"Rachel Bawden, Franccois Yvon","doi":"10.48550/arXiv.2303.01911","DOIUrl":"https://doi.org/10.48550/arXiv.2303.01911","url":null,"abstract":"The NLP community recently saw the release of a new large open-access multilingual language model, BLOOM (BigScience et al., 2022) covering 46 languages. We focus on BLOOM’s multilingual ability by evaluating its machine translation performance across several datasets (WMT, Flores-101 and DiaBLa) and language pairs (high- and low-resourced). Our results show that 0-shot performance suffers from overgeneration and generating in the wrong language, but this is greatly improved in the few-shot setting, with very good results for a number of language pairs. We study several aspects including prompt design, model sizes, cross-lingual transfer and the use of discursive context.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114645356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-02-28DOI: 10.48550/arXiv.2302.14520
Tom Kocmi, C. Federmann
We describe GEMBA, a GPT-based metric for assessment of translation quality, which works both with a reference translation and without. In our evaluation, we focus on zero-shot prompting, comparing four prompt variants in two modes, based on the availability of the reference. We investigate seven versions of GPT models, including ChatGPT. We show that our method for translation quality assessment only works with GPT 3.5 and larger models. Comparing to results from WMT22’s Metrics shared task, our method achieves state-of-the-art accuracy in both modes when compared to MQM-based human labels. Our results are valid on the system level for all three WMT22 Metrics shared task language pairs, namely English into German, English into Russian, and Chinese into English. This provides a first glimpse into the usefulness of pre-trained, generative large language models for quality assessment of translations. We publicly release all our code and prompt templates used for the experiments described in this work, as well as all corresponding scoring results, to allow for external validation and reproducibility.
{"title":"Large Language Models Are State-of-the-Art Evaluators of Translation Quality","authors":"Tom Kocmi, C. Federmann","doi":"10.48550/arXiv.2302.14520","DOIUrl":"https://doi.org/10.48550/arXiv.2302.14520","url":null,"abstract":"We describe GEMBA, a GPT-based metric for assessment of translation quality, which works both with a reference translation and without. In our evaluation, we focus on zero-shot prompting, comparing four prompt variants in two modes, based on the availability of the reference. We investigate seven versions of GPT models, including ChatGPT. We show that our method for translation quality assessment only works with GPT 3.5 and larger models. Comparing to results from WMT22’s Metrics shared task, our method achieves state-of-the-art accuracy in both modes when compared to MQM-based human labels. Our results are valid on the system level for all three WMT22 Metrics shared task language pairs, namely English into German, English into Russian, and Chinese into English. This provides a first glimpse into the usefulness of pre-trained, generative large language models for quality assessment of translations. We publicly release all our code and prompt templates used for the experiments described in this work, as well as all corresponding scoring results, to allow for external validation and reproducibility.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"30 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125696407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the advent of deep learning, research in many areas of machine learning is converging towards the same set of methods and models. For example, long short-term memory networks (Hochreiter and Schmidhuber, 1997) are not only popular for various tasks in natural language processing (NLP) such as speech recognition, machine translation, handwriting recognition, syntactic parsing, etc., but they are also applicable to seemingly unrelated fields such as bioinformatics (Min et al., 2016). Recent advances in contextual word embeddings like BERT (Devlin et al., 2019) boast with achieving state-of-the-art results on 11 NLP tasks with the same model. Before deep learning, a speech recognizer and a syntactic parser used to have little in common as systems were much more tailored towards the task at hand. At the core of this development is the tendency to view each task as yet another data mapping problem, neglecting the particular characteristics and (soft) requirements that tasks often have in practice. This often goes along with a sharp break of deep learning methods with previous research in the specific area. This thesis can be understood as an antithesis to the prevailing paradigm. We show how traditional symbolic statistical machine translation (Koehn, 2009) models can still improve neural machine translation (Kalchbrenner and Blunsom, 2013; Sutskever et al., 2014; Bahdanau et al., 2015, NMT) while reducing the risk of common pathologies of NMT such as hallucinations and neologisms. Other external symbolic models such as spell checkers and morphology databases help neural models to correct grammatical errors in text.
随着深度学习的出现,机器学习的许多领域的研究正在向同一套方法和模型融合。例如,长短期记忆网络(Hochreiter和Schmidhuber, 1997)不仅在语音识别、机器翻译、手写识别、句法解析等自然语言处理(NLP)中的各种任务中很受欢迎,而且还适用于生物信息学等看似不相关的领域(Min et al., 2016)。上下文词嵌入的最新进展,如BERT (Devlin等人,2019),在使用相同模型的11个NLP任务上取得了最先进的结果。在深度学习之前,语音识别器和语法解析器几乎没有共同点,因为系统更适合手头的任务。这种开发的核心是倾向于将每个任务视为另一个数据映射问题,而忽略了任务在实践中经常具有的特定特征和(软)需求。这通常伴随着深度学习方法与特定领域先前研究的急剧断裂。这篇论文可以被理解为对主流范式的反对。我们展示了传统的符号统计机器翻译(Koehn, 2009)模型如何仍然可以改进神经机器翻译(Kalchbrenner and Blunsom, 2013;Sutskever et al., 2014;Bahdanau等人,2015,NMT),同时降低NMT常见病理(如幻觉和新词)的风险。其他外部符号模型,如拼写检查器和词法数据库,可以帮助神经模型纠正文本中的语法错误。
{"title":"The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction","authors":"Felix Stahlberg","doi":"10.17863/CAM.49422","DOIUrl":"https://doi.org/10.17863/CAM.49422","url":null,"abstract":"With the advent of deep learning, research in many areas of machine learning is converging towards the same set of methods and models. For example, long short-term memory networks (Hochreiter and Schmidhuber, 1997) are not only popular for various tasks in natural language processing (NLP) such as speech recognition, machine translation, handwriting recognition, syntactic parsing, etc., but they are also applicable to seemingly unrelated fields such as bioinformatics (Min et al., 2016). Recent advances in contextual word embeddings like BERT (Devlin et al., 2019) boast with achieving state-of-the-art results on 11 NLP tasks with the same model. Before deep learning, a speech recognizer and a syntactic parser used to have little in common as systems were much more tailored towards the task at hand. At the core of this development is the tendency to view each task as yet another data mapping problem, neglecting the particular characteristics and (soft) requirements that tasks often have in practice. This often goes along with a sharp break of deep learning methods with previous research in the specific area. This thesis can be understood as an antithesis to the prevailing paradigm. We show how traditional symbolic statistical machine translation (Koehn, 2009) models can still improve neural machine translation (Kalchbrenner and Blunsom, 2013; Sutskever et al., 2014; Bahdanau et al., 2015, NMT) while reducing the risk of common pathologies of NMT such as hallucinations and neologisms. Other external symbolic models such as spell checkers and morphology databases help neural models to correct grammatical errors in text.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131935796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Sánchez-Martínez, V. M. Sánchez-Cartagena, J. A. Pérez-Ortiz, M. Forcada, M. Esplà-Gomis, Andrew Secker, Susie Coleman, J. Wall
This paper describes our approach to create a neural machine translation system to translate between English and Swahili (both directions) in the news domain, as well as the process we followed to crawl the necessary parallel corpora from the Internet. We report the results of a pilot human evaluation performed by the news media organisations participating in the H2020 EU-funded project GoURMET.
{"title":"An English-Swahili parallel corpus and its use for neural machine translation in the news domain","authors":"F. Sánchez-Martínez, V. M. Sánchez-Cartagena, J. A. Pérez-Ortiz, M. Forcada, M. Esplà-Gomis, Andrew Secker, Susie Coleman, J. Wall","doi":"10.5281/ZENODO.3923590","DOIUrl":"https://doi.org/10.5281/ZENODO.3923590","url":null,"abstract":"This paper describes our approach to create a neural machine translation system to translate between English and Swahili (both directions) in the news domain, as well as the process we followed to crawl the necessary parallel corpora from the Internet. We report the results of a pilot human evaluation performed by the news media organisations participating in the H2020 EU-funded project GoURMET.","PeriodicalId":137211,"journal":{"name":"European Association for Machine Translation Conferences/Workshops","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121415933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}