Madalena Gonçalves, Marianna Buchicchio, Helena Moniz
{"title":"Impacto da qualidade de textos de partida criados por utilizadores e agentes e a propagação de erros em sistemas de Tradução Automática","authors":"Madalena Gonçalves, Marianna Buchicchio, Helena Moniz","doi":"10.26334/2183-9077/rapln9ano2022a10","DOIUrl":null,"url":null,"abstract":"This paper proposes a typology concerning errors and linguistic structures found in the source text that have an impact on Machine Translation (MT). The main objectives of this project were firstly, to make a comparison between error typologies and analyze them according to their suitability; analyze annotated data and build a data-driven typology while adapting the previous existing typologies; make a distinction between the errors produced by users and agents in the online Customer Support domain; test the proposed typology with three case studies; methodize patterns in the errors found and verify their impact in MT systems; finally, create a typology ready for production for its particular field. At first, it was made a comparison between different typologies, whether they consider a bilingual or monolingual level (e.g. Unbabel Error Typology, MQM Typology (Lommel et al., 2014b) and SCATE MT Error Taxonomy (Tezcan et al., 2017). This comparison allowed us to verify the differences and similarities between them and, also, which issue types have been previously used. In order to build a data-driven typology, both sides of Customer Support were analyzed — user and agent — as they present different writing structures and are influenced by different factors. The results of that analysis were assessed through the annotation process with a bilingual error typology and were calculated with one of the most highly used manual evaluation metrics in translation quality evaluation — Multidimensional Quality Metrics (MQM), proposed in the QTLaunchPad project (2014), funded by the European Union. Through this analysis, it was then possible to build a data-driven typology — Source Typology. In order to aid future annotators of this typology, we provided guidelines concerning the annotation process and elaborate on the new additions of the typology. In the interest of confirming the reliability of this typology, three case studies were conducted in an internal pilot, with a total of 26,855 words, 2802 errors and 239 linguistic structures (represented in the ‘Neutral’ severity — associated with conversational markers, segmentation, emoticons, etc., characteristics of oral speech) annotated, with different purposes and taking into account several language pairs. In these studies, we verified the effectiveness of the new additions, as well as the transfer of source text errors to the target text. Besides that, it was also analyzed whether the linguistic structures annotated with the ‘Neutral’ severity had in fact any impact on the MT systems. This testing allowed us to confirm the effectiveness and reliability of the Source Typology, including what needs improvement.","PeriodicalId":313789,"journal":{"name":"Revista da Associação Portuguesa de Linguística","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista da Associação Portuguesa de Linguística","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26334/2183-9077/rapln9ano2022a10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a typology concerning errors and linguistic structures found in the source text that have an impact on Machine Translation (MT). The main objectives of this project were firstly, to make a comparison between error typologies and analyze them according to their suitability; analyze annotated data and build a data-driven typology while adapting the previous existing typologies; make a distinction between the errors produced by users and agents in the online Customer Support domain; test the proposed typology with three case studies; methodize patterns in the errors found and verify their impact in MT systems; finally, create a typology ready for production for its particular field. At first, it was made a comparison between different typologies, whether they consider a bilingual or monolingual level (e.g. Unbabel Error Typology, MQM Typology (Lommel et al., 2014b) and SCATE MT Error Taxonomy (Tezcan et al., 2017). This comparison allowed us to verify the differences and similarities between them and, also, which issue types have been previously used. In order to build a data-driven typology, both sides of Customer Support were analyzed — user and agent — as they present different writing structures and are influenced by different factors. The results of that analysis were assessed through the annotation process with a bilingual error typology and were calculated with one of the most highly used manual evaluation metrics in translation quality evaluation — Multidimensional Quality Metrics (MQM), proposed in the QTLaunchPad project (2014), funded by the European Union. Through this analysis, it was then possible to build a data-driven typology — Source Typology. In order to aid future annotators of this typology, we provided guidelines concerning the annotation process and elaborate on the new additions of the typology. In the interest of confirming the reliability of this typology, three case studies were conducted in an internal pilot, with a total of 26,855 words, 2802 errors and 239 linguistic structures (represented in the ‘Neutral’ severity — associated with conversational markers, segmentation, emoticons, etc., characteristics of oral speech) annotated, with different purposes and taking into account several language pairs. In these studies, we verified the effectiveness of the new additions, as well as the transfer of source text errors to the target text. Besides that, it was also analyzed whether the linguistic structures annotated with the ‘Neutral’ severity had in fact any impact on the MT systems. This testing allowed us to confirm the effectiveness and reliability of the Source Typology, including what needs improvement.