Impacto da qualidade de textos de partida criados por utilizadores e agentes e a propagação de erros em sistemas de Tradução Automática

Revista da Associação Portuguesa de Linguística Pub Date : 2022-10-25 DOI:10.26334/2183-9077/rapln9ano2022a10

Madalena Gonçalves, Marianna Buchicchio, Helena Moniz

{"title":"Impacto da qualidade de textos de partida criados por utilizadores e agentes e a propagação de erros em sistemas de Tradução Automática","authors":"Madalena Gonçalves, Marianna Buchicchio, Helena Moniz","doi":"10.26334/2183-9077/rapln9ano2022a10","DOIUrl":null,"url":null,"abstract":"This paper proposes a typology concerning errors and linguistic structures found in the source text that have an impact on Machine Translation (MT). The main objectives of this project were firstly, to make a comparison between error typologies and analyze them according to their suitability; analyze annotated data and build a data-driven typology while adapting the previous existing typologies; make a distinction between the errors produced by users and agents in the online Customer Support domain; test the proposed typology with three case studies; methodize patterns in the errors found and verify their impact in MT systems; finally, create a typology ready for production for its particular field. At first, it was made a comparison between different typologies, whether they consider a bilingual or monolingual level (e.g. Unbabel Error Typology, MQM Typology (Lommel et al., 2014b) and SCATE MT Error Taxonomy (Tezcan et al., 2017). This comparison allowed us to verify the differences and similarities between them and, also, which issue types have been previously used. In order to build a data-driven typology, both sides of Customer Support were analyzed — user and agent — as they present different writing structures and are influenced by different factors. The results of that analysis were assessed through the annotation process with a bilingual error typology and were calculated with one of the most highly used manual evaluation metrics in translation quality evaluation — Multidimensional Quality Metrics (MQM), proposed in the QTLaunchPad project (2014), funded by the European Union. Through this analysis, it was then possible to build a data-driven typology — Source Typology. In order to aid future annotators of this typology, we provided guidelines concerning the annotation process and elaborate on the new additions of the typology. In the interest of confirming the reliability of this typology, three case studies were conducted in an internal pilot, with a total of 26,855 words, 2802 errors and 239 linguistic structures (represented in the ‘Neutral’ severity — associated with conversational markers, segmentation, emoticons, etc., characteristics of oral speech) annotated, with different purposes and taking into account several language pairs. In these studies, we verified the effectiveness of the new additions, as well as the transfer of source text errors to the target text. Besides that, it was also analyzed whether the linguistic structures annotated with the ‘Neutral’ severity had in fact any impact on the MT systems. This testing allowed us to confirm the effectiveness and reliability of the Source Typology, including what needs improvement.","PeriodicalId":313789,"journal":{"name":"Revista da Associação Portuguesa de Linguística","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Revista da Associação Portuguesa de Linguística","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26334/2183-9077/rapln9ano2022a10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes a typology concerning errors and linguistic structures found in the source text that have an impact on Machine Translation (MT). The main objectives of this project were firstly, to make a comparison between error typologies and analyze them according to their suitability; analyze annotated data and build a data-driven typology while adapting the previous existing typologies; make a distinction between the errors produced by users and agents in the online Customer Support domain; test the proposed typology with three case studies; methodize patterns in the errors found and verify their impact in MT systems; finally, create a typology ready for production for its particular field. At first, it was made a comparison between different typologies, whether they consider a bilingual or monolingual level (e.g. Unbabel Error Typology, MQM Typology (Lommel et al., 2014b) and SCATE MT Error Taxonomy (Tezcan et al., 2017). This comparison allowed us to verify the differences and similarities between them and, also, which issue types have been previously used. In order to build a data-driven typology, both sides of Customer Support were analyzed — user and agent — as they present different writing structures and are influenced by different factors. The results of that analysis were assessed through the annotation process with a bilingual error typology and were calculated with one of the most highly used manual evaluation metrics in translation quality evaluation — Multidimensional Quality Metrics (MQM), proposed in the QTLaunchPad project (2014), funded by the European Union. Through this analysis, it was then possible to build a data-driven typology — Source Typology. In order to aid future annotators of this typology, we provided guidelines concerning the annotation process and elaborate on the new additions of the typology. In the interest of confirming the reliability of this typology, three case studies were conducted in an internal pilot, with a total of 26,855 words, 2802 errors and 239 linguistic structures (represented in the ‘Neutral’ severity — associated with conversational markers, segmentation, emoticons, etc., characteristics of oral speech) annotated, with different purposes and taking into account several language pairs. In these studies, we verified the effectiveness of the new additions, as well as the transfer of source text errors to the target text. Besides that, it was also analyzed whether the linguistic structures annotated with the ‘Neutral’ severity had in fact any impact on the MT systems. This testing allowed us to confirm the effectiveness and reliability of the Source Typology, including what needs improvement.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

用户和代理创建的源文本质量的影响以及机器翻译系统中错误的传播

本文提出了对机器翻译产生影响的源文本错误和语言结构的类型学。该项目的主要目标是首先，对错误类型进行比较，并根据其适用性进行分析;分析带注释的数据并构建数据驱动的类型学，同时适应先前现有的类型学;区分由在线客户支持域的用户和代理产生的错误;用三个案例研究测试提出的类型学;对发现的错误模式进行方法学分析，并验证其对MT系统的影响;最后，为其特定字段创建准备用于生产的类型。首先，对不同的类型学进行了比较，无论它们是考虑双语还是单语水平(例如Unbabel错误类型学、MQM类型学(Lommel等人，2014b)和SCATE MT错误分类法(Tezcan等人，2017)。这种比较使我们能够验证它们之间的异同，以及以前使用过哪些问题类型。为了建立一个数据驱动的类型学，我们分析了客户支持的两个方面——用户和代理——因为他们表现出不同的写作结构，并受到不同因素的影响。分析结果通过标注过程进行评估，并使用翻译质量评估中最常用的人工评估指标之一——多维质量度量(MQM)，该指标由欧盟资助的QTLaunchPad项目(2014)提出。通过这种分析，就可以建立一个数据驱动的类型学——源类型学。为了帮助未来的注释者对这一类型学进行注释，我们提供了关于注释过程的指导方针，并详细说明了这一类型学的新增内容。为了证实这一类型的可靠性，我们在内部试点中进行了三个案例研究，总共有26,855个单词，2802个错误和239个语言结构(以“中性”严重程度表示-与会话标记，分段，表情符号等相关，口语特征)进行了注释，目的不同，并考虑了几个语言对。在这些研究中，我们验证了新添加的有效性，以及源文本错误向目标文本的转移。此外，还分析了带有“中性”严重性注释的语言结构是否对机器翻译系统有实际影响。这个测试允许我们确认Source类型学的有效性和可靠性，包括需要改进的地方。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Revista da Associação Portuguesa de Linguística

自引率

0.00%

发文量