Are Ellipses Important for Machine Translation?

IF 5.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Computational Linguistics Pub Date : 2021-08-05 DOI:10.1162/coli_a_00414

Payal Khullar

引用次数: 0

Abstract

Abstract This article describes an experiment to evaluate the impact of different types of ellipses discussed in theoretical linguistics on Neural Machine Translation (NMT), using English to Hindi/Telugu as source and target languages. Evaluation with manual methods shows that most of the errors made by Google NMT are located in the clause containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents. These findings not only confirm the importance of ellipses and their resolution for MT, but also hint toward a possible correlation between the translation of discourse devices like ellipses with the morphological incongruity of the source and target. We also observe that not all ellipses are translated poorly and benefit from reconstruction, advocating for a disparate treatment of different ellipses in MT research.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

省略号对机器翻译很重要吗?

本文以英语为源语和目的语，研究了理论语言学中不同类型的省略号对神经机器翻译(NMT)的影响。手工方法评价表明，谷歌NMT的大部分错误位于包含省略号的子句中，泰卢固语的这种错误频率略高于印地语，使用前置词重构省略号后，翻译的充分性有所提高。这些发现不仅证实了省略号及其解析对机器翻译的重要性，而且暗示了省略号等话语装置的翻译可能与源语和译语的形态不一致有关。我们还观察到并非所有的省略都翻译得很差，并从重建中受益，主张在机器翻译研究中对不同的省略进行不同的处理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computational Linguistics 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.