Fertility-based Source-Language-biased Inversion Transduction Grammar for Word Alignment

Int. J. Comput. Linguistics Chin. Lang. Process. Pub Date : 2009-03-01 DOI:10.30019/IJCLCLP.200903.0001

Chung-Chi Huang, Jason J. S. Chang

引用次数: 0

Abstract

We propose a version of Inversion Transduction Grammar (ITG) model with IBM-style notation of fertility to improve word-alignment performance. In our approach, binary context-free grammar rules of the source language, accompanied by orientation preferences of the target language and fertilities of words, are leveraged to construct a syntax-based statistical translation model. Our model, inherently possessing the characteristics of ITG restrictions and allowing for many consecutive words aligned to one and vice-versa, outperforms the Bracketing Transduction Grammar (BTG) model and GIZA++, a state-of-the-art word aligner, not only in alignment error rate (23% and 14% error reduction) but also in consistent phrase error rate (13% and 9% error reduction). Better performance in these two evaluation metrics suggests that, based on our word alignment result, more accurate phrase pairs may be acquired, leading to better machine translation quality.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于生育的源语言偏倚倒转转导语法的词对齐

我们提出了一种具有ibm风格的生育符号的反转转导语法(ITG)模型，以提高单词对齐性能。在我们的方法中，利用源语言的二进制上下文无关语法规则，伴随着目标语言的方向偏好和单词的丰富性，构建基于语法的统计翻译模型。我们的模型固有地具有ITG限制的特征，并允许许多连续的单词对齐到一个，反之亦然，优于Bracketing Transduction Grammar (BTG)模型和giz++，一个最先进的单词对齐器，不仅在对齐错误率(减少23%和14%的错误)上，而且在一致短语错误率(减少13%和9%的错误)上。这两个评价指标的更好表现表明，基于我们的词对齐结果，可以获得更准确的短语对，从而提高机器翻译质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Int. J. Comput. Linguistics Chin. Lang. Process.

自引率

0.00%

发文量