Automatic Meta-evaluation of Low-Resource Machine Translation Evaluation Metrics

Junting Yu, Wuying Liu, Hongye He, Lin Wang
{"title":"Automatic Meta-evaluation of Low-Resource Machine Translation Evaluation Metrics","authors":"Junting Yu, Wuying Liu, Hongye He, Lin Wang","doi":"10.1109/IALP48816.2019.9037658","DOIUrl":null,"url":null,"abstract":"Meta-evaluation is a method to assess machine translation (MT) evaluation metrics according to certain theories and standards. This paper addresses an automatic meta-evaluation method of machine translation evaluation based on ORANGE- Limited ORANGE, which is applied in low-resource machine translation evaluation. It is adopted when the resources are limited. And take the three n-gram-based metrics - BLEUS, ROUGE-L and ROUGE-S for experiment, which is called horizontal comparison. Also, vertical comparison is used to compare the different forms of the same evaluation metric. Compared with the traditional human method, this method can evaluate metrics automatically without extra human involvement except for a set of references. It only needs the average rank of the references, and will not be influenced by the subjective factors. And it costs less and expends less time than the traditional one. It is good for the machine translation system parameter optimization and shortens the system development period. In this paper, we use this automatic meta-evaluation method to evaluate BLEUS, ROUGE-L, ROUGE-S and their different forms based on Cilin on the Russian-Chinese dataset. The result shows the same as that of the traditional human meta-evaluation. In this way, the consistency and effectiveness of Limited ORANGE are verified.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP48816.2019.9037658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Meta-evaluation is a method to assess machine translation (MT) evaluation metrics according to certain theories and standards. This paper addresses an automatic meta-evaluation method of machine translation evaluation based on ORANGE- Limited ORANGE, which is applied in low-resource machine translation evaluation. It is adopted when the resources are limited. And take the three n-gram-based metrics - BLEUS, ROUGE-L and ROUGE-S for experiment, which is called horizontal comparison. Also, vertical comparison is used to compare the different forms of the same evaluation metric. Compared with the traditional human method, this method can evaluate metrics automatically without extra human involvement except for a set of references. It only needs the average rank of the references, and will not be influenced by the subjective factors. And it costs less and expends less time than the traditional one. It is good for the machine translation system parameter optimization and shortens the system development period. In this paper, we use this automatic meta-evaluation method to evaluate BLEUS, ROUGE-L, ROUGE-S and their different forms based on Cilin on the Russian-Chinese dataset. The result shows the same as that of the traditional human meta-evaluation. In this way, the consistency and effectiveness of Limited ORANGE are verified.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
低资源机器翻译评价指标的自动元评价
元评价是根据一定的理论和标准对机器翻译评价指标进行评价的一种方法。本文提出了一种基于ORANGE- Limited ORANGE的机器翻译评价自动元评价方法,并将其应用于低资源机器翻译评价中。在资源有限的情况下采用。并以三个基于n-gram的指标——BLEUS、ROUGE-L和ROUGE-S进行实验,称为水平比较。此外,垂直比较用于比较相同评估指标的不同形式。与传统的人工方法相比,该方法可以自动评估指标,除了一组参考外,无需额外的人工参与。它只需要参考文献的平均排名,不会受到主观因素的影响。它比传统的花费更少,花费更少的时间。这有利于机器翻译系统的参数优化,缩短了系统的开发周期。本文采用这种自动元评价方法,在俄中数据集上对基于Cilin的BLEUS、ROUGE-L、ROUGE-S及其不同形式进行了评价。结果与传统的人的元评价结果一致。从而验证了Limited ORANGE的一致性和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A General Procedure for Improving Language Models in Low-Resource Speech Recognition Automated Prediction of Item Difficulty in Reading Comprehension Using Long Short-Term Memory An Measurement Method of Ancient Poetry Difficulty for Adaptive Testing How to Answer Comparison Questions An Enhancement of Malay Social Media Text Normalization for Lexicon-Based Sentiment Analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1