Using cognates to align sentences in bilingual corpora

Michel Simard, George F. Foster, P. Isabelle
{"title":"Using cognates to align sentences in bilingual corpora","authors":"Michel Simard, George F. Foster, P. Isabelle","doi":"10.1145/962411","DOIUrl":null,"url":null,"abstract":"In a recent paper, Gale and Church describe an inexpensive method for aligning bitext, based exclusively on sentence lengths [3]. While this method produces surprisingly good results (a success rate around 96%), even better results are required to perform such tasks as the computer-assisted revision of translations. In this paper, we examine some of the weaknesses of Gale and Church's program, and explain how just a small amount of linguistic knowledge would help to overcome these weaknesses. We discuss how cognates provide for a cheap and reasonably reliable source of linguistic knowledge. To illustrate this, we describe a modification to the program in which the criterion is cognates rather than sentence lengths. Finally, we show how better and more efficient results may be obtained by combining the two criteria length and \"cogneteness\". Our method can be generalized to accommodate other sources of linguistic knowledge, and experimentation shows that it produces better results than alignments based on length alone, at a minimal cost.","PeriodicalId":345684,"journal":{"name":"Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages","volume":"282 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"384","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/962411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 384

Abstract

In a recent paper, Gale and Church describe an inexpensive method for aligning bitext, based exclusively on sentence lengths [3]. While this method produces surprisingly good results (a success rate around 96%), even better results are required to perform such tasks as the computer-assisted revision of translations. In this paper, we examine some of the weaknesses of Gale and Church's program, and explain how just a small amount of linguistic knowledge would help to overcome these weaknesses. We discuss how cognates provide for a cheap and reasonably reliable source of linguistic knowledge. To illustrate this, we describe a modification to the program in which the criterion is cognates rather than sentence lengths. Finally, we show how better and more efficient results may be obtained by combining the two criteria length and "cogneteness". Our method can be generalized to accommodate other sources of linguistic knowledge, and experimentation shows that it produces better results than alignments based on length alone, at a minimal cost.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用同源词对双语语料库中的句子进行对齐
在最近的一篇论文中,Gale和Church描述了一种廉价的对齐文本的方法,完全基于句子长度[3]。虽然这种方法产生了令人惊讶的好结果(成功率约为96%),但执行诸如计算机辅助翻译修订之类的任务需要更好的结果。在本文中,我们研究了Gale和Church的程序的一些弱点,并解释了少量的语言知识如何有助于克服这些弱点。我们将讨论同源词如何提供一种廉价且相当可靠的语言知识来源。为了说明这一点,我们描述了对程序的修改,其中标准是同源词而不是句子长度。最后,我们展示了如何结合长度和“认知”这两个标准来获得更好和更有效的结果。我们的方法可以推广到适应其他语言知识来源,实验表明,它比仅基于长度的比对产生更好的结果,成本最小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Translation Analysis and Translation Automation Using cognates to align sentences in bilingual corpora A statistical approach to French/English translation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1