Effect of Diacritics on Machine Translation Performance: A Case Study of Yemeni Literature

Saleh Abduh Naji Ali, Ibraheem Nagi Ahmed Tagaddeen
{"title":"Effect of Diacritics on Machine Translation Performance: A Case Study of Yemeni Literature","authors":"Saleh Abduh Naji Ali, Ibraheem Nagi Ahmed Tagaddeen","doi":"10.36892/ijlls.v5i2.1342","DOIUrl":null,"url":null,"abstract":"Many Arabic texts are written without diacritics. However, in some contexts this raises the high level of homography and in turn presents difficulties for machine translation programs. Homographs are words which are spelled identically but have different meanings and are mostly pronounced differently. To avoid the problem of homography, words require to be diacriticized. Thus, the main objective of the study is to assess the ability of machine translation (henceforth MT) in rendering diacritical words from Arabic into English with special reference to translating Yemeni literature into English. This study will also compare the translations of three MT programs, namely, (Reverso, Systran Translate and Free Translation Online) to find out which program is close to the original meaning of the source language texts. Further, the study aims to identify some causes that stand behind errors of translating diacriticized words that result from the mentioned programs. To achieve these aims, descriptive, analytical and comparative methods were followed by the researcher. Thus, the three common and modern MT programs, Reverso, Systran and Free Translation Online were selected to translate some diacriticized words. Then, some excerpts with their contexts were taken from the two Yemeni works, The Hostage (Ar-rahinah) (???????) by the Yemeni famous writer Zayd Muttee Dammaj and the Yemeni book Yemeni Wealth from Popular Proverbs ?????? ??????? ?? ??????? ??????? by the Yemeni writer Muhammad Al-Adimi. The chosen samples were inserted into these MT programs to be electronically translated and then analyzed and discussed qualitatively and quantitatively. The study has concluded that MT encountered the problem of diacritics in Arabic texts; as a result most of the time MT programs failed in recognizing diacritics on letters. Thus, most of the programs' translation results were incorrect and did not concord with the original meaning. It can also be concluded that Free Translation Online program produced the least errors of the three programs and Systran mistranslated all the diacriticized excerpts. These errors can be attributed to absence of programs which contain the diacritic system of Arabic.","PeriodicalId":34879,"journal":{"name":"International Journal of Language and Literary Studies","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Language and Literary Studies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.36892/ijlls.v5i2.1342","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Many Arabic texts are written without diacritics. However, in some contexts this raises the high level of homography and in turn presents difficulties for machine translation programs. Homographs are words which are spelled identically but have different meanings and are mostly pronounced differently. To avoid the problem of homography, words require to be diacriticized. Thus, the main objective of the study is to assess the ability of machine translation (henceforth MT) in rendering diacritical words from Arabic into English with special reference to translating Yemeni literature into English. This study will also compare the translations of three MT programs, namely, (Reverso, Systran Translate and Free Translation Online) to find out which program is close to the original meaning of the source language texts. Further, the study aims to identify some causes that stand behind errors of translating diacriticized words that result from the mentioned programs. To achieve these aims, descriptive, analytical and comparative methods were followed by the researcher. Thus, the three common and modern MT programs, Reverso, Systran and Free Translation Online were selected to translate some diacriticized words. Then, some excerpts with their contexts were taken from the two Yemeni works, The Hostage (Ar-rahinah) (???????) by the Yemeni famous writer Zayd Muttee Dammaj and the Yemeni book Yemeni Wealth from Popular Proverbs ?????? ??????? ?? ??????? ??????? by the Yemeni writer Muhammad Al-Adimi. The chosen samples were inserted into these MT programs to be electronically translated and then analyzed and discussed qualitatively and quantitatively. The study has concluded that MT encountered the problem of diacritics in Arabic texts; as a result most of the time MT programs failed in recognizing diacritics on letters. Thus, most of the programs' translation results were incorrect and did not concord with the original meaning. It can also be concluded that Free Translation Online program produced the least errors of the three programs and Systran mistranslated all the diacriticized excerpts. These errors can be attributed to absence of programs which contain the diacritic system of Arabic.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
双重批评对机器翻译性能的影响——以也门文学为例
许多阿拉伯语文本都没有变音符号。然而,在某些情况下,这会提高单应性的高水平,进而给机器翻译程序带来困难。同源词是拼写相同但含义不同的单词,发音大多不同。为了避免单应性的问题,单词需要变音符号化。因此,本研究的主要目的是评估机器翻译(以下简称MT)将阿拉伯语的变音词翻译成英语的能力,特别是将也门文学翻译成英语。本研究还将比较三个机器翻译程序(Reverso、Systran Translate和Free Translation Online)的翻译,以找出哪个程序更接近源语言文本的原意。此外,本研究旨在找出上述程序导致的变音词翻译错误背后的一些原因。为了达到这些目的,研究人员采用了描述性、分析性和比较性的方法。因此,选择了三个常用的现代机器翻译程序Reverso、Systran和Free Translation Online来翻译一些变音词。然后,从也门著名作家Zayd Muttee Dammaj的两部也门作品《人质》(Ar rahinah)和也门书籍《流行谚语中的也门财富》中摘录了一些内容及其背景???????也门作家穆罕默德·阿迪米。选择的样本被插入到这些MT程序中进行电子翻译,然后进行定性和定量的分析和讨论。研究表明,MT在阿拉伯语文本中遇到了变音符号的问题;因此,MT程序在大多数情况下都无法识别字母上的变音符号。因此,大多数节目的翻译结果都是不正确的,与原意不一致。还可以得出结论,免费翻译在线程序产生的错误是三个程序中最少的,Systran误译了所有的变音摘录。这些错误可归因于缺少包含阿拉伯语变音符号系统的程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
74
审稿时长
6 weeks
期刊最新文献
Gender as a ‘Discursive Practice’ in Romance Discourse Conceptual Review: Cultivating Learner Autonomy Through Self-Directed Learning & Self-Regulated Learning: A Socio-Constructivist Exploration Impact and Identities as Revealed in Tourists' Perceptions of the Linguistic Landscape in Tourist Destinations English Language Learners’ Perception and Motivation Towards Exam Format: A Qualitative Study Essay Writing Strategies Employed by English-Majored Sophomores at A University in Vietnam
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1