Chandramma, Piyush Kumar Pareek, K. Swathi, Puneet Shetteppanavar
{"title":"An efficient machine translation model for Dravidian language","authors":"Chandramma, Piyush Kumar Pareek, K. Swathi, Puneet Shetteppanavar","doi":"10.1109/RTEICT.2017.8256970","DOIUrl":null,"url":null,"abstract":"In a multilingual diversity country like India, language translation plays a significant factor in the area of text processing application such as information extraction, machine learning, natural language understanding, information retrieval and machine translation. Thereexists many challenges and issues in translation between languages, especially the Dravidian language, such as ambiguities, lexical divergence, syntactic, lexical mismatches and semantic issues, etc. The n — gram language model (LM) performs very well machine translation. However the existing approach is not efficient for generating n — gram from large bi-lingual parallel corpora. Most existing approaches are limited to monolingual to minimize redundant n — gram. To overcome this, this work presents an efficient machine translation model using machine learning techniques. No prior work has considered machine translation of Kannada and Telugu. The experimentis conducted on Wikipedia dataset show significant performance in term of accuracy and computation complexity of alignment considering different threshold.","PeriodicalId":342831,"journal":{"name":"2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTEICT.2017.8256970","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24
Abstract
In a multilingual diversity country like India, language translation plays a significant factor in the area of text processing application such as information extraction, machine learning, natural language understanding, information retrieval and machine translation. Thereexists many challenges and issues in translation between languages, especially the Dravidian language, such as ambiguities, lexical divergence, syntactic, lexical mismatches and semantic issues, etc. The n — gram language model (LM) performs very well machine translation. However the existing approach is not efficient for generating n — gram from large bi-lingual parallel corpora. Most existing approaches are limited to monolingual to minimize redundant n — gram. To overcome this, this work presents an efficient machine translation model using machine learning techniques. No prior work has considered machine translation of Kannada and Telugu. The experimentis conducted on Wikipedia dataset show significant performance in term of accuracy and computation complexity of alignment considering different threshold.