Transformational generative grammar (TGG): An efficient way of parsing Bangla sentences

Mohammad Kamrul Huq Maroof, Lamia Alam, M. M. Hoque
{"title":"Transformational generative grammar (TGG): An efficient way of parsing Bangla sentences","authors":"Mohammad Kamrul Huq Maroof, Lamia Alam, M. M. Hoque","doi":"10.1109/ICECTE.2016.7879583","DOIUrl":null,"url":null,"abstract":"Natural language processing (NLP) refers to the ability of systems to process sentences in a natural language such as Bangla, rather than in a specialized artificial computer language. Computer processing of Bangla language is a challenging task due to its varieties of words formation and way of speaking. The same meaning can be expressed in different ways which is a great challenge to face for translation by an automatic machine translation system. With the advent of internet technology and e-commerce, the demand of automatic machine translation has been increased. Parsing is essential for any type of natural language processing. Parsing of Bangla natural language can be used as a subsystem for Bangla to another language machine aided translation. A parser usually checks the validity of a sentence using grammatical rule. In this paper, we propose a set of transformational generative grammar (TGG) in conjunction with phrase structure grammar to generate parse tree and to recognize assertive, interrogative, imperative, optative and exclamatory sentences of Bangla language. It is applicable for many sentences that cannot be parsed using only phrase structure grammars. The process involves analysis of Bangla sentence morphologically, syntactically where tokens and grammatical information are passed through parsing stage and finally output can be achieved. A dictionary of lexicon is used which contains some syntactic, semantic, and possibly some pragmatic information. We have tested our system for different kinds of Bangla sentences and experimental result reveals that the overall success rate of the proposed system is 84.4%.","PeriodicalId":6578,"journal":{"name":"2016 2nd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE)","volume":"22 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECTE.2016.7879583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Natural language processing (NLP) refers to the ability of systems to process sentences in a natural language such as Bangla, rather than in a specialized artificial computer language. Computer processing of Bangla language is a challenging task due to its varieties of words formation and way of speaking. The same meaning can be expressed in different ways which is a great challenge to face for translation by an automatic machine translation system. With the advent of internet technology and e-commerce, the demand of automatic machine translation has been increased. Parsing is essential for any type of natural language processing. Parsing of Bangla natural language can be used as a subsystem for Bangla to another language machine aided translation. A parser usually checks the validity of a sentence using grammatical rule. In this paper, we propose a set of transformational generative grammar (TGG) in conjunction with phrase structure grammar to generate parse tree and to recognize assertive, interrogative, imperative, optative and exclamatory sentences of Bangla language. It is applicable for many sentences that cannot be parsed using only phrase structure grammars. The process involves analysis of Bangla sentence morphologically, syntactically where tokens and grammatical information are passed through parsing stage and finally output can be achieved. A dictionary of lexicon is used which contains some syntactic, semantic, and possibly some pragmatic information. We have tested our system for different kinds of Bangla sentences and experimental result reveals that the overall success rate of the proposed system is 84.4%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
转换生成语法(TGG):一种分析孟加拉语句子的有效方法
自然语言处理(NLP)是指系统处理自然语言(如孟加拉语)句子的能力,而不是专门的人工计算机语言。由于孟加拉语构词法和说话方式的多样性,计算机处理是一项具有挑战性的任务。同样的意思可以有不同的表达方式,这是机器自动翻译系统面临的巨大挑战。随着互联网技术和电子商务的出现,对自动机器翻译的需求越来越大。解析对于任何类型的自然语言处理都是必不可少的。孟加拉语自然语言的解析可以作为孟加拉语到另一种语言机器辅助翻译的子系统。解析器通常使用语法规则检查句子的有效性。本文提出了一套结合短语结构语法的转换生成语法(TGG)来生成解析树,并对孟加拉语的断言句、疑问句、祈使句、选择句和感叹句进行识别。它适用于许多只使用短语结构语法无法解析的句子。该过程包括对孟加拉语句子进行形态、句法分析,并通过解析阶段传递标记和语法信息,最终得到输出。使用词典词典,其中包含一些语法、语义和可能的一些语用信息。我们对不同类型的孟加拉语句子进行了测试,实验结果表明,该系统的总体成功率为84.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An investigation of SAR inside human heart for antenna directivity, surface current variations and effect on antenna frequency in presence of heart Smoothening of wind farm output fluctuations using new pitch controller Low effective material loss microstructure fiber for THz wave guidance A new machine learning approach to select adaptive IMFs of EMD Comparison of two types of graphene coated fiber optic SPR biosensors
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1