Mohammad Kamrul Huq Maroof, Lamia Alam, M. M. Hoque
{"title":"Transformational generative grammar (TGG): An efficient way of parsing Bangla sentences","authors":"Mohammad Kamrul Huq Maroof, Lamia Alam, M. M. Hoque","doi":"10.1109/ICECTE.2016.7879583","DOIUrl":null,"url":null,"abstract":"Natural language processing (NLP) refers to the ability of systems to process sentences in a natural language such as Bangla, rather than in a specialized artificial computer language. Computer processing of Bangla language is a challenging task due to its varieties of words formation and way of speaking. The same meaning can be expressed in different ways which is a great challenge to face for translation by an automatic machine translation system. With the advent of internet technology and e-commerce, the demand of automatic machine translation has been increased. Parsing is essential for any type of natural language processing. Parsing of Bangla natural language can be used as a subsystem for Bangla to another language machine aided translation. A parser usually checks the validity of a sentence using grammatical rule. In this paper, we propose a set of transformational generative grammar (TGG) in conjunction with phrase structure grammar to generate parse tree and to recognize assertive, interrogative, imperative, optative and exclamatory sentences of Bangla language. It is applicable for many sentences that cannot be parsed using only phrase structure grammars. The process involves analysis of Bangla sentence morphologically, syntactically where tokens and grammatical information are passed through parsing stage and finally output can be achieved. A dictionary of lexicon is used which contains some syntactic, semantic, and possibly some pragmatic information. We have tested our system for different kinds of Bangla sentences and experimental result reveals that the overall success rate of the proposed system is 84.4%.","PeriodicalId":6578,"journal":{"name":"2016 2nd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE)","volume":"22 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 2nd International Conference on Electrical, Computer & Telecommunication Engineering (ICECTE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECTE.2016.7879583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Natural language processing (NLP) refers to the ability of systems to process sentences in a natural language such as Bangla, rather than in a specialized artificial computer language. Computer processing of Bangla language is a challenging task due to its varieties of words formation and way of speaking. The same meaning can be expressed in different ways which is a great challenge to face for translation by an automatic machine translation system. With the advent of internet technology and e-commerce, the demand of automatic machine translation has been increased. Parsing is essential for any type of natural language processing. Parsing of Bangla natural language can be used as a subsystem for Bangla to another language machine aided translation. A parser usually checks the validity of a sentence using grammatical rule. In this paper, we propose a set of transformational generative grammar (TGG) in conjunction with phrase structure grammar to generate parse tree and to recognize assertive, interrogative, imperative, optative and exclamatory sentences of Bangla language. It is applicable for many sentences that cannot be parsed using only phrase structure grammars. The process involves analysis of Bangla sentence morphologically, syntactically where tokens and grammatical information are passed through parsing stage and finally output can be achieved. A dictionary of lexicon is used which contains some syntactic, semantic, and possibly some pragmatic information. We have tested our system for different kinds of Bangla sentences and experimental result reveals that the overall success rate of the proposed system is 84.4%.