{"title":"Capturing the Common Syntactical Rules for the Holy Quran: A Data Mining Approach","authors":"M. Alsaheb, Dia AbuZeina","doi":"10.1109/NOORIC.2013.105","DOIUrl":null,"url":null,"abstract":"This paper presents a novel approach to capture the common syntactical rules for the Holy Quran . By syntactical rules, we mean the common relationships between the words' tags that highly show up in the Quran. Arabic, like other language, has a number of tags which include nouns, verbs, and pronouns with a number of sub-types of each one of them. In this paper we used data mining approach to extract the common syntactical rules which will be offered to the natural language processing applications. Stanford part of speech tagger (29 tags) will be used to tag the Quran words. Then, the data mining too called WEKA (PredictiveApriori algorithm) will be used to find the famous syntactical rules. The extracted syntactical rules have a property that it is not necessary to have adjacent words tags. That is, long distance relation. The most common syntactical rule found is: tag1=RP tag2=NN tag3=WP 91 ⇒ tag4=VBD 90 acc:(0.97912)Which can be seen in the Quran sentence. This phrase (which is part of an ayah) appeared in 89 ayahs in 20 different surahs; the study used Mushaf Al-Madinah Al-Munawwarah (published by the King Fahd Complex for Printing the Holy Quran ).","PeriodicalId":328341,"journal":{"name":"2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Taibah University International Conference on Advances in Information Technology for the Holy Quran and Its Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NOORIC.2013.105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
This paper presents a novel approach to capture the common syntactical rules for the Holy Quran . By syntactical rules, we mean the common relationships between the words' tags that highly show up in the Quran. Arabic, like other language, has a number of tags which include nouns, verbs, and pronouns with a number of sub-types of each one of them. In this paper we used data mining approach to extract the common syntactical rules which will be offered to the natural language processing applications. Stanford part of speech tagger (29 tags) will be used to tag the Quran words. Then, the data mining too called WEKA (PredictiveApriori algorithm) will be used to find the famous syntactical rules. The extracted syntactical rules have a property that it is not necessary to have adjacent words tags. That is, long distance relation. The most common syntactical rule found is: tag1=RP tag2=NN tag3=WP 91 ⇒ tag4=VBD 90 acc:(0.97912)Which can be seen in the Quran sentence. This phrase (which is part of an ayah) appeared in 89 ayahs in 20 different surahs; the study used Mushaf Al-Madinah Al-Munawwarah (published by the King Fahd Complex for Printing the Holy Quran ).