{"title":"AABATAS阿拉伯语文本分析系统中的语言整合信息","authors":"S. Kanoun, A. Ennaji, Y. Lecourtier, A. Alimi","doi":"10.1109/IWFHR.2002.1030941","DOIUrl":null,"url":null,"abstract":"An Arabic text analysis system called AABATAS (affixal approach-based Arabic text analysis system) is proposed. AABATAS recognizes and categorizes the words while identifying their morphological and grammatical characteristics. It is based on a new approach for Arabic word recognition called affixal approach. This affixal approach is guided by the structural properties of language. A dynamic decomposition-recognition mechanism is used in our system and leads to generate a set of reliable solutions for each word. This mechanism attempts to identify, the word basic morphemes: the prefix, the infix, the suffix and the root contrary to the existing approaches that are usually based on the recognition of the whole word or the pseudo-word or the letter. In this paper, we briefly present the general characteristics of Arabic texts as well as a succinct survey of the existing approaches used for their recognition. We then describe the structural properties of the Arabic language and the two systems based on these last properties. The first one concerns a word recognition process and the second is devoted to text analysis. We finally show two experimental results; one on a data set of 545 words and another on a text example.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"77 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Linguistic integration information in the AABATAS Arabic text analysis system\",\"authors\":\"S. Kanoun, A. Ennaji, Y. Lecourtier, A. Alimi\",\"doi\":\"10.1109/IWFHR.2002.1030941\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An Arabic text analysis system called AABATAS (affixal approach-based Arabic text analysis system) is proposed. AABATAS recognizes and categorizes the words while identifying their morphological and grammatical characteristics. It is based on a new approach for Arabic word recognition called affixal approach. This affixal approach is guided by the structural properties of language. A dynamic decomposition-recognition mechanism is used in our system and leads to generate a set of reliable solutions for each word. This mechanism attempts to identify, the word basic morphemes: the prefix, the infix, the suffix and the root contrary to the existing approaches that are usually based on the recognition of the whole word or the pseudo-word or the letter. In this paper, we briefly present the general characteristics of Arabic texts as well as a succinct survey of the existing approaches used for their recognition. We then describe the structural properties of the Arabic language and the two systems based on these last properties. The first one concerns a word recognition process and the second is devoted to text analysis. We finally show two experimental results; one on a data set of 545 words and another on a text example.\",\"PeriodicalId\":114017,\"journal\":{\"name\":\"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition\",\"volume\":\"77 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2002-08-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWFHR.2002.1030941\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWFHR.2002.1030941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Linguistic integration information in the AABATAS Arabic text analysis system
An Arabic text analysis system called AABATAS (affixal approach-based Arabic text analysis system) is proposed. AABATAS recognizes and categorizes the words while identifying their morphological and grammatical characteristics. It is based on a new approach for Arabic word recognition called affixal approach. This affixal approach is guided by the structural properties of language. A dynamic decomposition-recognition mechanism is used in our system and leads to generate a set of reliable solutions for each word. This mechanism attempts to identify, the word basic morphemes: the prefix, the infix, the suffix and the root contrary to the existing approaches that are usually based on the recognition of the whole word or the pseudo-word or the letter. In this paper, we briefly present the general characteristics of Arabic texts as well as a succinct survey of the existing approaches used for their recognition. We then describe the structural properties of the Arabic language and the two systems based on these last properties. The first one concerns a word recognition process and the second is devoted to text analysis. We finally show two experimental results; one on a data set of 545 words and another on a text example.