{"title":"A Named Entities Recognition System for Modern Standard Arabic using Rule-Based Approach","authors":"Hala Elsayed, T. Elghazaly","doi":"10.1109/ACLING.2015.14","DOIUrl":null,"url":null,"abstract":"Named Entity Recognition (NER) is a task in Information Extraction (IE). The Named Entity Recognition has become very important for Natural Language Processing (NLP). In this paper, we designed a system which enhanced the named entities recognition for Arabic language where the system was developed for Arabic nouns and entities extractions. The nouns extraction system is based on Arabic morphological, the Arabic grammar rules a lot of them are not used before. The noun extraction in the system uses no gazetteers and the system is combined with entities extraction system depending on gazetteers. The system extracts noun according to morphological Arabic and classify them into proper nouns entities, title entities, currency entities, percentage entities, countries entities, cities entities, nationality entities, number entities, places entities, date entities and time entities. The system applied algorithms for generate nationality entities from countries entities, and the system applied Regular Expression (RE) for extract numbers in digit format. The system is not needed to normalization into the text before extraction process. The system tested text that is in the Modern Standard Arabic (MSA), the corpus is in open text. The system achieves results in an average recall of 85%.","PeriodicalId":404268,"journal":{"name":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 First International Conference on Arabic Computational Linguistics (ACLing)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACLING.2015.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Named Entity Recognition (NER) is a task in Information Extraction (IE). The Named Entity Recognition has become very important for Natural Language Processing (NLP). In this paper, we designed a system which enhanced the named entities recognition for Arabic language where the system was developed for Arabic nouns and entities extractions. The nouns extraction system is based on Arabic morphological, the Arabic grammar rules a lot of them are not used before. The noun extraction in the system uses no gazetteers and the system is combined with entities extraction system depending on gazetteers. The system extracts noun according to morphological Arabic and classify them into proper nouns entities, title entities, currency entities, percentage entities, countries entities, cities entities, nationality entities, number entities, places entities, date entities and time entities. The system applied algorithms for generate nationality entities from countries entities, and the system applied Regular Expression (RE) for extract numbers in digit format. The system is not needed to normalization into the text before extraction process. The system tested text that is in the Modern Standard Arabic (MSA), the corpus is in open text. The system achieves results in an average recall of 85%.