{"title":"A new-arabic-text classification system using a Hidden Markov Model","authors":"Zied Kechaou, S. Kanoun","doi":"10.3233/KES-140297","DOIUrl":null,"url":null,"abstract":"The Recent years have witnessed a rapid growth in the quantity of Arabic-formulated information available in electronic format on both the Internet and corporate intranet. As a result, the user turns out to be overwhelmed by such a huge mass of information, with an arising question of how to locate or retrieve the desired information they need. For this end, several automatic classification systems have been developed both on the Internet, and within companies. With respect to the present paper, a special attempt is made to present a thorough examination of the effectiveness of applying a specific machine-learning technique relevant to help solve the Arabic text related classification problem. In addition, we undertake to explore and identify the major Hidden Markov Model (HMM) classifier benefits with regard to Arabic text classification procedure based on our newly-designed stemming approach. On the basis of the reached experimental results, one might well notice that our conceived HMM-based model has managed to achieve a high-classification accuracy with regard to Arabic-electronic text corpuses.","PeriodicalId":210048,"journal":{"name":"Int. J. Knowl. Based Intell. Eng. Syst.","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Knowl. Based Intell. Eng. Syst.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/KES-140297","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
The Recent years have witnessed a rapid growth in the quantity of Arabic-formulated information available in electronic format on both the Internet and corporate intranet. As a result, the user turns out to be overwhelmed by such a huge mass of information, with an arising question of how to locate or retrieve the desired information they need. For this end, several automatic classification systems have been developed both on the Internet, and within companies. With respect to the present paper, a special attempt is made to present a thorough examination of the effectiveness of applying a specific machine-learning technique relevant to help solve the Arabic text related classification problem. In addition, we undertake to explore and identify the major Hidden Markov Model (HMM) classifier benefits with regard to Arabic text classification procedure based on our newly-designed stemming approach. On the basis of the reached experimental results, one might well notice that our conceived HMM-based model has managed to achieve a high-classification accuracy with regard to Arabic-electronic text corpuses.