{"title":"COMPARATIVE STUDY OF FONT RECOGNITION USING CONVOLUTIONAL NEURAL NETWORKS AND TWO FEATURE EXTRACTION METHODS WITH SUPPORT VECTOR MACHINE","authors":"Aveen Jalal Mohammed, Jwan Abdulkhaliq Mohammed, Amera Ismail Melhum","doi":"10.25195/ijci.v49i2.434","DOIUrl":null,"url":null,"abstract":"Font recognition is one of the essential issues in document recognition and analysis, and is frequently a complex and time-consuming process. Many techniques of optical character recognition (OCR) have been suggested and some of them have been marketed, however, a few of these techniques considered font recognition. The issue of OCR is that it saves copies of documents to make them searchable, but the documents stop having the original appearance. To solve this problem, this paper presents a system for recognizing three and six English fonts from character images using Convolution Neural Network (CNN), and then compare the results of proposed system with the two studies. The first study used NCM features and SVM as a classification method, and the second study used DP features and SVM as classification method. The data of this study were taken from Al-Khaffaf dataset [21]. The two types of datasets have been used: the first type is about 27,620 sample for the three fonts classification and the second type is about 72,983 sample for the six fonts classification and both datasets are English character images in gray scale format with 8 bits. The results showed that CNN achieved the highest recognition rate in the proposed system compared with the two studies reached 99.75% and 98.329 % for the three and six fonts recognition, respectively. In addition, CNN got the least time required for creating model about 6 minutes and 23- 24 minutes for three and six fonts recognition, respectively. Based on the results, we can conclude that CNN technique is the best and most accurate model for recognizing fonts.","PeriodicalId":53384,"journal":{"name":"Iraqi Journal for Computers and Informatics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Iraqi Journal for Computers and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25195/ijci.v49i2.434","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Font recognition is one of the essential issues in document recognition and analysis, and is frequently a complex and time-consuming process. Many techniques of optical character recognition (OCR) have been suggested and some of them have been marketed, however, a few of these techniques considered font recognition. The issue of OCR is that it saves copies of documents to make them searchable, but the documents stop having the original appearance. To solve this problem, this paper presents a system for recognizing three and six English fonts from character images using Convolution Neural Network (CNN), and then compare the results of proposed system with the two studies. The first study used NCM features and SVM as a classification method, and the second study used DP features and SVM as classification method. The data of this study were taken from Al-Khaffaf dataset [21]. The two types of datasets have been used: the first type is about 27,620 sample for the three fonts classification and the second type is about 72,983 sample for the six fonts classification and both datasets are English character images in gray scale format with 8 bits. The results showed that CNN achieved the highest recognition rate in the proposed system compared with the two studies reached 99.75% and 98.329 % for the three and six fonts recognition, respectively. In addition, CNN got the least time required for creating model about 6 minutes and 23- 24 minutes for three and six fonts recognition, respectively. Based on the results, we can conclude that CNN technique is the best and most accurate model for recognizing fonts.