{"title":"Proposed Deep Learning System for Arabic Text Detection and Recognition","authors":"Ghufran Jafar Salman, M. S. M. Altaei","doi":"10.1109/DeSE58274.2023.10100235","DOIUrl":null,"url":null,"abstract":"Building a system to recognize Arabic words or texts has been challenging. It's harder when the text is in various sizes and fonts, regardless of font complexities. This work built a smart system to recognize Arabic words and texts by creating a dataset and training it by using deep learning techniques. This system can scan text into a computer texts. Each of the 1,000 words in the dataset was written out 24 different ways, using 24 different Arabic fonts. Words in images were identified and deduced with the use of image processing methods. Finally, the deep learning (Convolution Neural Network CNN) algorithm takes over, extracting features from the truncated word and retrieving text words that are visually similar to the ones that were cut. In experiments, the system achieved 99% accuracy in words detection and 96% accuracy in recognition.","PeriodicalId":346847,"journal":{"name":"2023 15th International Conference on Developments in eSystems Engineering (DeSE)","volume":"43 21","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 15th International Conference on Developments in eSystems Engineering (DeSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DeSE58274.2023.10100235","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Building a system to recognize Arabic words or texts has been challenging. It's harder when the text is in various sizes and fonts, regardless of font complexities. This work built a smart system to recognize Arabic words and texts by creating a dataset and training it by using deep learning techniques. This system can scan text into a computer texts. Each of the 1,000 words in the dataset was written out 24 different ways, using 24 different Arabic fonts. Words in images were identified and deduced with the use of image processing methods. Finally, the deep learning (Convolution Neural Network CNN) algorithm takes over, extracting features from the truncated word and retrieving text words that are visually similar to the ones that were cut. In experiments, the system achieved 99% accuracy in words detection and 96% accuracy in recognition.