Handwritten Geez Digit Recognition Using Deep Learning

Mukerem Ali Nur, Mesfin Abebe, Rajesh Sharma Rajendran
{"title":"Handwritten Geez Digit Recognition Using Deep Learning","authors":"Mukerem Ali Nur, Mesfin Abebe, Rajesh Sharma Rajendran","doi":"10.1155/2022/8515810","DOIUrl":null,"url":null,"abstract":"Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.","PeriodicalId":8218,"journal":{"name":"Appl. Comput. Intell. Soft Comput.","volume":"502 1","pages":"8515810:1-8515810:12"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Appl. Comput. Intell. Soft Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1155/2022/8515810","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Amharic language is the second most spoken language in the Semitic family after Arabic. In Ethiopia and neighboring countries more than 100 million people speak the Amharic language. There are many historical documents that are written using the Geez script. Digitizing historical handwritten documents and recognizing handwritten characters is essential to preserving valuable documents. Handwritten digit recognition is one of the tasks of digitizing handwritten documents from different sources. Currently, handwritten Geez digit recognition researches are very few, and there is no available organized dataset for the public researchers. Convolutional neural network (CNN) is preferable for pattern recognition like in handwritten document recognition by extracting a feature from different styles of writing. In this work, the proposed model is to recognize Geez digits using CNN. Deep neural networks, which have recently shown exceptional performance in numerous pattern recognition and machine learning applications, are used to recognize handwritten Geez digits, but this has not been attempted for Ethiopic scripts. Our dataset, which contains 51,952 images of handwritten Geez digits collected from 524 individuals, is used to train and evaluate the CNN model. The application of the CNN improves the performance of several machine-learning classification methods significantly. Our proposed CNN model has an accuracy of 96.21% and a loss of 0.2013. In comparison to earlier research works on Geez handwritten digit recognition, the study was able to attain higher recognition accuracy using the developed CNN model.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用深度学习的手写Geez数字识别
阿姆哈拉语是闪米特族中仅次于阿拉伯语的第二大语言。在埃塞俄比亚及其邻国,有超过1亿人说阿姆哈拉语。有许多历史文献都是用耶兹文字写成的。数字化历史手写文件和识别手写字符是保存有价值文件的必要条件。手写数字识别是对不同来源的手写文档进行数字化处理的任务之一。目前,手写体Geez数字识别的研究很少,也没有可供公众研究的有组织的数据集。卷积神经网络(CNN)通过从不同的写作风格中提取特征,更适合于模式识别,比如手写文档识别。在这项工作中,提出的模型是使用CNN识别Geez数字。深度神经网络最近在许多模式识别和机器学习应用中表现出色,用于识别手写的Geez数字,但尚未尝试识别埃塞俄比亚文字。我们的数据集包含51952张来自524个人的手写Geez数字图像,用于训练和评估CNN模型。CNN的应用显著提高了几种机器学习分类方法的性能。我们提出的CNN模型准确率为96.21%,损失为0.2013。与早期对Geez手写数字识别的研究工作相比,该研究使用开发的CNN模型能够获得更高的识别精度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reliable Breast Cancer Diagnosis with Deep Learning: DCGAN-Driven Mammogram Synthesis and Validity Assessment A space-reduction based three-phase approach for large-scale optimization Biparty multiobjective optimal power flow: The problem definition and an evolutionary approach Q-learning-based hyper-heuristic evolutionary algorithm for the distributed assembly blocking flowshop scheduling problem Geometric Degree Reduction of Wang-Ball Curves
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1