RRConvNet:用于现实人物图像识别的递归残差网络

News. Phi Delta Epsilon Pub Date : 2022-01-01 DOI:10.5220/0011270400003277

Tadele Mengiste, B. Belay, Bezawork Tilahun, Tsiyon Worku, Tesfa Tegegne

{"title":"RRConvNet:用于现实人物图像识别的递归残差网络","authors":"Tadele Mengiste, B. Belay, Bezawork Tilahun, Tsiyon Worku, Tesfa Tegegne","doi":"10.5220/0011270400003277","DOIUrl":null,"url":null,"abstract":": Variations in fonts, styles, and ways to write a character have been the major bottlenecks in OCR research. Such problems are swiftly tackled through advancements in deep neural networks (DNNs). However, the number of network parameters and feature reusability are still the issues when applying Deep Convolutional Neural networks(DCNNs) for character image recognition. To address these challenges, in this paper, we propose an extensible and recursive-residual ConvNet architecture (RRConvNet) for real-life character image recognition. Unlike the standard DCCNs, RRConvNet incorporates two extensions: recursive-supervision and skip-connection. To enhance the recognition performance and reduce the number of parameters for extra convolutions, layers of up to three recursions are proposed. Feature maps are used after each recursion for reconstructing the target character. For all recursions of the reconstruction method, the reconstruction layers are the same. The second enhancement is to use a short skip-connection from the input to the reconstruction output layer to reuse the character features maps that are already learned from the prior layer. This skip-connection could be also used as an alternative path for gradients where the gradient is too small. With an overall character recognition accuracy of 98.2 percent, the proposed method achieves a state-of-the-art result on both publicly available and private test datasets.","PeriodicalId":88612,"journal":{"name":"News. Phi Delta Epsilon","volume":"57 1","pages":"110-116"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RRConvNet: Recursive-residual Network for Real-life Character Image Recognition\",\"authors\":\"Tadele Mengiste, B. Belay, Bezawork Tilahun, Tsiyon Worku, Tesfa Tegegne\",\"doi\":\"10.5220/0011270400003277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Variations in fonts, styles, and ways to write a character have been the major bottlenecks in OCR research. Such problems are swiftly tackled through advancements in deep neural networks (DNNs). However, the number of network parameters and feature reusability are still the issues when applying Deep Convolutional Neural networks(DCNNs) for character image recognition. To address these challenges, in this paper, we propose an extensible and recursive-residual ConvNet architecture (RRConvNet) for real-life character image recognition. Unlike the standard DCCNs, RRConvNet incorporates two extensions: recursive-supervision and skip-connection. To enhance the recognition performance and reduce the number of parameters for extra convolutions, layers of up to three recursions are proposed. Feature maps are used after each recursion for reconstructing the target character. For all recursions of the reconstruction method, the reconstruction layers are the same. The second enhancement is to use a short skip-connection from the input to the reconstruction output layer to reuse the character features maps that are already learned from the prior layer. This skip-connection could be also used as an alternative path for gradients where the gradient is too small. With an overall character recognition accuracy of 98.2 percent, the proposed method achieves a state-of-the-art result on both publicly available and private test datasets.\",\"PeriodicalId\":88612,\"journal\":{\"name\":\"News. Phi Delta Epsilon\",\"volume\":\"57 1\",\"pages\":\"110-116\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"News. Phi Delta Epsilon\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0011270400003277\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"News. Phi Delta Epsilon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0011270400003277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

字体、样式和字符书写方式的变化一直是OCR研究的主要瓶颈。这些问题可以通过深度神经网络(dnn)的进步迅速解决。然而，将深度卷积神经网络(Deep Convolutional Neural network, DCNNs)应用于字符图像识别时，网络参数的数量和特征的可重用性仍然是有待解决的问题。为了解决这些挑战，在本文中，我们提出了一种可扩展的递归残差卷积神经网络架构(RRConvNet)，用于现实生活中的字符图像识别。与标准的dccn不同，RRConvNet包含两个扩展:递归监督和跳过连接。为了提高识别性能并减少额外卷积的参数数量，提出了最多三层递归的方法。特征映射在每次递归后用于重建目标字符。对于重构方法的所有递归，重构层是相同的。第二个增强是使用从输入到重建输出层的短跳过连接来重用已经从前一层学习到的字符特征映射。这种跳过连接也可以用作梯度过小的梯度的替代路径。整体字符识别准确率为98.2%，该方法在公开可用和私人测试数据集上都取得了最先进的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

RRConvNet: Recursive-residual Network for Real-life Character Image Recognition

: Variations in fonts, styles, and ways to write a character have been the major bottlenecks in OCR research. Such problems are swiftly tackled through advancements in deep neural networks (DNNs). However, the number of network parameters and feature reusability are still the issues when applying Deep Convolutional Neural networks(DCNNs) for character image recognition. To address these challenges, in this paper, we propose an extensible and recursive-residual ConvNet architecture (RRConvNet) for real-life character image recognition. Unlike the standard DCCNs, RRConvNet incorporates two extensions: recursive-supervision and skip-connection. To enhance the recognition performance and reduce the number of parameters for extra convolutions, layers of up to three recursions are proposed. Feature maps are used after each recursion for reconstructing the target character. For all recursions of the reconstruction method, the reconstruction layers are the same. The second enhancement is to use a short skip-connection from the input to the reconstruction output layer to reuse the character features maps that are already learned from the prior layer. This skip-connection could be also used as an alternative path for gradients where the gradient is too small. With an overall character recognition accuracy of 98.2 percent, the proposed method achieves a state-of-the-art result on both publicly available and private test datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助