Tadele Mengiste, B. Belay, Bezawork Tilahun, Tsiyon Worku, Tesfa Tegegne
{"title":"RRConvNet:用于现实人物图像识别的递归残差网络","authors":"Tadele Mengiste, B. Belay, Bezawork Tilahun, Tsiyon Worku, Tesfa Tegegne","doi":"10.5220/0011270400003277","DOIUrl":null,"url":null,"abstract":": Variations in fonts, styles, and ways to write a character have been the major bottlenecks in OCR research. Such problems are swiftly tackled through advancements in deep neural networks (DNNs). However, the number of network parameters and feature reusability are still the issues when applying Deep Convolutional Neural networks(DCNNs) for character image recognition. To address these challenges, in this paper, we propose an extensible and recursive-residual ConvNet architecture (RRConvNet) for real-life character image recognition. Unlike the standard DCCNs, RRConvNet incorporates two extensions: recursive-supervision and skip-connection. To enhance the recognition performance and reduce the number of parameters for extra convolutions, layers of up to three recursions are proposed. Feature maps are used after each recursion for reconstructing the target character. For all recursions of the reconstruction method, the reconstruction layers are the same. The second enhancement is to use a short skip-connection from the input to the reconstruction output layer to reuse the character features maps that are already learned from the prior layer. This skip-connection could be also used as an alternative path for gradients where the gradient is too small. With an overall character recognition accuracy of 98.2 percent, the proposed method achieves a state-of-the-art result on both publicly available and private test datasets.","PeriodicalId":88612,"journal":{"name":"News. Phi Delta Epsilon","volume":"57 1","pages":"110-116"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RRConvNet: Recursive-residual Network for Real-life Character Image Recognition\",\"authors\":\"Tadele Mengiste, B. Belay, Bezawork Tilahun, Tsiyon Worku, Tesfa Tegegne\",\"doi\":\"10.5220/0011270400003277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": Variations in fonts, styles, and ways to write a character have been the major bottlenecks in OCR research. Such problems are swiftly tackled through advancements in deep neural networks (DNNs). However, the number of network parameters and feature reusability are still the issues when applying Deep Convolutional Neural networks(DCNNs) for character image recognition. To address these challenges, in this paper, we propose an extensible and recursive-residual ConvNet architecture (RRConvNet) for real-life character image recognition. Unlike the standard DCCNs, RRConvNet incorporates two extensions: recursive-supervision and skip-connection. To enhance the recognition performance and reduce the number of parameters for extra convolutions, layers of up to three recursions are proposed. Feature maps are used after each recursion for reconstructing the target character. For all recursions of the reconstruction method, the reconstruction layers are the same. The second enhancement is to use a short skip-connection from the input to the reconstruction output layer to reuse the character features maps that are already learned from the prior layer. This skip-connection could be also used as an alternative path for gradients where the gradient is too small. With an overall character recognition accuracy of 98.2 percent, the proposed method achieves a state-of-the-art result on both publicly available and private test datasets.\",\"PeriodicalId\":88612,\"journal\":{\"name\":\"News. Phi Delta Epsilon\",\"volume\":\"57 1\",\"pages\":\"110-116\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"News. Phi Delta Epsilon\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5220/0011270400003277\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"News. Phi Delta Epsilon","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0011270400003277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
RRConvNet: Recursive-residual Network for Real-life Character Image Recognition
: Variations in fonts, styles, and ways to write a character have been the major bottlenecks in OCR research. Such problems are swiftly tackled through advancements in deep neural networks (DNNs). However, the number of network parameters and feature reusability are still the issues when applying Deep Convolutional Neural networks(DCNNs) for character image recognition. To address these challenges, in this paper, we propose an extensible and recursive-residual ConvNet architecture (RRConvNet) for real-life character image recognition. Unlike the standard DCCNs, RRConvNet incorporates two extensions: recursive-supervision and skip-connection. To enhance the recognition performance and reduce the number of parameters for extra convolutions, layers of up to three recursions are proposed. Feature maps are used after each recursion for reconstructing the target character. For all recursions of the reconstruction method, the reconstruction layers are the same. The second enhancement is to use a short skip-connection from the input to the reconstruction output layer to reuse the character features maps that are already learned from the prior layer. This skip-connection could be also used as an alternative path for gradients where the gradient is too small. With an overall character recognition accuracy of 98.2 percent, the proposed method achieves a state-of-the-art result on both publicly available and private test datasets.