{"title":"A novel ensemble deep network framework for scene text recognition","authors":"Sunil Kumar Dasari, S. Mehta, D. Steffi","doi":"10.11591/ijres.v13.i2.pp403-413","DOIUrl":null,"url":null,"abstract":"In recent years, scene text recognition (STR) has always been considered a sequence-to-sequence problem. Attention-based techniques have a greater potential for context-semantic modelling, but they tend to overfit inadequate training data. STR is one of the most important and difficult challenges in image-based sequence recognition. A novel framework ensemble deep network (EDN) is proposed, EDN comprises customized convolutional neural network (CNN), and deep autoencoder. Customized CNN is designed by introducing the optimal spatial transformation module for optimizing the input of irregular text to read for same size. Further, deep autoencoder is introduced with effective attention mechanism utilizing the inherent features. The proposed ensemble deep network-proposed system (EDN-PS) approach outperforms the existing state-of-art techniques for both irregular and regular scene-texts and upon further simulations, the proposed model generates better results for IIIT5K, ICDAR-13, ICDAR-15, and CUTE dataset in comparison with the existing system hence our proposed EDN-PS model outperforms the existing state-of-art methods.","PeriodicalId":158991,"journal":{"name":"International Journal of Reconfigurable and Embedded Systems (IJRES)","volume":"88 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Reconfigurable and Embedded Systems (IJRES)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11591/ijres.v13.i2.pp403-413","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, scene text recognition (STR) has always been considered a sequence-to-sequence problem. Attention-based techniques have a greater potential for context-semantic modelling, but they tend to overfit inadequate training data. STR is one of the most important and difficult challenges in image-based sequence recognition. A novel framework ensemble deep network (EDN) is proposed, EDN comprises customized convolutional neural network (CNN), and deep autoencoder. Customized CNN is designed by introducing the optimal spatial transformation module for optimizing the input of irregular text to read for same size. Further, deep autoencoder is introduced with effective attention mechanism utilizing the inherent features. The proposed ensemble deep network-proposed system (EDN-PS) approach outperforms the existing state-of-art techniques for both irregular and regular scene-texts and upon further simulations, the proposed model generates better results for IIIT5K, ICDAR-13, ICDAR-15, and CUTE dataset in comparison with the existing system hence our proposed EDN-PS model outperforms the existing state-of-art methods.