Adaptive Importance Pooling Network for Scene Text Recognition

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence Pub Date : 2020-04-23 DOI:10.1145/3404555.3404614

Peng Ren, Qingsong Yu, Xuanqi Wu, Ziyang Wang

{"title":"Adaptive Importance Pooling Network for Scene Text Recognition","authors":"Peng Ren, Qingsong Yu, Xuanqi Wu, Ziyang Wang","doi":"10.1145/3404555.3404614","DOIUrl":null,"url":null,"abstract":"Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"120 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3404555.3404614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

场景文本识别的自适应重要性池化网络

场景文本识别在模式识别界引起了广泛的关注。随着深度学习的发展，基于深度神经网络的目标检测和序列识别方案在该任务中得到了广泛的应用。至关重要的是，在复杂的场景文本背景中，判别特征起着至关重要的作用。然而，对于特定的任务，不适当的池化策略可能会丢失特性细节。为了解决这一问题，本文提出了一种基于端到端的自适应重要性池化网络(AIPN)。具体而言，我们将新的AIP策略嵌入到特征提取阶段。此外，我们采用基于注意力的LSTM作为解码器，在预测最终识别结果的同时自动聚焦有用的图像特征信息区域。此外，为了减少下一次识别的特征表示负担，利用文本识别部分监督的文本校正网络(TRN)对输入的文本图像进行归一化。实验结果表明，该模型在STR基准数据集IIIT5K、SVT、ICDAR-2003和ICDAR-2013上取得了令人鼓舞的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence

自引率

0.00%

发文量

期刊最新文献

mRNA Big Data Analysis of Hepatoma Carcinoma Between Different Genders Generalization or Instantiation?: Estimating the Relative Abstractness between Images and Text Auxiliary Edge Detection for Semantic Image Segmentation Intrusion Detection of Abnormal Objects for Railway Scenes Using Infrared Images Multi-Tenant Machine Learning Platform Based on Kubernetes