Adaptive Importance Pooling Network for Scene Text Recognition

Peng Ren, Qingsong Yu, Xuanqi Wu, Ziyang Wang
{"title":"Adaptive Importance Pooling Network for Scene Text Recognition","authors":"Peng Ren, Qingsong Yu, Xuanqi Wu, Ziyang Wang","doi":"10.1145/3404555.3404614","DOIUrl":null,"url":null,"abstract":"Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.","PeriodicalId":220526,"journal":{"name":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","volume":"120 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 6th International Conference on Computing and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3404555.3404614","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Scene text recognition (STR) has attracted extensive attention in pattern recognition community. With the development of deep learning, the object detection and sequence recognition schemes based on deep neural networks have been widely used in this task. Crucially, the discriminative features play a vital role in complex scene text backgrounds. However, for specific tasks, inappropriate pooling strategies may lose feature details. To tackle this problem, in this paper, an end-to-end based on adaptive importance pooling network (AIPN) is proposed. Concretely, we embed the novel AIP strategy into feature extraction stage. Additionally, we adopt the attention-based LSTM as decoder so that the useful image feature information regions are automatically focused while predicting final recognition results. Furthermore, to reduce the burden of feature representation for the next recognition, text rectification network (TRN) supervised by text recognition parts is utilized to normalize the input text images. Experimental results show that our model achieves inspiring performances on STR benchmark datasets IIIT5K, SVT, ICDAR-2003 and ICDAR-2013.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
场景文本识别的自适应重要性池化网络
场景文本识别在模式识别界引起了广泛的关注。随着深度学习的发展,基于深度神经网络的目标检测和序列识别方案在该任务中得到了广泛的应用。至关重要的是,在复杂的场景文本背景中,判别特征起着至关重要的作用。然而,对于特定的任务,不适当的池化策略可能会丢失特性细节。为了解决这一问题,本文提出了一种基于端到端的自适应重要性池化网络(AIPN)。具体而言,我们将新的AIP策略嵌入到特征提取阶段。此外,我们采用基于注意力的LSTM作为解码器,在预测最终识别结果的同时自动聚焦有用的图像特征信息区域。此外,为了减少下一次识别的特征表示负担,利用文本识别部分监督的文本校正网络(TRN)对输入的文本图像进行归一化。实验结果表明,该模型在STR基准数据集IIIT5K、SVT、ICDAR-2003和ICDAR-2013上取得了令人鼓舞的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
mRNA Big Data Analysis of Hepatoma Carcinoma Between Different Genders Generalization or Instantiation?: Estimating the Relative Abstractness between Images and Text Auxiliary Edge Detection for Semantic Image Segmentation Intrusion Detection of Abnormal Objects for Railway Scenes Using Infrared Images Multi-Tenant Machine Learning Platform Based on Kubernetes
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1