Text detection based on convolutional neural networks with spatial pyramid pooling

2016 IEEE International Conference on Image Processing (ICIP) Pub Date : 2016-09-01 DOI:10.1109/ICIP.2016.7532514

Rui Zhu, Xiao-Jiao Mao, Qi-Hai Zhu, Ning Li, Yubin Yang

引用次数: 15

Abstract

Text detection is a difficult task due to the significant diversity of the texts appearing in natural scene images. In this paper, we propose a novel text descriptor, SPP-net, extracted by equipping the Convolutional Neural Network (CNN) with spatial pyramid pooling. We first compute the feature maps from the original text lines without any cropping or warping, and then generate the fixed-size representations for text discrimination. Experimental results on the latest ICDAR 2011 and 2013 datasets have proven that the proposed descriptor outperforms the state-of-the-art methods by a noticeable margin on F-measure with its merit of incorporating multi-scale text information and its flexibility of describing text regions with different sizes and shapes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于空间金字塔池的卷积神经网络文本检测

由于自然场景图像中出现的文本具有显著的多样性，文本检测是一项艰巨的任务。在本文中，我们提出了一种新的文本描述符SPP-net，它通过卷积神经网络(CNN)的空间金字塔池来提取。我们首先从原始文本行计算特征映射，不进行任何裁剪或扭曲，然后生成固定大小的文本区分表示。在最新的ICDAR 2011和2013数据集上的实验结果证明，该描述符具有融合多尺度文本信息的优点以及描述不同大小和形状的文本区域的灵活性，在F-measure上明显优于最先进的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2016 IEEE International Conference on Image Processing (ICIP)

自引率

0.00%

发文量

期刊最新文献

Content-adaptive pyramid representation for 3D object classification Automating the measurement of physiological parameters: A case study in the image analysis of cilia motion Horizon based orientation estimation for planetary surface navigation Softcast with per-carrier power-constrained channels Speeding-up a convolutional neural network by connecting an SVM network