A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007) Pub Date : 2007-09-23 DOI:10.1109/ICDAR.2007.35

Marc'Aurelio Ranzato, Yann LeCun

引用次数: 27

Abstract

We describe an unsupervised learning algorithm for extracting sparse and locally shift-invariant features. We also devise a principled procedure for learning hierarchies of invariant features. Each feature detector is composed of a set of trainable convolutional filters followed by a max-pooling layer over non-overlapping windows, and a point-wise sigmoid non-linearity. A second stage of more invariant features is fed with patches provided by the first stage feature extractor, and is trained in the same way. The method is used to pre-train the first four layers of a deep convolutional network which achieves state-of-the-art performance on the MNIST dataset of handwritten digits. The final testing error rate is equal to 0.42%. Preliminary experiments on compression of bitonal document images show very promising results in terms of compression ratio and reconstruction error.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种用于文档图像的稀疏局部移位不变性特征提取方法

我们描述了一种用于提取稀疏和局部移位不变特征的无监督学习算法。我们还设计了一个原则性的过程来学习不变特征的层次。每个特征检测器由一组可训练的卷积滤波器、非重叠窗口上的最大池化层和逐点s型非线性组成。第二阶段的更多不变特征是由第一阶段特征提取器提供的补丁馈送，并以相同的方式进行训练。该方法用于预训练深度卷积网络的前四层，该网络在手写数字的MNIST数据集上达到了最先进的性能。最终测试错误率为0.42%。对双色文档图像进行压缩的初步实验，在压缩比和重构误差方面取得了令人满意的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

自引率

0.00%

发文量