Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)最新文献

英文中文

A Weighted Finite-State Framework for Correcting Errors in Natural Scene OCR 自然场景OCR误差校正的加权有限状态框架

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.41

Richard Beaufort, C. Mancas-Thillou

With the increasing market of cheap cameras, natural scene text has to be handled in an efficient way. Some works deal with text detection in the image while more recent ones point out the challenge of text extraction and recognition. We propose here an OCR correction system to handle traditional issues of recognizer errors but also the ones due to natural scene images, i.e. cut characters, artistic display, incomplete sentences (present in advertisements) and out- of-vocabulary (OOV) words such as acronyms and so on. The main algorithm bases on finite-state machines (FSMs) to deal with learned OCR confusions, capital/accented letters and lexicon look-up. Moreover, as OCR is not considered as a black box, several outputs are taken into account to intermingle recognition and correction steps. Based on a public database of natural scene words, detailed results are also presented along with future works.

随着廉价相机市场的不断扩大，自然场景文本的高效处理势在必行。一些工作涉及图像中的文本检测，而最近的工作则指出了文本提取和识别的挑战。我们在此提出一种OCR校正系统来处理传统的识别器错误问题，也可以处理自然场景图像的错误，如剪切字符、艺术展示、不完整的句子(出现在广告中)和词汇外(OOV)单词(首字母缩略词)等。主要算法基于有限状态机(FSMs)来处理学习到的OCR混淆、大写/重音字母和词典查找。此外，由于OCR不被视为黑盒，因此考虑了多个输出以混合识别和校正步骤。在自然场景词的公共数据库的基础上，详细的结果以及未来的工作。

引用次数: 46

Decompose Document Image Using Integer Linear Programming 使用整数线性规划分解文档图像

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.97

Dashan Gao, Yizhou Wang, Haitham A. Hindi, Minh Do

Document decomposition is a basic but crucial step for many document related applications. This paper proposes a novel approach to decompose document images into zones. It first generates overlapping zone hypotheses based on generic visual features. Then, each candidate zone is evaluated quantitatively by a learned generative zone model. We formulate the zone inference problem into a constrained optimization problem, so as to select an optimal set of non-overlapping zones that cover a given document image. The experimental results demonstrate that the proposed method is very robust to document structure variation and noise.

对于许多与文档相关的应用程序来说，文档分解是一个基本但至关重要的步骤。提出了一种新的文档图像区域分解方法。它首先基于一般的视觉特征生成重叠区域假设。然后，通过学习生成区域模型对每个候选区域进行定量评估。我们将区域推理问题转化为约束优化问题，从而选择覆盖给定文档图像的最优非重叠区域集。实验结果表明，该方法对文档结构变化和噪声具有较强的鲁棒性。

引用次数: 8

A Sparse and Locally Shift Invariant Feature Extractor Applied to Document Images 一种用于文档图像的稀疏局部移位不变性特征提取方法

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.35

Marc'Aurelio Ranzato, Yann LeCun

We describe an unsupervised learning algorithm for extracting sparse and locally shift-invariant features. We also devise a principled procedure for learning hierarchies of invariant features. Each feature detector is composed of a set of trainable convolutional filters followed by a max-pooling layer over non-overlapping windows, and a point-wise sigmoid non-linearity. A second stage of more invariant features is fed with patches provided by the first stage feature extractor, and is trained in the same way. The method is used to pre-train the first four layers of a deep convolutional network which achieves state-of-the-art performance on the MNIST dataset of handwritten digits. The final testing error rate is equal to 0.42%. Preliminary experiments on compression of bitonal document images show very promising results in terms of compression ratio and reconstruction error.

我们描述了一种用于提取稀疏和局部移位不变特征的无监督学习算法。我们还设计了一个原则性的过程来学习不变特征的层次。每个特征检测器由一组可训练的卷积滤波器、非重叠窗口上的最大池化层和逐点s型非线性组成。第二阶段的更多不变特征是由第一阶段特征提取器提供的补丁馈送，并以相同的方式进行训练。该方法用于预训练深度卷积网络的前四层，该网络在手写数字的MNIST数据集上达到了最先进的性能。最终测试错误率为0.42%。对双色文档图像进行压缩的初步实验，在压缩比和重构误差方面取得了令人满意的结果。

引用次数: 27

Circular Noises Removal from Scanned Document Images 从扫描文档图像中去除圆形噪声

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.80

Gaofeng Meng, N. Zheng, Yuanlin Zhang, Yonghong Song

Defects inspection and correction is an important topic in the fields of scanned documents preprocessing. In this paper, a very fast and robust algorithm is proposed for locating and removing a special kind of circular noises caused by scanning documents with punched holes. Firstly, original image is reduced according to an elaborately selected ratio. Punched holes after reduction will leave some distinctive small regions. By examining such small regions, holes noises can be fast detected and located. To diminish false detections, Hough transformation is applied to the roughly located regions to further confirm the located holes. Finally, circular noise is eliminated by fitting a bi-linear blending Coons surface which interpolates along the four edges of noisy region. Experiments on a variety of scanned documents with punched holes demonstrate the feasibility and efficiency of the proposed algorithm.

缺陷检测与校正是扫描文档预处理领域的一个重要课题。本文提出了一种快速、鲁棒的算法，用于定位和去除扫描穿孔文档时产生的一种特殊的圆形噪声。首先，根据精心选择的比例对原始图像进行约简。冲孔后减少会留下一些明显的小区域。通过检测这些小区域，可以快速检测和定位孔洞噪声。为了减少误检，对大致定位的区域进行霍夫变换，进一步确认定位的孔洞。最后，通过拟合沿噪声区域四边插值的双线性混合库恩曲面来消除圆形噪声。对各种带打孔的扫描文档进行了实验，验证了该算法的可行性和有效性。

引用次数: 5

Generating Copybooks from Consistent Handwriting Styles 从一致的笔迹样式生成抄本

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.123

R. Niels, L. Vuurpijl

The automatic extraction of handwriting styles is an important process that can be used for various applications in the processing of handwriting. We propose a novel method that employs hierarchical clustering to explore prominent clusters of handwriting. So-called membership vectors are introduced to describe the handwriting of a writer. Each membership vector reveals the frequency of occurrence of prototypical characters in a writer's handwriting. By clustering these vectors, consistent handwriting styles can be extracted, similar to the exemplar handwritings documented in copybooks. The results presented here are challenging. The most prominent handwriting styles detected correspond to the broad style categories cursive, mixed, and print.

笔迹样式的自动提取是笔迹处理中一个重要的过程，可用于各种应用。我们提出了一种新的方法，采用层次聚类来探索笔迹的突出聚类。所谓的隶属向量被引入来描述一个作家的笔迹。每个隶属向量揭示了作家笔迹中原型字符出现的频率。通过聚类这些向量，可以提取出一致的笔迹风格，类似于抄本中记录的范例笔迹。这里提出的结果是具有挑战性的。检测到的最突出的笔迹风格对应于大致的风格类别:草书、混合和印刷。

引用次数: 13

A Multi-Agent System for Hand-drawn Diagram Recognition 手绘图识别的多智能体系统

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.21

G. Casella, V. Deufemia, V. Mascardi

In this paper we present AgentSketch, an agent- based system for on-line recognition of hand-drawn diagrams. Agents are used for managing the activity of symbol recognizers and for providing efficient interpretations of the sketch to the user thanks to the use of contextual information for ambiguity resolution. The system can be applied to a variety of domains by providing recognizers of the symbols in that domain. A first experimental evaluation has been performed on the domain of UML use case diagrams to verify the effectiveness of the proposed approach.

本文提出了一个基于agent的手绘图在线识别系统AgentSketch。代理用于管理符号识别器的活动，并通过使用上下文信息来解决歧义，为用户提供对草图的有效解释。该系统可以通过提供该领域中符号的识别器来应用于各种领域。在UML用例图领域上进行了第一次实验评估，以验证所建议方法的有效性。

引用次数: 1

A Multi-Stage Strategy to Perspective Rectification for Mobile Phone Camera-Based Document Images 基于手机相机的文档图像多阶段视角校正策略

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.22

Xu-Cheng Yin, Jun Sun, S. Naoi, K. Fujimoto, Y. Fujii, Koji Kurokawa, Hiroaki Takebe

Document images captured by a mobile phone camera often have perspective distortions. Efficiency and accuracy are two important issues in designing a rectification system for such perspective documents. In this paper, we propose a new perspective rectification system based on vanishing point detection. This system achieves both the desired efficiency and accuracy using a multi-stage strategy: at the first stage, document boundaries and straight lines are used to compute vanishing points; at the second stage, text baselines and block aligns are utilized; and at the last stage, character tilt orientations are voted for the vertical vanishing point. A profit function is introduced to evaluate the reliability of detected vanishing points at each stage. If vanishing points at one stage are reliable, then rectification is ended at that stage. Otherwise, our method continues to seek more reliable vanishing points in the next stage. We have tested this method with more than 400 images including paper documents, signboards and posters. The image acceptance rate is more than 98.5% with an average speed of only about 60 ms.

手机相机拍摄的文件图像通常有透视失真。在设计透视文件纠错系统时，效率和准确性是两个重要问题。本文提出了一种新的基于消失点检测的视角校正系统。该系统采用多阶段策略实现了期望的效率和精度:第一阶段，使用文档边界和直线计算消失点;在第二阶段，使用文本基线和块对齐;在最后阶段，角色倾斜方向被投票为垂直消失点。引入收益函数来评价每一阶段检测到的消失点的可靠性。如果某一阶段的消失点是可靠的，则在该阶段结束整流。否则，我们的方法将在下一阶段继续寻找更可靠的消失点。我们已经用400多张图片测试了这种方法，包括纸质文件、招牌和海报。图像接受率超过98.5%，平均速度仅为60ms左右。

{"title":"A Multi-Stage Strategy to Perspective Rectification for Mobile Phone Camera-Based Document Images","authors":"Xu-Cheng Yin, Jun Sun, S. Naoi, K. Fujimoto, Y. Fujii, Koji Kurokawa, Hiroaki Takebe","doi":"10.1109/ICDAR.2007.22","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.22","url":null,"abstract":"Document images captured by a mobile phone camera often have perspective distortions. Efficiency and accuracy are two important issues in designing a rectification system for such perspective documents. In this paper, we propose a new perspective rectification system based on vanishing point detection. This system achieves both the desired efficiency and accuracy using a multi-stage strategy: at the first stage, document boundaries and straight lines are used to compute vanishing points; at the second stage, text baselines and block aligns are utilized; and at the last stage, character tilt orientations are voted for the vertical vanishing point. A profit function is introduced to evaluate the reliability of detected vanishing points at each stage. If vanishing points at one stage are reliable, then rectification is ended at that stage. Otherwise, our method continues to seek more reliable vanishing points in the next stage. We have tested this method with more than 400 images including paper documents, signboards and posters. The image acceptance rate is more than 98.5% with an average speed of only about 60 ms.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"395 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114865217","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Mathematical Formulae Recognition Using 2D Grammars 使用二维语法的数学公式识别

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.165

D. Prusa, Václav Hlaváč

We present a method for off-line mathematical formulae recognition based on the structural construction paradigm and two-dimensional grammars. In general, this approach can be successfully used in the analysis of images containing objects that exhibit rich structural relations. An important benefit of the structural construction is in treating the symbol segmentation in the image and its structural analysis as a single intertwined process. This allows the system to avoid errors usually appearing during the segmentation phase. We have developed and tested a pilot study proving that the method is computationally efficient, practical and able to cope with noise.

提出了一种基于结构构建范式和二维语法的离线数学公式识别方法。一般来说，这种方法可以成功地用于分析包含具有丰富结构关系的对象的图像。结构构造的一个重要优点是将图像中的符号分割及其结构分析视为一个相互交织的过程。这允许系统避免通常在分割阶段出现的错误。我们已经开发并测试了一项试点研究，证明该方法在计算上是有效的，实用的，并且能够处理噪声。

引用次数: 40

On Using Classical Poetry Structure for Indian Language Post-Processing 古典诗歌结构在印度语后处理中的应用

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.199

A. Namboodiri, P J Narayanan, C. V. Jawahar

Post-processors are critical to the performance of language recognizers like OCRs, speech recognizers, etc. Dictionary-based post-processing commonly employ either an algorithmic approach or a statistical approach. Other linguistic features are not exploited for this purpose. The language analysis is also largely limited to the prose form. This paper proposes a framework to use the rich metric and formal structure of classical poetic forms in Indian languages for post-processing a recognizer like an OCR engine. We show that the structure present in the form of the vrtta and prasa can be efficiently used to disambiguate some cases that may be difficult for an OCR. The approach is efficient, and complementary to other post-processing approaches and can be used in conjunction with them.

后置处理器对于语言识别器(如ocr、语音识别器等)的性能至关重要。基于字典的后处理通常采用算法方法或统计方法。其他语言特征没有被用于此目的。语言分析也主要局限于散文形式。本文提出了一个框架，利用印度语言古典诗歌形式丰富的韵律和形式结构对识别器进行后处理，如OCR引擎。我们表明，以vrta和prasa形式存在的结构可以有效地用于消除某些情况下可能难以用于OCR的歧义。该方法是有效的，是其他后处理方法的补充，可以与它们结合使用。

引用次数: 13

XML Data Representation in Document Image Analysis 文档图像分析中的XML数据表示

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.272

A. Belaïd, Ingrid Falk, Yves Rangoni

This paper presents the XML-based formats ALTO, TEI, METS used for digital libraries and their interest for data representation in a document image analysis and recognition (DIAR) process. In the first part we briefly present these formats with focus on their adequacy for structural representation and modeling of DIAR data. The second part shows how these formats can be used in a reverse engineering process. Their implementation as a data representation framework will be shown.

本文介绍了用于数字图书馆的基于xml的ALTO、TEI、METS格式，以及它们在文档图像分析和识别(DIAR)过程中的数据表示兴趣。在第一部分中，我们简要介绍了这些格式，重点介绍了它们是否适合DIAR数据的结构表示和建模。第二部分将展示如何在逆向工程过程中使用这些格式。将显示它们作为数据表示框架的实现。

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀