Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)最新文献

英文中文

Content-level Annotation of Large Collection of Printed Document Images 大型打印文档图像集合的内容级注释

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.89

Anand Kumar, C. V. Jawahar

A large annotated corpus is critical to the development of robust optical character recognizers (OCRs). However, creation of annotated corpora is a tedious task. It is laborious, especially when the annotation is at the character level. In this paper, we propose an efficient hierarchical approach for annotation of large collection of printed document images. We align document images with independently keyed-in text. The method is model-driven and is intended to annotate large collection of documents, scanned in three different resolutions, at character level. We employ an XML representation for storage of the annotation information. APIs are provided for access at content level for easy use in training and evaluation of OCRs and other document understanding tasks.

大型标注语料库对于鲁棒光学字符识别器的开发至关重要。然而，创建带注释的语料库是一项乏味的任务。这很费力，特别是当注释是字符级别时。在本文中，我们提出了一种有效的分层方法来标注大量打印文档图像。我们将文档图像与独立键入的文本对齐。该方法是模型驱动的，目的是在字符级别对以三种不同分辨率扫描的大量文档进行注释。我们使用XML表示来存储注释信息。api提供了内容级别的访问，以便在ocr的培训和评估以及其他文档理解任务中轻松使用。

引用次数: 43

An Efficient Thresholding Algorithm for Brazilian Bank Checks 一种有效的巴西银行支票阈值分割算法

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.50

C. Mello, B. Bezerra, C. Zanchettin, V. Macário

It is present herein an algorithm for thresholding images of bank checks. These images have complex background elements. Some of these patterns make very hard to distinguish between the text and the texture pattern defined by the bank. For the binarizing process, an adaptive global thresholding algorithm is proposed based on ROC curves and it is compared to several well-known algorithms. The images generated by the new algorithm achieved a hit rate of 97% for recognition of the CMC7 code.

提出了一种对银行支票图像进行阈值分割的算法。这些图像具有复杂的背景元素。其中一些模式使得很难区分文本和银行定义的纹理模式。在二值化过程中，提出了一种基于ROC曲线的自适应全局阈值化算法，并与几种常用算法进行了比较。新算法生成的图像对CMC7代码的识别命中率达到97%。

引用次数: 4

An Approach to Improve Accuracy Rate of On-line Signature Verification Systems of Different Sizes 一种提高不同大小在线签名验证系统准确率的方法

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.46

R. Araujo, George D. C. Cavalcanti, E. C. B. C. Filho

This paper discusses the problem of size variation in on-line signature verification systems. The main idea of the article is to investigate the influence of the size variation in the feature extraction techniques and how this distortion can affect the final classification performance of the systems. In this study a new classification approach was suggested based on Kholmatov and Yanikoglu work in order to measure this performance. Besides that, a feature selection technique was applied in the description of the patterns with the purpose of over come the size variation problem. All the experiments were performed in a database constructed with signatures of three different sizes and skilled forgeries. This kind of study plays an important role in the implementation of systems that uses different signature sources.

本文讨论了在线签名验证系统中大小变化的问题。本文的主要思想是研究尺寸变化对特征提取技术的影响，以及这种扭曲如何影响系统的最终分类性能。本研究基于Kholmatov和Yanikoglu的工作，提出了一种新的分类方法来衡量这一绩效。此外，在模式描述中引入了特征选择技术，克服了尺寸变化问题。所有的实验都是在一个由三种不同大小的签名和熟练的伪造者组成的数据库中进行的。这种研究在使用不同签名源的系统实现中起着重要的作用。

引用次数: 5

Text Segmentation from Complex Background Using Sparse Representations 基于稀疏表示的复杂背景文本分割

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.246

Wumo Pan, T. D. Bui, C. Suen

A novel text segmentation method from complex background is presented in this paper. The idea is inspired by the recent development in searching for the sparse signal representation among a family of over-complete atoms, which is called a dictionary. We assume that the image under investigation is composed of two components: the foreground text and the complex background. We further assume that the latter can be modeled as a piece-wise smooth function. Then we choose two dictionaries, where the first one gives sparse representation to one component and non-sparse representation to another while the second one does the opposite. By looking for the sparse representations in each dictionary, we can decompose the image into the two composing components. After that, text segmentation can be easily achieved by applying simple thresholding to the text component. Preliminary experiments show some promising results.

提出了一种新的复杂背景文本分割方法。这个想法的灵感来自于最近在一组被称为字典的过完备原子中寻找稀疏信号表示的研究进展。我们假设所研究的图像由两部分组成:前景文本和复杂背景。我们进一步假设后者可以建模为分段平滑函数。然后，我们选择两个字典，其中第一个字典对一个组件进行稀疏表示，对另一个组件进行非稀疏表示，而第二个字典则相反。通过在每个字典中寻找稀疏表示，我们可以将图像分解为两个组成组件。之后，文本分割可以通过对文本组件应用简单的阈值化来轻松实现。初步实验显示了一些有希望的结果。

引用次数: 15

Table Recognition and Understanding from PDF Files 从PDF文件中识别和理解表

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.241

Tamir Hassan, Robert Baumgartner

We propose a flexible method for detecting and understanding tables in PDF files, which is not reliant upon one particular feature being present, for example ruling lines or indentations, and is therefore applicable to a wide variety of visual presentations. We describe the steps required in transforming the low-level PDF instructions into text segments, lines and boxes on a page. We propose three different classifications for published tables, and develop methods to detect these tables and correctly identify their respective rows and columns. We also explain how to recognize spanning rows and columns, and multi-line rows. Experimental results show that our algorithm is effective in converting a wide variety of tabular presentations into HTML for information extraction purposes.

我们提出了一种灵活的方法来检测和理解PDF文件中的表，这种方法不依赖于存在的特定特性，例如规则行或缩进，因此适用于各种视觉表示。我们描述了将低级PDF指令转换为页面上的文本段、行和框所需的步骤。我们对已发布的表提出了三种不同的分类，并开发了检测这些表并正确识别其各自行和列的方法。我们还解释了如何识别跨行、跨列以及多行。实验结果表明，该算法可以有效地将各种表格表示转换为HTML以用于信息提取。

引用次数: 70

Automatic Document Logo Detection 文档标识自动检测

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.68

Guangyu Zhu, D. Doermann

Automatic logo detection and recognition continues to be of great interest to the document retrieval community as it enables effective identification of the source of a document. In this paper, we propose a new approach to logo detection and extraction in document images that robustly classifies and precisely localizes logos using a boosting strategy across multiple image scales. At a coarse scale, a trained Fisher classifier performs initial classification using features from document context and connected components. Each logo candidate region is further classified at successively finer scales by a cascade of simple classifiers, which allows false alarms to be discarded and the detected region to be refined. Our approach is segmentation free and lay-out independent. We define a meaningful evaluation metric to measure the quality of logo detection using labeled groundtruth. We demonstrate the effectiveness of our approach using a large collection of real-world documents.

自动标识检测和识别仍然是文档检索社区非常感兴趣的问题，因为它可以有效地识别文档的来源。在本文中，我们提出了一种在文档图像中检测和提取徽标的新方法，该方法使用跨多个图像尺度的增强策略对徽标进行鲁棒分类和精确定位。在粗尺度上，经过训练的Fisher分类器使用来自文档上下文和连接组件的特征执行初始分类。通过简单分类器的级联，在连续更细的尺度上进一步对每个徽标候选区域进行分类，从而可以丢弃假警报并对检测到的区域进行细化。我们的方法是分割自由和布局独立的。我们定义了一个有意义的评价指标来衡量使用标记基础真值的标识检测质量。我们使用大量真实文档来演示我们方法的有效性。

引用次数: 120

Identification of Non-Black Inks Using HSV Colour Space 利用HSV色彩空间识别非黑色油墨

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.138

Haritha Dasari, C. Bhagvati

An important problem in questioned document examination is detection of alterations done by inserting words or additional lines of text. In this paper, we present a statistical pattern recognition driven approach that views it as a two- class problem. Given two sample words, one of which is a suspected alteration, it is necessary to determine if the two belong to the same class or different classes. Our approach is defined in two stages. We start with a 11-dimensional vector that comprises colour features defined in HSV space and texture features. During the training phase, we derive within-class and between-class LI distance distributions and identify an optimal threshold that minimizes Type I and Type II errors. During the second or test phase, we take a pair of unkown samples and use the threshold value obtained from the training phase to decide if the two belong to the same class or distinct classes. Our experimental results involving more than 95000 pairs of word images show that the approach gives an accuracy of over 90% for gel and roller pens and an accuracy of 85% for ball pen writings.

在被质疑的文件检查中，一个重要的问题是检测插入单词或额外的文本行所做的更改。在本文中，我们提出了一种统计模式识别驱动的方法，将其视为一个两类问题。给定两个样本词，其中一个是可疑的变异词，有必要确定这两个词是属于同一类还是不同类。我们的方法分为两个阶段。我们从一个11维向量开始，它包括在HSV空间中定义的颜色特征和纹理特征。在训练阶段，我们推导了类内和类间LI距离分布，并确定了最小化I型和II型误差的最佳阈值。在第二阶段或测试阶段，我们取一对未知样本，并使用从训练阶段获得的阈值来确定两者是属于同一类还是不同的类。我们涉及超过95000对单词图像的实验结果表明，该方法对凝胶笔和滚轮笔的准确率超过90%，对圆珠笔的准确率为85%。

引用次数: 21

Off-line Signature Verification Using Enhanced Modified Direction Features in Conjunction with Neural Classifiers and Support Vector Machines 结合神经分类器和支持向量机的改进方向特征离线签名验证

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.192

Vu Nguyen, M. Blumenstein, V. Muthukkumarasamy, G. Leedham

As a biometric, signatures have been widely used to identify people. In the context of static image processing, the lack of dynamic information such as velocity, pressure and the direction and sequence of strokes has made the realization of accurate off-line signature verification systems more challenging as compared to their on-line counterparts. In this paper, we propose an effective method to perform off-line signature verification based on intelligent techniques. Structural features are extracted from the signature's contour using the modified direction feature (MDF) and its extended version: the Enhanced MDF (EMDF). Two neural network-based techniques and Support Vector Machines (SVMs) were investigated and compared for the process of signature verification. The classifiers were trained using genuine specimens and other randomly selected signatures taken from a publicly available database of 3840 genuine signatures from 160 volunteers and 4800 targeted forged signatures. A distinguishing error rate (DER) of 17.78% was obtained with the SVM whilst keeping the false acceptance rate for random forgeries (FARR) below 0.16%.

作为一种生物识别技术，签名已被广泛用于识别人的身份。在静态图像处理的背景下，由于缺乏速度、压力、笔划方向和顺序等动态信息，使得准确的离线签名验证系统的实现比在线签名验证系统更具挑战性。本文提出了一种基于智能技术的离线签名验证方法。结构特征是利用修正方向特征(MDF)及其扩展版本:增强方向特征(EMDF)从特征的轮廓中提取出来的。对基于神经网络和支持向量机的签名验证过程进行了研究和比较。分类器使用真实样本和其他随机选择的签名进行训练，这些签名来自160名志愿者的3840个真实签名和4800个目标伪造签名的公开数据库。支持向量机的识别错误率(DER)为17.78%，同时保持随机伪造的错误接受率(FARR)低于0.16%。

{"title":"Off-line Signature Verification Using Enhanced Modified Direction Features in Conjunction with Neural Classifiers and Support Vector Machines","authors":"Vu Nguyen, M. Blumenstein, V. Muthukkumarasamy, G. Leedham","doi":"10.1109/ICDAR.2007.192","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.192","url":null,"abstract":"As a biometric, signatures have been widely used to identify people. In the context of static image processing, the lack of dynamic information such as velocity, pressure and the direction and sequence of strokes has made the realization of accurate off-line signature verification systems more challenging as compared to their on-line counterparts. In this paper, we propose an effective method to perform off-line signature verification based on intelligent techniques. Structural features are extracted from the signature's contour using the modified direction feature (MDF) and its extended version: the Enhanced MDF (EMDF). Two neural network-based techniques and Support Vector Machines (SVMs) were investigated and compared for the process of signature verification. The classifiers were trained using genuine specimens and other randomly selected signatures taken from a publicly available database of 3840 genuine signatures from 160 volunteers and 4800 targeted forged signatures. A distinguishing error rate (DER) of 17.78% was obtained with the SVM whilst keeping the false acceptance rate for random forgeries (FARR) below 0.16%.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134322322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 78

How to Reduce the Size of Bank Check Image Archive? 如何减小银行支票图像档案的大小?

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.134

V. Shapiro

TIFF group 4 is the dominant format for storing bitonal bank check images in digital archives. Kept for at least seven years these images demand huge volume of disk space. The image size becomes a critical resource from the bandwidth perspective in the era of the widespread Internet banking and Image exchange between various financial institutions. This paper proposes an approach for lossy compression of TIFF group 4 images able to reducing the compressed size by an extra 30%. The approach consists of the two separate steps: generic preprocessing, applicable to any document, and the check context-sensitive step that is specific to the check image compression. Compared to the widely used resolution reduction method, the proposed approach preserves the essential information in the image, providing comparable compression rates.

TIFF组4是在数字档案中存储银行支票图像的主要格式。这些映像至少保存了7年，需要大量的磁盘空间。在互联网金融和各金融机构之间图像交换广泛开展的时代，从带宽的角度来看，图像大小成为一种重要的资源。本文提出了一种对TIFF组4图像进行有损压缩的方法，可以将压缩大小额外减少30%。该方法由两个独立的步骤组成:适用于任何文档的通用预处理，以及特定于检查图像压缩的检查上下文敏感步骤。与广泛使用的分辨率降低方法相比，该方法保留了图像中的基本信息，提供了相当的压缩率。

引用次数: 0

Language Models for Handwritten Short Message Services 手写短消息服务的语言模型

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

Pub Date : 2007-09-23 DOI: 10.1109/ICDAR.2007.153

Emmanuel Prochasson, C. Viard-Gaudin, E. Morin

Handwriting is an alternative method for entering texts composing short message services. However, a whole new language features the texts which are produced. They include for instance abbreviations and other consonantal writing which sprung up for time saving and fashion. We have collected and processed a significant number of such handwriting SMS, and used various strategies to tackle this challenging area of handwriting recognition. We proposed to study more specifically three different phenomena: consonant skeleton, rebus, and phonetic writing. For each of them, we compare the rough results produced by a standard recognition system with those obtained when using a specific language model.

手写是输入短信服务文本的另一种方法。然而，一种全新的语言特征是产生的文本。例如，它们包括缩写和其他辅音书写，它们为节省时间和时尚而涌现。我们收集并处理了大量这样的手写短信，并使用了各种策略来解决手写识别这一具有挑战性的领域。我们建议更具体地研究三种不同的现象:辅音骨架、舌音和语音书写。对于每一种情况，我们将标准识别系统产生的粗略结果与使用特定语言模型时获得的结果进行比较。

引用次数: 16

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀