Proceedings of Sixth International Conference on Document Analysis and Recognition最新文献

英文中文

Recognition of unconstrained handwritten numeral strings using decision value generator 使用决策值生成器识别无约束手写数字字符串

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953746

K. Kim, Y. Chung, Jinho Kim, C. Suen

This paper presents recognition of unconstrained handwritten numeral strings using a decision value generator. The numeral string recognition system is composed of three modules: pre-segmentation, segmentation and recognition. The pre-segmentation module classifies a numeral string into sub-images, such as isolated digits, touching digits or broken digits, based on the confidence value of decision value generator. The segmentation module splits the touching digits using the reliability value of decision value generator. Both segmentation-based and segmentation free methods are used in classification and segmentation. To evaluate the proposed method, experiments were conducted using the handwritten numeral strings of NIST SD19 and a higher recognition performance than previous works was obtained.

提出了一种基于决策值生成器的无约束手写数字字符串识别方法。数字字符串识别系统由预分割、分割和识别三个模块组成。预分割模块根据决策值生成器的置信度，将数字串划分为孤立数字、接触数字或破碎数字等子图像。分割模块利用决策值发生器的可靠性值对触摸数字进行分割。在分类和分割中使用了基于分割和无分割两种方法。为了验证该方法的有效性，使用NIST SD19的手写数字串进行了实验，获得了比以往工作更高的识别性能。

引用次数: 7

On the influence of vocabulary size and language models in unconstrained handwritten text recognition 词汇量和语言模型对无约束手写体文本识别的影响

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953795

Urs-Viktor Marti, H. Bunke

In this paper we present a system for unconstrained handwritten text recognition. The system consists of three components: preprocessing, feature extraction and recognition. In the preprocessing phase, a page of handwritten text is divided into its lines and the writing is normalized by means of skew and slant correction, positioning and scaling. From a normalized text line image, features are extracted using a sliding window technique. From each position of the window nine geometrical features are computed. The core of the system, the recognizes is based on hidden Markov models. For each individual character, a model is provided. The character models are concatenated to words using a vocabulary. Moreover, the word models are concatenated to models that represent full lines of text. Thus the difficult problem of segmenting a line of text into its individual words can be overcome. To enhance the recognition capabilities of the system, a statistical language model is integrated into the hidden Markov model framework. To preselect useful language models and compare them, perplexity is used. Both perplexity as originally proposed and normalized perplexity are considered.

本文提出了一种无约束手写文本识别系统。该系统由预处理、特征提取和识别三个部分组成。在预处理阶段，将一页手写体文本分成几行，通过斜、斜校正、定位、缩放等方法对文字进行归一化。从归一化的文本行图像中，使用滑动窗口技术提取特征。从窗口的每个位置计算9个几何特征。该系统的核心是基于隐马尔可夫模型的识别。对于每个单独的角色，都提供了一个模型。使用词汇表将字符模型连接到单词。此外，单词模型被连接到表示整行文本的模型上。这样就可以克服将一行文本分割成单个单词的难题。为了提高系统的识别能力，将统计语言模型集成到隐马尔可夫模型框架中。为了预先选择有用的语言模型并对它们进行比较，使用了困惑。考虑了最初提出的困惑和归一化困惑。

{"title":"On the influence of vocabulary size and language models in unconstrained handwritten text recognition","authors":"Urs-Viktor Marti, H. Bunke","doi":"10.1109/ICDAR.2001.953795","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953795","url":null,"abstract":"In this paper we present a system for unconstrained handwritten text recognition. The system consists of three components: preprocessing, feature extraction and recognition. In the preprocessing phase, a page of handwritten text is divided into its lines and the writing is normalized by means of skew and slant correction, positioning and scaling. From a normalized text line image, features are extracted using a sliding window technique. From each position of the window nine geometrical features are computed. The core of the system, the recognizes is based on hidden Markov models. For each individual character, a model is provided. The character models are concatenated to words using a vocabulary. Moreover, the word models are concatenated to models that represent full lines of text. Thus the difficult problem of segmenting a line of text into its individual words can be overcome. To enhance the recognition capabilities of the system, a statistical language model is integrated into the hidden Markov model framework. To preselect useful language models and compare them, perplexity is used. Both perplexity as originally proposed and normalized perplexity are considered.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114013725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

Speeding up on-line recognition of handwritten characters by pruning the prototype set 通过对原型集的修剪，加快手写字符的在线识别

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953840

V. Vuori, Jorma T. Laaksonen, E. Oja, J. Kangas

This work describes a prototype-based online handwritten character recognition system and a two-phase recognition scheme aimed to speed up the recognition. In the first phase, the prototype set is pruned and ordered on the basis of preclassification performed with heavily down-sampled characters and prototypes. In the second phase, the final classification is performed without down-sampling by using the reduced set of prototypes. Two down-sampling methods, a linear and nonlinear one, have been analyzed to see their properties regarding the recognition time and accuracy.

本文描述了一种基于原型的在线手写字符识别系统和一种两阶段识别方案，旨在提高识别速度。在第一阶段，原型集在对大量降采样的字符和原型进行预分类的基础上进行修剪和排序。在第二阶段，通过使用简化的原型集，在没有下采样的情况下执行最终分类。分析了线性和非线性两种下采样方法在识别时间和精度方面的特性。

引用次数: 26

Adaptation of an address reading system to local mail streams 地址读取系统适应本地邮件流

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953911

A. Brakensiek, J. Rottland, F. Wallhoff, G. Rigoll

A scheme for handwriting adaptation for post offices is described to improve recognition performance of German addresses. The recognition system is based on a tied-mixture hidden Markov model, whose parameters are updated using the expectation maximization technique, the maximum likelihood linear regression algorithm and a new discriminative adaptation technique, the scaled likelihood linear regression. Contrary to the usual approach of adapting a writer-independent system to a specific writer we propose to adapt the system to the writer-independent data of a specific post office. The resulting system for each post office yields up to 16% lower word recognition errors.

为提高德语地址的识别性能，提出了一种邮局笔迹自适应方案。该识别系统基于一种绑定混合隐马尔可夫模型，该模型的参数更新采用期望最大化技术、最大似然线性回归算法和一种新的判别自适应技术——比例似然线性回归。与将独立于编写器的系统适应于特定编写器的通常方法相反，我们建议将系统适应于特定邮局的独立于编写器的数据。由此产生的每个邮局系统的单词识别错误率可降低16%。

引用次数: 7

Separating handwritten material from machine printed text using hidden Markov models 使用隐马尔可夫模型从机器打印文本中分离手写材料

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953828

J. Guo, Matthew Y. Ma

In this paper, we address the problem of separating handwritten annotations from machine-printed text within a document. We present an algorithm that is based on the theory of hidden Markov models (HMMs) to distinguish between machine-printed and handwritten materials. No OCR results are required prior to or during the process, and the classification is performed at the word level. Handwritten annotations are not limited to marginal areas, as the approach can deal with document images having handwritten annotations overlaid on machine-printed text and it has been shown to be promising in our experiments. Experimental results show that the proposed method can achieve 72.19% recall for fully extracted handwritten words and 90.37% for partially extracted words. The precision of extracting handwritten words has reached 92.86%.

在本文中，我们解决了将文档中的手写注释与机器打印文本分离的问题。我们提出了一种基于隐马尔可夫模型(hmm)理论的算法来区分机器打印和手写材料。在此过程之前或过程中不需要OCR结果，并且在单词级别执行分类。手写注释不局限于边缘区域，因为该方法可以处理覆盖在机器打印文本上的手写注释的文档图像，并且在我们的实验中已经显示出它的前景。实验结果表明，该方法对完全提取的手写体单词的召回率为72.19%，对部分提取的手写单词的召回率为90.37%。手写单词的提取精度达到了92.86%。

引用次数: 115

Alignment of free layout color texts for character recognition 字符识别的自由布局颜色文本对齐

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953922

H. Hase, M. Yoneda, Toshiyuki Shinokawa, C. Suen

A realignment algorithm for irregular character strings on color documents is proposed. Color documents often contain poorly aligned texts such as inclined or curved texts sometimes with distortion. In order to recognize them, we classify these texts into five types. After determining the type, we realign all the characters in a text horizontally, then test them with an ordinary character recognition method. Lastly, we show some experimental results for texts extracted from real color documents and discuss some causes of misrecognition.

提出了一种彩色文档中不规则字符串的对齐算法。彩色文档通常包含对齐不良的文本，例如倾斜或弯曲的文本，有时会失真。为了识别它们，我们将这些文本分为五种类型。在确定类型后，我们将文本中的所有字符水平对齐，然后用普通的字符识别方法进行测试。最后，我们给出了一些从真实彩色文档中提取文本的实验结果，并讨论了一些错误识别的原因。

引用次数: 18

A new stroke-based directional feature extraction approach for handwritten Chinese character recognition 一种新的基于笔划的手写汉字方向特征提取方法

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953867

Xue Gao, Lianwen Jin, Junxun Yin, Jiancheng Huang

A directional feature extraction approach based on stroke directional decomposition of a Chinese character is proposed. Without extracting the skeleton or contour of the character, the four directional sub-patterns, namely, horizontal (-), vertical (|), left up diagonal (/) and right up diagonal () sub-patterns could be obtained directly from analyzing the stroke directional characteristics of the character. Five kinds of line-density based elastic meshing methods are presented to extract cellular directional features. Experimentation on a total of 18800 handwritten samples from 940 categories produces a recognition rate of 92.71%, showing the effectiveness of the proposed approach.

提出了一种基于汉字笔画方向分解的方向特征提取方法。在不提取汉字骨架或轮廓的情况下，通过对汉字笔画方向特征的分析，可以直接得到水平(-)、垂直(|)、左上对角线(/)和右上对角线()四个方向子模式。提出了五种基于线密度的弹性网格方法来提取细胞的方向特征。对940个类别的18800个手写样本进行实验，识别率为92.71%，表明了该方法的有效性。

引用次数: 8

A graph grammar to recognize textured symbols 一个图形语法来识别有纹理的符号

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953833

Gemma Sánchez, J. Lladós

This paper describes a graph grammar to modelize textured symbols in a graphics recognition framework. A textured symbol means a symbol consisting of repetitive structured patterns. We propose a method to infer a graph grammar from a structured texture detected in a document, and the subsequent parser to decide whether a symbol is accepted by the grammar. The grammar is based on a region adjacency graph representation of the vectorized document and the productions are based on the neighboring relations of the patterns forming the textured symbol. The syntactic framework is applied on an architectural plan understanding application.

本文描述了一种图形识别框架中用于对纹理符号建模的图形语法。纹理符号是指由重复的结构图案组成的符号。我们提出了一种从文档中检测到的结构化纹理推断图语法的方法，以及随后的解析器来决定语法是否接受符号。该语法基于矢量化文档的区域邻接图表示，生成的结果基于构成纹理符号的模式的相邻关系。语法框架应用于架构计划理解应用程序。

引用次数: 10

On-line handwritten signature verification using wavelets and back-propagation neural networks 基于小波和反向传播神经网络的在线手写签名验证

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953934

Dariusz Z. Lejtman, Susan E. George

This paper investigates dynamic handwritten signature verification (HSV) using the wavelet transform with verification by the backpropagation neural network (NN). It is yet another avenue in the approach to HSV that is found to produce excellent results when compared with other methods of dynamic, or on-line, HSV. Using a database of dynamic signatures collected from 41 Chinese writers and 7 from Latin script we extract features (including pen pressure, x and y velocity, angle of pen movement and angular velocity) from the signature and apply the Daubechies-6 wavelet transform using coefficients as input to a NN which learns to verify signatures with a False Rejection Rate (FRR) of 0.0% and False Acceptance Rate (FAR) less of than 0.1.

研究了基于小波变换和反向传播神经网络验证的动态手写签名验证方法。与其他动态或在线HSV方法相比，它是HSV方法的另一种途径，被发现可以产生出色的结果。使用从41位中国作家和7位拉丁作家收集的动态签名数据库，我们从签名中提取特征(包括笔压力，x和y速度，笔移动角度和角速度)，并将Daubechies-6小波变换应用于使用系数作为输入的神经网络，该神经网络学习验证错误拒收率(FRR)为0.0%，错误接受率(FAR)小于0.1的签名。

引用次数: 28

Web-based cooperative document understanding 基于web的协同文档理解

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953815

Nicolas Roussel, O. Hitz, R. Ingold

The paper presents ongoing work on the design of a Web-based framework for cooperative document understanding. The authors begin by exposing their motivations for designing a new document understanding environment. They then describe the different levels of cooperation they intend to support and how Web technologies can help in this respect. Finally, the authors present Edelweiss, the framework we currently being developing based on this approach.

本文介绍了正在进行的基于web的协作文档理解框架设计工作。作者首先揭示了他们设计新的文档理解环境的动机。然后，他们描述了他们打算支持的不同级别的合作，以及Web技术如何在这方面提供帮助。最后，作者介绍了Edelweiss，我们目前正在基于这种方法开发的框架。

引用次数: 16

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of Sixth International Conference on Document Analysis and Recognition

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀