Proceedings of Sixth International Conference on Document Analysis and Recognition最新文献

英文中文

Lexicon-driven handwritten character string recognition for Japanese address reading 日文地址读取的词典驱动手写字符串识别

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953912

Cheng-Lin Liu, Masashi Koga, H. Fujisawa

Proposes a handwritten character string recognition method for Japanese mail address reading on very large vocabulary. The recognition is performed by classification-embedded lexicon matching based on over-segmentation. The lexicon contains 111,349 address phrases and is represented in a trie structure. In recognition, the input text line image is matched with all lexicon entries by beam search to obtain reliable character segmentation and retrieve valid phrases. A classifier is embedded in lexicon matching to select from a dynamic set the characters matched with a candidate pattern. The beam search and the character classification jointly enable accurate phrase identification in real time. In experiments on 3,589 live mail images, the proposed method achieved correct rate of 83.68% with error rate less than 1%.

提出了一种针对超大词汇量的日语邮件地址识别的手写字符串识别方法。通过基于过分割的分类嵌入词典匹配实现识别。该词典包含111,349个地址短语，并以trie结构表示。在识别中，输入的文本行图像通过光束搜索与所有词典条目进行匹配，获得可靠的字符分割并检索有效短语。在词典匹配中嵌入分类器，从动态集中选择与候选模式匹配的字符。波束搜索和字符分类相结合，实现了实时准确的短语识别。在3589张实时邮件图像的实验中，该方法的正确率达到83.68%，错误率小于1%。

引用次数: 9

Web-based cooperative document understanding 基于web的协同文档理解

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953815

Nicolas Roussel, O. Hitz, R. Ingold

The paper presents ongoing work on the design of a Web-based framework for cooperative document understanding. The authors begin by exposing their motivations for designing a new document understanding environment. They then describe the different levels of cooperation they intend to support and how Web technologies can help in this respect. Finally, the authors present Edelweiss, the framework we currently being developing based on this approach.

本文介绍了正在进行的基于web的协作文档理解框架设计工作。作者首先揭示了他们设计新的文档理解环境的动机。然后，他们描述了他们打算支持的不同级别的合作，以及Web技术如何在这方面提供帮助。最后，作者介绍了Edelweiss，我们目前正在基于这种方法开发的框架。

引用次数: 16

Prediction of handwriting legibility 预测笔迹的易读性

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953935

M. E. Dehkordi, N. Sherkat, Tony Allen

This paper describes an independent handwriting style classifier that has been designed to select the best recognizer for a given style of writing. For this purpose a definition of handwriting legibility has been defined and a method has been implemented that can predict this legibility. The technique consists of two phases. In the feature extraction phase, a set of 16 features is extracted from the image contour. These features have been selected from amongst a set of pre-recognition features as those features that contribute the most (95%) to a discriminant between legible and illegible words. In the classification phase, a Probability Neural Network based on Bayesian decision is introduced to predict the legibility of unknown handwriting using a Parzen method to estimate a class conditional density function from the available training data.

本文描述了一个独立的手写风格分类器，该分类器被设计用于为给定的书写风格选择最佳识别器。为此，定义了手写体易读性的定义，并实现了预测这种易读性的方法。该技术包括两个阶段。在特征提取阶段，从图像轮廓中提取一组16个特征。这些特征是从一组预识别特征中挑选出来的，因为这些特征对区分易读和难读的单词贡献最大(95%)。在分类阶段，引入基于贝叶斯决策的概率神经网络，利用Parzen方法从可用的训练数据中估计类条件密度函数来预测未知笔迹的易读性。

引用次数: 3

Visual exploration and functional document labeling 视觉探索和功能文档标记

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953901

V. Eglin, Antoine Gagneux

This paper presents a new approach to textual data labeling based on texture analysis. Texture is used here to show the impact of document composition on visual exploration. We demonstrate how textural properties are well adapted to typography characterization by categorizing document regions into visual text classes (such as headings, head- and footnotes, paragraphs, abstracts, etc.). We reference and classify different types of text fonts according to their visual aspect and the visual impression that emerges from the textual data. Experiments on a set of various document images show a good accuracy and robustness for our method.

提出了一种基于纹理分析的文本数据标注方法。这里使用纹理来显示文档构成对视觉探索的影响。我们通过将文档区域分类为可视文本类(如标题、头部和脚注、段落、摘要等)来演示纹理属性如何很好地适应排版特性。我们根据不同类型的文本字体的视觉方面和从文本数据中产生的视觉印象来参考和分类。在一组不同文档图像上的实验表明，该方法具有良好的准确性和鲁棒性。

引用次数: 7

A recognition method of machine-printed monetary amounts based on the two-dimensional segmentation and the bottom-up parsing 一种基于二维分割和自底向上解析的机印货币数量识别方法

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953929

Masashi Koga, Ryuji Mine, H. Sako, H. Fujisawa

We propose a new method to recognize the machine-printed monetary amount based on two-dimensional segmentation and bottom-up parsing. In conventional segmentation-based methods, the system segments the image only along the direction of the character line. This new method segments the image both horizontally and vertically, and extracts candidates of character segments correctly if there are many noises or characters are fragmented. A parsing module detects the optimal sequence of candidate segments using linguistic knowledge. In our method, a context-free grammar describes the linguistic constraints in the monetary amounts. We devised a new bottom-up parsing technique that interprets the results of character classification of the two-dimensionally segmented sub-images. We tested the validity of the new method using 1,314 images, and found that it improves the recognition accuracy significantly.

提出了一种基于二维分割和自底向上解析的机器印钞金额识别方法。在传统的基于分割的方法中，系统只沿着字符线的方向分割图像。该方法对图像进行水平和垂直分割，在噪声较多或字符被分割的情况下，正确提取候选字符段。解析模块使用语言知识检测候选片段的最佳序列。在我们的方法中，上下文无关的语法描述了货币数量中的语言约束。我们设计了一种新的自底向上的解析技术来解释二维分割子图像的特征分类结果。我们用1314张图像测试了新方法的有效性，发现它显著提高了识别精度。

引用次数: 1

Active radical modeling for handwritten Chinese characters 手写体汉字的主动自由基建模

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953790

D. Shi, S. Gunn, R. Damper

Handwritten Chinese character recognition is one of the most difficult problems of pattern recognition. Since the majority of Chinese characters are made up from just a small set of primitive structures (radicals), this paper describes an approach to active radical modeling for such handwritten characters. The most significant characteristic of our method is that radicals can be found robustly without stroke extraction, and the principal variations of the radical can be encoded in a small number of parameters. In the training phase, the example radicals are represented by manually-labeled 'landmark' points. Then a small number of principal components of the eigenvectors are calculated to capture the main variation of the training examples from the mean radical. In the matching phase, each radical model is fitted to the image evidence by adjusting the shape parameters in terms of chamfer distance minimization. Initial experiments are conducted on 1,100 loosely-constrained Chinese character categories written by 200 different writers. The correct matching rate is 95.8%, showing that our radical modeling is effective and capable of forming a sound basis for handwritten Chinese character recognition.

手写体汉字识别是模式识别的难点之一。由于大多数汉字都是由一小部分原始结构(部首)组成的，本文描述了一种针对这类手写汉字的主动部首建模方法。该方法最显著的特点是不需要提取笔划就可以稳健性地找到自由基，并且自由基的主要变化可以用少量的参数进行编码。在训练阶段，示例基由手动标记的“地标”点表示。然后计算特征向量的少量主成分，以从均值根中捕获训练样例的主要变化。在匹配阶段，根据最小倒角距离调整形状参数，将每个径向模型拟合到图像证据中。最初的实验是对200位不同作者所写的1100个松散约束的汉字类别进行的。正确匹配率为95.8%，表明我们的激进建模是有效的，能够为手写体汉字识别奠定良好的基础。

{"title":"Active radical modeling for handwritten Chinese characters","authors":"D. Shi, S. Gunn, R. Damper","doi":"10.1109/ICDAR.2001.953790","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953790","url":null,"abstract":"Handwritten Chinese character recognition is one of the most difficult problems of pattern recognition. Since the majority of Chinese characters are made up from just a small set of primitive structures (radicals), this paper describes an approach to active radical modeling for such handwritten characters. The most significant characteristic of our method is that radicals can be found robustly without stroke extraction, and the principal variations of the radical can be encoded in a small number of parameters. In the training phase, the example radicals are represented by manually-labeled 'landmark' points. Then a small number of principal components of the eigenvectors are calculated to capture the main variation of the training examples from the mean radical. In the matching phase, each radical model is fitted to the image evidence by adjusting the shape parameters in terms of chamfer distance minimization. Initial experiments are conducted on 1,100 loosely-constrained Chinese character categories written by 200 different writers. The correct matching rate is 95.8%, showing that our radical modeling is effective and capable of forming a sound basis for handwritten Chinese character recognition.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127238913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Application of slant correction to handwritten Japanese address recognition 倾斜校正在手写日文地址识别中的应用

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953874

Yimei Ding, Masayuki Okada, F. Kimura, Y. Miyake

We describe the application of slant correction to handwritten Japanese address recognition. Horizontal strokes of Japanese characters in handwritten Japanese addresses often have a slant to the right upper direction. We propose an iterative modified 8-directional chain code method, that is adapted to slant estimation for these addresses. Recognition experiments for the IPTP cdrom2 database are performed to evaluate the effectiveness of slant correction. Comparative results show that based on slant correction, the accuracy of character segmentation and recognition are both improved, which lead to the accuracy improvement of address recognition.

我们描述了倾斜校正在手写日文地址识别中的应用。在手写的日文地址中，日文字符的横笔画常向右上方向倾斜。我们提出了一种迭代改进的8向链码方法，该方法适用于这些地址的倾斜估计。通过IPTP cdrom2数据库的识别实验，对倾斜校正的有效性进行了评价。对比结果表明，基于倾斜校正的字符分割和识别精度都得到了提高，从而提高了地址识别的精度。

引用次数: 4

Recognition of unconstrained handwritten numeral strings using decision value generator 使用决策值生成器识别无约束手写数字字符串

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953746

K. Kim, Y. Chung, Jinho Kim, C. Suen

This paper presents recognition of unconstrained handwritten numeral strings using a decision value generator. The numeral string recognition system is composed of three modules: pre-segmentation, segmentation and recognition. The pre-segmentation module classifies a numeral string into sub-images, such as isolated digits, touching digits or broken digits, based on the confidence value of decision value generator. The segmentation module splits the touching digits using the reliability value of decision value generator. Both segmentation-based and segmentation free methods are used in classification and segmentation. To evaluate the proposed method, experiments were conducted using the handwritten numeral strings of NIST SD19 and a higher recognition performance than previous works was obtained.

提出了一种基于决策值生成器的无约束手写数字字符串识别方法。数字字符串识别系统由预分割、分割和识别三个模块组成。预分割模块根据决策值生成器的置信度，将数字串划分为孤立数字、接触数字或破碎数字等子图像。分割模块利用决策值发生器的可靠性值对触摸数字进行分割。在分类和分割中使用了基于分割和无分割两种方法。为了验证该方法的有效性，使用NIST SD19的手写数字串进行了实验，获得了比以往工作更高的识别性能。

引用次数: 7

A new stroke-based directional feature extraction approach for handwritten Chinese character recognition 一种新的基于笔划的手写汉字方向特征提取方法

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953867

Xue Gao, Lianwen Jin, Junxun Yin, Jiancheng Huang

A directional feature extraction approach based on stroke directional decomposition of a Chinese character is proposed. Without extracting the skeleton or contour of the character, the four directional sub-patterns, namely, horizontal (-), vertical (|), left up diagonal (/) and right up diagonal () sub-patterns could be obtained directly from analyzing the stroke directional characteristics of the character. Five kinds of line-density based elastic meshing methods are presented to extract cellular directional features. Experimentation on a total of 18800 handwritten samples from 940 categories produces a recognition rate of 92.71%, showing the effectiveness of the proposed approach.

提出了一种基于汉字笔画方向分解的方向特征提取方法。在不提取汉字骨架或轮廓的情况下，通过对汉字笔画方向特征的分析，可以直接得到水平(-)、垂直(|)、左上对角线(/)和右上对角线()四个方向子模式。提出了五种基于线密度的弹性网格方法来提取细胞的方向特征。对940个类别的18800个手写样本进行实验，识别率为92.71%，表明了该方法的有效性。

引用次数: 8

Character recognition experiments using Unipen data 使用Unipen数据的字符识别实验

Proceedings of Sixth International Conference on Document Analysis and Recognition

Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953836

M. Parizeau, Alexandre Lemieux, Christian Gagné

This paper presents experiments that compare the performances of several versions of a regional-fuzzy representation (RFR) developed for cursive handwriting recognition (CHR). These experiments are conducted using a common neural network classifier namely a multilayer perceptron (MLP) trained with backpropagation. Results are given for isolated digits, isolated lower-case letters and lower-case letters extracted from phrases, from the Unipen database. Data set Train-R01/V07 is used for training while DevTest-R01/V02 is used for testing. The best overall representation yields recognition rates of respectively 97.0% and 85.6% for isolated digits and lower case, and 84.4% for lower-case extracted from phrases.

本文通过实验比较了几种不同版本的区域模糊表示(RFR)在草书手写识别(CHR)中的性能。这些实验是使用一种常见的神经网络分类器即多层感知器(MLP)进行的。给出了从Unipen数据库中提取的孤立数字、孤立小写字母和从短语中提取的小写字母的结果。数据集Train-R01/V07用于训练，数据集DevTest-R01/V02用于测试。最佳的整体表示对孤立数字和小写字母的识别率分别为97.0%和85.6%，对从短语中提取的小写字母的识别率为84.4%。

引用次数: 47

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of Sixth International Conference on Document Analysis and Recognition

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀