Proceedings of the Fourth International Conference on Document Analysis and Recognition最新文献

英文中文

Handwritten ZIP code recognition 手写邮政编码识别

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.620613

Gregory I. Dzuba, Alexander Filatov, A. Volgunin

The encoding of delivery point code (DPC) for a handwritten address is one of the most complex problems of the US mail delivery automation. This paper describes a real-time system intended to recognize the 5-digit ZIP code part of DPC. To increase the system performance the results of ZIP code recognition are cross-validated with those of city and state name recognition. The main principles of the handwritten word recognizer which provide the core of the system are explained. The system throughput is 40,000 address blocks per hour. Experimental results on live mail pieces are presented. The ZIP code recognition rate is 73% with 1% error rate.

手写地址的投递点编码(DPC)是美国邮件投递自动化中最复杂的问题之一。本文介绍了一种实时识别邮政编码中5位邮编的系统。为了提高系统的性能，将邮政编码识别结果与城市和州名识别结果进行交叉验证。介绍了手写体文字识别器的主要工作原理，它构成了系统的核心。系统吞吐量为每小时4万个地址块。给出了在实弹邮件上的实验结果。邮政编码识别率为73%，错误率为1%。

引用次数: 30

Indexing and classification of TV news articles based on telop recognition 基于图像识别的电视新闻文章索引与分类

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.619882

Y. Ariki, T. Teranishi

In accumulating and retrieving multimedia information such as images, speech and text, it is necessary to compress and retrieve the information efficiently and accurately. The purpose of this paper is to construct a multimedia database of TV news images based on telop character recognition. The first step is to detect telop frames and to segment the characters by differentiating the telop frames based on the fact that character regions have high brightness and the character edges are clear. The second step is the telop character recognition. It is performed by a subspace method using direction histogram features. The third step is indexing by extracting noun words after morphological analysis of the recognized telop characters. These noun words correspond with key words and are given to TV news articles as their indices. Finally TV news articles are classified into 10 topics such as politics, economics, culture, amusements, sports and so on based on the extracted indices. We employed an index-topic table to classify the articles using indices. The telop character recognition rate was 65.7% and the article classification rate was 67.3%.

在图像、语音、文本等多媒体信息的积累和检索中，需要对信息进行高效、准确的压缩和检索。本文的目的是构建一个基于远程字符识别的电视新闻图像多媒体数据库。第一步是检测边缘帧，根据字符区域亮度高、字符边缘清晰的特点，通过区分边缘帧对字符进行分割。第二步是字符识别。利用方向直方图特征的子空间方法来实现。第三步是对识别出的名词特征进行形态分析后提取名词词进行标引。这些名词词与关键词相对应，作为电视新闻文章的索引。最后根据提取的指标将电视新闻文章分为政治、经济、文化、娱乐、体育等10个主题。我们使用索引主题表对文章进行索引分类。远程字符识别率为65.7%，文章分类率为67.3%。

{"title":"Indexing and classification of TV news articles based on telop recognition","authors":"Y. Ariki, T. Teranishi","doi":"10.1109/ICDAR.1997.619882","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619882","url":null,"abstract":"In accumulating and retrieving multimedia information such as images, speech and text, it is necessary to compress and retrieve the information efficiently and accurately. The purpose of this paper is to construct a multimedia database of TV news images based on telop character recognition. The first step is to detect telop frames and to segment the characters by differentiating the telop frames based on the fact that character regions have high brightness and the character edges are clear. The second step is the telop character recognition. It is performed by a subspace method using direction histogram features. The third step is indexing by extracting noun words after morphological analysis of the recognized telop characters. These noun words correspond with key words and are given to TV news articles as their indices. Finally TV news articles are classified into 10 topics such as politics, economics, culture, amusements, sports and so on based on the extracted indices. We employed an index-topic table to classify the articles using indices. The telop character recognition rate was 65.7% and the article classification rate was 67.3%.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122875049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Confidence computation improvement in an optical field reading system 光学场读数系统置信度计算改进

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.620629

A. Benedetti, Z. Kovács-Vajna

An expression in closed form is derived for the recognition error vs. rejection rate of optical character or word recognition systems. This expression allows to define a lower bound for the error rate of any recognition system employing a rejection process based on the definition of a confidence threshold. This relation has also proved to be useful to make a quantitative comparison between two confidence computation methods implemented in a system for reading USA Census '90 hand-written forms. The newly proposed method is based upon a confidence model integrating single-character confidence levels, digram statistics and other information from the dictionary matching phase. At a 50% rejection rate, the field error rate calculated using the new confidence computation algorithm decreased from 47.7% to 44.6%, which represents a considerable improvement, given a theoretical lower bound of 40.8% on the error rate.

导出了光学字符或单词识别系统的识别误差与拒绝率的封闭表达式。该表达式允许定义基于置信度阈值定义的任何采用拒绝过程的识别系统的错误率的下界。这种关系也被证明是有用的，使两种置信度计算方法之间的定量比较在一个系统中实施的阅读美国人口普查'90手写表格。新提出的方法基于一个置信度模型，该模型集成了单字符置信度、图统计和字典匹配阶段的其他信息。在拒绝率为50%时，使用新的置信度计算算法计算的现场错误率从47.7%下降到44.6%，这是一个相当大的改进，因为错误率的理论下限为40.8%。

引用次数: 1

Global interpolation method II for handwritten numbers overlapping a border by automatic knowledge acquisition of overlapped conditions 基于重叠条件知识自动获取的手写体数字边界重叠全局插值方法二

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.620558

S. Naoi, Maki Yabuki

The global interpolation method we propose can extract a handwritten alpha-numeric character pattern even if it overlaps a border. Our method interpolates blank segments in a character after borders are removed by globally evaluating segment label connectivity and connectedness to produce characters with smooth edges. However, the method cannot interpolate missing superpositioning segments, such as an overlapping horizontal line in the number "2". To solve this problem, we propose a global interpolation method II which adds top-down recognition processing to the bottom-up processing of the existing global interpolation method by automatically acquiring knowledge of the relationship between the overlapped condition and recognition reliability. Experimental results which use generated overlapping characters using the ETL database showed that our global interpolation method II has almost the same accuracy as the original ETL database.

我们提出的全局插值方法可以提取手写字母数字字符模式，即使它与边界重叠。我们的方法通过全局评估片段标签的连通性和连通性，在去除边界后插入字符中的空白段，以产生边缘光滑的字符。然而，该方法不能插值缺失的重叠部分，例如数字“2”中的重叠水平线。为了解决这一问题，我们提出了一种全局插值方法II，该方法通过自动获取重叠条件与识别可靠性之间的关系知识，在现有全局插值方法的自下而上处理基础上增加了自顶向下的识别处理。使用ETL数据库生成重叠字符的实验结果表明，我们的全局插值方法II与原始ETL数据库具有几乎相同的精度。

引用次数: 2

Hand-printed Chinese character recognition via machine learning 基于机器学习的手印汉字识别

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.619839

A. Amin, Seung-Gwon Kim, C. Sammut

Recognition of Chinese characters has been an area of great interest for many years, and a large number of research papers and reports have already been published in this area. There are several major problems with Chinese character recognition: Chinese characters are distinct and ideographic, the character size is very large and a lot of structurally similar characters exist in the character set. Thus, classification criteria are difficult to generate. This paper presents a new technique for the recognition of hand-printed Chinese characters using machine learning C4.5. Conventional methods have relied on hand-constructed dictionaries which are tedious to construct and difficult to make tolerant to variation in writing styles. The paper also discusses Chinese character recognition using dominant point feature extraction and C4.5. The system was tested with 900 characters (each character has 40 samples) and the rate of recognition obtained was 84%.

多年来，汉字识别一直是一个备受关注的领域，在这一领域已经发表了大量的研究论文和报告。汉字识别存在几个主要问题:汉字具有明显的表意性，字符尺寸非常大，字符集中存在大量结构相似的字符。因此，很难产生分类标准。本文提出了一种基于机器学习C4.5的手印汉字识别新技术。传统的方法依赖于手工构建的词典，这些词典构建起来很繁琐，而且很难适应写作风格的变化。本文还讨论了基于优势点特征提取和C4.5的汉字识别方法。该系统以900个字符(每个字符有40个样本)进行测试，获得的识别率为84%。

引用次数: 8

A neural-based architecture for spot-noisy logo recognition 基于神经的点噪声标志识别体系结构

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.619836

F. Cesarini, E. Francesconi, M. Gori, S. Marinai, Jianqing Sheng, G. Soda

Much attention has recently been paid to the recognition of graphical objects, such as company logos and trademarks. Recognizing these objects facilitates the recognition of document classes. Some promising results have been achieved by using autoassociator-based artificial neural networks (AANN) in the presence of homogeneously distributed noise. However, the performance drops significantly when dealing with spot-noisy logos, where strips or blobs produce a partial obstruction of the pictures. We propose a new approach for training AANNs especially conceived for dealing with spot noise. The basic idea is to introduce new metrics for assessing the reproduction error in AANNs. The proposed algorithm, referred to as spot-backpropagation (S-BP), is significantly more robust with respect to spot-noise than classical Euclidean norm-based backpropagation (BP). Our experimental results are based on a database of 88 real logos that are artificially corrupted by spot-noise.

最近，人们非常关注图形对象的识别，例如公司徽标和商标。识别这些对象有助于识别文档类。在均匀分布噪声存在的情况下，使用基于自关联器的人工神经网络(AANN)已经取得了一些令人满意的结果。然而，当处理斑点噪声标识时，性能会显著下降，其中条带或斑点会对图像产生部分阻碍。我们提出了一种训练aann的新方法，特别是为处理点噪声而设计的aann。基本思想是引入新的指标来评估aann的复制误差。所提出的算法，被称为点反向传播(S-BP)，相对于经典的基于欧几里得范数的反向传播(BP)，在点噪声方面具有更强的鲁棒性。我们的实验结果是基于一个包含88个真实标识的数据库，这些标识被人为的点噪声破坏了。

引用次数: 49

On-line handwritten signature verification using hidden Markov model features 使用隐马尔可夫模型特征的在线手写签名验证

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.619851

R. Kashi, Jianying Hu, W. Nelson, William Turin

A method for the automatic verification of on-line handwritten signatures using both global and local features as described. The global and local features capture various aspects of signature shape and dynamics of signature production. The authors demonstrate that with the addition to the global features of a local feature based on the signature likelihood obtained from hidden Markov models (HMM) the performance of signature verification improves significantly. The current version of the program, has 2.5% equal error rate. At the 1% false rejection (FR) point, the addition of the local information to the algorithm with only global features reduced the false acceptance (FA) rate from 13% to 5%.

一种使用全局和局部特征对在线手写签名进行自动验证的方法。全局和局部特征捕获签名形状和签名生产动态的各个方面。研究表明，在隐马尔可夫模型(HMM)的签名似然值基础上，在局部特征的基础上加入全局特征，可以显著提高签名验证的性能。当前版本的程序错误率为2.5%。在1%的错误拒绝(FR)点，在仅具有全局特征的算法中添加局部信息将错误接受(FA)率从13%降低到5%。

引用次数: 7

Off-line handwritten Chinese character recognition based on crossing line feature 基于交叉线特征的离线手写汉字识别

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.619842

Youbin Chen, Xiaoqing Ding, Youshou Wu

A new method to extract crossing line features for off-line handwritten Chinese character recognition is proposed in this paper. Firstly, the input pattern is nonlinearly normalized in order to compensate for shape variations. Secondly, the normalized pattern is separated into four subpatterns according to the four kinds of elementary strokes. Thirdly, the four subpatterns are uniformly divided into M/spl times/M cells respectively. In every cell, the crossing lines are counted. Then a 4M/sup 2/-dimensional feature vector is generated. An off-line handwritten Chinese character recognition system is built based on this feature. Our experiments have demonstrated the effectiveness of the method proposed in this paper.

提出了一种脱机手写体汉字识别的交叉线特征提取方法。首先，输入模式非线性归一化，以补偿形状变化。其次，根据四种基本笔画将归一化模式分成四个子模式;再次，将4个子模式分别统一划分为M/ sp1次/M个单元。在每个单元格中，交叉线被计数。然后生成一个4M/sup 2维特征向量。基于这一特征，构建了离线手写汉字识别系统。实验证明了本文方法的有效性。

引用次数: 14

A system for automatically reading IATA flight coupons 自动读取国际航空运输协会航班优惠券的系统

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.619832

J. Mao, R. Lorie, K. Mohiuddin

We describe a prototype system for reading IATA flight coupons. The system exploits various characteristics of IATA coupons to determine reliably coupon types and field boundaries, and to minimize the amount of manual keying. In particular, we propose a method for extracting and recognizing fixed-pitch characters on noisy images with a complex background. The method does not require a complete drop-out of background, pre-printed text, or lines before recognition, and allows for recovering partially damaged characters (e.g., overlap with form content, handwritten annotations, etc.).

我们描述了一个用于读取IATA航班优惠券的原型系统。该系统利用国际航空运输协会票证的各种特性来确定可靠的票证类型和场域边界，并最大限度地减少人工输入的数量。特别地，我们提出了一种在复杂背景下的噪声图像中提取和识别固定间距字符的方法。该方法不需要在识别之前完全删除背景，预打印文本或行，并且允许恢复部分损坏的字符(例如，与表单内容重叠，手写注释等)。

引用次数: 12

A Chinese bank check recognition system based on the fault tolerant technique 基于容错技术的中国银行支票识别系统

Proceedings of the Fourth International Conference on Document Analysis and Recognition

Pub Date : 1997-08-18 DOI: 10.1109/ICDAR.1997.620667

Wang Song, Ma Feng, X. Shaowei

The contradiction between the high recognition accuracy and the low rejection rate in automatic bank check recognition has not been solved successfully. In this paper, a fault-tolerant Chinese bank check recognition system is presented to solve the contradiction between the need for low-error-recognition probability and the need for low-refused-recognition probability. The main idea is to use a dynamic cipher code (which is to be widely applied in China) to lower both of them. This system achieves a high recognition rate and a high reliability simultaneously when automatically processing Chinese bank checks with dynamic cipher codes. A practical scheme of fault-tolerant recognition of bank checks is given in this paper, and experiments show the performance of our fault-tolerant technique.

银行支票自动识别中，高识别准确率与低拒收率之间的矛盾一直没有得到很好的解决。为了解决低错误识别概率与低拒绝识别概率之间的矛盾，本文提出了一种容错的银行支票识别系统。主要思想是使用动态密码(在中国被广泛应用)来降低两者。该系统在自动处理动态密码银行支票时，实现了高识别率和高可靠性。本文给出了一种实用的银行支票容错识别方案，并通过实验验证了容错技术的性能。

引用次数: 1

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the Fourth International Conference on Document Analysis and Recognition

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀