首页 > 最新文献

Proceedings of Sixth International Conference on Document Analysis and Recognition最新文献

英文 中文
A scanning n-tuple classifier for online recognition of handwritten digits 用于手写数字在线识别的扫描n元组分类器
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953747
E. Ratzlaff
A scanning n-tuple classifier is applied to the task of recognizing online handwritten isolated digits. Various aspects of preprocessing, feature extraction, training and application of the scanning n-tuple method are examined. These include: distortion transformations of training data, test data perturbations, variations in bitmap generation and scaling, chain code extraction and concatenation, various static and dynamic features, and scanning n-tuple combinations. Results are reported for both the UNIPEN Train-R01/V07 and DevTest-R01/V02 subset la isolated digits databases.
将扫描n元组分类器应用于识别在线手写孤立数字的任务。研究了扫描n元组方法的预处理、特征提取、训练和应用等各个方面。这些包括:训练数据的失真转换,测试数据的扰动,位图生成和缩放的变化,链码提取和连接,各种静态和动态特征,以及扫描n元组组合。报告了UNIPEN Train-R01/V07和DevTest-R01/V02子集在隔离数字数据库中的结果。
{"title":"A scanning n-tuple classifier for online recognition of handwritten digits","authors":"E. Ratzlaff","doi":"10.1109/ICDAR.2001.953747","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953747","url":null,"abstract":"A scanning n-tuple classifier is applied to the task of recognizing online handwritten isolated digits. Various aspects of preprocessing, feature extraction, training and application of the scanning n-tuple method are examined. These include: distortion transformations of training data, test data perturbations, variations in bitmap generation and scaling, chain code extraction and concatenation, various static and dynamic features, and scanning n-tuple combinations. Results are reported for both the UNIPEN Train-R01/V07 and DevTest-R01/V02 subset la isolated digits databases.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130849319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Multi-branch and two-pass HMM modeling approaches for off-line cursive handwriting recognition 离线草书手写识别的多分支和两次HMM建模方法
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953789
Wenwei Wang, A. Brakensiek, A. Kosmala, G. Rigoll
Because of large shape variations in human handwriting, cursive handwriting recognition remains a challenging task. Usually, the recognition performance depends crucially upon the pre-processing steps, e.g. the word baseline detection and segmentation process. Hidden Markov models (HMMs) have the ability to model similarities and variations among samples of a class. In this paper, we present a multi-branch HMM modeling method and an HMM-based two-pass modeling approach. Whereas the multi-branch HMM method makes the resulting system more robust with word baseline detection, the two-pass recognition approach exploits the segmentation ability of the Viterbi algorithm and creates another HMM set and carries out a second recognition pass. The total performance is enhanced by the combination of the two recognition passes. Experiments recognizing cursive handwritten words with a 30,000-word lexicon have been carried out. The results demonstrate that our novel approaches achieve better recognition performance and reduce the relative error rate significantly.
由于人类笔迹的形状变化很大,草书笔迹识别仍然是一项具有挑战性的任务。通常,识别性能主要取决于预处理步骤,例如单词基线检测和分割过程。隐马尔可夫模型(hmm)能够模拟类样本之间的相似性和差异性。本文提出了一种多分支HMM建模方法和一种基于HMM的两步建模方法。多分支HMM方法通过单词基线检测使结果系统更具鲁棒性,而两步识别方法利用Viterbi算法的分割能力,创建另一个HMM集并进行第二次识别。两个识别通道的结合提高了总体性能。用3万字词典进行了手写体识别实验。结果表明,我们的新方法取得了更好的识别性能,显著降低了相对错误率。
{"title":"Multi-branch and two-pass HMM modeling approaches for off-line cursive handwriting recognition","authors":"Wenwei Wang, A. Brakensiek, A. Kosmala, G. Rigoll","doi":"10.1109/ICDAR.2001.953789","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953789","url":null,"abstract":"Because of large shape variations in human handwriting, cursive handwriting recognition remains a challenging task. Usually, the recognition performance depends crucially upon the pre-processing steps, e.g. the word baseline detection and segmentation process. Hidden Markov models (HMMs) have the ability to model similarities and variations among samples of a class. In this paper, we present a multi-branch HMM modeling method and an HMM-based two-pass modeling approach. Whereas the multi-branch HMM method makes the resulting system more robust with word baseline detection, the two-pass recognition approach exploits the segmentation ability of the Viterbi algorithm and creates another HMM set and carries out a second recognition pass. The total performance is enhanced by the combination of the two recognition passes. Experiments recognizing cursive handwritten words with a 30,000-word lexicon have been carried out. The results demonstrate that our novel approaches achieve better recognition performance and reduce the relative error rate significantly.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127901682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Character extraction and recognition in natural scene images 自然场景图像中的字符提取与识别
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953953
Xuewen Wang, Xiaoqing Ding, Changsong Liu
With the proposal of the concept of a "smart camera", character recognition in natural scene images has become an interesting but difficult task nowadays. In this paper, we propose an algorithm for extracting characters from text regions of natural scene images with complex backgrounds. Our method first clusters the color feature vectors of the text regions into a number of color classes by applying a modified coarse-fine fuzzy c-means algorithm. Then, different slices are constructed according to these color classes. Characters are eventually extracted from the images using the information of segmentation and recognition. Some experiments have shown that this method is a promising starting point for such applications.
随着“智能相机”概念的提出,自然场景图像中的字符识别已成为一项有趣但又困难的任务。本文提出了一种从具有复杂背景的自然场景图像文本区域中提取字符的算法。该方法首先采用改进的粗-精模糊c均值算法,将文本区域的颜色特征向量聚类为多个颜色类。然后,根据这些颜色类构造不同的切片。最后利用分割和识别的信息从图像中提取出字符。一些实验表明,这种方法是这种应用的一个有希望的起点。
{"title":"Character extraction and recognition in natural scene images","authors":"Xuewen Wang, Xiaoqing Ding, Changsong Liu","doi":"10.1109/ICDAR.2001.953953","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953953","url":null,"abstract":"With the proposal of the concept of a \"smart camera\", character recognition in natural scene images has become an interesting but difficult task nowadays. In this paper, we propose an algorithm for extracting characters from text regions of natural scene images with complex backgrounds. Our method first clusters the color feature vectors of the text regions into a number of color classes by applying a modified coarse-fine fuzzy c-means algorithm. Then, different slices are constructed according to these color classes. Characters are eventually extracted from the images using the information of segmentation and recognition. Some experiments have shown that this method is a promising starting point for such applications.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125480434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
PenCalc: a novel application of on-line mathematical expression recognition technology 铅笔:在线数学表达式识别技术的一种新应用
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953893
Kam-Fai Chan, D. Yeung
Most of the calculator programs found in existing pen-based mobile computing devices, such as personal digital assistants (PDA) and other handheld devices, do not take full advantages of the pen technology offered by these devices. Instead, input of expressions is still done through a virtual keypad shown on the screen, and the stylus (i.e., electronic pen) is simply used as a pointing device. In this paper we propose an intelligent handwriting-based calculator program with which the user can enter expressions simply by writing them on the screen using a stylus. In addition, variables can be defined to store intermediate results for subsequent calculations, as in ordinary algebraic calculations. The proposed software is the result of a novel application of on-line mathematical expression recognition technology which has mostly been used by others only for some mathematical expression editor programs.
在现有的基于笔的移动计算设备(如个人数字助理(PDA)和其他手持设备)中发现的大多数计算器程序并没有充分利用这些设备提供的笔技术。相反,表达式的输入仍然是通过屏幕上显示的虚拟键盘来完成的,而触控笔(即电子笔)只是用作指向设备。在本文中,我们提出了一个基于智能手写的计算器程序,用户可以通过使用触控笔在屏幕上简单地输入表达式。此外,还可以定义变量来存储后续计算的中间结果,就像在普通代数计算中一样。本文提出的软件是在线数学表达式识别技术的一种新应用,而在线数学表达式识别技术通常只用于一些数学表达式编辑程序。
{"title":"PenCalc: a novel application of on-line mathematical expression recognition technology","authors":"Kam-Fai Chan, D. Yeung","doi":"10.1109/ICDAR.2001.953893","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953893","url":null,"abstract":"Most of the calculator programs found in existing pen-based mobile computing devices, such as personal digital assistants (PDA) and other handheld devices, do not take full advantages of the pen technology offered by these devices. Instead, input of expressions is still done through a virtual keypad shown on the screen, and the stylus (i.e., electronic pen) is simply used as a pointing device. In this paper we propose an intelligent handwriting-based calculator program with which the user can enter expressions simply by writing them on the screen using a stylus. In addition, variables can be defined to store intermediate results for subsequent calculations, as in ordinary algebraic calculations. The proposed software is the result of a novel application of on-line mathematical expression recognition technology which has mostly been used by others only for some mathematical expression editor programs.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122992227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
A model guided document image analysis scheme 一个模型引导文档图像分析方案
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953963
Gaurav Harit, S. Chaudhury, Puneet Gupta, Neeti Vohra, S. Joshi
This paper presents a new model-based document image segmentation scheme that uses XML-DTDs (eXtensible Markup Language Document Type Definitions). Given a document image, the algorithm has the ability to select the appropriate model. A new wavelet-based tool has been designed for distinguishing text from non-text regions and characterization of font sizes. Our model-based analysis scheme makes use of this tool for identifying the logical components of a document image.
提出了一种使用xml - dtd(可扩展标记语言文档类型定义)的基于模型的文档图像分割方案。给定文档图像,该算法能够选择合适的模型。设计了一种新的基于小波的工具来区分文本和非文本区域,并对字体大小进行表征。我们基于模型的分析方案利用这个工具来识别文档图像的逻辑组件。
{"title":"A model guided document image analysis scheme","authors":"Gaurav Harit, S. Chaudhury, Puneet Gupta, Neeti Vohra, S. Joshi","doi":"10.1109/ICDAR.2001.953963","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953963","url":null,"abstract":"This paper presents a new model-based document image segmentation scheme that uses XML-DTDs (eXtensible Markup Language Document Type Definitions). Given a document image, the algorithm has the ability to select the appropriate model. A new wavelet-based tool has been designed for distinguishing text from non-text regions and characterization of font sizes. Our model-based analysis scheme makes use of this tool for identifying the logical components of a document image.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122312676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Why table ground-truthing is hard 为什么台面真相很难
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953768
Jianying Hu, R. Kashi, D. Lopresti, G. Wilfong, G. Nagy
The principle that for every document analysis task there exists a mechanism for creating well-defined ground-truth is a widely held tenet. Past experience with standard datasets providing ground-truth for character recognition and page segmentation tasks supports this belief. In the process of attempting to evaluate several table recognition algorithms we have been developing, however, we have uncovered a number of serious hurdles connected with the ground-truthing of tables. This problem may, in fact, be much more difficult than it appears. We present a detailed analysis of why table ground-truthing is so hard, including the notions that there may exist more than one acceptable "truth" and/or incomplete or partial "truths".
对于每个文档分析任务,都存在一个创建定义良好的基本事实的机制,这是一个广泛持有的原则。过去使用标准数据集为字符识别和页面分割任务提供基本事实的经验支持这一信念。然而,在试图评估我们一直在开发的几种表识别算法的过程中,我们发现了一些与表的基础真实性相关的严重障碍。事实上,这个问题可能比看上去要困难得多。我们详细分析了为什么地面真实如此困难,包括可能存在不止一个可接受的“真理”和/或不完整或部分“真理”的概念。
{"title":"Why table ground-truthing is hard","authors":"Jianying Hu, R. Kashi, D. Lopresti, G. Wilfong, G. Nagy","doi":"10.1109/ICDAR.2001.953768","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953768","url":null,"abstract":"The principle that for every document analysis task there exists a mechanism for creating well-defined ground-truth is a widely held tenet. Past experience with standard datasets providing ground-truth for character recognition and page segmentation tasks supports this belief. In the process of attempting to evaluate several table recognition algorithms we have been developing, however, we have uncovered a number of serious hurdles connected with the ground-truthing of tables. This problem may, in fact, be much more difficult than it appears. We present a detailed analysis of why table ground-truthing is so hard, including the notions that there may exist more than one acceptable \"truth\" and/or incomplete or partial \"truths\".","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128019458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Constructing Web-based legacy index card archives-architectural design issues and initial data acquisition 构建基于web的遗留索引卡存档——体系结构设计问题和初始数据获取
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953908
A. Downton, A. Tams, G. Wells, A. C. Holmes, S. Lucas, G. Beccaloni, M. Scoble, G. S. Robinson
Presents a progress report (after 1 year of a 3 year project) on the overall design for a flexible archive conversion system, intended eventually for widespread use as a tool to convert legacy typescript and handwritten archive card indexes into Internet-accessible and searchable databases. The VIADOCS system is being developed and evaluated on a demonstrator archive of 30,000 pyraloid moth cards at the UK Natural History Museum, and has already demonstrated a successful and efficient mechanism for image acquisition using a modified bank cheque scanner. Document image processing and analysis techniques, defined by an XML validating document type definition (DTD), are being used to correct defects in the acquired images and parse card sequences to match the hierarchical taxonomy of pyraloid moth species. Parsed data is processed by offline OCR engines augmented by field-specific subject dictionaries to produce a 'draft' online archive. This archive will then be validated interactively via a Web browser as it is used. It is hoped eventually to provide an efficient and configurable legacy archive document conversion system not only for the Natural History Museum, but also for all museums, libraries and archives where there is a need to interrogate legacy documents via computer.
提出了一个灵活的档案转换系统的总体设计的进度报告(在一个3年项目的1年后),最终目的是作为一种工具广泛使用,将遗留的打字文稿和手写档案卡索引转换为可访问internet和可搜索的数据库。VIADOCS系统正在英国自然历史博物馆的3万张蛾卡演示档案中进行开发和评估,并且已经展示了一种使用改进的银行支票扫描仪的成功和有效的图像采集机制。由XML验证文档类型定义(DTD)定义的文档图像处理和分析技术用于纠正所获取图像中的缺陷,并解析卡片序列以匹配蛾类的层次分类。解析后的数据由离线OCR引擎处理,并由特定字段的主题字典进行扩充,以生成“草稿”在线存档。然后,该存档将在使用时通过Web浏览器交互式地进行验证。希望最终为自然历史博物馆,以及所有需要通过计算机查询遗留文件的博物馆、图书馆和档案馆提供一个高效、可配置的遗留档案文件转换系统。
{"title":"Constructing Web-based legacy index card archives-architectural design issues and initial data acquisition","authors":"A. Downton, A. Tams, G. Wells, A. C. Holmes, S. Lucas, G. Beccaloni, M. Scoble, G. S. Robinson","doi":"10.1109/ICDAR.2001.953908","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953908","url":null,"abstract":"Presents a progress report (after 1 year of a 3 year project) on the overall design for a flexible archive conversion system, intended eventually for widespread use as a tool to convert legacy typescript and handwritten archive card indexes into Internet-accessible and searchable databases. The VIADOCS system is being developed and evaluated on a demonstrator archive of 30,000 pyraloid moth cards at the UK Natural History Museum, and has already demonstrated a successful and efficient mechanism for image acquisition using a modified bank cheque scanner. Document image processing and analysis techniques, defined by an XML validating document type definition (DTD), are being used to correct defects in the acquired images and parse card sequences to match the hierarchical taxonomy of pyraloid moth species. Parsed data is processed by offline OCR engines augmented by field-specific subject dictionaries to produce a 'draft' online archive. This archive will then be validated interactively via a Web browser as it is used. It is hoped eventually to provide an efficient and configurable legacy archive document conversion system not only for the Natural History Museum, but also for all museums, libraries and archives where there is a need to interrogate legacy documents via computer.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133935798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
An investigation on MPEG audio segmentation by evolutionary algorithms 基于进化算法的MPEG音频分割研究
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953926
C. Stefano, A. D. Cioppa, A. Marcelli
The recent research efforts in the field of video parsing and analysis have recognized that the soundtrack represents an important supplementary source of content information. In this framework, one of the most relevant topics is that of detecting homogeneous segments within the audio stream, in that changes in the audio very often coincide with scene changes. We present some preliminary results obtained by using different evolutionary algorithms for detecting music and speech audio segments. The experiments have been carried out on MPEG encoded sequences to avoid the computational cost of the decoding procedures.
近年来在视频解析和分析领域的研究已经认识到,音轨是内容信息的重要补充来源。在这个框架中,最相关的主题之一是检测音频流中的同质片段,因为音频的变化经常与场景的变化相吻合。我们提出了一些使用不同的进化算法来检测音乐和语音音频片段的初步结果。为了避免解码过程的计算成本,我们在MPEG编码序列上进行了实验。
{"title":"An investigation on MPEG audio segmentation by evolutionary algorithms","authors":"C. Stefano, A. D. Cioppa, A. Marcelli","doi":"10.1109/ICDAR.2001.953926","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953926","url":null,"abstract":"The recent research efforts in the field of video parsing and analysis have recognized that the soundtrack represents an important supplementary source of content information. In this framework, one of the most relevant topics is that of detecting homogeneous segments within the audio stream, in that changes in the audio very often coincide with scene changes. We present some preliminary results obtained by using different evolutionary algorithms for detecting music and speech audio segments. The experiments have been carried out on MPEG encoded sequences to avoid the computational cost of the decoding procedures.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134445505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Character-like region verification for extracting text in scene images 在场景图像中提取文本的类字符区域验证
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953927
Hao Wang, J. Kangas
This paper proposes a method of identifying character-like regions in order to extract and recognize characters in natural color scene images automatically. After connected component extraction based on a multi-group decomposition scheme, alignment analysis is used to check the block candidates, namely, the character-like regions in each binary image layer and the final composed image. Priority adaptive segmentation (PAS) is implemented to obtain accurate foreground pixels of the character in each block. Then some heuristic meanings such as statistical features, recognition confidence, and alignment properties, are employed to justify the segmented characters. The algorithms are robust for a wide range of character fonts, shooting conditions, and color backgrounds. Results of our experiments are promising for real applications.
为了实现自然彩色场景图像中字符的自动提取和识别,提出了一种类字符区域的识别方法。在基于多组分解方案的连通分量提取之后,使用对齐分析来检查候选块,即每个二值图像层中的类字符区域和最终的合成图像。实现了优先级自适应分割(PAS),以获得每个块中准确的字符前景像素。然后利用统计特征、识别置信度和对齐属性等启发式含义对分割后的字符进行判别。该算法是鲁棒的广泛的字符字体,拍摄条件和颜色背景。我们的实验结果在实际应用中是有希望的。
{"title":"Character-like region verification for extracting text in scene images","authors":"Hao Wang, J. Kangas","doi":"10.1109/ICDAR.2001.953927","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953927","url":null,"abstract":"This paper proposes a method of identifying character-like regions in order to extract and recognize characters in natural color scene images automatically. After connected component extraction based on a multi-group decomposition scheme, alignment analysis is used to check the block candidates, namely, the character-like regions in each binary image layer and the final composed image. Priority adaptive segmentation (PAS) is implemented to obtain accurate foreground pixels of the character in each block. Then some heuristic meanings such as statistical features, recognition confidence, and alignment properties, are employed to justify the segmented characters. The algorithms are robust for a wide range of character fonts, shooting conditions, and color backgrounds. Results of our experiments are promising for real applications.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134471355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
On the influence of vocabulary size and language models in unconstrained handwritten text recognition 词汇量和语言模型对无约束手写体文本识别的影响
Pub Date : 2001-09-10 DOI: 10.1109/ICDAR.2001.953795
Urs-Viktor Marti, H. Bunke
In this paper we present a system for unconstrained handwritten text recognition. The system consists of three components: preprocessing, feature extraction and recognition. In the preprocessing phase, a page of handwritten text is divided into its lines and the writing is normalized by means of skew and slant correction, positioning and scaling. From a normalized text line image, features are extracted using a sliding window technique. From each position of the window nine geometrical features are computed. The core of the system, the recognizes is based on hidden Markov models. For each individual character, a model is provided. The character models are concatenated to words using a vocabulary. Moreover, the word models are concatenated to models that represent full lines of text. Thus the difficult problem of segmenting a line of text into its individual words can be overcome. To enhance the recognition capabilities of the system, a statistical language model is integrated into the hidden Markov model framework. To preselect useful language models and compare them, perplexity is used. Both perplexity as originally proposed and normalized perplexity are considered.
本文提出了一种无约束手写文本识别系统。该系统由预处理、特征提取和识别三个部分组成。在预处理阶段,将一页手写体文本分成几行,通过斜、斜校正、定位、缩放等方法对文字进行归一化。从归一化的文本行图像中,使用滑动窗口技术提取特征。从窗口的每个位置计算9个几何特征。该系统的核心是基于隐马尔可夫模型的识别。对于每个单独的角色,都提供了一个模型。使用词汇表将字符模型连接到单词。此外,单词模型被连接到表示整行文本的模型上。这样就可以克服将一行文本分割成单个单词的难题。为了提高系统的识别能力,将统计语言模型集成到隐马尔可夫模型框架中。为了预先选择有用的语言模型并对它们进行比较,使用了困惑。考虑了最初提出的困惑和归一化困惑。
{"title":"On the influence of vocabulary size and language models in unconstrained handwritten text recognition","authors":"Urs-Viktor Marti, H. Bunke","doi":"10.1109/ICDAR.2001.953795","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953795","url":null,"abstract":"In this paper we present a system for unconstrained handwritten text recognition. The system consists of three components: preprocessing, feature extraction and recognition. In the preprocessing phase, a page of handwritten text is divided into its lines and the writing is normalized by means of skew and slant correction, positioning and scaling. From a normalized text line image, features are extracted using a sliding window technique. From each position of the window nine geometrical features are computed. The core of the system, the recognizes is based on hidden Markov models. For each individual character, a model is provided. The character models are concatenated to words using a vocabulary. Moreover, the word models are concatenated to models that represent full lines of text. Thus the difficult problem of segmenting a line of text into its individual words can be overcome. To enhance the recognition capabilities of the system, a statistical language model is integrated into the hidden Markov model framework. To preselect useful language models and compare them, perplexity is used. Both perplexity as originally proposed and normalized perplexity are considered.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114013725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
期刊
Proceedings of Sixth International Conference on Document Analysis and Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1