Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953939
Hanaa Barakat, D. Blostein
It is quite common in document analysis and symbol recognition to rely on a priori knowledge about the nature of the document in order to locate candidate symbols. It is desirable, but less common, for a segmentation procedure to rely on "a posteriori" feedback from a non-human-guided process to adjust for segmentation errors. For this method to succeed, the feedback must come from a reliable classifier (one that is able to reject negative symbols including miss-segmented symbols). This paper examines the use of positive and negative training data on a nearest-neighbour classifier for hand-drawn geometric shapes. We explore the issues involved in the development of a reliable classifier using this method, and we discuss the trade-off between reliability and correctness.
{"title":"Training with positive and negative data samples: effects on a classifier for hand-drawn geometric shapes","authors":"Hanaa Barakat, D. Blostein","doi":"10.1109/ICDAR.2001.953939","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953939","url":null,"abstract":"It is quite common in document analysis and symbol recognition to rely on a priori knowledge about the nature of the document in order to locate candidate symbols. It is desirable, but less common, for a segmentation procedure to rely on \"a posteriori\" feedback from a non-human-guided process to adjust for segmentation errors. For this method to succeed, the feedback must come from a reliable classifier (one that is able to reject negative symbols including miss-segmented symbols). This paper examines the use of positive and negative training data on a nearest-neighbour classifier for hand-drawn geometric shapes. We explore the issues involved in the development of a reliable classifier using this method, and we discuss the trade-off between reliability and correctness.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124948498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953754
M. Seeger, C. Dance
We describe a binarisation method designed specifically for OCR of low quality camera images: background surface thresholding or BST. This method is robust to lighting variations and produces images with very little noise and consistent stroke width. BST computes a "surface" of background intensities at every point in the image and performs adaptive thresholding based on this result. The surface is estimated by identifying regions of low-resolution text and interpolating neighbouring background intensities into these regions. The final threshold is a combination of this surface and a global offset. According to our evaluation BST produces considerably fewer OCR errors than Niblack's local average method while also being more runtime efficient.
{"title":"Binarising camera images for OCR","authors":"M. Seeger, C. Dance","doi":"10.1109/ICDAR.2001.953754","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953754","url":null,"abstract":"We describe a binarisation method designed specifically for OCR of low quality camera images: background surface thresholding or BST. This method is robust to lighting variations and produces images with very little noise and consistent stroke width. BST computes a \"surface\" of background intensities at every point in the image and performs adaptive thresholding based on this result. The surface is estimated by identifying regions of low-resolution text and interpolating neighbouring background intensities into these regions. The final threshold is a combination of this surface and a global offset. According to our evaluation BST produces considerably fewer OCR errors than Niblack's local average method while also being more runtime efficient.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124997249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953813
E. Lank, Jeb S. Thorley, Sean Chen, D. Blostein
Unified Modeling Language (UML) diagrams are widely used by software engineers to describe the structure of software systems. Early in the software design cycle, software engineers informally sketch initial UML diagrams on paper or whiteboards. The information provided by these UML diagrams needs to be made available to computer assisted software engineering (CASE) tools. In order to smooth this transition from paper to electronic form, we have developed an online recognition system for UML diagrams. The system accepts input from an electronic whiteboard, a data tablet or a mouse. Efforts have been made to separate the domain-independent and domain-specific parts of the recognition system. The kernel of the system is retargetable, providing a general front end for online recognition of any glyph-based diagram notation. The kernel is extended with UML-specific routines for segmentation, recognition of glyphs, and recognition of glyph relationships.
{"title":"On-line recognition of UML diagrams","authors":"E. Lank, Jeb S. Thorley, Sean Chen, D. Blostein","doi":"10.1109/ICDAR.2001.953813","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953813","url":null,"abstract":"Unified Modeling Language (UML) diagrams are widely used by software engineers to describe the structure of software systems. Early in the software design cycle, software engineers informally sketch initial UML diagrams on paper or whiteboards. The information provided by these UML diagrams needs to be made available to computer assisted software engineering (CASE) tools. In order to smooth this transition from paper to electronic form, we have developed an online recognition system for UML diagrams. The system accepts input from an electronic whiteboard, a data tablet or a mouse. Efforts have been made to separate the domain-independent and domain-specific parts of the recognition system. The kernel of the system is retargetable, providing a general front end for online recognition of any glyph-based diagram notation. The kernel is extended with UML-specific routines for segmentation, recognition of glyphs, and recognition of glyph relationships.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115996551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953832
Jean-Philippe Valois, Myriam Côté, M. Cheriet
In this paper, a model-based scheme for recognizing and beautifying online hand-drawn sketches of electric diagrams is presented. The system uses a structural and topological relations matching mechanism that allows scale, translation, rotation invariant recognition. A simple prototype was developed and preliminary experimental results show how this technique, although simple, is efficient in recognizing such sketches.
{"title":"Online recognition of sketched electrical diagrams","authors":"Jean-Philippe Valois, Myriam Côté, M. Cheriet","doi":"10.1109/ICDAR.2001.953832","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953832","url":null,"abstract":"In this paper, a model-based scheme for recognizing and beautifying online hand-drawn sketches of electric diagrams is presented. The system uses a structural and topological relations matching mechanism that allows scale, translation, rotation invariant recognition. A simple prototype was developed and preliminary experimental results show how this technique, although simple, is efficient in recognizing such sketches.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128242996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953872
Alessandro Lameiras Koerich, R. Sabourin, C. Suen
Many offline handwritten word recognition systems have been proposed since the early nineties. Most systems reported high recognition rates, however, they overlooked a very important factor in the process: speed factor. The authors explore the potential for speeding up an offline handwritten word recognition system via concurrency. The goal of the system is to achieve both full accuracy and high speed when taking into account large vocabularies. This was accomplished by integrating the recognition process with multiprocessing and distributed computing concepts. Experimental results showed that the multiprocessing environment is very promising in enhancing a sequential offline handwritten word recognition system performance.
{"title":"A distributed scheme for lexicon-driven handwritten word recognition and its application to large vocabulary problems","authors":"Alessandro Lameiras Koerich, R. Sabourin, C. Suen","doi":"10.1109/ICDAR.2001.953872","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953872","url":null,"abstract":"Many offline handwritten word recognition systems have been proposed since the early nineties. Most systems reported high recognition rates, however, they overlooked a very important factor in the process: speed factor. The authors explore the potential for speeding up an offline handwritten word recognition system via concurrency. The goal of the system is to achieve both full accuracy and high speed when taking into account large vocabularies. This was accomplished by integrating the recognition process with multiprocessing and distributed computing concepts. Experimental results showed that the multiprocessing environment is very promising in enhancing a sequential offline handwritten word recognition system performance.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130667965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953744
G. Nagy
ECSE 6610 Advanced Character Recognition. Principles and practice of the recognition of isolated or connected typeset, hand-printed, and cursive characters. Review of optical digitization, supervised and unsupervised estimation of classifier parameters, bias and variance, expectation maximization, the curse of dimensionality. Advanced classification techniques including classifier combinations, support vector machines, hidden Markov methods, styles, language context, adaptation, segmentation-free classifiers, indirect symbolic correlation. Prereq: ECSE 2610, Probability, Linear Algebra. Spring term annually.
{"title":"Advanced character recognition 6610","authors":"G. Nagy","doi":"10.1109/ICDAR.2001.953744","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953744","url":null,"abstract":"ECSE 6610 Advanced Character Recognition. Principles and practice of the recognition of isolated or connected typeset, hand-printed, and cursive characters. Review of optical digitization, supervised and unsupervised estimation of classifier parameters, bias and variance, expectation maximization, the curse of dimensionality. Advanced classification techniques including classifier combinations, support vector machines, hidden Markov methods, styles, language context, adaptation, segmentation-free classifiers, indirect symbolic correlation. Prereq: ECSE 2610, Probability, Linear Algebra. Spring term annually.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123354202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953812
Frédéric Grandidier, R. Sabourin, M. Gilloux, C. Suen
During the development of a hidden Markov model based handwriting recognition system, the testing phase takes a non-negligible amount of computation time. This is especially true for real application where the lexicon size is large. In order to shorten the development process, we propose an indicator of the system discrimination power. This indicator is calculated during training and its final value is obtained at the end of the training phase, without more calculation. Its definition consists of a modification of the observation probability of the validation corpus by the trained system. Some experiments were carried out and the results show clearly the correlation between this indicator and recognition rates.
{"title":"An a priori indicator of the discrimination power of discrete hidden Markov models","authors":"Frédéric Grandidier, R. Sabourin, M. Gilloux, C. Suen","doi":"10.1109/ICDAR.2001.953812","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953812","url":null,"abstract":"During the development of a hidden Markov model based handwriting recognition system, the testing phase takes a non-negligible amount of computation time. This is especially true for real application where the lexicon size is large. In order to shorten the development process, we propose an indicator of the system discrimination power. This indicator is calculated during training and its final value is obtained at the end of the training phase, without more calculation. Its definition consists of a modification of the observation probability of the validation corpus by the trained system. Some experiments were carried out and the results show clearly the correlation between this indicator and recognition rates.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"151 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123497685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953965
B. Gatos, N. Papamarkos
Run length smoothing algorithm (RLSA) and projection profiles are among the fundamental algorithms in binary image processing, mainly used for segmentation of monochrome images. In this paper, fast RLSA and projection profiles are applied to binary images represented by a set of nonoverlapping rectangular blocks. The representation of binary images using rectangular blocks as primitives has been used with great success for several image processing tasks, such as image compression, Hough transform fast implementation and skeletonization. We show that this representation can be applied with great success for fast RLSA application and fast projection profiles evaluation. The experimental results demonstrate that starting from a block represented binary image we can apply RLSA and evaluate projection profiles in significant less CPU time. The average time gain is recorded at 60% and 88%, respectively.
{"title":"Applying fast segmentation techniques at a binary image represented by a set of non-overlapping blocks","authors":"B. Gatos, N. Papamarkos","doi":"10.1109/ICDAR.2001.953965","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953965","url":null,"abstract":"Run length smoothing algorithm (RLSA) and projection profiles are among the fundamental algorithms in binary image processing, mainly used for segmentation of monochrome images. In this paper, fast RLSA and projection profiles are applied to binary images represented by a set of nonoverlapping rectangular blocks. The representation of binary images using rectangular blocks as primitives has been used with great success for several image processing tasks, such as image compression, Hough transform fast implementation and skeletonization. We show that this representation can be applied with great success for fast RLSA application and fast projection profiles evaluation. The experimental results demonstrate that starting from a block represented binary image we can apply RLSA and evaluate projection profiles in significant less CPU time. The average time gain is recorded at 60% and 88%, respectively.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121422132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953894
Bertrand Coüasnon, L. Pasquer
In this paper we present a real-world evaluation of DMOS, a new generic document recognition method. This method uses a new grammatical formalism (EPF) and an associated parser able to introduce context in segmentation. We have implemented this DMOS method to build an automatic generator of structured document recognition systems. We already produced three recognition systems by only changing the EPF grammar: one on musical scores, one on mathematical formulae and one on recursive table structures. We present here a specific light grammar to automatically recognize quite damaged 19th century military forms. The quality of those forms is far from perfect: table lines are not well printed, paper is so thin that there are transparency problems (the forms are two-sided) but the biggest problem comes from small paper sheets hiding part of the structure. The evaluation of this system has been made onto 5268 images and the results show that the system did not make any mistake. Moreover it can recognize the entire structure in 97.2% of the forms (the other 2.8% are automatically set apart).
{"title":"A real-world evaluation of a generic document recognition method applied to a military form of the 19th century","authors":"Bertrand Coüasnon, L. Pasquer","doi":"10.1109/ICDAR.2001.953894","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953894","url":null,"abstract":"In this paper we present a real-world evaluation of DMOS, a new generic document recognition method. This method uses a new grammatical formalism (EPF) and an associated parser able to introduce context in segmentation. We have implemented this DMOS method to build an automatic generator of structured document recognition systems. We already produced three recognition systems by only changing the EPF grammar: one on musical scores, one on mathematical formulae and one on recursive table structures. We present here a specific light grammar to automatically recognize quite damaged 19th century military forms. The quality of those forms is far from perfect: table lines are not well printed, paper is so thin that there are transparency problems (the forms are two-sided) but the biggest problem comes from small paper sheets hiding part of the structure. The evaluation of this system has been made onto 5268 images and the results show that the system did not make any mistake. Moreover it can recognize the entire structure in 97.2% of the forms (the other 2.8% are automatically set apart).","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"10 12","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113962157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953967
V. Märgner, M. Pechwitz
A system for the automatic generation of synthetic databases for the development or evaluation of Arabic word or text recognition systems (Arabic OCR) is presented. The proposed system works without any scanning of printed paper. Firstly Arabic text has to be typeset using a standard typesetting system. Secondly a noise-free bitmap of the document and the corresponding ground truth (GT) is automatically generated. Finally, an image distortion can be superimposed to the character or word image to simulate the expected real world noise of the intended application. All necessary modules are presented together with some examples. Special problems caused by specific features of Arabic, such as printing from right to left, many diacritical points, variation in the height of characters, and changes in the relative position to the writing line, are suggested. The synthetic data set was used to train and test a recognition system based on hidden Markov model (HMM), which was originally developed for German cursive script, for Arabic printed words. Recognition results with different synthetic data sets are presented.
提出了一种用于开发或评价阿拉伯语文字识别系统(OCR)的自动合成数据库的系统。该系统无需扫描印刷纸张即可工作。首先,阿拉伯文本必须使用标准排版系统进行排版。其次,自动生成文档的无噪声位图和相应的ground truth (GT);最后,可以将图像失真叠加到字符或单词图像上,以模拟预期应用程序的预期真实世界噪声。介绍了所有必要的模块,并给出了一些示例。由于阿拉伯语的特殊特点,如从右向左印刷、许多变音符点、字符高度的变化以及与书写线的相对位置的变化,提出了一些特殊问题。该合成数据集用于训练和测试基于隐马尔可夫模型(HMM)的识别系统,该系统最初是为德文草书开发的,用于识别阿拉伯印刷文字。给出了不同合成数据集的识别结果。
{"title":"Synthetic data for Arabic OCR system development","authors":"V. Märgner, M. Pechwitz","doi":"10.1109/ICDAR.2001.953967","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953967","url":null,"abstract":"A system for the automatic generation of synthetic databases for the development or evaluation of Arabic word or text recognition systems (Arabic OCR) is presented. The proposed system works without any scanning of printed paper. Firstly Arabic text has to be typeset using a standard typesetting system. Secondly a noise-free bitmap of the document and the corresponding ground truth (GT) is automatically generated. Finally, an image distortion can be superimposed to the character or word image to simulate the expected real world noise of the intended application. All necessary modules are presented together with some examples. Special problems caused by specific features of Arabic, such as printing from right to left, many diacritical points, variation in the height of characters, and changes in the relative position to the writing line, are suggested. The synthetic data set was used to train and test a recognition system based on hidden Markov model (HMM), which was originally developed for German cursive script, for Arabic printed words. Recognition results with different synthetic data sets are presented.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"51 11","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114009550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}