Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227672
Hirokazu Muramatsu, Takashi Kobayashi, Takahiro Sugiyama, K. Abe
The purpose of this study is to develop a flexible matching method for recognizing handwritten numerals based on the statistics of shapes and structures learned from learning samples. In the recognition method we reported before, there were problems in matching of the feature points and evaluation of matching. To solve them, we propose a new matching method supplementing contour orientations with convex/concave information and a new evaluation method considering the structure of strokes. With these improvements the recognition rate rose to 96.0% from the earlier figure 91.9%. We also made a recognition experiment on samples from the ETL-1 database and obtained the recognition rate 95.2%.
{"title":"Improvement of matching and evaluation in handwritten numeral recognition using flexible standard patterns","authors":"Hirokazu Muramatsu, Takashi Kobayashi, Takahiro Sugiyama, K. Abe","doi":"10.1109/ICDAR.2003.1227672","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227672","url":null,"abstract":"The purpose of this study is to develop a flexible matching method for recognizing handwritten numerals based on the statistics of shapes and structures learned from learning samples. In the recognition method we reported before, there were problems in matching of the feature points and evaluation of matching. To solve them, we propose a new matching method supplementing contour orientations with convex/concave information and a new evaluation method considering the structure of strokes. With these improvements the recognition rate rose to 96.0% from the earlier figure 91.9%. We also made a recognition experiment on samples from the ETL-1 database and obtained the recognition rate 95.2%.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130087455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227621
H. Kang
Without an independence assumption, combining multiple classifiers deals with a high order probability distribution composed of classifiers and a class label. Storing and estimating the high order probability distribution is exponentially complex and unmanageable in theoretical analysis, so we rely on an approximation scheme using the dependency. In this paper, as an extension of the second-order dependency approach, the probability distribution is optimally approximated by the third-order dependency and multiple classifiers are combined. The proposed method is evaluated on the recognition of unconstrained handwritten numerals from Concordia University and the University of California, Irvine. Experimental results support the proposed method as a promising approach.
{"title":"Combining multiple classifiers based on third-order dependency","authors":"H. Kang","doi":"10.1109/ICDAR.2003.1227621","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227621","url":null,"abstract":"Without an independence assumption, combining multiple classifiers deals with a high order probability distribution composed of classifiers and a class label. Storing and estimating the high order probability distribution is exponentially complex and unmanageable in theoretical analysis, so we rely on an approximation scheme using the dependency. In this paper, as an extension of the second-order dependency approach, the probability distribution is optimally approximated by the third-order dependency and multiple classifiers are combined. The proposed method is evaluated on the recognition of unconstrained handwritten numerals from Concordia University and the University of California, Irvine. Experimental results support the proposed method as a promising approach.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114552338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227780
A. Britto, P. L. D. Souza, R. Sabourin, S. Souza, D. Borges
In this paper we propose a parallel approach for the K-meansVector Quantization (VQ) algorithm used in a two-stageHidden Markov Model (HMM)-based system forrecognizing handwritten numeral strings. With thisparallel algorithm, based on the master/slave paradigm,we overcome two drawbacks of the sequential version: a)the time taken to create the codebook; and b) the amountof memory necessary to work with large trainingdatabases. Distributing the training samples over theslaves' local disks reduces the overhead associated withthe communication process. In addition, modelspredicting computation and communication time havebeen developed. These models are useful to predict theoptimal number of slaves taking into account the numberof training samples and codebook size.
{"title":"A low-cost parallel K-means VQ algorithm using cluster computing","authors":"A. Britto, P. L. D. Souza, R. Sabourin, S. Souza, D. Borges","doi":"10.1109/ICDAR.2003.1227780","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227780","url":null,"abstract":"In this paper we propose a parallel approach for the K-meansVector Quantization (VQ) algorithm used in a two-stageHidden Markov Model (HMM)-based system forrecognizing handwritten numeral strings. With thisparallel algorithm, based on the master/slave paradigm,we overcome two drawbacks of the sequential version: a)the time taken to create the codebook; and b) the amountof memory necessary to work with large trainingdatabases. Distributing the training samples over theslaves' local disks reduces the overhead associated withthe communication process. In addition, modelspredicting computation and communication time havebeen developed. These models are useful to predict theoptimal number of slaves taking into account the numberof training samples and codebook size.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115098005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227651
T. Breuel
Handwriting recognition and OCR systems need to cope with a wide variety of writing styles and fonts, many of them possibly not previously encountered during training. This paper describes a notion of Bayesian statistical similarity and demonstrates how it can be applied to rapid adaptation to new styles. The ability to generalize across different problem instances is illustrated in the Gaussian case, and the use of statistical similarity Gaussian case is shown to be related to adaptive metric classification methods. The relationship to prior approaches to multitask learning, as well as variable or adaptive metric classification, and hierarchical Bayesian methods, are discussed. Experimental results on character recognition from the NIST3 database are presented.
{"title":"Character recognition by adaptive statistical similarity","authors":"T. Breuel","doi":"10.1109/ICDAR.2003.1227651","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227651","url":null,"abstract":"Handwriting recognition and OCR systems need to cope with a wide variety of writing styles and fonts, many of them possibly not previously encountered during training. This paper describes a notion of Bayesian statistical similarity and demonstrates how it can be applied to rapid adaptation to new styles. The ability to generalize across different problem instances is illustrated in the Gaussian case, and the use of statistical similarity Gaussian case is shown to be related to adaptive metric classification methods. The relationship to prior approaches to multitask learning, as well as variable or adaptive metric classification, and hierarchical Bayesian methods, are discussed. Experimental results on character recognition from the NIST3 database are presented.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114510256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227635
D. Malerba, F. Esposito, O. Altamura, Michelangelo Ceci, Margherita Berardi
In this paper, a machine learning approach to support the user during the correction of the layout analysis is proposed. Layout analysis is the process of extracting a hierarchical structure describing the layout of a page. In our approach, the layout analysis is performed in two steps: firstly, the global analysis determines possible areas containing paragraphs, sections, columns, figures and tables, and secondly, the local analysis groups together blocks that possibly fall within the same area. The result of the local analysis process strongly depends on the quality of the results of the first step. We investigate the possibility of supporting the user during the correction of the results of the global analysis. This is done by allowing the user to correct the results of the global analysis and then by learning rules for layout correction from the sequence of user actions. Experimental results on a set of multi-page documents are reported and commented.
{"title":"Correcting the document layout: a machine learning approach","authors":"D. Malerba, F. Esposito, O. Altamura, Michelangelo Ceci, Margherita Berardi","doi":"10.1109/ICDAR.2003.1227635","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227635","url":null,"abstract":"In this paper, a machine learning approach to support the user during the correction of the layout analysis is proposed. Layout analysis is the process of extracting a hierarchical structure describing the layout of a page. In our approach, the layout analysis is performed in two steps: firstly, the global analysis determines possible areas containing paragraphs, sections, columns, figures and tables, and secondly, the local analysis groups together blocks that possibly fall within the same area. The result of the local analysis process strongly depends on the quality of the results of the first step. We investigate the possibility of supporting the user during the correction of the results of the global analysis. This is done by allowing the user to correct the results of the global analysis and then by learning rules for layout correction from the sequence of user actions. Experimental results on a set of multi-page documents are reported and commented.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227813
Jian Zhai, Wenyin Liu, D. Dori, Qing Li
Line detection algorithms constitute the basis fortechnical document analysis and recognition. Theperformance of these algorithms decreases as the qualityof the documents degrades. To test the robustness of linedetection algorithms under noisy circumstance, wepropose a document degradation mode, which simulatesnoise types that drawings may undergo during theirproduction, storage, photocopying, or scanning. Using ourmodel, a series of document images at various noise levelsand types can be generated for testing the performance ofline detection algorithms. To illustrate that our model isconsistent with real world noise types, we validated themethod by applying it to three line recognition algorithms.
{"title":"A line drawings degradation model for performance characterization","authors":"Jian Zhai, Wenyin Liu, D. Dori, Qing Li","doi":"10.1109/ICDAR.2003.1227813","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227813","url":null,"abstract":"Line detection algorithms constitute the basis fortechnical document analysis and recognition. Theperformance of these algorithms decreases as the qualityof the documents degrades. To test the robustness of linedetection algorithms under noisy circumstance, wepropose a document degradation mode, which simulatesnoise types that drawings may undergo during theirproduction, storage, photocopying, or scanning. Using ourmodel, a series of document images at various noise levelsand types can be generated for testing the performance ofline detection algorithms. To illustrate that our model isconsistent with real world noise types, we validated themethod by applying it to three line recognition algorithms.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133864035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227725
H. Shimanuki, Jien Kato, Toyohide Watanabe
This paper describes a framework to recognizing and recreating folding process of origami based on illustrations of origami drill books. Illustration images acquired from origami books are motley and not sequenced. Moreover, the information obtained from 2D illustrations is so superficial and incomplete that the folding operations cannot be determined uniquely. To solve these problems, a highly flexible and reliable recognition mechanism is proposed. The paper additionally includes the content as follows. Firstly, an algorithm for revising the positions of folding operations extracted from illustrations is proposed so as to make our recognition approach more reliable. Secondly, the outline of the methods which enable feasible folding operations to be generated based only on superficial and incomplete information extracted from illustrations is described. Finally, some updating procedures are proposed to maintain consistency of data (called internal model) which record the transformation of origami models in 3D virtual space during a folding process. Several examples that prove the validness of proposed algorithms/methods are also given in this paper.
{"title":"Recognition of folding process from origami drill books","authors":"H. Shimanuki, Jien Kato, Toyohide Watanabe","doi":"10.1109/ICDAR.2003.1227725","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227725","url":null,"abstract":"This paper describes a framework to recognizing and recreating folding process of origami based on illustrations of origami drill books. Illustration images acquired from origami books are motley and not sequenced. Moreover, the information obtained from 2D illustrations is so superficial and incomplete that the folding operations cannot be determined uniquely. To solve these problems, a highly flexible and reliable recognition mechanism is proposed. The paper additionally includes the content as follows. Firstly, an algorithm for revising the positions of folding operations extracted from illustrations is proposed so as to make our recognition approach more reliable. Secondly, the outline of the methods which enable feasible folding operations to be generated based only on superficial and incomplete information extracted from illustrations is described. Finally, some updating procedures are proposed to maintain consistency of data (called internal model) which record the transformation of origami models in 3D virtual space during a folding process. Several examples that prove the validness of proposed algorithms/methods are also given in this paper.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134108598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227775
J. Lecoq, M. Mainguenaud
This article introduces a general way to bring alphanumericalattributes into a graphical system to managetreatings. This semantics supply is done by a spatialoperator closure using two spatial concepts: Topology andGranularity. Two graphical operator improvements areproposed, derived from spatial operators.
{"title":"Spatial alphanumerical attributes for graphical treatings","authors":"J. Lecoq, M. Mainguenaud","doi":"10.1109/ICDAR.2003.1227775","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227775","url":null,"abstract":"This article introduces a general way to bring alphanumericalattributes into a graphical system to managetreatings. This semantics supply is done by a spatialoperator closure using two spatial concepts: Topology andGranularity. Two graphical operator improvements areproposed, derived from spatial operators.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134278380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227777
P. Radtke, Luiz Oliveira, R. Sabourin, Tony Wong
This paper discusses the use of multi objective evolutionaryalgorithms applied to the engineering of zoning forhandwriten recognition. Usually a task fulfilled by an humanexpert, zoning design relies on specific domain knowledgeand a trial and error process to select an adequatedesign. Our proposed approach to automatically define thezone design was tested and was able to define zoning strategiesthat performed better than our former strategy definedmanually.
{"title":"Intelligent zoning design using multi-objective evolutionary algorithms","authors":"P. Radtke, Luiz Oliveira, R. Sabourin, Tony Wong","doi":"10.1109/ICDAR.2003.1227777","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227777","url":null,"abstract":"This paper discusses the use of multi objective evolutionaryalgorithms applied to the engineering of zoning forhandwriten recognition. Usually a task fulfilled by an humanexpert, zoning design relies on specific domain knowledgeand a trial and error process to select an adequatedesign. Our proposed approach to automatically define thezone design was tested and was able to define zoning strategiesthat performed better than our former strategy definedmanually.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133264316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227647
F. Kimura, N. Kayahara, Y. Miyake, M. Shridhar
High accuracy character recognition techniques can provide useful information for segmentation-based handwritten word recognition systems. This research describes neural network-based techniques for segmented character recognition that may be applied to the segmentation and recognition components of an off-line handwritten word recognition system. Two neural architectures along with two different feature extraction techniques were investigated. A novel technique for character feature extraction is discussed and compared with others in the literature. Recognition results above 80% are reported using characters automatically segmented from the CEDAR benchmark database as well as standard CEDAR alphanumerics.
{"title":"A novel feature extraction technique for the recognition of segmented handwritten characters","authors":"F. Kimura, N. Kayahara, Y. Miyake, M. Shridhar","doi":"10.1109/ICDAR.2003.1227647","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227647","url":null,"abstract":"High accuracy character recognition techniques can provide useful information for segmentation-based handwritten word recognition systems. This research describes neural network-based techniques for segmented character recognition that may be applied to the segmentation and recognition components of an off-line handwritten word recognition system. Two neural architectures along with two different feature extraction techniques were investigated. A novel technique for character feature extraction is discussed and compared with others in the literature. Recognition results above 80% are reported using characters automatically segmented from the CEDAR benchmark database as well as standard CEDAR alphanumerics.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"64 9","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133136122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}