Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227818
Freddy Perraud, C. Viard-Gaudin, E. Morin, P. Lallican
This paper highlights the interest of a language modelin increasing the performances of on-line handwritingrecognition systems. Models based on statisticalapproaches, trained on written corpora, have beeninvestigated. Two kinds of models have been studied: n-grammodels and n-class models. In the latter case, theclasses result either from a syntactic criteria or acontextual criteria. In order to integrate it into smallcapacity systems (mobile device), an n-class model hasbeen designed by combining these criteria. It outperformsbulkier models based on n-gram. Integration into an on-linehandwriting recognition system demonstrates asubstantial performance improvement due to the languagemodel.
{"title":"N-gram and N-class models for on line handwriting recognition","authors":"Freddy Perraud, C. Viard-Gaudin, E. Morin, P. Lallican","doi":"10.1109/ICDAR.2003.1227818","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227818","url":null,"abstract":"This paper highlights the interest of a language modelin increasing the performances of on-line handwritingrecognition systems. Models based on statisticalapproaches, trained on written corpora, have beeninvestigated. Two kinds of models have been studied: n-grammodels and n-class models. In the latter case, theclasses result either from a syntactic criteria or acontextual criteria. In order to integrate it into smallcapacity systems (mobile device), an n-class model hasbeen designed by combining these criteria. It outperformsbulkier models based on n-gram. Integration into an on-linehandwriting recognition system demonstrates asubstantial performance improvement due to the languagemodel.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130460461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227678
Jean-Marc Viglino, M. Pierrot-Deseilligny
This paper deals with cadastral maps interpretation device. The challenge is to propose a complete reconstruction of the parcel's areas and buildings to use with geographic information systems. It is based on a low level primitives extraction and classification. As this low level may be quite noisy, an interpretation process classifies medium level objects and manages convenient processes to the particular extracted shape. Then, a reconstruction step is used to label the parcels areas and determine the final land partition. We present at first the vectorization strategy in our particular context then we will discuss the different tools used to reach the higher level.
{"title":"A vector approach for automatic interpretation of the French cadastral map","authors":"Jean-Marc Viglino, M. Pierrot-Deseilligny","doi":"10.1109/ICDAR.2003.1227678","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227678","url":null,"abstract":"This paper deals with cadastral maps interpretation device. The challenge is to propose a complete reconstruction of the parcel's areas and buildings to use with geographic information systems. It is based on a low level primitives extraction and classification. As this low level may be quite noisy, an interpretation process classifies medium level objects and manages convenient processes to the particular extracted shape. Then, a reconstruction step is used to label the parcels areas and determine the final land partition. We present at first the vectorization strategy in our particular context then we will discuss the different tools used to reach the higher level.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121432170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227623
L. Prevost, C. Michel-Sendis, A. Moises, L. Oudot, M. Milgram
Handwriting recognition is such a complex classification problem that it is quite usual now to make co-operate several classification methods at the pre-processing stage or at the classification stage. In this paper, we present an original two stages recognizer. The first stage is a model-based classifier that stores an exhaustive set of character models. The second stage is a discriminative classifier that separates the most ambiguous pairs of classes. This hybrid architecture is based on the idea that the correct class almost systematically belongs to the two more relevant classes found by the first classifier. Experiments on the Unipen database show a 30% improvement on a 62 class recognition problem.
{"title":"Combining model-based and discriminative classifiers: application to handwritten character recognition","authors":"L. Prevost, C. Michel-Sendis, A. Moises, L. Oudot, M. Milgram","doi":"10.1109/ICDAR.2003.1227623","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227623","url":null,"abstract":"Handwriting recognition is such a complex classification problem that it is quite usual now to make co-operate several classification methods at the pre-processing stage or at the classification stage. In this paper, we present an original two stages recognizer. The first stage is a model-based classifier that stores an exhaustive set of character models. The second stage is a discriminative classifier that separates the most ambiguous pairs of classes. This hybrid architecture is based on the idea that the correct class almost systematically belongs to the two more relevant classes found by the first classifier. Experiments on the Unipen database show a 30% improvement on a 62 class recognition problem.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123085334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227708
S. Lucas, Gregory Patoulas, A. Downton
This paper describes a complete system for reading type-written lexicon words in noisy images - in this case museum index cards. The system is conceptually simple, and straightforward to implement. It involves three stages of processing. The first stage extracts row-regions from the image, where each row is a hypothesized line of text. The next stage scans an OCR classifier over each row image, creating a character hypothesis graph in the process. This graph is then searched using a priority-queue based algorithm for the best matches with a set of words (lexicon). Performance evaluation on a set of museum archive cards indicates competitive accuracy and also reasonable throughput. The priority queue algorithm is over two hundred times faster than using flat dynamic programming on these graphs.
{"title":"Fast lexicon-based word recognition in noisy index card images","authors":"S. Lucas, Gregory Patoulas, A. Downton","doi":"10.1109/ICDAR.2003.1227708","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227708","url":null,"abstract":"This paper describes a complete system for reading type-written lexicon words in noisy images - in this case museum index cards. The system is conceptually simple, and straightforward to implement. It involves three stages of processing. The first stage extracts row-regions from the image, where each row is a hypothesized line of text. The next stage scans an OCR classifier over each row image, creating a character hypothesis graph in the process. This graph is then searched using a priority-queue based algorithm for the best matches with a set of words (lexicon). Performance evaluation on a set of museum archive cards indicates competitive accuracy and also reasonable throughput. The priority queue algorithm is over two hundred times faster than using flat dynamic programming on these graphs.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"94 17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126056467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227704
T. Kanahori, M. Suzuki
We proposed a method for recognizing matrices which contain abbreviation symbols, and a format for representing the structure of matrices, and reported experimental results in our paper (2002). The method consisted of 4 processes: detection of matrices, segmentation of elements, construction of networks and analysis of the matrix structure. In the paper, our work is described with a focus on the construction of networks and the analysis of the matrix structure. However, we concluded that improvements in the other two processes were very important for obtaining a high accuracy rate for recognition. In this paper, we describe the two improved processes, the detection of matrices and the segmentation of elements, and we report the experimental results.
{"title":"Detection of matrices and segmentation of matrix elements in scanned images of scientific documents","authors":"T. Kanahori, M. Suzuki","doi":"10.1109/ICDAR.2003.1227704","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227704","url":null,"abstract":"We proposed a method for recognizing matrices which contain abbreviation symbols, and a format for representing the structure of matrices, and reported experimental results in our paper (2002). The method consisted of 4 processes: detection of matrices, segmentation of elements, construction of networks and analysis of the matrix structure. In the paper, our work is described with a focus on the construction of networks and the analysis of the matrix structure. However, we concluded that improvements in the other two processes were very important for obtaining a high accuracy rate for recognition. In this paper, we describe the two improved processes, the detection of matrices and the segmentation of elements, and we report the experimental results.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121118519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227685
Eduardo Akira Yonekura, J. Facon
In this paper we present a new postal envelope segmentation method based on 2-D histogram clustering and watershed transform. Segmentation task consists in detecting the modes associated with homogeneous regions in envelope images such as handwritten address block, postmarks, stamps and background. The homogeneous modes in 2-D histogram are segmented through the morphological watershed transform. Our approach is applied to complex Brazilian postal envelopes. Very little a priori knowledge of the envelope images is required. The advantages of this approach will be described and illustrated with tests carried out on 300 different images where there are no fixed position for the handwritten address block, postmarks and stamps.
{"title":"Postal envelope segmentation by 2-D histogram clustering through watershed transform","authors":"Eduardo Akira Yonekura, J. Facon","doi":"10.1109/ICDAR.2003.1227685","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227685","url":null,"abstract":"In this paper we present a new postal envelope segmentation method based on 2-D histogram clustering and watershed transform. Segmentation task consists in detecting the modes associated with homogeneous regions in envelope images such as handwritten address block, postmarks, stamps and background. The homogeneous modes in 2-D histogram are segmented through the morphological watershed transform. Our approach is applied to complex Brazilian postal envelopes. Very little a priori knowledge of the envelope images is required. The advantages of this approach will be described and illustrated with tests carried out on 300 different images where there are no fixed position for the handwritten address block, postmarks and stamps.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131807875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227675
Yi Li, Zhiyan Wang, Haizan Zeng
A novel technique is presented in this paper to extract strings in color images of both business settlement plan (BSP) and non-BSP airline coupon. The essential concept is to remove non-text pixels from complex coupon images, rather than extract strings directly. First we transfer color images from RGB to HSV space, which is approximate uniformed, and then remove the black component of images using the property of HSV space. A statistical approach called principal components analysis (PCA) is applied to extract strings by removing the background decorative pattern based on priori environment. Finally, a method to validate and improve performance is present.
{"title":"String extraction from color airline coupon image using statistical approach","authors":"Yi Li, Zhiyan Wang, Haizan Zeng","doi":"10.1109/ICDAR.2003.1227675","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227675","url":null,"abstract":"A novel technique is presented in this paper to extract strings in color images of both business settlement plan (BSP) and non-BSP airline coupon. The essential concept is to remove non-text pixels from complex coupon images, rather than extract strings directly. First we transfer color images from RGB to HSV space, which is approximate uniformed, and then remove the black component of images using the property of HSV space. A statistical approach called principal components analysis (PCA) is applied to extract strings by removing the background decorative pattern based on priori environment. Finally, a method to validate and improve performance is present.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114296911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227778
Stefano Baldi, S. Marinai, G. Soda
In this paper we describe a method for the expansionof training sets made by XY trees representing page layout.This approach is appropriate when dealing with page classificationbased on MXY tree page representations. The basicidea is the use of tree grammars to model the variationsin the tree which are caused by segmentation algorithms.A set of general grammatical rules are defined and used toexpand the training set. Pages are classified with a k - nnapproach where the distance between pages is computed bymeans of tree-edit distance.
{"title":"Using tree-grammars for training set expansion in page classi .cation","authors":"Stefano Baldi, S. Marinai, G. Soda","doi":"10.1109/ICDAR.2003.1227778","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227778","url":null,"abstract":"In this paper we describe a method for the expansionof training sets made by XY trees representing page layout.This approach is appropriate when dealing with page classificationbased on MXY tree page representations. The basicidea is the use of tree grammars to model the variationsin the tree which are caused by segmentation algorithms.A set of general grammatical rules are defined and used toexpand the training set. Pages are classified with a k - nnapproach where the distance between pages is computed bymeans of tree-edit distance.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114690657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227774
Tomoyuki Hamamura, H. Mizutani, Bunpei Irie
In this paper, a new method of composing a multi-classclassifier using pairwise classifiers is proposed. A"Resemblance Model" is exploited to calculate aposteriori probability for combining pairwise classifiers.We proved the validity of this model by usingapproximation of a posteriori probability formula. Usingthis theory, we can obtain the optimal decision. Anexperimental result of handwritten numeral recognition ispresented, supporting the effectiveness of our method.
{"title":"A multiclass classification method based on multiple pairwise classifiers","authors":"Tomoyuki Hamamura, H. Mizutani, Bunpei Irie","doi":"10.1109/ICDAR.2003.1227774","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227774","url":null,"abstract":"In this paper, a new method of composing a multi-classclassifier using pairwise classifiers is proposed. A\"Resemblance Model\" is exploited to calculate aposteriori probability for combining pairwise classifiers.We proved the validity of this model by usingapproximation of a posteriori probability formula. Usingthis theory, we can obtain the optimal decision. Anexperimental result of handwritten numeral recognition ispresented, supporting the effectiveness of our method.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114852357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227850
K. Kise, Yasuo Miki, Keinosuke Matsumoto
In order to realize seamless integration of paper andelectronic documents, it is at least necessary to assure errorfree conversion from one to the other. In general, theconversion from paper to electronic documents is the taskof document image understanding. Although its researchhas made remarkable progress, it is still a hard task withoutlimiting the type of documents. This paper presents acompletely different approach to this task on condition thatprinted documents have their originals in electronic form.The proposed method employs fine dots to represent dataof electronic documents and places the dots on white space(backgrounds) of pages. Since the data is encoded with anerror correcting code, it is guaranteed to be correctly recoveredfrom the scanned images of documents. Experimentalresults show that a page with normal foreground objects(characters and other things) can contain more than 4KB ofdata, even when errors up to 20% of the data are permitted.
{"title":"Stippling data on backgrounds of pages-toward seamless integration of paper and electronic documents","authors":"K. Kise, Yasuo Miki, Keinosuke Matsumoto","doi":"10.1109/ICDAR.2003.1227850","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227850","url":null,"abstract":"In order to realize seamless integration of paper andelectronic documents, it is at least necessary to assure errorfree conversion from one to the other. In general, theconversion from paper to electronic documents is the taskof document image understanding. Although its researchhas made remarkable progress, it is still a hard task withoutlimiting the type of documents. This paper presents acompletely different approach to this task on condition thatprinted documents have their originals in electronic form.The proposed method employs fine dots to represent dataof electronic documents and places the dots on white space(backgrounds) of pages. Since the data is encoded with anerror correcting code, it is guaranteed to be correctly recoveredfrom the scanned images of documents. Experimentalresults show that a page with normal foreground objects(characters and other things) can contain more than 4KB ofdata, even when errors up to 20% of the data are permitted.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116721495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}