Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227818
Freddy Perraud, C. Viard-Gaudin, E. Morin, P. Lallican
This paper highlights the interest of a language modelin increasing the performances of on-line handwritingrecognition systems. Models based on statisticalapproaches, trained on written corpora, have beeninvestigated. Two kinds of models have been studied: n-grammodels and n-class models. In the latter case, theclasses result either from a syntactic criteria or acontextual criteria. In order to integrate it into smallcapacity systems (mobile device), an n-class model hasbeen designed by combining these criteria. It outperformsbulkier models based on n-gram. Integration into an on-linehandwriting recognition system demonstrates asubstantial performance improvement due to the languagemodel.
{"title":"N-gram and N-class models for on line handwriting recognition","authors":"Freddy Perraud, C. Viard-Gaudin, E. Morin, P. Lallican","doi":"10.1109/ICDAR.2003.1227818","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227818","url":null,"abstract":"This paper highlights the interest of a language modelin increasing the performances of on-line handwritingrecognition systems. Models based on statisticalapproaches, trained on written corpora, have beeninvestigated. Two kinds of models have been studied: n-grammodels and n-class models. In the latter case, theclasses result either from a syntactic criteria or acontextual criteria. In order to integrate it into smallcapacity systems (mobile device), an n-class model hasbeen designed by combining these criteria. It outperformsbulkier models based on n-gram. Integration into an on-linehandwriting recognition system demonstrates asubstantial performance improvement due to the languagemodel.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130460461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227678
Jean-Marc Viglino, M. Pierrot-Deseilligny
This paper deals with cadastral maps interpretation device. The challenge is to propose a complete reconstruction of the parcel's areas and buildings to use with geographic information systems. It is based on a low level primitives extraction and classification. As this low level may be quite noisy, an interpretation process classifies medium level objects and manages convenient processes to the particular extracted shape. Then, a reconstruction step is used to label the parcels areas and determine the final land partition. We present at first the vectorization strategy in our particular context then we will discuss the different tools used to reach the higher level.
{"title":"A vector approach for automatic interpretation of the French cadastral map","authors":"Jean-Marc Viglino, M. Pierrot-Deseilligny","doi":"10.1109/ICDAR.2003.1227678","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227678","url":null,"abstract":"This paper deals with cadastral maps interpretation device. The challenge is to propose a complete reconstruction of the parcel's areas and buildings to use with geographic information systems. It is based on a low level primitives extraction and classification. As this low level may be quite noisy, an interpretation process classifies medium level objects and manages convenient processes to the particular extracted shape. Then, a reconstruction step is used to label the parcels areas and determine the final land partition. We present at first the vectorization strategy in our particular context then we will discuss the different tools used to reach the higher level.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121432170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227623
L. Prevost, C. Michel-Sendis, A. Moises, L. Oudot, M. Milgram
Handwriting recognition is such a complex classification problem that it is quite usual now to make co-operate several classification methods at the pre-processing stage or at the classification stage. In this paper, we present an original two stages recognizer. The first stage is a model-based classifier that stores an exhaustive set of character models. The second stage is a discriminative classifier that separates the most ambiguous pairs of classes. This hybrid architecture is based on the idea that the correct class almost systematically belongs to the two more relevant classes found by the first classifier. Experiments on the Unipen database show a 30% improvement on a 62 class recognition problem.
{"title":"Combining model-based and discriminative classifiers: application to handwritten character recognition","authors":"L. Prevost, C. Michel-Sendis, A. Moises, L. Oudot, M. Milgram","doi":"10.1109/ICDAR.2003.1227623","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227623","url":null,"abstract":"Handwriting recognition is such a complex classification problem that it is quite usual now to make co-operate several classification methods at the pre-processing stage or at the classification stage. In this paper, we present an original two stages recognizer. The first stage is a model-based classifier that stores an exhaustive set of character models. The second stage is a discriminative classifier that separates the most ambiguous pairs of classes. This hybrid architecture is based on the idea that the correct class almost systematically belongs to the two more relevant classes found by the first classifier. Experiments on the Unipen database show a 30% improvement on a 62 class recognition problem.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123085334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227708
S. Lucas, Gregory Patoulas, A. Downton
This paper describes a complete system for reading type-written lexicon words in noisy images - in this case museum index cards. The system is conceptually simple, and straightforward to implement. It involves three stages of processing. The first stage extracts row-regions from the image, where each row is a hypothesized line of text. The next stage scans an OCR classifier over each row image, creating a character hypothesis graph in the process. This graph is then searched using a priority-queue based algorithm for the best matches with a set of words (lexicon). Performance evaluation on a set of museum archive cards indicates competitive accuracy and also reasonable throughput. The priority queue algorithm is over two hundred times faster than using flat dynamic programming on these graphs.
{"title":"Fast lexicon-based word recognition in noisy index card images","authors":"S. Lucas, Gregory Patoulas, A. Downton","doi":"10.1109/ICDAR.2003.1227708","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227708","url":null,"abstract":"This paper describes a complete system for reading type-written lexicon words in noisy images - in this case museum index cards. The system is conceptually simple, and straightforward to implement. It involves three stages of processing. The first stage extracts row-regions from the image, where each row is a hypothesized line of text. The next stage scans an OCR classifier over each row image, creating a character hypothesis graph in the process. This graph is then searched using a priority-queue based algorithm for the best matches with a set of words (lexicon). Performance evaluation on a set of museum archive cards indicates competitive accuracy and also reasonable throughput. The priority queue algorithm is over two hundred times faster than using flat dynamic programming on these graphs.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"94 17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126056467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227704
T. Kanahori, M. Suzuki
We proposed a method for recognizing matrices which contain abbreviation symbols, and a format for representing the structure of matrices, and reported experimental results in our paper (2002). The method consisted of 4 processes: detection of matrices, segmentation of elements, construction of networks and analysis of the matrix structure. In the paper, our work is described with a focus on the construction of networks and the analysis of the matrix structure. However, we concluded that improvements in the other two processes were very important for obtaining a high accuracy rate for recognition. In this paper, we describe the two improved processes, the detection of matrices and the segmentation of elements, and we report the experimental results.
{"title":"Detection of matrices and segmentation of matrix elements in scanned images of scientific documents","authors":"T. Kanahori, M. Suzuki","doi":"10.1109/ICDAR.2003.1227704","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227704","url":null,"abstract":"We proposed a method for recognizing matrices which contain abbreviation symbols, and a format for representing the structure of matrices, and reported experimental results in our paper (2002). The method consisted of 4 processes: detection of matrices, segmentation of elements, construction of networks and analysis of the matrix structure. In the paper, our work is described with a focus on the construction of networks and the analysis of the matrix structure. However, we concluded that improvements in the other two processes were very important for obtaining a high accuracy rate for recognition. In this paper, we describe the two improved processes, the detection of matrices and the segmentation of elements, and we report the experimental results.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121118519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227685
Eduardo Akira Yonekura, J. Facon
In this paper we present a new postal envelope segmentation method based on 2-D histogram clustering and watershed transform. Segmentation task consists in detecting the modes associated with homogeneous regions in envelope images such as handwritten address block, postmarks, stamps and background. The homogeneous modes in 2-D histogram are segmented through the morphological watershed transform. Our approach is applied to complex Brazilian postal envelopes. Very little a priori knowledge of the envelope images is required. The advantages of this approach will be described and illustrated with tests carried out on 300 different images where there are no fixed position for the handwritten address block, postmarks and stamps.
{"title":"Postal envelope segmentation by 2-D histogram clustering through watershed transform","authors":"Eduardo Akira Yonekura, J. Facon","doi":"10.1109/ICDAR.2003.1227685","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227685","url":null,"abstract":"In this paper we present a new postal envelope segmentation method based on 2-D histogram clustering and watershed transform. Segmentation task consists in detecting the modes associated with homogeneous regions in envelope images such as handwritten address block, postmarks, stamps and background. The homogeneous modes in 2-D histogram are segmented through the morphological watershed transform. Our approach is applied to complex Brazilian postal envelopes. Very little a priori knowledge of the envelope images is required. The advantages of this approach will be described and illustrated with tests carried out on 300 different images where there are no fixed position for the handwritten address block, postmarks and stamps.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131807875","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227801
P. Simard, David Steinkraus, John C. Platt
Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.
{"title":"Best practices for convolutional neural networks applied to visual document analysis","authors":"P. Simard, David Steinkraus, John C. Platt","doi":"10.1109/ICDAR.2003.1227801","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227801","url":null,"abstract":"Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple \"do-it-yourself\" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"4 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131505750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227759
Qian Wang, Tao Xia, C. Tan, Lida Li
In this paper, we propose a directional wavelet approachto remove images of interfering strokes coming from theback of a historical handwritten document due to seepingof ink during long period of storage. Our previous workrequired mapping of both sides of the document in orderto identify the interfering strokes to be eliminated. Perfectmapping, however, is difficult due to document skews,differing resolutions, non-availability of the reverseside and warped pages during scanning. The newapproach does not require double-sided mapping butinstead uses a directional wavelet transformto distinguish the foreground and reverse side strokes.Experiments have shown that the directional waveletoperation effectively removes the interfering strokes.
{"title":"Directional wavelet approach to remove document image interference","authors":"Qian Wang, Tao Xia, C. Tan, Lida Li","doi":"10.1109/ICDAR.2003.1227759","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227759","url":null,"abstract":"In this paper, we propose a directional wavelet approachto remove images of interfering strokes coming from theback of a historical handwritten document due to seepingof ink during long period of storage. Our previous workrequired mapping of both sides of the document in orderto identify the interfering strokes to be eliminated. Perfectmapping, however, is difficult due to document skews,differing resolutions, non-availability of the reverseside and warped pages during scanning. The newapproach does not require double-sided mapping butinstead uses a directional wavelet transformto distinguish the foreground and reverse side strokes.Experiments have shown that the directional waveletoperation effectively removes the interfering strokes.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132558670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227840
A. Seropian, M. Grimaldi, N. Vincent
Our aim is to achieve writer identification processthanks to a fractal analysis of handwriting style. For eachwriter, a set of characteristics is extracted. They arespecific to the writer. Advantage is taken from theautosimilarity properties that are present in one'shandwriting. In order to do that, some invariant patternscharacterizing the writing are extracted. During thetraining step these invariant patterns appear along afractal compression process, then they are organized in areference base that can be associated with the writer.This base allows to analyze an unknown writing thewriter of which has to be identified. A Pattern Matchingprocess is performed using all the reference basessuccessively. The results of this analyze are estimatedthrough the signal to noise ratio. Thus, the signal to noiseratio according to a set of bases identifies the unknowntext's writer.
{"title":"Writer identification based on the fractal construction of a reference base","authors":"A. Seropian, M. Grimaldi, N. Vincent","doi":"10.1109/ICDAR.2003.1227840","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227840","url":null,"abstract":"Our aim is to achieve writer identification processthanks to a fractal analysis of handwriting style. For eachwriter, a set of characteristics is extracted. They arespecific to the writer. Advantage is taken from theautosimilarity properties that are present in one'shandwriting. In order to do that, some invariant patternscharacterizing the writing are extracted. During thetraining step these invariant patterns appear along afractal compression process, then they are organized in areference base that can be associated with the writer.This base allows to analyze an unknown writing thewriter of which has to be identified. A Pattern Matchingprocess is performed using all the reference basessuccessively. The results of this analyze are estimatedthrough the signal to noise ratio. Thus, the signal to noiseratio according to a set of bases identifies the unknowntext's writer.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115903434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227642
M. Schambach
On the basis of a well accepted, HMM-based cursive script recognition system, an algorithm which automatically adapts the length of the models representing the letter writing variants is proposed. An average improvement in recognition performance of about 2.72 percent could be obtained. Two initialization methods for the algorithm have been tested, which show quite different behaviors; both prove to be useful in different application areas. To get a deeper insight into the functioning of the algorithm a method for the visualization of letter HMMs is developed. It shows the plausibility of most results, but also the limitations of the proposed method. However, these are mostly due to given restrictions of the training and recognition method of the underlying system.
{"title":"Model length adaptation of an HMM based cursive word recognition system","authors":"M. Schambach","doi":"10.1109/ICDAR.2003.1227642","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227642","url":null,"abstract":"On the basis of a well accepted, HMM-based cursive script recognition system, an algorithm which automatically adapts the length of the models representing the letter writing variants is proposed. An average improvement in recognition performance of about 2.72 percent could be obtained. Two initialization methods for the algorithm have been tested, which show quite different behaviors; both prove to be useful in different application areas. To get a deeper insight into the functioning of the algorithm a method for the visualization of letter HMMs is developed. It shows the plausibility of most results, but also the limitations of the proposed method. However, these are mostly due to given restrictions of the training and recognition method of the underlying system.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131995947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}