Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227732
Zheng Zhang, C. Tan
Image warping is a common problem when one scans or photocopies a document page from a thick bound volume, resulting in shading and curved text lines in the spine area of the bound volume. This will not only impair readability, but will also reduce the OCR accuracy. Further to our earlier attempt to correct such images, this paper proposes a simpler connected component analysis and regression technique. Compared to our earlier method, the present system is computationally less expensive and is resolution independent too. The implementation of the new system and improvement of OCR accuracy are presented in this paper.
{"title":"Correcting document image warping based on regression of curved text lines","authors":"Zheng Zhang, C. Tan","doi":"10.1109/ICDAR.2003.1227732","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227732","url":null,"abstract":"Image warping is a common problem when one scans or photocopies a document page from a thick bound volume, resulting in shading and curved text lines in the spine area of the bound volume. This will not only impair readability, but will also reduce the OCR accuracy. Further to our earlier attempt to correct such images, this paper proposes a simpler connected component analysis and regression technique. Compared to our earlier method, the present system is computationally less expensive and is resolution independent too. The implementation of the new system and improvement of OCR accuracy are presented in this paper.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134101146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227834
Jianming Jin, Xionghu Han, Qingren Wang
As a universal technical language, mathematics hasbeen widely applied in many fields, and it is more accuratethan any other languages in describing information.Therefore, numerous mathematical formulas exist in allkinds of documents. There is no doubt that automaticmathematical formulas processing is very important andnecessary, of which extract formulas from documentimages is the first step. In this paper, formulas extractionmethods which are not based on recognition results arepresented: isolated formulas are extracted based onParzen window and embedded expressions are extractedbased on 2-D structures detection. Experiments show thatour methods are very effective in formulas extraction.
{"title":"Mathematical formulas extraction","authors":"Jianming Jin, Xionghu Han, Qingren Wang","doi":"10.1109/ICDAR.2003.1227834","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227834","url":null,"abstract":"As a universal technical language, mathematics hasbeen widely applied in many fields, and it is more accuratethan any other languages in describing information.Therefore, numerous mathematical formulas exist in allkinds of documents. There is no doubt that automaticmathematical formulas processing is very important andnecessary, of which extract formulas from documentimages is the first step. In this paper, formulas extractionmethods which are not based on recognition results arepresented: isolated formulas are extracted based onParzen window and embedded expressions are extractedbased on 2-D structures detection. Experiments show thatour methods are very effective in formulas extraction.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"213 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132086249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227652
S. Uchida, H. Sakoe
For handwritten character recognition, a new elastic image matching (EM) technique based on a class-dependent deformation model is proposed. In the deformation model, any deformation of a class is described by a linear combination of eigen-deformations, which are intrinsic deformation directions of the class. The eigen-deformations can be estimated statistically from the actual deformations of handwritten characters. Experimental results show that the proposed technique can attain higher recognition rates than conventional EM techniques based on class-independent deformation models. The results also show the superiority of the proposed technique over those conventional EM techniques in computational efficiency.
{"title":"Handwritten character recognition using elastic matching based on a class-dependent deformation model","authors":"S. Uchida, H. Sakoe","doi":"10.1109/ICDAR.2003.1227652","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227652","url":null,"abstract":"For handwritten character recognition, a new elastic image matching (EM) technique based on a class-dependent deformation model is proposed. In the deformation model, any deformation of a class is described by a linear combination of eigen-deformations, which are intrinsic deformation directions of the class. The eigen-deformations can be estimated statistically from the actual deformations of handwritten characters. Experimental results show that the proposed technique can attain higher recognition rates than conventional EM techniques based on class-independent deformation models. The results also show the superiority of the proposed technique over those conventional EM techniques in computational efficiency.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133178943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227804
M. Goccia, M. Bruzzo, C. Scagliola, S. Dellepiane
This paper describes the recognition of container codecharacters in the project Mocont-II, where containerimages are taken in largely varying light situations. Therecognition system has to deal with gray-level charactersshowing a wide variability of brightness and contrast,varying inclination, segmentation uncertainties, damagedcharacters and the presence of shadows. Different sets offeatures were extracted directly from gray-level images,and a minimum distance classifier with a weighted metricwas used for recognition. To achieve good recognitionperformances, the feature weights and the prototype setswere optimized by a new gradient-based learningalgorithm that maximizes a fuzzy recognition ratefunctional.
{"title":"Recognition of container code characters through gray-level feature extraction and gradient-based classifier optimization","authors":"M. Goccia, M. Bruzzo, C. Scagliola, S. Dellepiane","doi":"10.1109/ICDAR.2003.1227804","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227804","url":null,"abstract":"This paper describes the recognition of container codecharacters in the project Mocont-II, where containerimages are taken in largely varying light situations. Therecognition system has to deal with gray-level charactersshowing a wide variability of brightness and contrast,varying inclination, segmentation uncertainties, damagedcharacters and the presence of shadows. Different sets offeatures were extracted directly from gray-level images,and a minimum distance classifier with a weighted metricwas used for recognition. To achieve good recognitionperformances, the feature weights and the prototype setswere optimized by a new gradient-based learningalgorithm that maximizes a fuzzy recognition ratefunctional.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127855854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227739
S. Agne, A. Dengel, B. Klein
The decomposition of a document into segments such as text regions and graphics is a significant part of the document analysis process. The basic requirement for rating and improvement of page segmentation algorithms is systematic evaluation. The approaches known from the literature have the disadvantage that manually generated reference data (zoning ground truth) are needed for the evaluation task. The effort and cost of the creation of these data are very high. This paper describes the evaluation system SEE and presents an assessment of its quality. The system requires the OCR generated text and the original text of the document in correct reading order (text ground truth) as input. No manually generated zoning ground truth is needed. The implicit structure information that is contained in the text ground truth is used for the evaluation of the automatic zoning. Therefore, an assignment of the corresponding text regions in the text ground truth and those in the OCR generated text (matches) is sought. A fault tolerant string matching algorithm underlies a method, able to tolerate OCR errors in the text. The segmentation errors are determined as a result of the evaluation of the matching. Subsequently, the edit operations which are necessary for the correction of the recognized segmentation errors are computed to estimate the correction costs. Furthermore, SEE provides a version of the OCR generated text, which is corrected from the detected page segmentation errors.
{"title":"Evaluating SEE: a benchmarking system for document page segmentation","authors":"S. Agne, A. Dengel, B. Klein","doi":"10.1109/ICDAR.2003.1227739","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227739","url":null,"abstract":"The decomposition of a document into segments such as text regions and graphics is a significant part of the document analysis process. The basic requirement for rating and improvement of page segmentation algorithms is systematic evaluation. The approaches known from the literature have the disadvantage that manually generated reference data (zoning ground truth) are needed for the evaluation task. The effort and cost of the creation of these data are very high. This paper describes the evaluation system SEE and presents an assessment of its quality. The system requires the OCR generated text and the original text of the document in correct reading order (text ground truth) as input. No manually generated zoning ground truth is needed. The implicit structure information that is contained in the text ground truth is used for the evaluation of the automatic zoning. Therefore, an assignment of the corresponding text regions in the text ground truth and those in the OCR generated text (matches) is sought. A fault tolerant string matching algorithm underlies a method, able to tolerate OCR errors in the text. The segmentation errors are determined as a result of the evaluation of the matching. Subsequently, the edit operations which are necessary for the correction of the recognized segmentation errors are computed to estimate the correction costs. Furthermore, SEE provides a version of the OCR generated text, which is corrected from the detected page segmentation errors.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127465389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227660
Sung-Jung Cho, J. H. Kim
In this paper we propose a Bayesian network framework for explicitly modeling components and their relationships of Korean Hangul characters. A Hangul character is modeled with hierarchical components: a syllable model, grapheme models, stroke models and point models. Each model is constructed with subcomponents and their relationships except a point model, the primitive one, which is represented by a 2D Gaussian for X-Y coordinates of a point instances. Relationships between components are modeled with their positional dependencies. For online handwritten Hangul characters, the proposed system shows higher recognition rates than the HMM system with chain code features: 95.7% vs. 92.9% on average.
{"title":"Bayesian network modeling of Hangul characters for online handwriting recognition","authors":"Sung-Jung Cho, J. H. Kim","doi":"10.1109/ICDAR.2003.1227660","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227660","url":null,"abstract":"In this paper we propose a Bayesian network framework for explicitly modeling components and their relationships of Korean Hangul characters. A Hangul character is modeled with hierarchical components: a syllable model, grapheme models, stroke models and point models. Each model is constructed with subcomponents and their relationships except a point model, the primitive one, which is represented by a 2D Gaussian for X-Y coordinates of a point instances. Relationships between components are modeled with their positional dependencies. For online handwritten Hangul characters, the proposed system shows higher recognition rates than the HMM system with chain code features: 95.7% vs. 92.9% on average.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124172281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227832
U. Pal, S. Datta
To take care of variability involved in the writing style ofdifferent individuals in this paper we propose a robustscheme to segment unconstrained handwritten Banglatexts into lines, words and characters. For linesegmentation, at first, we divide the text into verticalstripes. Stripe width of a document is computed bystatistical analysis of the text height in the document.Next we determine horizontal histogram of these stripesand the relationship of the minimal values of thehistograms is used to segment text lines. Based onvertical projection profile lines are segmented intowords. Segmentation of characters from handwrittenword is very tricky as the characters are seldomvertically separable. We use a concept based on waterreservoir principle for the purpose. Here we, at first,identify isolated and connected (touching) characters ina word. Next touching characters of the word aresegmented based on the reservoir base area points andstructural feature of the component.
{"title":"Segmentation of Bangla unconstrained handwritten text","authors":"U. Pal, S. Datta","doi":"10.1109/ICDAR.2003.1227832","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227832","url":null,"abstract":"To take care of variability involved in the writing style ofdifferent individuals in this paper we propose a robustscheme to segment unconstrained handwritten Banglatexts into lines, words and characters. For linesegmentation, at first, we divide the text into verticalstripes. Stripe width of a document is computed bystatistical analysis of the text height in the document.Next we determine horizontal histogram of these stripesand the relationship of the minimal values of thehistograms is used to segment text lines. Based onvertical projection profile lines are segmented intowords. Segmentation of characters from handwrittenword is very tricky as the characters are seldomvertically separable. We use a concept based on waterreservoir principle for the purpose. Here we, at first,identify isolated and connected (touching) characters ina word. Next touching characters of the word aresegmented based on the reservoir base area points andstructural feature of the component.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115025142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227632
Shanheng Zhao, Zhiyan Wang
In this paper we introduce a practical flight coupon automatic processing system for scanning and recognition. We discuss the coupon classification, character location and binarization. We emphasize a high performance character segmentation and recognition engine, which are proved very effective. The results of experiment and commercial running applying the system are presented.
{"title":"A high accuracy rate commercial flight coupon recognition system","authors":"Shanheng Zhao, Zhiyan Wang","doi":"10.1109/ICDAR.2003.1227632","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227632","url":null,"abstract":"In this paper we introduce a practical flight coupon automatic processing system for scanning and recognition. We discuss the coupon classification, character location and binarization. We emphasize a high performance character segmentation and recognition engine, which are proved very effective. The results of experiment and commercial running applying the system are presented.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117272773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227714
S. Wu, A. Amin
A multistage approach is presented for thresholding document images, along with its application. The proposed method is based on two stages. Global thresholding is used in the first stage to give a preliminary result. A second stage then refines the threshold value based on local spatial characteristics of the regions formed in the first stage. It automatically customizes the thresholding of regions that have specific and consistent characteristics but are different to other regions in the image. This technique works well for both simple images, in which the background and foreground are distinct and separable and complex images containing multiple regions with different shading/textures. A typical application is postal envelope analysis. The results of evaluation show significant improvement compared to several other global and local thresholding techniques.
{"title":"Automatic thresholding of gray-level using multistage approach","authors":"S. Wu, A. Amin","doi":"10.1109/ICDAR.2003.1227714","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227714","url":null,"abstract":"A multistage approach is presented for thresholding document images, along with its application. The proposed method is based on two stages. Global thresholding is used in the first stage to give a preliminary result. A second stage then refines the threshold value based on local spatial characteristics of the regions formed in the first stage. It automatically customizes the thresholding of regions that have specific and consistent characteristics but are different to other regions in the image. This technique works well for both simple images, in which the background and foreground are distinct and separable and complex images containing multiple regions with different shading/textures. A typical application is postal envelope analysis. The results of evaluation show significant improvement compared to several other global and local thresholding techniques.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123730573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-08-03DOI: 10.1109/ICDAR.2003.1227706
M. Shafiei, H. Rabiee
In this paper, a new on-line handwritten signature verification system using Hidden Markov Model (HMM) is presented. The proposed system segments each signature based on its perceptually important points and then computes for each segment a number of features that are scale and displacement invariant. The resulted sequence is then used for training an HMM to achieve signature verification. Our database includes 622 genuine signatures and 1010 forgery signatures that were collected from a population of 69 human subjects. Our verification system has achieved a false acceptance rate (FAR) of 4% and a false rejection rate (FRR) of 12%.
{"title":"A new online signature verification algorithm using variable length segmentation and hidden Markov models","authors":"M. Shafiei, H. Rabiee","doi":"10.1109/ICDAR.2003.1227706","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227706","url":null,"abstract":"In this paper, a new on-line handwritten signature verification system using Hidden Markov Model (HMM) is presented. The proposed system segments each signature based on its perceptually important points and then computes for each segment a number of features that are scale and displacement invariant. The resulted sequence is then used for training an HMM to achieve signature verification. Our database includes 622 genuine signatures and 1010 forgery signatures that were collected from a population of 69 human subjects. Our verification system has achieved a false acceptance rate (FAR) of 4% and a false rejection rate (FRR) of 12%.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123972892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}