In this paper we present a method to improve the performance of individual page segmentation engines based on the combination of the output of several engines. The rules of combination are designed after analyzing the results of each individual method. This analysis is performed using a performance evaluation framework that aims at characterizing each method according to its strengths and weaknesses rather than computing a single performance measure telling which is the "best" segmentation method.
{"title":"Combination of OCR Engines for Page Segmentation Based on Performance Evaluation","authors":"Miquel A. Ferrer, Ernest Valveny","doi":"10.1109/ICDAR.2007.83","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.83","url":null,"abstract":"In this paper we present a method to improve the performance of individual page segmentation engines based on the combination of the output of several engines. The rules of combination are designed after analyzing the results of each individual method. This analysis is performed using a performance evaluation framework that aims at characterizing each method according to its strengths and weaknesses rather than computing a single performance measure telling which is the \"best\" segmentation method.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132733743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present an approach for writer identification using off-line Arabic handwriting. The proposed method explores the handwriting texture analysis by 2D discrete wavelet transforms using lifting scheme. A comparative evaluation between textural features extracted by 9 different wavelet transform functions was done. A modular multilayer perceptron classifier was used. Experiments have shown that writer identification accuracies reach best performance levels with an average rate of 95.68%. Experiments have been carried out using a database of 180 text samples. The chosen text was made to guarantee the involvement of the various internal shapes and letter locations within an Arabic subword.
{"title":"Arabic Handwriting Texture Analysis for Writer Identification Using the DWT-Lifting Scheme","authors":"S. Gazzah, N. Amara","doi":"10.1109/ICDAR.2007.62","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.62","url":null,"abstract":"In this paper, we present an approach for writer identification using off-line Arabic handwriting. The proposed method explores the handwriting texture analysis by 2D discrete wavelet transforms using lifting scheme. A comparative evaluation between textural features extracted by 9 different wavelet transform functions was done. A modular multilayer perceptron classifier was used. Experiments have shown that writer identification accuracies reach best performance levels with an average rate of 95.68%. Experiments have been carried out using a database of 180 text samples. The chosen text was made to guarantee the involvement of the various internal shapes and letter locations within an Arabic subword.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122328463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Document image segmentation algorithms primarily aim at separating text and graphics in presence of complex layouts. However, for many non-Latin scripts, segmentation becomes a challenge due to the characteristics of the script. In this paper, we empirically demonstrate that successful algorithms for Latin scripts may not be very effective for Indic and complex scripts. We explain this based on the differences in the spatial distribution of symbols in the scripts. We argue that the visual information used for segmentation needs to be enhanced with other information like script models for accurate results.
{"title":"On Segmentation of Documents in Complex Scripts","authors":"K. S. S. Kumar, S. Kumar, C. V. Jawahar","doi":"10.1109/ICDAR.2007.194","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.194","url":null,"abstract":"Document image segmentation algorithms primarily aim at separating text and graphics in presence of complex layouts. However, for many non-Latin scripts, segmentation becomes a challenge due to the characteristics of the script. In this paper, we empirically demonstrate that successful algorithms for Latin scripts may not be very effective for Indic and complex scripts. We explain this based on the differences in the spatial distribution of symbols in the scripts. We argue that the visual information used for segmentation needs to be enhanced with other information like script models for accurate results.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120954414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tong-Hua Su, Tian-Wen Zhang, Hu-Jie Huang, Yu Zhou
This paper proposes a skew detection method for real Chinese handwritten documents. After analyzing the characteristics of Chinese characters, it utilizes the horizontal stroke histogram. Its accuracy, ability to increase the recall rate of text line separation, and CPU time consuming are investigated using 853 real Chinese handwritten documents. The results show that: 1) the method can identify 98.83% of the skew angles within one degree, with an improvement of 8.44% than Wigner-Ville distribution (WVD) method; 2) when incorporated into text line separation, the recall rate has an improvement of 2.54% than WVD method; 3) the method only consumes one-twentieth of WVD method on the same test environment.
{"title":"Skew Detection for Chinese Handwriting by Horizontal Stroke Histogram","authors":"Tong-Hua Su, Tian-Wen Zhang, Hu-Jie Huang, Yu Zhou","doi":"10.1109/ICDAR.2007.233","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.233","url":null,"abstract":"This paper proposes a skew detection method for real Chinese handwritten documents. After analyzing the characteristics of Chinese characters, it utilizes the horizontal stroke histogram. Its accuracy, ability to increase the recall rate of text line separation, and CPU time consuming are investigated using 853 real Chinese handwritten documents. The results show that: 1) the method can identify 98.83% of the skew angles within one degree, with an improvement of 8.44% than Wigner-Ville distribution (WVD) method; 2) when incorporated into text line separation, the recall rate has an improvement of 2.54% than WVD method; 3) the method only consumes one-twentieth of WVD method on the same test environment.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117177738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a method of information extraction from HTML documents based on modelling the visual information in the document. A page segmentation algorithm is used for detecting the document layout and subsequently, the extraction process is based on the analysis of mutual positions of the detected blocks and their visual features. This approach is more robust that the traditional DOM-based methods and it opens new possibilities for the extraction task specification.
{"title":"Layout Based Information Extraction from HTML Documents","authors":"Radek Burget","doi":"10.1109/ICDAR.2007.155","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.155","url":null,"abstract":"We propose a method of information extraction from HTML documents based on modelling the visual information in the document. A page segmentation algorithm is used for detecting the document layout and subsequently, the extraction process is based on the analysis of mutual positions of the detected blocks and their visual features. This approach is more robust that the traditional DOM-based methods and it opens new possibilities for the extraction task specification.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115696483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Seki, Masakazu Fujio, T. Nagasaki, Hiroshi Shinjo, K. Marukawa
An information management system using analyzing document structure is presented. The purpose is simultaneous management of information in various paper and electronic documents. The system contains image document analysis, PDF document analysis, and HTML document analysis. The two applications are presented and the developed prototypes are described. One application is document summarization. The other application is table understanding to correlate data to items.
{"title":"Information Management System Using Structure Analysis of Paper/Electronic Documents and Its Applications","authors":"M. Seki, Masakazu Fujio, T. Nagasaki, Hiroshi Shinjo, K. Marukawa","doi":"10.1109/ICDAR.2007.144","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.144","url":null,"abstract":"An information management system using analyzing document structure is presented. The purpose is simultaneous management of information in various paper and electronic documents. The system contains image document analysis, PDF document analysis, and HTML document analysis. The two applications are presented and the developed prototypes are described. One application is document summarization. The other application is table understanding to correlate data to items.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115082142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Toselli, Verónica Romero, Luis Rodríguez, E. Vidal
To date, automatic handwriting recognition systems are far from being perfect and often they need a post editing where a human intervention is required to check and correct the results of such systems. We propose to have a new interactive, on-line framework which, rather than full automation, aims at assisting the human in the proper recognition- transcription process; that is, facilitate and speed up their transcription task of handwritten texts. This framework combines the efficiency of automatic handwriting recognition systems with the accuracy of the human transcriptor. The best result is a cost-effective perfect transcription of the handwriting text images.
{"title":"Computer Assisted Transcription of Handwritten Text Images","authors":"A. Toselli, Verónica Romero, Luis Rodríguez, E. Vidal","doi":"10.1109/ICDAR.2007.86","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.86","url":null,"abstract":"To date, automatic handwriting recognition systems are far from being perfect and often they need a post editing where a human intervention is required to check and correct the results of such systems. We propose to have a new interactive, on-line framework which, rather than full automation, aims at assisting the human in the proper recognition- transcription process; that is, facilitate and speed up their transcription task of handwritten texts. This framework combines the efficiency of automatic handwriting recognition systems with the accuracy of the human transcriptor. The best result is a cost-effective perfect transcription of the handwriting text images.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116024356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the present article, we describe a novel direction code based feature extraction approach for recognition of online Bangla handwritten basic characters. We have implemented the proposed approach on a database of 7043 online handwritten Bangla (a major script of the Indian subcontinent) character samples, which has been developed by us. This is a 50-class recognition problem and we achieved 93.90% and 83.61% recognition accuracies respectively on its training and test sets.
{"title":"Direction Code Based Features for Recognition of Online Handwritten Characters of Bangla","authors":"U. Bhattacharya, B. K. Gupta, S. K. Parui","doi":"10.1109/ICDAR.2007.100","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.100","url":null,"abstract":"In the present article, we describe a novel direction code based feature extraction approach for recognition of online Bangla handwritten basic characters. We have implemented the proposed approach on a database of 7043 online handwritten Bangla (a major script of the Indian subcontinent) character samples, which has been developed by us. This is a 50-class recognition problem and we achieved 93.90% and 83.61% recognition accuracies respectively on its training and test sets.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114552642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present a system towards the recognition of off-line handwritten characters of Devnagari, the most popular script in India. The features used for recognition purpose are mainly based on directional information obtained from the arc tangent of the gradient. To get the feature, at first, a 2times2 mean filtering is applied 4 times on the gray level image and a non-linear size normalization is done on the image. The normalized image is then segmented to 49times49 blocks and a Roberts filter is applied to obtain gradient image. Next, the arc tangent of the gradient (direction of gradient) is initially quantized into 32 directions and the strength of the gradient is accumulated with each of the quantized direction. Finally, the blocks and the directions are down sampled using Gaussian filter to get 392 dimensional feature vector. A modified quadratic classifier is applied on these features for recognition. We used 36172 handwritten data for testing our system and obtained 94.24% accuracy using 5-fold cross-validation scheme.
{"title":"Off-Line Handwritten Character Recognition of Devnagari Script","authors":"U. Pal, N. Sharma, T. Wakabayashi, F. Kimura","doi":"10.1109/ICDAR.2007.189","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.189","url":null,"abstract":"In this paper we present a system towards the recognition of off-line handwritten characters of Devnagari, the most popular script in India. The features used for recognition purpose are mainly based on directional information obtained from the arc tangent of the gradient. To get the feature, at first, a 2times2 mean filtering is applied 4 times on the gray level image and a non-linear size normalization is done on the image. The normalized image is then segmented to 49times49 blocks and a Roberts filter is applied to obtain gradient image. Next, the arc tangent of the gradient (direction of gradient) is initially quantized into 32 directions and the strength of the gradient is accumulated with each of the quantized direction. Finally, the blocks and the directions are down sampled using Gaussian filter to get 392 dimensional feature vector. A modified quadratic classifier is applied on these features for recognition. We used 36172 handwritten data for testing our system and obtained 94.24% accuracy using 5-fold cross-validation scheme.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123999361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The main contribution of the paper is that it presents a suffix tree based data structure for automatic handwritten Chinese address reading. Since lots of papers have discussed the destination address block (DAB) location for Chinese, we will not extend it in this paper. Instead, we pay more attention to improve the address matching performance after DAB location. As some conventional methods, the extracted text lines are pre-segmented into a series of radicals. We then build a hierarchical structure of sub-strings from the recognized characters of valid radical combinations. Coarse address candidates are selected at the same time. In address maching, we incorporate postcode information to filter redundant addresses. The pre- segmented radicals are compared with candidate address and a cost function combining recognition and structrual cost is evaluated for final decision. In the system, character segmentation, recognition, string searching and matching are considered synchronously by taking advantage of lexicon knowledge. Suffix tree can greatly facilitate the substring generation process and enable the matching process to start from any character to collect potentially bitty information. Therefore, our algorithms is more robust to the intervening noises and irregular writing styles. Finallly, we test 1,000 handwritten Chinese envelopes and achieve a correct rate of 85.30% in 3.0 seconds per mail averagely.
{"title":"A Suffix Tree Based Handwritten Chinese Address Recognition System","authors":"Y. Jiang, X. Ding, Z. Ren","doi":"10.1109/ICDAR.2007.36","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.36","url":null,"abstract":"The main contribution of the paper is that it presents a suffix tree based data structure for automatic handwritten Chinese address reading. Since lots of papers have discussed the destination address block (DAB) location for Chinese, we will not extend it in this paper. Instead, we pay more attention to improve the address matching performance after DAB location. As some conventional methods, the extracted text lines are pre-segmented into a series of radicals. We then build a hierarchical structure of sub-strings from the recognized characters of valid radical combinations. Coarse address candidates are selected at the same time. In address maching, we incorporate postcode information to filter redundant addresses. The pre- segmented radicals are compared with candidate address and a cost function combining recognition and structrual cost is evaluated for final decision. In the system, character segmentation, recognition, string searching and matching are considered synchronously by taking advantage of lexicon knowledge. Suffix tree can greatly facilitate the substring generation process and enable the matching process to start from any character to collect potentially bitty information. Therefore, our algorithms is more robust to the intervening noises and irregular writing styles. Finallly, we test 1,000 handwritten Chinese envelopes and achieve a correct rate of 85.30% in 3.0 seconds per mail averagely.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122396803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}