Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953760
Sung-Jung Cho, J. H. Kim
It is important to model strokes and their relationships for on-line handwriting recognition, because they reflect character structures. We propose to model them explicitly and statistically with Bayesian networks. A character is modeled with stroke models and their relationships. Strokes, parts of handwriting traces that are approximately linear, are modeled with a set of point models and their relationships. Points are modeled with conditional probability tables and distributions for pen status and X, Y positions in the 2-D space, given the information of related points. A Bayesian network is adopted to represent a character model, whose nodes correspond to point models and arcs their dependencies. The proposed system was tested on the recognition of on-line handwritten digits. It showed higher recognition rates than the HMM based recognizer with chaincode features and was comparable to other published systems.
{"title":"Bayesian network modeling of strokes and their relationships for on-line handwriting recognition","authors":"Sung-Jung Cho, J. H. Kim","doi":"10.1109/ICDAR.2001.953760","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953760","url":null,"abstract":"It is important to model strokes and their relationships for on-line handwriting recognition, because they reflect character structures. We propose to model them explicitly and statistically with Bayesian networks. A character is modeled with stroke models and their relationships. Strokes, parts of handwriting traces that are approximately linear, are modeled with a set of point models and their relationships. Points are modeled with conditional probability tables and distributions for pen status and X, Y positions in the 2-D space, given the information of related points. A Bayesian network is adopted to represent a character model, whose nodes correspond to point models and arcs their dependencies. The proposed system was tested on the recognition of on-line handwritten digits. It showed higher recognition rates than the HMM based recognizer with chaincode features and was comparable to other published systems.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133165944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953798
Hsi-Ming Yang, Jainn-Jyh Lu, Hsi-Jian Lee
In this paper, we propose a method to vectorize Chinese characters in calligraphy documents. Our system can prevent the zigzag phenomena when the characters are enlarged. The system contains two modules: contour segment extraction and description. In the former, high curvature points on contours are detected as corner points, which divide the contour into several segments. In the latter, a contour segment can be described either by a straight line or a cubic Bezier curve. According to relations between the contour segment and the Bezier curve, control points are adjusted to fit the contour segment better. When the curve fitness cost is small enough, the shape is described well. The processing time of our curve fitting is about five seconds per A4 image, which has 4488 contour segments. Experimental results demonstrate that our system is efficient and promising.
{"title":"A Bezier curve-based approach to shape description for Chinese calligraphy characters","authors":"Hsi-Ming Yang, Jainn-Jyh Lu, Hsi-Jian Lee","doi":"10.1109/ICDAR.2001.953798","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953798","url":null,"abstract":"In this paper, we propose a method to vectorize Chinese characters in calligraphy documents. Our system can prevent the zigzag phenomena when the characters are enlarged. The system contains two modules: contour segment extraction and description. In the former, high curvature points on contours are detected as corner points, which divide the contour into several segments. In the latter, a contour segment can be described either by a straight line or a cubic Bezier curve. According to relations between the contour segment and the Bezier curve, control points are adjusted to fit the contour segment better. When the curve fitness cost is small enough, the shape is described well. The processing time of our curve fitting is about five seconds per A4 image, which has 4488 contour segments. Experimental results demonstrate that our system is efficient and promising.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126870260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953945
K. Tanabe, M. Yoshihara, H. Kameya, S. Mori, S. Omata, Tatsuro Ito
A feasibility experiment on an automatic signature verification system based on writing pressure was conducted using a new device sensing the z-axis component of writing pressure. The data acquisitions were conducted over a half a year for checking both subjects' and device stabilities. The DP (dynamic programming) matching method provided 6% error rates for type I and type II errors. The nature of signature writing pressure was investigated.
{"title":"Automatic signature verification based on the dynamic feature of pressure","authors":"K. Tanabe, M. Yoshihara, H. Kameya, S. Mori, S. Omata, Tatsuro Ito","doi":"10.1109/ICDAR.2001.953945","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953945","url":null,"abstract":"A feasibility experiment on an automatic signature verification system based on writing pressure was conducted using a new device sensing the z-axis component of writing pressure. The data acquisitions were conducted over a half a year for checking both subjects' and device stabilities. The DP (dynamic programming) matching method provided 6% error rates for type I and type II errors. The nature of signature writing pressure was investigated.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"369 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116118526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953864
A. Elgammal, M. Ismail
This paper presents a graph-based framework for the segmentation of Arabic text. The same framework is used to extract font independent structural features from the text that are used in the recognition. The major contribution of this paper is a new graph-based structural segmentation approach based on the topological relation between the baseline and the line adjacency graph representation of the text. The text is segmented to sub-character units that we call "scripts". A structure analysis approach is used for recognition of these units. A different classifier is used to recognize dots and diacritic signs. The final character recognition is achieved by using a regular grammar that describes how characters are composed from scripts.
{"title":"A graph-based segmentation and feature extraction framework for Arabic text recognition","authors":"A. Elgammal, M. Ismail","doi":"10.1109/ICDAR.2001.953864","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953864","url":null,"abstract":"This paper presents a graph-based framework for the segmentation of Arabic text. The same framework is used to extract font independent structural features from the text that are used in the recognition. The major contribution of this paper is a new graph-based structural segmentation approach based on the topological relation between the baseline and the line adjacency graph representation of the text. The text is segmented to sub-character units that we call \"scripts\". A structure analysis approach is used for recognition of these units. A different classifier is used to recognize dots and diacritic signs. The final character recognition is achieved by using a regular grammar that describes how characters are composed from scripts.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122121959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953841
Frank Lebourgeois, H. Emptoz, S. Souafi-Bensafi
This paper describes a statistical model for a document understanding system, which uses both text attributes and document layouts. Probabilistic relaxation is used as a recognition scheme to find the hierarchical structure of the logical layout. This approach, commonly used for pixels classification in image analysis, can be applied to classify text blocks into logical classes according to local compatibility with other neighboring blocks at different hierarchical levels. It provides a logical layout that is globally compatible with the training model. We have tested this approach on reading tables of contents of periodicals for documents indexing. Probabilistic relaxation has interesting properties like high-speed training and the 'a priori' recognition rate, which provides the consistency of the model according to the features used, and the samples chosen among the training set.
{"title":"Document understanding using probabilistic relaxation: application on tables of contents of periodicals","authors":"Frank Lebourgeois, H. Emptoz, S. Souafi-Bensafi","doi":"10.1109/ICDAR.2001.953841","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953841","url":null,"abstract":"This paper describes a statistical model for a document understanding system, which uses both text attributes and document layouts. Probabilistic relaxation is used as a recognition scheme to find the hierarchical structure of the logical layout. This approach, commonly used for pixels classification in image analysis, can be applied to classify text blocks into logical classes according to local compatibility with other neighboring blocks at different hierarchical levels. It provides a logical layout that is globally compatible with the training model. We have tested this approach on reading tables of contents of periodicals for documents indexing. Probabilistic relaxation has interesting properties like high-speed training and the 'a priori' recognition rate, which provides the consistency of the model according to the features used, and the samples chosen among the training set.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125699925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953824
Hiromitsu Nishimura, Masayoshi Tsutsumi, M. Maruyama, H. Miyao, Y. Nakano
The purpose of our research is to improve the recognition rate of an off-line handwritten character recognition system using HMM (hidden Markov model), so that we can use the system for practical application. Due to the insufficient recognition rate of ID HMM character recognition systems and the requirement for a huge number of learning samples to construct 2D HMM character recognition systems, HMM-based character recognition systems have not yet achieved sufficient recognition performance for practical use. In this research, we propose the character recognition method that integrates 4 simply structured 1D HMMs all of which are based on feature extraction using linear filters. The results of our evaluation experiment using the Hand-Printed Character Database (ETL6) showed that the first rank recognition rate of the test samples was 98.5% and that the cumulative recognition rate of top 3 candidates was 99.3%. Although our method is relatively easy to implement, it can work even better than 2D HMM method. These results show the proposed method is very effective.
{"title":"Off-line hand-written character recognition using integrated 1D HMMs based on feature extraction filters","authors":"Hiromitsu Nishimura, Masayoshi Tsutsumi, M. Maruyama, H. Miyao, Y. Nakano","doi":"10.1109/ICDAR.2001.953824","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953824","url":null,"abstract":"The purpose of our research is to improve the recognition rate of an off-line handwritten character recognition system using HMM (hidden Markov model), so that we can use the system for practical application. Due to the insufficient recognition rate of ID HMM character recognition systems and the requirement for a huge number of learning samples to construct 2D HMM character recognition systems, HMM-based character recognition systems have not yet achieved sufficient recognition performance for practical use. In this research, we propose the character recognition method that integrates 4 simply structured 1D HMMs all of which are based on feature extraction using linear filters. The results of our evaluation experiment using the Hand-Printed Character Database (ETL6) showed that the first rank recognition rate of the test samples was 98.5% and that the cumulative recognition rate of top 3 candidates was 99.3%. Although our method is relatively easy to implement, it can work even better than 2D HMM method. These results show the proposed method is very effective.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123559654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953808
Jian-xiong Dong, A. Krzyżak, C. Suen
This paper proposes a general local learning framework to effectively alleviate the complexities of classifier design by means of "divide and conquer" principle and ensemble method. The learning framework consists of quantization layer and ensemble layer. After GLVQ and MLP are applied to the framework, the proposed method is tested on MNIST handwritten digit database. The obtained performance is very promising, an error rate with 0.99%, which is comparable to that of LeNet5, one of the best classifiers on this database. Further, in contrast to LeNet5, our method is especially suitable for a large-scale real-world classification problem.
{"title":"A multi-net local learning framework for pattern recognition","authors":"Jian-xiong Dong, A. Krzyżak, C. Suen","doi":"10.1109/ICDAR.2001.953808","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953808","url":null,"abstract":"This paper proposes a general local learning framework to effectively alleviate the complexities of classifier design by means of \"divide and conquer\" principle and ensemble method. The learning framework consists of quantization layer and ensemble layer. After GLVQ and MLP are applied to the framework, the proposed method is tested on MNIST handwritten digit database. The obtained performance is very promising, an error rate with 0.99%, which is comparable to that of LeNet5, one of the best classifiers on this database. Further, in contrast to LeNet5, our method is especially suitable for a large-scale real-world classification problem.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125092260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953892
R. Zanibbi, D. Blostein, J. Cordy
The structure of mathematics notation is particularly difficult to recognize in handwritten notation because irregular symbol placements are common. We present an efficient and robust method of parsing handwritten and typeset mathematics notation without backtracking. The system is designed to be easily adaptable to various dialects of mathematics notation. The following strategies are used: (1) separate the analysis of layout, syntax, and semantics, (2) recursively apply search functions and image partitioning to recognize dominant and nested baselines, and (3) use tree transformations to express computations in a compact, efficiently executable form.
{"title":"Baseline structure analysis of handwritten mathematics notation","authors":"R. Zanibbi, D. Blostein, J. Cordy","doi":"10.1109/ICDAR.2001.953892","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953892","url":null,"abstract":"The structure of mathematics notation is particularly difficult to recognize in handwritten notation because irregular symbol placements are common. We present an efficient and robust method of parsing handwritten and typeset mathematics notation without backtracking. The system is designed to be easily adaptable to various dialects of mathematics notation. The following strategies are used: (1) separate the analysis of layout, syntax, and semantics, (2) recursively apply search functions and image partitioning to recognize dominant and nested baselines, and (3) use tree transformations to express computations in a compact, efficiently executable form.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"50 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127086201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953954
T. Mita, O. Hori
This paper proposes a new method for improving the recognition accuracy of video text by exploiting the temporal redundancy of video. The proposed method divides the video into short segments and obtains several recognition results from some video segments. The video segments have various backgrounds because background image changes temporally due to camera-work or object motion. These recognition results from diverse backgrounds are integrated into a single text string after selecting the best recognition results of individual characters. The proposed method was tested on a large set of news video sequences. Experimental results show that the proposed method increased the number of correct characters by 3.1% and the number of strings which do not include any recognition errors by 8.1%.
{"title":"Improvement of video text recognition by character selection","authors":"T. Mita, O. Hori","doi":"10.1109/ICDAR.2001.953954","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953954","url":null,"abstract":"This paper proposes a new method for improving the recognition accuracy of video text by exploiting the temporal redundancy of video. The proposed method divides the video into short segments and obtains several recognition results from some video segments. The video segments have various backgrounds because background image changes temporally due to camera-work or object motion. These recognition results from diverse backgrounds are integrated into a single text string after selecting the best recognition results of individual characters. The proposed method was tested on a large set of news video sequences. Experimental results show that the proposed method increased the number of correct characters by 3.1% and the number of strings which do not include any recognition errors by 8.1%.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130449992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953888
Markus Feldbach, Klaus D. Tönnies
For being able to automatically acquire the information recorded in church registers and other historical scriptures, the writing on these documents has to be recognized. This paper describes algorithms for transforming the paper documents into a representation of text apt to be used as input for an automatic text recognizer. The automatic recognition of old handwritten scriptures is difficult for two main reasons. Lines of text in general are not straight and ascenders and descenders of adjacent lines interfere. The algorithms described in this paper provide ways to reconstruct the path of the lines of text using an approach of gradually constructing line segments until a unique line of text is formed. In addition, the single lines are segmented and an output in form of a raster image is provided. The method was applied to church registers. They were written between the 17th and 19th Century. Line segmentation was found to be successful in 97% of all samples.
{"title":"Line detection and segmentation in historical church registers","authors":"Markus Feldbach, Klaus D. Tönnies","doi":"10.1109/ICDAR.2001.953888","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953888","url":null,"abstract":"For being able to automatically acquire the information recorded in church registers and other historical scriptures, the writing on these documents has to be recognized. This paper describes algorithms for transforming the paper documents into a representation of text apt to be used as input for an automatic text recognizer. The automatic recognition of old handwritten scriptures is difficult for two main reasons. Lines of text in general are not straight and ascenders and descenders of adjacent lines interfere. The algorithms described in this paper provide ways to reconstruct the path of the lines of text using an approach of gradually constructing line segments until a unique line of text is formed. In addition, the single lines are segmented and an output in form of a raster image is provided. The method was applied to church registers. They were written between the 17th and 19th Century. Line segmentation was found to be successful in 97% of all samples.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129862826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}