Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030958
N. Mezghani, A. Mitiche, M. Cheriet
Neural networks have been applied to various pattern classification and recognition problems for their learning ability, discrimination power and generalization ability The neural network most referenced in the pattern recognition literature are the multi-layer perceptron, the Kohonen associative memory and the Capenter-Grossberg ART network. The Kohonen memory runs an unsupervised clustering algorithm. It is easily trained and has attractive properties such as topological ordering and good generalization. In this study an on-line system for the recognition of handwriting Arabic characters using a Kohonen network is investigated. The input of the neural network is a feature vector of elliptic Fourier coefficients extracted from the handwritten dynamic representation. Experimental results show that the network successfully recognizes both clearly and roughly written characters with good performance.
{"title":"On-line recognition of handwritten Arabic characters using a Kohonen neural network","authors":"N. Mezghani, A. Mitiche, M. Cheriet","doi":"10.1109/IWFHR.2002.1030958","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030958","url":null,"abstract":"Neural networks have been applied to various pattern classification and recognition problems for their learning ability, discrimination power and generalization ability The neural network most referenced in the pattern recognition literature are the multi-layer perceptron, the Kohonen associative memory and the Capenter-Grossberg ART network. The Kohonen memory runs an unsupervised clustering algorithm. It is easily trained and has attractive properties such as topological ordering and good generalization. In this study an on-line system for the recognition of handwriting Arabic characters using a Kohonen network is investigated. The input of the neural network is a feature vector of elliptic Fourier coefficients extracted from the handwritten dynamic representation. Experimental results show that the network successfully recognizes both clearly and roughly written characters with good performance.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129041798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030948
T. Breuel
Segmentation is a key step in many off-line handwriting recognition systems but, to date, there are almost no ground truth segmentation databases and no widely accepted and formally defined metrics for segmentation performance. This paper proposes a representation of segmentations and presegmentations in terms of color images. Such representations allow convenient interchange of ground truth and hypothesized segmentations in the form of standard image formats. The paper formally defines the notions of oversegmentation and undersegmentation in terms of the maximal bipartite match between corresponding pixels. It also defines a number of metrics that quantify the frequency and extent of events in handwriting like kerning, splitting, and merging of characters. It is hoped that these metrics and representations will find wider use in the community and serve as a basis for creating standard training and test databases of segmentation data.
{"title":"Representations and metrics for off-line handwriting segmentation","authors":"T. Breuel","doi":"10.1109/IWFHR.2002.1030948","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030948","url":null,"abstract":"Segmentation is a key step in many off-line handwriting recognition systems but, to date, there are almost no ground truth segmentation databases and no widely accepted and formally defined metrics for segmentation performance. This paper proposes a representation of segmentations and presegmentations in terms of color images. Such representations allow convenient interchange of ground truth and hypothesized segmentations in the form of standard image formats. The paper formally defines the notions of oversegmentation and undersegmentation in terms of the maximal bipartite match between corresponding pixels. It also defines a number of metrics that quantify the frequency and extent of events in handwriting like kerning, splitting, and merging of characters. It is hoped that these metrics and representations will find wider use in the community and serve as a basis for creating standard training and test databases of segmentation data.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121351981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030932
Yong Ge, Qinah Huo
We (2002) have investigate how to use Gaussian mixture continuous-density hidden Markov models (CDHMMs) for handwritten Chinese character modeling and recognition. We have identified and developed a set of techniques that can be used to construct a practical CDHMM-based off-line recognition system for a large vocabulary of handwritten Chinese characters. We have reported elsewhere the key techniques that contribute to the high recognition accuracy. In this paper we describe how to make our recognizer compact without sacrificing too much of the recognition accuracy. We also report the results of a series of experiments that were performed to help us make a good decision when we face several design choices.
{"title":"A study on the use of CDHMM for large vocabulary off-line recognition of handwritten Chinese characters","authors":"Yong Ge, Qinah Huo","doi":"10.1109/IWFHR.2002.1030932","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030932","url":null,"abstract":"We (2002) have investigate how to use Gaussian mixture continuous-density hidden Markov models (CDHMMs) for handwritten Chinese character modeling and recognition. We have identified and developed a set of techniques that can be used to construct a practical CDHMM-based off-line recognition system for a large vocabulary of handwritten Chinese characters. We have reported elsewhere the key techniques that contribute to the high recognition accuracy. In this paper we describe how to make our recognizer compact without sacrificing too much of the recognition accuracy. We also report the results of a series of experiments that were performed to help us make a good decision when we face several design choices.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131985031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030914
T. Artières, P. Gallinari
The recent development of new terminals such as phones, mobile computers, e-books, etc, raises the needs for new interface modalities, in order to replace or complement the traditional mouse/keyboard interface. Ideally, these new interfaces should use limited computing resources and should be easy to adapt to a specific user and to a large variety of user needs. We propose here a new handwriting recognition system that is an attempt to handle these constraints. We evaluate its performances and ability to adapt to new users on a part of the Unipen database.
{"title":"Stroke level HMMs for on-line handwriting recognition","authors":"T. Artières, P. Gallinari","doi":"10.1109/IWFHR.2002.1030914","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030914","url":null,"abstract":"The recent development of new terminals such as phones, mobile computers, e-books, etc, raises the needs for new interface modalities, in order to replace or complement the traditional mouse/keyboard interface. Ideally, these new interfaces should use limited computing resources and should be easy to adapt to a specific user and to a large variety of user needs. We propose here a new handwriting recognition system that is an attempt to handle these constraints. We evaluate its performances and ability to adapt to new users on a part of the Unipen database.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130999160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030951
José Josemar de Oliveira, J. Carvalho, C. Freitas, R. Sabourin
This paper presents a baseline system used to evaluate feature sets for word recognition. The main goal is to determine an optimum feature set to represent the handwritten names for the months of the year in Brazilian Portuguese language. Three kinds of features are evaluated: perceptual, directional and topological. The evaluation shows that taken in isolation, the perceptual feature set produces the best results for the lexicon used. These results can be further improved combining the feature sets. The baseline system developed obtains an average recognition rate of 87%. This can be considered a good result considering that no explicit segmentation is performed.
{"title":"Feature sets evaluation for handwritten word recognition","authors":"José Josemar de Oliveira, J. Carvalho, C. Freitas, R. Sabourin","doi":"10.1109/IWFHR.2002.1030951","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030951","url":null,"abstract":"This paper presents a baseline system used to evaluate feature sets for word recognition. The main goal is to determine an optimum feature set to represent the handwritten names for the months of the year in Brazilian Portuguese language. Three kinds of features are evaluated: perceptual, directional and topological. The evaluation shows that taken in isolation, the perceptual feature set produces the best results for the lexicon used. These results can be further improved combining the feature sets. The baseline system developed obtains an average recognition rate of 87%. This can be considered a good result considering that no explicit segmentation is performed.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131076666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030886
A. El-Nasan, M. Perrone
Describes an adaptive, partial-word-level, writer,dependent, handwriting recognition system that utilizes the character n-gram statistics of the English language. The system exploits the linguistic property that very few pairs of English words share exactly the same set of character bigrams. This property is used to bring linguistic context to the recognition stage. The recognition is based on, estimating the probability of bigram co-occurrences between words. Preliminary experiments using naive features and limited training sets show that the system can recognize over 60% of words it has never seen before in handwritten form. The system has only few trainable parameters. In addition, incremental training is computationally inexpensive.
{"title":"On-line handwriting recognition using character bigram match vectors","authors":"A. El-Nasan, M. Perrone","doi":"10.1109/IWFHR.2002.1030886","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030886","url":null,"abstract":"Describes an adaptive, partial-word-level, writer,dependent, handwriting recognition system that utilizes the character n-gram statistics of the English language. The system exploits the linguistic property that very few pairs of English words share exactly the same set of character bigrams. This property is used to bring linguistic context to the recognition stage. The recognition is based on, estimating the probability of bigram co-occurrences between words. Preliminary experiments using naive features and limited training sets show that the system can recognize over 60% of words it has never seen before in handwritten form. The system has only few trainable parameters. In addition, incremental training is computationally inexpensive.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128709128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030933
C. K. Tan
Existing Chinese educational software tools teach the students at the primary or kindergarten levels the stroke movements and stroke order of each Chinese character by animating the individual strokes of the whole character, one stroke at a time. In this paper the author presents an algorithm which allows the student to hand-write a specified character online through the computer, and which checks if the individual stroke movements and the strokes' order are correct. The algorithm proposed here uses low-integer-valued coding to represent two categories of features: primitive-stroke features and character features. The proposed algorithm is capable of identifying the typical errors as incorrect movement for the primitive stroke, incorrect stroke type, incorrect relative lengths, incorrect position of a stroke from the rest, incorrect character which looks similar in appearance, incorrect order of strokes and insufficient or extra strokes.
{"title":"An algorithm for online strokes verification of Chinese characters using discrete features","authors":"C. K. Tan","doi":"10.1109/IWFHR.2002.1030933","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030933","url":null,"abstract":"Existing Chinese educational software tools teach the students at the primary or kindergarten levels the stroke movements and stroke order of each Chinese character by animating the individual strokes of the whole character, one stroke at a time. In this paper the author presents an algorithm which allows the student to hand-write a specified character online through the computer, and which checks if the individual stroke movements and the strokes' order are correct. The algorithm proposed here uses low-integer-valued coding to represent two categories of features: primitive-stroke features and character features. The proposed algorithm is capable of identifying the typical errors as incorrect movement for the primitive stroke, incorrect stroke type, incorrect relative lengths, incorrect position of a stroke from the rest, incorrect character which looks similar in appearance, incorrect order of strokes and insufficient or extra strokes.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116322627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030939
D. Doermann, N. Intrator, E. Rivlin, T. Steinherz
One significant challenge in the recognition of off-line handwriting is in the interpretation of loop structures. Although this information is readily available in online representation, close proximity of strokes often merges their centers making them difficult to identify. In this paper a novel approach to the recovery of hidden loops in off-line scanned document images is presented. The proposed algorithm seeks blobs that resemble truncated ellipses. We use a sophisticated form analysis method based on mutual distance measurements between the two sides of a symmetric shape. The experimental results are compared with the ground truth of the online representations of each off-line word image. More than 86% percent of the meaningful loops are handled correctly.
{"title":"Hidden loop recovery for handwriting recognition","authors":"D. Doermann, N. Intrator, E. Rivlin, T. Steinherz","doi":"10.1109/IWFHR.2002.1030939","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030939","url":null,"abstract":"One significant challenge in the recognition of off-line handwriting is in the interpretation of loop structures. Although this information is readily available in online representation, close proximity of strokes often merges their centers making them difficult to identify. In this paper a novel approach to the recovery of hidden loops in off-line scanned document images is presented. The proposed algorithm seeks blobs that resemble truncated ellipses. We use a sophisticated form analysis method based on mutual distance measurements between the two sides of a symmetric shape. The experimental results are compared with the ground truth of the online representations of each off-line word image. More than 86% percent of the meaningful loops are handled correctly.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133827534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030924
A. Vinciarelli, Samy Bengio
This work presents the application of HMM adaptation techniques to the problem of off-line cursive script recognition. Instead of training a new model for each writer one first creates a unique model with a mixed database and then adapts it for each different writer using his own small dataset. Experiments on a publicly available benchmark database show that an adapted system has an accuracy higher than 80% even when less than 30 word samples are used during adaptation, while a system trained using the data of the single writer only needs at least 200 words (the estimate is a lower bound) in order to achieve the same performance as the adapted models.
{"title":"Writer adaptation techniques in off-line cursive word recognition","authors":"A. Vinciarelli, Samy Bengio","doi":"10.1109/IWFHR.2002.1030924","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030924","url":null,"abstract":"This work presents the application of HMM adaptation techniques to the problem of off-line cursive script recognition. Instead of training a new model for each writer one first creates a unique model with a mixed database and then adapts it for each different writer using his own small dataset. Experiments on a publicly available benchmark database show that an adapted system has an accuracy higher than 80% even when less than 30 word samples are used during adaptation, while a system trained using the data of the single writer only needs at least 200 words (the estimate is a lower bound) in order to achieve the same performance as the adapted models.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130793319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030922
A. Bensefia, A. Nosary, T. Paquet, L. Heutte
This communication deals with the problem of writer identification. If the assumption of writing individuality is true then graphical fragments that constitute it should be individual too. Therefore we propose a morphological grapheme based analysis to make writer identification. Template Matching is the core of the approach. The redundancy of the individual patterns in a writing, defined as the writer's invariants, allows to compress the handwritten texts while maintaining good identification performance. Two series of tests are reported. The first series is designed to evaluate the relevance of our approach of identification on a basis of 88 writers by evaluating the influence of the text representation (with or without invariants) on the quality of the method. The method gives about 97,7% of correct identification when using large compressed samples of handwriting. The second series of tests is designed to evaluate the influence of the sample size of the writing to be identified on the quality of the method. It is shown that writer identification can reach a correct identification rate of 92,9% using only samples of 50 graphemes of each writing.
{"title":"Writer identification by writer's invariants","authors":"A. Bensefia, A. Nosary, T. Paquet, L. Heutte","doi":"10.1109/IWFHR.2002.1030922","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030922","url":null,"abstract":"This communication deals with the problem of writer identification. If the assumption of writing individuality is true then graphical fragments that constitute it should be individual too. Therefore we propose a morphological grapheme based analysis to make writer identification. Template Matching is the core of the approach. The redundancy of the individual patterns in a writing, defined as the writer's invariants, allows to compress the handwritten texts while maintaining good identification performance. Two series of tests are reported. The first series is designed to evaluate the relevance of our approach of identification on a basis of 88 writers by evaluating the influence of the text representation (with or without invariants) on the quality of the method. The method gives about 97,7% of correct identification when using large compressed samples of handwriting. The second series of tests is designed to evaluate the influence of the sample size of the writing to be identified on the quality of the method. It is shown that writer identification can reach a correct identification rate of 92,9% using only samples of 50 graphemes of each writing.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115036705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}