Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030888
Jun-ichi Tokuno, Nobuhito Inami, Shigeki Matsuda, M. Nakai, H. Shimodaira, S. Sagayama
Describes context-dependent substroke hidden Markov models (HMMs)for on-line handwritten recognition of cursive Kanji and Hiragana characters. In order to tackle this problem, we have proposed the substroke HMM approach where a modeling unit "substroke" that is much smaller than a whole character is employed and each character is modeled as a concatenation of only 25 kinds of substroke HMMs. One of the drawbacks of this approach is that the recognition accuracy deteriorates in the case of scribbled characters, and characters where the shape of the substrokes varies a lot. We show that the context-dependent substroke modeling which depends on how the substroke connects to the adjacent substrokes is effective for achieving robust recognition of low quality characters, The successive state splitting algorithm which was mainly developed for speech recognition is employed to construct the context dependent substroke HMMs. Experimental results show that the correct recognition rate improved from 88% to 92% for cursive Kanji handwriting and from 90% to 98% for Hiragana handwriting.
{"title":"Context-dependent substroke model for HMM-based on-line handwriting recognition","authors":"Jun-ichi Tokuno, Nobuhito Inami, Shigeki Matsuda, M. Nakai, H. Shimodaira, S. Sagayama","doi":"10.1109/IWFHR.2002.1030888","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030888","url":null,"abstract":"Describes context-dependent substroke hidden Markov models (HMMs)for on-line handwritten recognition of cursive Kanji and Hiragana characters. In order to tackle this problem, we have proposed the substroke HMM approach where a modeling unit \"substroke\" that is much smaller than a whole character is employed and each character is modeled as a concatenation of only 25 kinds of substroke HMMs. One of the drawbacks of this approach is that the recognition accuracy deteriorates in the case of scribbled characters, and characters where the shape of the substrokes varies a lot. We show that the context-dependent substroke modeling which depends on how the substroke connects to the adjacent substrokes is effective for achieving robust recognition of low quality characters, The successive state splitting algorithm which was mainly developed for speech recognition is employed to construct the context dependent substroke HMMs. Experimental results show that the correct recognition rate improved from 88% to 92% for cursive Kanji handwriting and from 90% to 98% for Hiragana handwriting.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"293 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131944852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030920
Sung-Hyuk Cha, C. Tappert
We investigated the detection of handwriting forgery by both human and machine. We obtained experimental handwriting data from subjects writing samples in their natural style and writing forgeries of other subjects' handwriting. These handwriting samples were digitally scanned and stored in an image database. We investigated the ease of forging handwriting, and found that many subjects can successfully forge the handwriting of others in terms of shape and size by tracing the authentic handwriting. Our hypothesis is that the authentic handwriting samples provided by subjects in their own natural writing style will have smooth ink traces, while forged handwritings will have wrinkly traces. We believe the reason for this is that forged handwriting is often either traced or copied slowly and is therefore more likely to appear wrinkly when scanned with a high-resolution scanner. Using seven handwriting distance features, we trained an artificial neural network to achieved 89% accuracy on test samples.
{"title":"Automatic detection of handwriting forgery","authors":"Sung-Hyuk Cha, C. Tappert","doi":"10.1109/IWFHR.2002.1030920","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030920","url":null,"abstract":"We investigated the detection of handwriting forgery by both human and machine. We obtained experimental handwriting data from subjects writing samples in their natural style and writing forgeries of other subjects' handwriting. These handwriting samples were digitally scanned and stored in an image database. We investigated the ease of forging handwriting, and found that many subjects can successfully forge the handwriting of others in terms of shape and size by tracing the authentic handwriting. Our hypothesis is that the authentic handwriting samples provided by subjects in their own natural writing style will have smooth ink traces, while forged handwritings will have wrinkly traces. We believe the reason for this is that forged handwriting is often either traced or copied slowly and is therefore more likely to appear wrinkly when scanned with a high-resolution scanner. Using seven handwriting distance features, we trained an artificial neural network to achieved 89% accuracy on test samples.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129662965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030927
Cheng-Lin Liu, H. Sako, H. Fujisawa
In integrated segmentation and recognition (ISR) of handwritten character strings, the underlying classifier is desired to be accurate in character classification and resistant to non-character patterns (also called garbage or outliers). This paper compares the performance of a number of statistical and neural classifiers in ISR. Each classifier has some variations depending on learning method: maximum likelihood estimation (MLE), discriminative learning (DL) under the minimum square error (MSE) or minimum classification error (MCE) criterion, or enhanced DL (EDL) with outlier samples. A heuristic pre-segmentation method is proposed to generate candidate cuts and character patterns. The methods were tested on the 5-digit Zip code images in CEDAR CDROM-1. The results show that training with outliers is crucial for neural classifiers in ISR. The best result was given by the learning quadratic discriminant function (LQDF) classifier.
{"title":"Integrated segmentation and recognition of handwritten numerals: comparison of classification algorithms","authors":"Cheng-Lin Liu, H. Sako, H. Fujisawa","doi":"10.1109/IWFHR.2002.1030927","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030927","url":null,"abstract":"In integrated segmentation and recognition (ISR) of handwritten character strings, the underlying classifier is desired to be accurate in character classification and resistant to non-character patterns (also called garbage or outliers). This paper compares the performance of a number of statistical and neural classifiers in ISR. Each classifier has some variations depending on learning method: maximum likelihood estimation (MLE), discriminative learning (DL) under the minimum square error (MSE) or minimum classification error (MCE) criterion, or enhanced DL (EDL) with outlier samples. A heuristic pre-segmentation method is proposed to generate candidate cuts and character patterns. The methods were tested on the 5-digit Zip code images in CEDAR CDROM-1. The results show that training with outliers is crucial for neural classifiers in ISR. The best result was given by the learning quadratic discriminant function (LQDF) classifier.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126603591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030946
J. Allan, Tony Allen, N. Sherkat
This paper highlights the research issues associated with the automated assessment of handwritten scripts and introduces the theoretical scoring confidence. Using this concept, in a 3 word response environment, we prove that it is theoretically possible to achieve a scoring confidence greater than 98% using recognition rates as low as 81% to produce actual response yields of 50%. These results are verified by experiment.
{"title":"Automated assessment: how confident are we?","authors":"J. Allan, Tony Allen, N. Sherkat","doi":"10.1109/IWFHR.2002.1030946","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030946","url":null,"abstract":"This paper highlights the research issues associated with the automated assessment of handwritten scripts and introduces the theoretical scoring confidence. Using this concept, in a 3 word response environment, we prove that it is theoretically possible to achieve a scoring confidence greater than 98% using recognition rates as low as 81% to produce actual response yields of 50%. These results are verified by experiment.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126849621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030964
G. Dimauro, S. Impedovo, R. Modugno, G. Pirlo
This paper presents a new database for off-line handwriting recognition. The database, that is particularly devoted to research on bank-check recognition, up to now includes instances of isolated digits and characters, basic words of worded amounts, and signatures. Pattern images are stored using a standard image format, and hence they are easily usable by several commercial and scientific image processing packages.
{"title":"A new database for research on bank-check processing","authors":"G. Dimauro, S. Impedovo, R. Modugno, G. Pirlo","doi":"10.1109/IWFHR.2002.1030964","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030964","url":null,"abstract":"This paper presents a new database for off-line handwriting recognition. The database, that is particularly devoted to research on bank-check recognition, up to now includes instances of isolated digits and characters, basic words of worded amounts, and signatures. Pattern images are stored using a standard image format, and hence they are easily usable by several commercial and scientific image processing packages.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115182926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030963
S. Jeong, Kil-Taek Lim, Yun-Seok Nam
This paper describes some of research results related to the development of a character recognizer appropriate for fast handwritten Korean address reading. Our goal is to design a handwritten Korean character recognizer retaining the following three characteristics: reliable recognition scores indicating probability, high speed, and naturally acceptable cumulative recognition rates. We have adopted two statistical classifiers to satisfy the first characteristic and proposed methods coupling the two classifiers to meet the second and third ones. The superiority of the proposed combination methods has been proven through experiments done with the PE92 database.
{"title":"A combination method of two classifiers based on the information of confusion matrix","authors":"S. Jeong, Kil-Taek Lim, Yun-Seok Nam","doi":"10.1109/IWFHR.2002.1030963","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030963","url":null,"abstract":"This paper describes some of research results related to the development of a character recognizer appropriate for fast handwritten Korean address reading. Our goal is to design a handwritten Korean character recognizer retaining the following three characteristics: reliable recognition scores indicating probability, high speed, and naturally acceptable cumulative recognition rates. We have adopted two statistical classifiers to satisfy the first characteristic and proposed methods coupling the two classifiers to meet the second and third ones. The superiority of the proposed combination methods has been proven through experiments done with the PE92 database.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129540121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030885
A. Biem
We describe an application of the minimum classification error (MCE) training criterion to online unconstrained-style word recognition. The described system uses allograph-HMMs to handle writer variability. The result, on vocabularies of 5k to 10k, shows that MCE training achieves around 17% word error rate reduction when compared to the baseline maximum likelihood system.
{"title":"Minimum classification error training for online handwritten word recognition","authors":"A. Biem","doi":"10.1109/IWFHR.2002.1030885","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030885","url":null,"abstract":"We describe an application of the minimum classification error (MCE) training criterion to online unconstrained-style word recognition. The described system uses allograph-HMMs to handle writer variability. The result, on vocabularies of 5k to 10k, shows that MCE training achieves around 17% word error rate reduction when compared to the baseline maximum likelihood system.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130707668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030935
Su Yang, G. Dai
We proposed a new dominant point detection method which has the following advantages: robust, computational efficient, and real-time response to pen movement. We construct a variable which is the ratio of the height to the width of an imagined rectangle whose bottom coincides with the polygon enclosed by the pen movement trace, and the area is equal to the polygonal area. While online watching whether the value of this variable exceeds a given threshold, one can find dominant points in real-time. Only when the fluctuation comparing to the scale of a curve is big enough, value of this variable can exceed the given threshold. By this way, pseudo turning points can be rejected. As a new point comes in, only a small number of computations are needed to update the value of this variable. Effectiveness of this method was confirmed by experiments.
{"title":"Detecting dominant points on online scripts with a simple approach","authors":"Su Yang, G. Dai","doi":"10.1109/IWFHR.2002.1030935","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030935","url":null,"abstract":"We proposed a new dominant point detection method which has the following advantages: robust, computational efficient, and real-time response to pen movement. We construct a variable which is the ratio of the height to the width of an imagined rectangle whose bottom coincides with the polygon enclosed by the pen movement trace, and the area is equal to the polygonal area. While online watching whether the value of this variable exceeds a given threshold, one can find dominant points in real-time. Only when the fluctuation comparing to the scale of a curve is big enough, value of this variable can exceed the given threshold. By this way, pseudo turning points can be rejected. As a new point comes in, only a small number of computations are needed to update the value of this variable. Effectiveness of this method was confirmed by experiments.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125478008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030894
M. Morita, R. Sabourin, Flávio Bortolozzi, C. Suen
Presents an HMM-MLP hybrid system to recognize complex date images written on Brazilian bank cheques. The system first segments implicitly a date image into sub-fields through the recognition process based on an HMM-based approach. Afterwards, the three obligatory date sub-fields are processed by the system (day, month and year). A neural approach has been adopted to work with strings of digits and a Markovian strategy to recognize and verify words. We also introduce the concept of meta-classes of digits, which is used to reduce the lexicon size of the day and year and improve the precision of their segmentation and recognition. Experiments show interesting results on date recognition.
{"title":"Segmentation and recognition of handwritten dates","authors":"M. Morita, R. Sabourin, Flávio Bortolozzi, C. Suen","doi":"10.1109/IWFHR.2002.1030894","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030894","url":null,"abstract":"Presents an HMM-MLP hybrid system to recognize complex date images written on Brazilian bank cheques. The system first segments implicitly a date image into sub-fields through the recognition process based on an HMM-based approach. Afterwards, the three obligatory date sub-fields are processed by the system (day, month and year). A neural approach has been adopted to work with strings of digits and a Markovian strategy to recognize and verify words. We also introduce the concept of meta-classes of digits, which is used to reduce the lexicon size of the day and year and improve the precision of their segmentation and recognition. Experiments show interesting results on date recognition.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123905719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-08-06DOI: 10.1109/IWFHR.2002.1030897
H. Xue, V. Govindaraju
Contextual character geometry is the geometric information available only when a character presents in the context of a word. Such information includes the character's location and relative size in the entire word image, forming a bounding box of the character. The differences between the geometry of an image segment and the expected geometry of a candidate character are considered as additional features to refine the recognition of individual characters. A typical word recognizer based on over-segmentation and segment-combination is used to illustrate the use of these new features and experimental results have shown significant improvement of recognition accuracy, especially on large lexicons.
{"title":"Incorporating contextual character geometry in word recognition","authors":"H. Xue, V. Govindaraju","doi":"10.1109/IWFHR.2002.1030897","DOIUrl":"https://doi.org/10.1109/IWFHR.2002.1030897","url":null,"abstract":"Contextual character geometry is the geometric information available only when a character presents in the context of a word. Such information includes the character's location and relative size in the entire word image, forming a bounding box of the character. The differences between the geometry of an image segment and the expected geometry of a candidate character are considered as additional features to refine the recognition of individual characters. A typical word recognizer based on over-segmentation and segment-combination is used to illustrate the use of these new features and experimental results have shown significant improvement of recognition accuracy, especially on large lexicons.","PeriodicalId":114017,"journal":{"name":"Proceedings Eighth International Workshop on Frontiers in Handwriting Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132713198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}