Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620648
A. Hennig, N. Sherkat, R. Whitrow
The recognition of unconstrained handwriting has to cope with the ambiguity and variability of cursive script. Preprocessing techniques are often applied to on-line data before representing the script as basic primitives, resulting in the propagation of errors introduced during pre-processing. This paper therefore combines pre-processing of the data (i.e. tangential smoothing) and encoding into primitives (Partial Strokes) in a single step. Finding the correct character at the correct place (i.e. letter spotting) is the main problem in non-holistic recognition approaches. Many cursive letters are composed of common shapes of varying complexity that can in turn consist of other subshapes. In this paper, we present a production rule system using Hierarchical Fuzzy Inference in order to exploit this hierarchical property of cursive script. Shapes of increasing complexity are found on a page of handwriting until letters are finally spotted. Zoning is then applied to verify their vertical position. The performance of letter spotting is compared with an alternative method.
{"title":"Recognising letters in on-line handwriting using hierarchical fuzzy inference","authors":"A. Hennig, N. Sherkat, R. Whitrow","doi":"10.1109/ICDAR.1997.620648","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620648","url":null,"abstract":"The recognition of unconstrained handwriting has to cope with the ambiguity and variability of cursive script. Preprocessing techniques are often applied to on-line data before representing the script as basic primitives, resulting in the propagation of errors introduced during pre-processing. This paper therefore combines pre-processing of the data (i.e. tangential smoothing) and encoding into primitives (Partial Strokes) in a single step. Finding the correct character at the correct place (i.e. letter spotting) is the main problem in non-holistic recognition approaches. Many cursive letters are composed of common shapes of varying complexity that can in turn consist of other subshapes. In this paper, we present a production rule system using Hierarchical Fuzzy Inference in order to exploit this hierarchical property of cursive script. Shapes of increasing complexity are found on a page of handwriting until letters are finally spotted. Zoning is then applied to verify their vertical position. The performance of letter spotting is compared with an alternative method.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123487547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620618
Juan F. Arias, A. K. Chhabra, Vishal Misra
We have developed an efficient method to extract straight lines at any orientation from a line drawing. The method works by extracting the horizontal and vertical lines using the FAST method, detecting the angles of the other lines and applying the FAST method again while the image is rotated to each corresponding angle. The method is efficient because it is based on very efficient line finding, transposition, and rotation operations which work over the run-length representation of the line drawing.
{"title":"Finding straight lines in drawings","authors":"Juan F. Arias, A. K. Chhabra, Vishal Misra","doi":"10.1109/ICDAR.1997.620618","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620618","url":null,"abstract":"We have developed an efficient method to extract straight lines at any orientation from a line drawing. The method works by extracting the horizontal and vertical lines using the FAST method, detecting the angles of the other lines and applying the FAST method again while the image is rotated to each corresponding angle. The method is efficient because it is based on very efficient line finding, transposition, and rotation operations which work over the run-length representation of the line drawing.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122035375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619814
P. Suda, C. Bridoux, B. Kämmerer, G. Maderlechner
The paper presents work in the field of logo and word recognition. The approach is based on a general theory for signal registration and is thus applicable to a broad variety of signal processing domains. It has been fruitfully applied to solve speech and handwriting recognition as well as tasks in the field of document analysis.
{"title":"Logo and word matching using a general approach to signal registration","authors":"P. Suda, C. Bridoux, B. Kämmerer, G. Maderlechner","doi":"10.1109/ICDAR.1997.619814","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619814","url":null,"abstract":"The paper presents work in the field of logo and word recognition. The approach is based on a general theory for signal registration and is thus applicable to a broad variety of signal processing domains. It has been fruitfully applied to solve speech and handwriting recognition as well as tasks in the field of document analysis.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124448982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620674
D. Doermann, E. Rivlin, A. Rosenfeld
The purpose of a document is to facilitate the transfer of information from its author to its readers. It is the author's job to design the document so that the information it contains can be interpreted accurately and efficiently. To do this, the author can make use of a set of stylistic tools. In this paper, we introduce the concept of document functionality, which attempts to describe the roles of documents and their components in the process of transferring information. A functional description of a document provides insight into the type of the document, into its intended uses, and into strategies for automatic document interpretation and retrieval. To demonstrate these ideas, we define a taxonomy of functional document components and show how functional descriptions can be used to reverse-engineer the intentions of the author, to navigate in document space, and to provide important contextual information to aid in interpretation.
{"title":"The function of documents","authors":"D. Doermann, E. Rivlin, A. Rosenfeld","doi":"10.1109/ICDAR.1997.620674","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620674","url":null,"abstract":"The purpose of a document is to facilitate the transfer of information from its author to its readers. It is the author's job to design the document so that the information it contains can be interpreted accurately and efficiently. To do this, the author can make use of a set of stylistic tools. In this paper, we introduce the concept of document functionality, which attempts to describe the roles of documents and their components in the process of transferring information. A functional description of a document provides insight into the type of the document, into its intended uses, and into strategies for automatic document interpretation and retrieval. To demonstrate these ideas, we define a taxonomy of functional document components and show how functional descriptions can be used to reverse-engineer the intentions of the author, to navigate in document space, and to provide important contextual information to aid in interpretation.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126161990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620633
S. Madhvanath, Evelyn Kleinberg, V. Govindaraju, S. Srihari
The authors describe ongoing research on a system for rapid verification of unconstrained off-line handwritten phrases using perceptual holistic features of the handwritten phrase image. The system is used to verify handwritten street names automatically extracted from live US mail against recognition results of analytical classifiers. The system rejects errors with 98% accuracy at the 30% accept level, while consuming approximately 20 msec per image on the average on a 150 MHz SPARC 10.
{"title":"The HOVER system for rapid holistic verification of off-line handwritten phrases","authors":"S. Madhvanath, Evelyn Kleinberg, V. Govindaraju, S. Srihari","doi":"10.1109/ICDAR.1997.620633","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620633","url":null,"abstract":"The authors describe ongoing research on a system for rapid verification of unconstrained off-line handwritten phrases using perceptual holistic features of the handwritten phrase image. The system is used to verify handwritten street names automatically extracted from live US mail against recognition results of analytical classifiers. The system rejects errors with 98% accuracy at the 30% accept level, while consuming approximately 20 msec per image on the average on a 150 MHz SPARC 10.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127535138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620560
C. Olivier, F. Jouzel, M. Avila
Markov models are currently used for printed or handwritten word recognition. The order k is a very important parameter of these models. The aim of this paper is to use model selection criteria in order to estimate the order of a Markov model. Akaike (1973) suggested the AIC criterion for the estimation of the order k of a parameterized statistical model, including the term k as penalization of the likelihood function. Yet, selection according to this criterion leads asymptotically to a strict overestimation of the order. That is why we suggest the use of other consistent criteria in a Markovian case: the Bayesian and the Hannan and Quinn information criteria (BIC and /spl rho/, respectively). The performance of the criteria are analysed on simulated data and on a real case: a handwritten word description. We discuss the limit of these methods in relation to the number of states in the model.
{"title":"Markov model order optimization for text recognition","authors":"C. Olivier, F. Jouzel, M. Avila","doi":"10.1109/ICDAR.1997.620560","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620560","url":null,"abstract":"Markov models are currently used for printed or handwritten word recognition. The order k is a very important parameter of these models. The aim of this paper is to use model selection criteria in order to estimate the order of a Markov model. Akaike (1973) suggested the AIC criterion for the estimation of the order k of a parameterized statistical model, including the term k as penalization of the likelihood function. Yet, selection according to this criterion leads asymptotically to a strict overestimation of the order. That is why we suggest the use of other consistent criteria in a Markovian case: the Bayesian and the Hannan and Quinn information criteria (BIC and /spl rho/, respectively). The performance of the criteria are analysed on simulated data and on a real case: a handwritten word description. We discuss the limit of these methods in relation to the number of states in the model.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127254671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619807
Yuji Waizumi, N. Kato, Kazuki Saruta, Y. Nemoto
Today, high accuracy of character recognition is attainable using a neural network for problems with a relatively small number of categories. But for large categories, like Chinese characters, it is difficult to reach the neural network convergence because of the "local minima problem" and a large number of calculations. Studies are being done to solve the problem by splitting the neural network into some small modules. The effectiveness of the combination of learning vector quantization (LVQ) and back propagation (BP) has been reported. LVQ is used for rough classification and BP is used for fine recognition. It is difficult to obtain high accuracy for rough classification by LVQ itself. To deal with this problem, we propose hierarchical learning vector quantization (HLVQ). HLVQ divides categories in feature space hierarchically in the learning procedure. The adjacent feature spaces overlap each other near the borders. HLVQ possesses both classification speed and accuracy due to the hierarchical architecture and the overlapping technique. In the experiment using ETL9B, the largest database of handwritten characters in Japan, (includes 3036 categories, 607,200 samples), the effectiveness of HLVQ was verified.
{"title":"High speed rough classification for handwritten characters using hierarchical learning vector quantization","authors":"Yuji Waizumi, N. Kato, Kazuki Saruta, Y. Nemoto","doi":"10.1109/ICDAR.1997.619807","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619807","url":null,"abstract":"Today, high accuracy of character recognition is attainable using a neural network for problems with a relatively small number of categories. But for large categories, like Chinese characters, it is difficult to reach the neural network convergence because of the \"local minima problem\" and a large number of calculations. Studies are being done to solve the problem by splitting the neural network into some small modules. The effectiveness of the combination of learning vector quantization (LVQ) and back propagation (BP) has been reported. LVQ is used for rough classification and BP is used for fine recognition. It is difficult to obtain high accuracy for rough classification by LVQ itself. To deal with this problem, we propose hierarchical learning vector quantization (HLVQ). HLVQ divides categories in feature space hierarchically in the learning procedure. The adjacent feature spaces overlap each other near the borders. HLVQ possesses both classification speed and accuracy due to the hierarchical architecture and the overlapping technique. In the experiment using ETL9B, the largest database of handwritten characters in Japan, (includes 3036 categories, 607,200 samples), the effectiveness of HLVQ was verified.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130586946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620567
U. Pal, B. B. Chaudhuri
In a multi-lingual country like India, a document may contain more than one script forms. For such a document it is necessary to separate different script forms before feeding them to OCRs of individual script. In this paper an automatic word segmentation approach is described which can separate Roman, Bangla and Devnagari scripts present in a single document. The approach has a tree structure where at first Roman script words are separated using the 'headline' feature. The headline is common in Bangla and Devnagari but absent in Roman. Next, Bangla and Devnagari words are separated using some finer characteristics of the character set although recognition of individual character is avoided. At present, the system has an overall accuracy of 96.09%.
{"title":"Automatic separation of words in multi-lingual multi-script Indian documents","authors":"U. Pal, B. B. Chaudhuri","doi":"10.1109/ICDAR.1997.620567","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620567","url":null,"abstract":"In a multi-lingual country like India, a document may contain more than one script forms. For such a document it is necessary to separate different script forms before feeding them to OCRs of individual script. In this paper an automatic word segmentation approach is described which can separate Roman, Bangla and Devnagari scripts present in a single document. The approach has a tree structure where at first Roman script words are separated using the 'headline' feature. The headline is common in Bangla and Devnagari but absent in Roman. Next, Bangla and Devnagari words are separated using some finer characteristics of the character set although recognition of individual character is avoided. At present, the system has an overall accuracy of 96.09%.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132057149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620576
A. Amin, W. Mansoor
The main theme of the paper is the automatic recognition of Arabic printed text using artificial neural networks in addition to conventional techniques. This approach has a number of advantages: it combines rule based (structural) and classification tests; feature extraction is inexpensive; and execution time is independent of character font and size. The technique can be divided into three major steps: The first step is preprocessing in which the original image is transformed into a binary image utilizing a 300 dpi scanner and then forming the connected component. Second, global features of the input Arabic word are then extracted such as number of subwords, number of peaks within the subword, number and position of the complementary character, etc. Finally, an artificial neural network is used for character classification. The algorithm was implemented on a powerful MS-DOS microcomputer and written in C.
{"title":"Recognition of printed Arabic text using neural networks","authors":"A. Amin, W. Mansoor","doi":"10.1109/ICDAR.1997.620576","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620576","url":null,"abstract":"The main theme of the paper is the automatic recognition of Arabic printed text using artificial neural networks in addition to conventional techniques. This approach has a number of advantages: it combines rule based (structural) and classification tests; feature extraction is inexpensive; and execution time is independent of character font and size. The technique can be divided into three major steps: The first step is preprocessing in which the original image is transformed into a binary image utilizing a 300 dpi scanner and then forming the connected component. Second, global features of the input Arabic word are then extracted such as number of subwords, number of peaks within the subword, number and position of the complementary character, etc. Finally, an artificial neural network is used for character classification. The algorithm was implemented on a powerful MS-DOS microcomputer and written in C.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132200786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619840
Choi Baek Young, S. Bang
Realizing that the availability of a practical recognition system for Korean handwritten characters without any constraints has a long way to go, we have attempted to find a set of writing constraints which significantly improves the machine-readability. Based on our observation that the majority of the misrecognitions reported are caused by ambiguous characters, we have developed a set of writing constraints which maximally disambiguate those characters. Through experiments, we have confirmed that the recognition rate of those handwritten data could be improved significantly by applying the proposed set of constraints.
{"title":"Constraints on handwriting Korean characters to improve the machine readability","authors":"Choi Baek Young, S. Bang","doi":"10.1109/ICDAR.1997.619840","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619840","url":null,"abstract":"Realizing that the availability of a practical recognition system for Korean handwritten characters without any constraints has a long way to go, we have attempted to find a set of writing constraints which significantly improves the machine-readability. Based on our observation that the majority of the misrecognitions reported are caused by ambiguous characters, we have developed a set of writing constraints which maximally disambiguate those characters. Through experiments, we have confirmed that the recognition rate of those handwritten data could be improved significantly by applying the proposed set of constraints.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126706590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}