Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953792
M. Maragoudakis, E. Kavallieratou, N. Fakotakis, G. Kokkinakis
This paper deals with the use of Bayesian Belief Networks in order to improve the accuracy and training time of character segmentation for unconstrained handwritten text. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the assumption of the independence of the parameters and two additional previous commonly used methods. Results have depicted that obtaining the inferential dependencies of the training data, could lead to the reduction of the required training time and size by a factor of 55%. Moreover, the achieved accuracy in detecting segment boundaries exceeds 86% whereas limited training data are proved to endow with very satisfactory results.
{"title":"How conditional independence assumption affects handwritten character segmentation","authors":"M. Maragoudakis, E. Kavallieratou, N. Fakotakis, G. Kokkinakis","doi":"10.1109/ICDAR.2001.953792","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953792","url":null,"abstract":"This paper deals with the use of Bayesian Belief Networks in order to improve the accuracy and training time of character segmentation for unconstrained handwritten text. Comparative experimental results have been evaluated against Naive Bayes classification, which is based on the assumption of the independence of the parameters and two additional previous commonly used methods. Results have depicted that obtaining the inferential dependencies of the training data, could lead to the reduction of the required training time and size by a factor of 55%. Moreover, the achieved accuracy in detecting segment boundaries exceeds 86% whereas limited training data are proved to endow with very satisfactory results.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126975769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953843
T. Kieninger, A. Dengel
This paper summarizes the core idea of the T-Recs table recognition system, an integrated system covering block-segmentation, table location and a model-free structural analysis of tables. T-Recs works on the output of commercial OCR systems that provide the word bounding box geometry together with the text itself (e.g. Xerox ScanWorX). While T-Recs performs well on a number of document categories, business letters still remained a challenging domain because the T-Recs location heuristics are mislead by their header or footer resulting in a low recognition precision. Business letters such as invoices are a very interesting domain for industrial applications due to the large amount of documents to be analyzed and the importance of the data carried within their tables. Hence, we developed a more restrictive approach which is implemented in the T-Recs++ prototype. This paper describes the ideas of the T-Recs++ location and also proposes a quality evaluation measure that reflects the bottom-up strategy of either T-Recs or T-Recs++. Finally, some results comparing both systems on a collection of business letters are given.
{"title":"Applying the T-Recs table recognition system to the business letter domain","authors":"T. Kieninger, A. Dengel","doi":"10.1109/ICDAR.2001.953843","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953843","url":null,"abstract":"This paper summarizes the core idea of the T-Recs table recognition system, an integrated system covering block-segmentation, table location and a model-free structural analysis of tables. T-Recs works on the output of commercial OCR systems that provide the word bounding box geometry together with the text itself (e.g. Xerox ScanWorX). While T-Recs performs well on a number of document categories, business letters still remained a challenging domain because the T-Recs location heuristics are mislead by their header or footer resulting in a low recognition precision. Business letters such as invoices are a very interesting domain for industrial applications due to the large amount of documents to be analyzed and the importance of the data carried within their tables. Hence, we developed a more restrictive approach which is implemented in the T-Recs++ prototype. This paper describes the ideas of the T-Recs++ location and also proposes a quality evaluation measure that reflects the bottom-up strategy of either T-Recs or T-Recs++. Finally, some results comparing both systems on a collection of business letters are given.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"435 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126983837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953838
M. Nakai, N. Akira, H. Shimodaira, S. Sagayama
A new method is proposed for online handwriting recognition of Kanji characters. The method employs substroke HMM as minimum units to constitute Japanese Kanji characters and utilizes the direction of pen motion. The main motivation is to fully utilize the continuous speech recognition algorithm by relating sentence speech to Kanji character phonemes to substrokes, and grammar to Kanji structure. The proposed system consists input feature analysis, substroke HMM, a character structure dictionary and a decoder. The present approach has the following advantages over the conventional methods that employ whole character HMM. 1) Much smaller memory requirement for dictionary and models. 2) Fast recognition by employing efficient substroke network search. 3) Capability of recognizing characters not included in the training data if defined as a sequence of substrokes in the dictionary. 4) Capability of recognizing characters written by various different stroke orders with multiple definitions per one character in the dictionary. 5) Easiness in HMM adaptation to the user with a few sample character data.
{"title":"Substroke approach to HMM-based on-line Kanji handwriting recognition","authors":"M. Nakai, N. Akira, H. Shimodaira, S. Sagayama","doi":"10.1109/ICDAR.2001.953838","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953838","url":null,"abstract":"A new method is proposed for online handwriting recognition of Kanji characters. The method employs substroke HMM as minimum units to constitute Japanese Kanji characters and utilizes the direction of pen motion. The main motivation is to fully utilize the continuous speech recognition algorithm by relating sentence speech to Kanji character phonemes to substrokes, and grammar to Kanji structure. The proposed system consists input feature analysis, substroke HMM, a character structure dictionary and a decoder. The present approach has the following advantages over the conventional methods that employ whole character HMM. 1) Much smaller memory requirement for dictionary and models. 2) Fast recognition by employing efficient substroke network search. 3) Capability of recognizing characters not included in the training data if defined as a sequence of substrokes in the dictionary. 4) Capability of recognizing characters written by various different stroke orders with multiple definitions per one character in the dictionary. 5) Easiness in HMM adaptation to the user with a few sample character data.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125109209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953822
Claus Bahlmann, H. Burkhardt
We propose a novel similarity measure for hidden Markov models (HMMs). This measure calculates the Bayes probability of error for HMM state correspondences and propagates it along the Viterbi path in a similar way to the HMM Viterbi scoring. It can be applied as a tool to interpret misclassifications, as a stop criterion in iterative HMM training or as a distance measure for HMM clustering. The similarity measure is evaluated in the context of online handwriting recognition on lower case character models which have been trained from the UNIPEN database. We compare the similarities with experimental classifications. The results show that similar and misclassified class pairs are highly correlated. The measure is not limited to handwriting recognition, but can be used in other applications that use HMM based methods.
{"title":"Measuring HMM similarity with the Bayes probability of error and its application to online handwriting recognition","authors":"Claus Bahlmann, H. Burkhardt","doi":"10.1109/ICDAR.2001.953822","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953822","url":null,"abstract":"We propose a novel similarity measure for hidden Markov models (HMMs). This measure calculates the Bayes probability of error for HMM state correspondences and propagates it along the Viterbi path in a similar way to the HMM Viterbi scoring. It can be applied as a tool to interpret misclassifications, as a stop criterion in iterative HMM training or as a distance measure for HMM clustering. The similarity measure is evaluated in the context of online handwriting recognition on lower case character models which have been trained from the UNIPEN database. We compare the similarities with experimental classifications. The results show that similar and misclassified class pairs are highly correlated. The measure is not limited to handwriting recognition, but can be used in other applications that use HMM based methods.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125155429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953758
Lu Da, Pu Wei, B. McCane
This paper presents a new fuzzy-logic approach for character pre-classification which gives a precise way of calculating the baseline detection algorithm with tolerance analysis through analyzing the typographical structure of textual blocks. The other virtual reference lines are extracted from clustering techniques. In order to ensure character pre-classification correctly, a fuzzy-logic approach is used to assign a membership to each typographical category for ambiguous classes. The results prove that an improved character recognition rate can be achieved by means of typographical categorization. The fuzzy typographical analysis can correctly pre-classify characters and can efficiently process more than 10000 characters per second.
{"title":"Character pre-classification based on fuzzy typographical analysis","authors":"Lu Da, Pu Wei, B. McCane","doi":"10.1109/ICDAR.2001.953758","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953758","url":null,"abstract":"This paper presents a new fuzzy-logic approach for character pre-classification which gives a precise way of calculating the baseline detection algorithm with tolerance analysis through analyzing the typographical structure of textual blocks. The other virtual reference lines are extracted from clustering techniques. In order to ensure character pre-classification correctly, a fuzzy-logic approach is used to assign a membership to each typographical category for ambiguous classes. The results prove that an improved character recognition rate can be achieved by means of typographical categorization. The fuzzy typographical analysis can correctly pre-classify characters and can efficiently process more than 10000 characters per second.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121755076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953861
Sanaul Hoque, M. Fairhurst
The moving window classifier (MWC) is a simple and efficient classifier structure which, although shown to be capable of promising performance in a variety of tasks such as face recognition, its common application is a tool in text recognition. Various measures have been proposed to improve the MWC classification speed and to reduce memory space requirement. This paper introduces techniques for improving the MWC classification accuracy without losing any of gains previously achieved. These performance enhancement schemes are readily applicable to a range of related classifiers and hence provide a generalized method for enhancement in a variety of tasks.
{"title":"An improved learning scheme for the moving window classifier","authors":"Sanaul Hoque, M. Fairhurst","doi":"10.1109/ICDAR.2001.953861","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953861","url":null,"abstract":"The moving window classifier (MWC) is a simple and efficient classifier structure which, although shown to be capable of promising performance in a variety of tasks such as face recognition, its common application is a tool in text recognition. Various measures have been proposed to improve the MWC classification speed and to reduce memory space requirement. This paper introduces techniques for improving the MWC classification accuracy without losing any of gains previously achieved. These performance enhancement schemes are readily applicable to a range of related classifiers and hence provide a generalized method for enhancement in a variety of tasks.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125065174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953816
A. Anjewierden
AIDAS is part of a research project in which the aim is to turn technical manuals into a database of indexed training material. We describe the approach AIDAS uses to extract the logical document structure from PDF documents. The approach is based on the idea that the layout structure contains cues about the logical structure and that the logical structure can be discovered incrementally.
{"title":"AIDAS: incremental logical structure discovery in PDF documents","authors":"A. Anjewierden","doi":"10.1109/ICDAR.2001.953816","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953816","url":null,"abstract":"AIDAS is part of a research project in which the aim is to turn technical manuals into a database of indexed training material. We describe the approach AIDAS uses to extract the logical document structure from PDF documents. The approach is based on the idea that the layout structure contains cues about the logical structure and that the logical structure can be discovered incrementally.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125506162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953877
G. Leedham, W. Tan, Weng Lee Yap
This paper is a study of keyword recognition using vector quantisation and a hidden Markov model. The purpose is to be able to identify a word holistically. This study considers the problem of identifying a handwritten country name from the 189 different country names registered with the Universal Postal Union. The method divides the words in the last line of the address image into 16/spl times/16 pixel blocks which are fed into a vector quantiser. The VQ outputs are classified using a HMM. Some presorting is carried out based on the letter-length of the word. The results on a set of 415 handwritten country names show the method is 85.3% correct with the majority of errors in estimating the letter-length of the word and distorted VQ output due to sloping and slanted words/letters.
{"title":"Handwritten country name identification using vector quantisation and hidden Markov model","authors":"G. Leedham, W. Tan, Weng Lee Yap","doi":"10.1109/ICDAR.2001.953877","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953877","url":null,"abstract":"This paper is a study of keyword recognition using vector quantisation and a hidden Markov model. The purpose is to be able to identify a word holistically. This study considers the problem of identifying a handwritten country name from the 189 different country names registered with the Universal Postal Union. The method divides the words in the last line of the address image into 16/spl times/16 pixel blocks which are fed into a vector quantiser. The VQ outputs are classified using a HMM. Some presorting is carried out based on the letter-length of the word. The results on a set of 415 handwritten country names show the method is 85.3% correct with the majority of errors in estimating the letter-length of the word and distorted VQ output due to sloping and slanted words/letters.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133672462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953955
Lyonel Serradura, M. Slimane, N. Vincent
There is more and more information available on the Internet. We need tools to help us extract the right piece of information. We have developed a classification algorithm tackling this issue in French. It distinguishes web pages classifying their text content into themes. We use Hidden Markov Models (HMM) to build this method named STCoL (Supervised Thematic Corpus Learning). Once themes are modeled with HMMs, STCoL is able to classify documents from different sources. This method is not only efficient but is also robust.
{"title":"Web sites thematic classification using hidden Markov models","authors":"Lyonel Serradura, M. Slimane, N. Vincent","doi":"10.1109/ICDAR.2001.953955","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953955","url":null,"abstract":"There is more and more information available on the Internet. We need tools to help us extract the right piece of information. We have developed a classification algorithm tackling this issue in French. It distinguishes web pages classifying their text content into themes. We use Hidden Markov Models (HMM) to build this method named STCoL (Supervised Thematic Corpus Learning). Once themes are modeled with HMMs, STCoL is able to classify documents from different sources. This method is not only efficient but is also robust.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133687412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953972
K. Hadjar, O. Hitz, R. Ingold
Indexing large newspaper archives requires automatic page decomposition algorithms with high accuracy. In this paper, we present our approach to an automatic page decomposition algorithm developed for the First International Newspaper Segmentation Contest. Our approach decomposes the newspaper image into image regions, horizontal and vertical lines, text regions and title areas. Experimental results are obtained from the data set of the contest.
{"title":"Newspaper page decomposition using a split and merge approach","authors":"K. Hadjar, O. Hitz, R. Ingold","doi":"10.1109/ICDAR.2001.953972","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953972","url":null,"abstract":"Indexing large newspaper archives requires automatic page decomposition algorithms with high accuracy. In this paper, we present our approach to an automatic page decomposition algorithm developed for the First International Newspaper Segmentation Contest. Our approach decomposes the newspaper image into image regions, horizontal and vertical lines, text regions and title areas. Experimental results are obtained from the data set of the contest.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132356663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}