Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620671
Markus Junker, R. Hoch
In the literature, many feature types and learning algorithms have been proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not been done yet. In order to investigate different text representations for document classification, we have developed a tool which transforms documents into feature-value representations that are suitable for standard learning algorithms. In this paper, we investigate seven document representations for German texts based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts.
{"title":"Evaluating OCR and non-OCR text representations for learning document classifiers","authors":"Markus Junker, R. Hoch","doi":"10.1109/ICDAR.1997.620671","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620671","url":null,"abstract":"In the literature, many feature types and learning algorithms have been proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not been done yet. In order to investigate different text representations for document classification, we have developed a tool which transforms documents into feature-value representations that are suitable for standard learning algorithms. In this paper, we investigate seven document representations for German texts based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"14 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115071972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619875
A. Alimi
The author describes a system that recognizes on-line Arabic cursive handwriting. In this system, a genetic algorithm is used to select the best combination of characters recognized by a fuzzy neural network. The handwritten words used in this system are modelled by a theory of movement generation. Based on this motor theory, the features extracted from each character are the neuro-physiological and biomechanical parameters of the equation describing the curvilinear velocity of the script. The evolutionary approach proposed permits the recognition of cursive handwriting with a segmentation procedure allowing overlapped strokes having neuro-physiological meaning.
{"title":"An evolutionary neuro-fuzzy approach to recognize on-line Arabic handwriting","authors":"A. Alimi","doi":"10.1109/ICDAR.1997.619875","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619875","url":null,"abstract":"The author describes a system that recognizes on-line Arabic cursive handwriting. In this system, a genetic algorithm is used to select the best combination of characters recognized by a fuzzy neural network. The handwritten words used in this system are modelled by a theory of movement generation. Based on this motor theory, the features extracted from each character are the neuro-physiological and biomechanical parameters of the equation describing the curvilinear velocity of the script. The evolutionary approach proposed permits the recognition of cursive handwriting with a segmentation procedure allowing overlapped strokes having neuro-physiological meaning.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114658407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620668
Matthew F. Hurst, Shona Douglas
Describes a prototype system for assigning table cells to their proper place in the logical structure of the table, based on a simple model of table structure combined with a number of measures of cohesion between cells. A framework is presented for examining the effect of particular variables on the performance of the system, and preliminary results are presented showing the effect of cohesion measures based on the simplest domain-independent analyses, with the aim allowing future comparison with more knowledge-intensive analyses based on natural language processing. These baseline results suggest that very simple string-based cohesion measures are not sufficient to support the extraction of tuples as we require. Future work will pursue the aim of more adequate approximations to a notional subtype/supertype definition of the relationship between value cells and label cells.
{"title":"Layout and language: preliminary investigations in recognizing the structure of tables","authors":"Matthew F. Hurst, Shona Douglas","doi":"10.1109/ICDAR.1997.620668","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620668","url":null,"abstract":"Describes a prototype system for assigning table cells to their proper place in the logical structure of the table, based on a simple model of table structure combined with a number of measures of cohesion between cells. A framework is presented for examining the effect of particular variables on the performance of the system, and preliminary results are presented showing the effect of cohesion measures based on the simplest domain-independent analyses, with the aim allowing future comparison with more knowledge-intensive analyses based on natural language processing. These baseline results suggest that very simple string-based cohesion measures are not sufficient to support the extraction of tuples as we require. Future work will pursue the aim of more adequate approximations to a notional subtype/supertype definition of the relationship between value cells and label cells.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128239473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620666
Cheng-Lin Liu, In-Jung Kim, J. H. Kim
Proposes some strategies to improve the recognition performance of a feature matching method for handwritten Chinese character recognition (HCCR). Favorable modifications are given to all stages throughout the recognition. In pre-processing, we devised a modified nonlinear normalization algorithm and a connectivity-preserving smoothing algorithm. For feature extraction, an efficient directional decomposition algorithm and a systematic approach to design a blurring mask are presented. Finally, a modified LVQ3 algorithm is applied to optimize the reference vectors for classification. The integrated effect of these strategies significantly improves the recognition performance. Recognition results on the large-vocabulary databases ETL8B2 and ETL9B are promising.
{"title":"High accuracy handwritten Chinese character recognition by improved feature matching method","authors":"Cheng-Lin Liu, In-Jung Kim, J. H. Kim","doi":"10.1109/ICDAR.1997.620666","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620666","url":null,"abstract":"Proposes some strategies to improve the recognition performance of a feature matching method for handwritten Chinese character recognition (HCCR). Favorable modifications are given to all stages throughout the recognition. In pre-processing, we devised a modified nonlinear normalization algorithm and a connectivity-preserving smoothing algorithm. For feature extraction, an efficient directional decomposition algorithm and a systematic approach to design a blurring mask are presented. Finally, a modified LVQ3 algorithm is applied to optimize the reference vectors for classification. The integrated effect of these strategies significantly improves the recognition performance. Recognition results on the large-vocabulary databases ETL8B2 and ETL9B are promising.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128627510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620563
Claus Aufmuth
The article describes a tool for visualizing hidden Markov recognizers (HMR) which allows the developer to get a detailed view of the recognition process. Improvements are suggested for a hidden Markov recognizer using an appropriate processing and visualization tool.
{"title":"Revealing the hidden Markov recognizer","authors":"Claus Aufmuth","doi":"10.1109/ICDAR.1997.620563","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620563","url":null,"abstract":"The article describes a tool for visualizing hidden Markov recognizers (HMR) which allows the developer to get a detailed view of the recognition process. Improvements are suggested for a hidden Markov recognizer using an appropriate processing and visualization tool.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134187712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619845
A. Spitz
The author has previously (Proc. Int. Conf. on Doc. Anal. and Recognition, Montreal, pp. 723-728, 1995) described a high-speed, lexically driven OCR called GEOCR (Good Enough Optical Character Recognition). This paper expands on that work by describing the effects of lexical content, structure and processing on the performance of GEOCR as a word recognition engine, describing the recognition of a particular text, Moby Dick. Word recognition performance is shown to be enhanced by the application of an appropriate lexicon. Recognition speed is essentially independent of the details of lexical content, provided that the intersection of the occurrences of words in the document and the lexicon is high. Word recognition accuracy is dependent on both the intersection and specificity of the lexicon.
作者以前曾(Proc. Int.)别担心,医生。分析的和Recognition, Montreal, pp. 723-728, 1995)描述了一种高速的、词法驱动的OCR,称为GEOCR (Good Enough Optical Character Recognition)。本文通过描述词法内容、结构和处理对GEOCR作为一个词识别引擎的性能的影响来扩展这项工作,并描述了对特定文本《白鲸记》的识别。使用合适的词汇可以提高单词识别性能。识别速度基本上与词汇内容的细节无关,前提是文档中单词的出现频率与词汇的出现频率有很大的交集。单词识别的准确性取决于词汇的交叉性和特异性。
{"title":"Moby Dick meets GEOCR: lexical considerations in word recognition","authors":"A. Spitz","doi":"10.1109/ICDAR.1997.619845","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619845","url":null,"abstract":"The author has previously (Proc. Int. Conf. on Doc. Anal. and Recognition, Montreal, pp. 723-728, 1995) described a high-speed, lexically driven OCR called GEOCR (Good Enough Optical Character Recognition). This paper expands on that work by describing the effects of lexical content, structure and processing on the performance of GEOCR as a word recognition engine, describing the recognition of a particular text, Moby Dick. Word recognition performance is shown to be enhanced by the application of an appropriate lexicon. Recognition speed is essentially independent of the details of lexical content, provided that the intersection of the occurrences of words in the document and the lexicon is high. Word recognition accuracy is dependent on both the intersection and specificity of the lexicon.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"19 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134404616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619879
R. Haruki, T. Horiuchi
The proposed method expresses a gray-scale image by parametric spline functions for edge components and by two-variable spline functions for low frequency components. It can reconstruct the image keeping its quality for the basic shape transformation. If a binary image is input as a special case, the proposed method can make a scalable vector font automatically. The performance of the proposed method is verified by some experiments.
{"title":"Scalable image coding by spline approximation for a gray-scale image","authors":"R. Haruki, T. Horiuchi","doi":"10.1109/ICDAR.1997.619879","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619879","url":null,"abstract":"The proposed method expresses a gray-scale image by parametric spline functions for edge components and by two-variable spline functions for low frequency components. It can reconstruct the image keeping its quality for the basic shape transformation. If a binary image is input as a special case, the proposed method can make a scalable vector font automatically. The performance of the proposed method is verified by some experiments.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134016161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619886
G. Raza, A. Hennig, N. Sherkat, R. Whitrow
A method for the recognition of poor quality documents containing touching characters is presented. The method is based on extraction of independent and robust features of each object of a sample word, where objects consist of single letters or of several touching ones. Thus avoiding letter segmentation the method eliminates errors frequently introduced in segmentation based approaches. Features are attributed by their position and extent in order to facilitate discrimination between different classes of objects. A method for automatic construction of a comprehensive database is presented. From a given dictionary every possible letter combination is obtained and the images of the artificially touching letters created. These images are subjected to noise and their features extracted. For recognition, alternatives for each object are found based on the database. Object alternatives are then combined into valid word alternatives using lexicon lookup. It has been observed that the developed method is effective for the recognition of poor quality documents.
{"title":"Recognition of facsimile documents using a database of robust features","authors":"G. Raza, A. Hennig, N. Sherkat, R. Whitrow","doi":"10.1109/ICDAR.1997.619886","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619886","url":null,"abstract":"A method for the recognition of poor quality documents containing touching characters is presented. The method is based on extraction of independent and robust features of each object of a sample word, where objects consist of single letters or of several touching ones. Thus avoiding letter segmentation the method eliminates errors frequently introduced in segmentation based approaches. Features are attributed by their position and extent in order to facilitate discrimination between different classes of objects. A method for automatic construction of a comprehensive database is presented. From a given dictionary every possible letter combination is obtained and the images of the artificially touching letters created. These images are subjected to noise and their features extracted. For recognition, alternatives for each object are found based on the database. Object alternatives are then combined into valid word alternatives using lexicon lookup. It has been observed that the developed method is effective for the recognition of poor quality documents.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132996863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620653
Yasuhiko Watanabe, M. Nagao
Pattern information and natural language information used together can complement and reinforce each other to enable more effective communication than can either medium alone. A good example is a pictorial book of flora (PBF). In the PBF, readable explanations combine texts and pictures. However, it is difficult to retrieve explanation text and pictures from the PBF when we don't know the names of flowers. To solve this problem, we propose a retrieval method for the PBF using the color feature of each flower and fruit, and construct an experimental retrieval system for the PBF. For obtaining the color feature of each flower and fruit, we analysed the PBF pictures and found several problems as follows: Pictures of the PBF contain many kinds of objects. In addition to flowers and fruits, there are leaves, stems, skies, soils, and sometimes humans in the PBF pictures. The position, size, and direction of flowers and fruits vary quite widely in each picture. Each flower and fruit has its unique shape, color, and texture which are commonly different from those of the others. Because of these problems, it is difficult to build the general and precise model for analyzing the PBF pictures in advance. We propose a method for image analysis using natural language information. Our method works as follows. First, we analyse the PBF explanation texts for extracting the color information on each flower and fruit. Then, we analyse the PBF pictures by using the results of the natural language processing, and finally obtain the color feature of each flower and fruit.
{"title":"Construction of retrieval system for pictorial book of flora","authors":"Yasuhiko Watanabe, M. Nagao","doi":"10.1109/ICDAR.1997.620653","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620653","url":null,"abstract":"Pattern information and natural language information used together can complement and reinforce each other to enable more effective communication than can either medium alone. A good example is a pictorial book of flora (PBF). In the PBF, readable explanations combine texts and pictures. However, it is difficult to retrieve explanation text and pictures from the PBF when we don't know the names of flowers. To solve this problem, we propose a retrieval method for the PBF using the color feature of each flower and fruit, and construct an experimental retrieval system for the PBF. For obtaining the color feature of each flower and fruit, we analysed the PBF pictures and found several problems as follows: Pictures of the PBF contain many kinds of objects. In addition to flowers and fruits, there are leaves, stems, skies, soils, and sometimes humans in the PBF pictures. The position, size, and direction of flowers and fruits vary quite widely in each picture. Each flower and fruit has its unique shape, color, and texture which are commonly different from those of the others. Because of these problems, it is difficult to build the general and precise model for analyzing the PBF pictures in advance. We propose a method for image analysis using natural language information. Our method works as follows. First, we analyse the PBF explanation texts for extracting the color information on each flower and fruit. Then, we analyse the PBF pictures by using the results of the natural language processing, and finally obtain the color feature of each flower and fruit.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133154770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619830
Changming Sun, Deyi Si
A fast algorithm is presented for skew and slant correction in printed document images. The algorithm employs only the gradient information. The skew angle is obtained by searching for a peak in the histogram of the gradient orientation of the input grey-level image. The skewness of the document is corrected by a rotation at such an angle. The slant of characters can also be detected using the same technique, and can be corrected by a shear operation. A second method for character slant correction by fitting parallelograms to the connected components is also described. Document images with different contents (tables, figures, and photos) have been tested for skew correction and the algorithm gives accurate results on all the test images, and the algorithm is very easy to implement.
{"title":"Skew and slant correction for document images using gradient direction","authors":"Changming Sun, Deyi Si","doi":"10.1109/ICDAR.1997.619830","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619830","url":null,"abstract":"A fast algorithm is presented for skew and slant correction in printed document images. The algorithm employs only the gradient information. The skew angle is obtained by searching for a peak in the histogram of the gradient orientation of the input grey-level image. The skewness of the document is corrected by a rotation at such an angle. The slant of characters can also be detected using the same technique, and can be corrected by a shear operation. A second method for character slant correction by fitting parallelograms to the connected components is also described. Document images with different contents (tables, figures, and photos) have been tested for skew correction and the algorithm gives accurate results on all the test images, and the algorithm is very easy to implement.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132390563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}