Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.602043
Seung-Ho Lee, Hyunkyu Lee, J. H. Kim
A new approach for on-line cursive script recognition that combines a letter spotting technique with an island-driven lattice search algorithm is presented. Initially, all plausible letter components within an input pattern are detected, using a letter spotting technique based on hidden Markov models. A word hypothesis lattice is generated as a result of the letter spotting. Then an island-driven search algorithm is performed to find the optimal path on the word hypothesis lattice, which corresponds to the most probable word among the dictionary words. The results of this experiment suggest that the proposed method works effectively in recognizing English cursive words. In a word recognition test, the average 85.4% word accuracy was obtained.
{"title":"On-line cursive script recognition using an island-driven search technique","authors":"Seung-Ho Lee, Hyunkyu Lee, J. H. Kim","doi":"10.1109/ICDAR.1995.602043","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.602043","url":null,"abstract":"A new approach for on-line cursive script recognition that combines a letter spotting technique with an island-driven lattice search algorithm is presented. Initially, all plausible letter components within an input pattern are detected, using a letter spotting technique based on hidden Markov models. A word hypothesis lattice is generated as a result of the letter spotting. Then an island-driven search algorithm is performed to find the optimal path on the word hypothesis lattice, which corresponds to the most probable word among the dictionary words. The results of this experiment suggest that the proposed method works effectively in recognizing English cursive words. In a word recognition test, the average 85.4% word accuracy was obtained.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124377857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599040
P. Lefèvre, François Reynaud
This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL"-Office Document Image description Language-that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML elements, and their characteristics are defined by SGML attributes. The basic objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language: texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DTD will permit to use SGML tools for the logical structure recognition which is viewed as an SGML up-conversion problem.
{"title":"ODIL: an SGML description language of the layout structure of documents","authors":"P. Lefèvre, François Reynaud","doi":"10.1109/ICDAR.1995.599040","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599040","url":null,"abstract":"This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named \"ODIL\"-Office Document Image description Language-that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML elements, and their characteristics are defined by SGML attributes. The basic objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language: texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DTD will permit to use SGML tools for the logical structure recognition which is viewed as an SGML up-conversion problem.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127730608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599035
A. Simon, Jean-Christophe Pret, A.P. Johnson
This paper presents a novel view of document processing, as being the reverse process to T/sub E/X. This concept simplifies the analysis of the physical structure of documents, and also suggests the use of a style file for layout recognition. An algorithm is given for both phases, layout analysis and layout recognition. The bottom-up layout analysis method employed is based on the Kruskal's algorithm and uses the distances between the components to construct the physical page structure. The algorithm is linear with respect to the number of the connected components. For layout recognition, a document style description language (DSDL) is introduced. This helps a fault-tolerant, recursive parsing algorithm to label the blocks of the document. The presented methods were designed to be used for scientific publications (papers, reports, books), but could be applied to a broader range of documents.
{"title":"(Chem)DeT/sub E/X automatic generation of a markup language description of (chemical) documents from bitmap images","authors":"A. Simon, Jean-Christophe Pret, A.P. Johnson","doi":"10.1109/ICDAR.1995.599035","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599035","url":null,"abstract":"This paper presents a novel view of document processing, as being the reverse process to T/sub E/X. This concept simplifies the analysis of the physical structure of documents, and also suggests the use of a style file for layout recognition. An algorithm is given for both phases, layout analysis and layout recognition. The bottom-up layout analysis method employed is based on the Kruskal's algorithm and uses the distances between the components to construct the physical page structure. The algorithm is linear with respect to the number of the connected components. For layout recognition, a document style description language (DSDL) is introduced. This helps a fault-tolerant, recursive parsing algorithm to label the blocks of the document. The presented methods were designed to be used for scientific publications (papers, reports, books), but could be applied to a broader range of documents.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133892353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.601979
Yi Lu, A. Tisler
The extraction of lines, words and characters from a digital document image are necessary computational steps preceding character recognition. Much has been discussed in character segmentation and recognition but little has been done in the area of line and word segmentation. The authors present two special filters, minimum difference filters (MDF) and average difference filters (ADF) to facilitate line and word segmentation. They discuss how to select the scales of these filters dynamically and how to use the filters to eliminate crossing lines from a text image.
{"title":"Gray scale filtering for line and word segmentation","authors":"Yi Lu, A. Tisler","doi":"10.1109/ICDAR.1995.601979","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.601979","url":null,"abstract":"The extraction of lines, words and characters from a digital document image are necessary computational steps preceding character recognition. Much has been discussed in character segmentation and recognition but little has been done in the area of line and word segmentation. The authors present two special filters, minimum difference filters (MDF) and average difference filters (ADF) to facilitate line and word segmentation. They discuss how to select the scales of these filters dynamically and how to use the filters to eliminate crossing lines from a text image.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"159 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133641941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.601980
Norio Nakamura, K. Hosaka, Masakazu Nagura
The paper describes the properties of Ueda's (1985) image enhancement method for line drawings and its merit for practical use. This method can remove the line discontinuities or mis-connections caused by scanning errors. The method is applied to simple images to evaluate its effect quantitatively. The authors confirm that it is more efficient than any other methods, and propose a drawing capturing system based on this method that can build up high quality drawing databases faster than any other system.
{"title":"Drawing capturing system using image enhancement","authors":"Norio Nakamura, K. Hosaka, Masakazu Nagura","doi":"10.1109/ICDAR.1995.601980","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.601980","url":null,"abstract":"The paper describes the properties of Ueda's (1985) image enhancement method for line drawings and its merit for practical use. This method can remove the line discontinuities or mis-connections caused by scanning errors. The method is applied to simple images to evaluate its effect quantitatively. The authors confirm that it is more efficient than any other methods, and propose a drawing capturing system based on this method that can build up high quality drawing databases faster than any other system.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130755617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599008
H. Tanahashi, K. Sakaue, Kazuhiko Yamamoto
In order to develop a shape and decorative pattern database of old pottery and ceramics objects, it is necessary to obtain the surface pattern on the 3-dimensional shape of the object. This paper describes the recovery of the revolution surface from a monocular image of unknown camera parameters and the retrieval of the 2-dimensional pattern. The camera parameters are obtained by using a genetic algorithm (GA). After the revolution surfaces are reconstructed, these surfaces are developed into a 2-dimensional plane. We show that a scanner-digitized image of an old ceramic object can be analyzed by GA to reconstruct the revolution surface and to develop the 2-dimensional image on the 3-dimensional object.
{"title":"Recovering decorative patterns of ceramic objects from a monocular image using a genetic algorithm","authors":"H. Tanahashi, K. Sakaue, Kazuhiko Yamamoto","doi":"10.1109/ICDAR.1995.599008","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599008","url":null,"abstract":"In order to develop a shape and decorative pattern database of old pottery and ceramics objects, it is necessary to obtain the surface pattern on the 3-dimensional shape of the object. This paper describes the recovery of the revolution surface from a monocular image of unknown camera parameters and the retrieval of the 2-dimensional pattern. The camera parameters are obtained by using a genetic algorithm (GA). After the revolution surfaces are reconstructed, these surfaces are developed into a 2-dimensional plane. We show that a scanner-digitized image of an old ceramic object can be analyzed by GA to reconstruct the revolution surface and to develop the 2-dimensional image on the 3-dimensional object.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131565229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599033
Hsi-Jian Lee, Cheng-Huang Tung
This paper presents a new method for clustering the words in a dictionary into word groups, which are applied in a Chinese character recognition system with a language model to describe the contextual information. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to train the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the behavior dictionary, which has a rather complete word set. Then, the updated word classes are clustered into m groups according to the semantic measurement by a greedy method. The words in the behavior dictionary can finally be assigned into the m groups. The parameter space for bigram contextual information of the character recognition system is m/sup 2/. From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model.
{"title":"A language model based on semantically clustered words in a Chinese character recognition system","authors":"Hsi-Jian Lee, Cheng-Huang Tung","doi":"10.1109/ICDAR.1995.599033","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599033","url":null,"abstract":"This paper presents a new method for clustering the words in a dictionary into word groups, which are applied in a Chinese character recognition system with a language model to describe the contextual information. The Chinese synonym dictionary Tong2yi4ci2 ci2lin2 providing the semantic features is used to train the weights of the semantic attributes of the character-based word classes. The weights of the semantic attributes are next updated according to the words of the behavior dictionary, which has a rather complete word set. Then, the updated word classes are clustered into m groups according to the semantic measurement by a greedy method. The words in the behavior dictionary can finally be assigned into the m groups. The parameter space for bigram contextual information of the character recognition system is m/sup 2/. From the experimental results, the recognition system with the proposed model has shown better performance than that of a character-based bigram language model.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132794036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.601976
Moon-Soo Chang, S. Kang, Woo-Sik Rho, Heok-Gu Kim, Duck-Jin Kim
A binarization method is presented to counter the stroke connectivity problems of characters arising from mid-level-quality binary image scanning systems. In the output of a binary image scanning system, separate strokes may look connected if the point size is small and the character strokes are complex while strokes may lose connectivity if they are generated at low intensity. Also, erroneous recognition may result if a blemished document surface distorts the image. To counter these problems and to further enhance the quality of character recognition, the authors have developed an integrated binarization scheme, exploiting synergistic use of an adaptive thresholding technique and variable histogram equalization. This algorithm is composed of two components. The first removes background noise via gray level histogram equalization while the second enhances the gray level of characters over and above the surrounding background via an edge image composition technique.
{"title":"Improved binarization algorithm for document image by histogram and edge detection","authors":"Moon-Soo Chang, S. Kang, Woo-Sik Rho, Heok-Gu Kim, Duck-Jin Kim","doi":"10.1109/ICDAR.1995.601976","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.601976","url":null,"abstract":"A binarization method is presented to counter the stroke connectivity problems of characters arising from mid-level-quality binary image scanning systems. In the output of a binary image scanning system, separate strokes may look connected if the point size is small and the character strokes are complex while strokes may lose connectivity if they are generated at low intensity. Also, erroneous recognition may result if a blemished document surface distorts the image. To counter these problems and to further enhance the quality of character recognition, the authors have developed an integrated binarization scheme, exploiting synergistic use of an adaptive thresholding technique and variable histogram equalization. This algorithm is composed of two components. The first removes background noise via gray level histogram equalization while the second enhances the gray level of characters over and above the surrounding background via an edge image composition technique.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128540205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.598985
A. Takasu, S. Satoh, E. Katsura
A syntactic rule learning method is presented for analyzing document images and constructing a database from them. This method is used in a digital library system named CyberMagazine, where document images are sequentially converted into database tuples by block segmentation, rough classification, and syntactic analysis. The syntactic rule has an ability to analyze symbols located in two dimensional plane, and has a syntax similar to an ordinal context free grammar except for the concatenation of symbols. In the presented learning method, the syntactic rules are generated from a set of parse trees by decomposing the trees according to non terminal symbols, generalizing the decomposed trees to a syntactic rule, and merging them.
{"title":"A rule learning method for academic document image processing","authors":"A. Takasu, S. Satoh, E. Katsura","doi":"10.1109/ICDAR.1995.598985","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.598985","url":null,"abstract":"A syntactic rule learning method is presented for analyzing document images and constructing a database from them. This method is used in a digital library system named CyberMagazine, where document images are sequentially converted into database tuples by block segmentation, rough classification, and syntactic analysis. The syntactic rule has an ability to analyze symbols located in two dimensional plane, and has a syntax similar to an ordinal context free grammar except for the concatenation of symbols. In the presented learning method, the syntactic rules are generated from a set of parse trees by decomposing the trees according to non terminal symbols, generalizing the decomposed trees to a syntactic rule, and merging them.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134294853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.602070
D. Bouchaffra, J. Meunier
A Markovian random field approach is proposed for automatic information retrieval in full text documents. We draw up an analogy between a flow of queries/document images connections and statistical mechanics systems. The Markovian flow process machine (MFP) models the interaction between queries and document images as a dynamical system. The MFP machine searches to fit the user's queries by changing the set of descriptors contained in the document images. There is hence a constant transformation of the informational states of the fund. For each state, a certain degradation of the system is considered. We use simulated annealing algorithm to isolate low energy states: this corresponds to the best "matching" in some sense between queries and images.
{"title":"A Markovian random field approach to information retrieval","authors":"D. Bouchaffra, J. Meunier","doi":"10.1109/ICDAR.1995.602070","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.602070","url":null,"abstract":"A Markovian random field approach is proposed for automatic information retrieval in full text documents. We draw up an analogy between a flow of queries/document images connections and statistical mechanics systems. The Markovian flow process machine (MFP) models the interaction between queries and document images as a dynamical system. The MFP machine searches to fit the user's queries by changing the set of descriptors contained in the document images. There is hence a constant transformation of the informational states of the fund. For each state, a certain degradation of the system is considered. We use simulated annealing algorithm to isolate low energy states: this corresponds to the best \"matching\" in some sense between queries and images.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129005742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}