Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.602070
D. Bouchaffra, J. Meunier
A Markovian random field approach is proposed for automatic information retrieval in full text documents. We draw up an analogy between a flow of queries/document images connections and statistical mechanics systems. The Markovian flow process machine (MFP) models the interaction between queries and document images as a dynamical system. The MFP machine searches to fit the user's queries by changing the set of descriptors contained in the document images. There is hence a constant transformation of the informational states of the fund. For each state, a certain degradation of the system is considered. We use simulated annealing algorithm to isolate low energy states: this corresponds to the best "matching" in some sense between queries and images.
{"title":"A Markovian random field approach to information retrieval","authors":"D. Bouchaffra, J. Meunier","doi":"10.1109/ICDAR.1995.602070","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.602070","url":null,"abstract":"A Markovian random field approach is proposed for automatic information retrieval in full text documents. We draw up an analogy between a flow of queries/document images connections and statistical mechanics systems. The Markovian flow process machine (MFP) models the interaction between queries and document images as a dynamical system. The MFP machine searches to fit the user's queries by changing the set of descriptors contained in the document images. There is hence a constant transformation of the informational states of the fund. For each state, a certain degradation of the system is considered. We use simulated annealing algorithm to isolate low energy states: this corresponds to the best \"matching\" in some sense between queries and images.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129005742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599030
Cheng-Lin Liu, Ru-Wei Dai, Ying-Jian Liu
To solve the problem of writer identification (WI) with indeterminate classes (writers) and objects (characters), it is a good way to extract individual features with clear physical meanings and small dynamic ranges. In this paper, a new method named Moment-Based Feature Method to identify Chinese writers is presented in which normalized individual features are derived from geometric moments of character images. The extracted features are invariant under translation, scaling, and stroke-width. They are explicitly corresponding to human perception of shape and distribute their values in small dynamic ranges. Experiments of writer recognition and verification are implemented to demonstrate the efficiency of this method and promising results have been achieved.
{"title":"Extracting individual features from moments for Chinese writer identification","authors":"Cheng-Lin Liu, Ru-Wei Dai, Ying-Jian Liu","doi":"10.1109/ICDAR.1995.599030","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599030","url":null,"abstract":"To solve the problem of writer identification (WI) with indeterminate classes (writers) and objects (characters), it is a good way to extract individual features with clear physical meanings and small dynamic ranges. In this paper, a new method named Moment-Based Feature Method to identify Chinese writers is presented in which normalized individual features are derived from geometric moments of character images. The extracted features are invariant under translation, scaling, and stroke-width. They are explicitly corresponding to human perception of shape and distribute their values in small dynamic ranges. Experiments of writer recognition and verification are implemented to demonstrate the efficiency of this method and promising results have been achieved.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129411856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.601980
Norio Nakamura, K. Hosaka, Masakazu Nagura
The paper describes the properties of Ueda's (1985) image enhancement method for line drawings and its merit for practical use. This method can remove the line discontinuities or mis-connections caused by scanning errors. The method is applied to simple images to evaluate its effect quantitatively. The authors confirm that it is more efficient than any other methods, and propose a drawing capturing system based on this method that can build up high quality drawing databases faster than any other system.
{"title":"Drawing capturing system using image enhancement","authors":"Norio Nakamura, K. Hosaka, Masakazu Nagura","doi":"10.1109/ICDAR.1995.601980","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.601980","url":null,"abstract":"The paper describes the properties of Ueda's (1985) image enhancement method for line drawings and its merit for practical use. This method can remove the line discontinuities or mis-connections caused by scanning errors. The method is applied to simple images to evaluate its effect quantitatively. The authors confirm that it is more efficient than any other methods, and propose a drawing capturing system based on this method that can build up high quality drawing databases faster than any other system.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130755617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.602096
S. Baumann
This paper describes a simplified attributed programmed graph grammar to represent and process a-priori knowledge about common music notation. The presented approach serves as a high-level recognition stage and is interlocked to previous low-level recognition phases in our entire optical music recognition system (DOREMIDI++). The implemented grammar rules and control diagrams describe a declarative knowledge base to drive a transformation algorithm. This transformation converts the results of symbol recognition stages to a symbolic representation of the musical score.
{"title":"A simplified attributed graph grammar for high-level music recognition","authors":"S. Baumann","doi":"10.1109/ICDAR.1995.602096","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.602096","url":null,"abstract":"This paper describes a simplified attributed programmed graph grammar to represent and process a-priori knowledge about common music notation. The presented approach serves as a high-level recognition stage and is interlocked to previous low-level recognition phases in our entire optical music recognition system (DOREMIDI++). The implemented grammar rules and control diagrams describe a declarative knowledge base to drive a transformation algorithm. This transformation converts the results of symbol recognition stages to a symbolic representation of the musical score.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129835090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.602082
C. Privitera, R. Plamondon
This paper presents a segmentation method that partly mimics the cognitive-behavioral process used by human subjects to recover motor-temporal information from the image of a handwritten word. The approach does not exploit any thinning procedure, but rather a different typology of information is manipulated concerning the curvature of the word contour. Starting from the maximum curvature points roughly corresponding to the beginning of a stroke, the algorithm scans the word, following the natural course of the line and attempts to repeat the same movement as executed by the writer during the generation of the word. At each maximum curvature point, the line is segmented and reconstructed by a smooth interpolation of the most interior points belonging to the line just covered. At the end of the scanning process, a temporal sequence of motor strokes is obtained which plausibly composes the original intended movement.
{"title":"A system for scanning and segmenting cursively handwritten words into basic strokes","authors":"C. Privitera, R. Plamondon","doi":"10.1109/ICDAR.1995.602082","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.602082","url":null,"abstract":"This paper presents a segmentation method that partly mimics the cognitive-behavioral process used by human subjects to recover motor-temporal information from the image of a handwritten word. The approach does not exploit any thinning procedure, but rather a different typology of information is manipulated concerning the curvature of the word contour. Starting from the maximum curvature points roughly corresponding to the beginning of a stroke, the algorithm scans the word, following the natural course of the line and attempts to repeat the same movement as executed by the writer during the generation of the word. At each maximum curvature point, the line is segmented and reconstructed by a smooth interpolation of the most interior points belonging to the line just covered. At the end of the scanning process, a temporal sequence of motor strokes is obtained which plausibly composes the original intended movement.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128491004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.598966
Tyne Liang, Suh-Yin Lee, Wei-Pang Yang
In the application of the superimposed coding method to character-based Chinese text retrieval we find two kinds of false hits for a multi-syllabic (multicharacter) query. The first type is a random false hit (RFH) which is due to accidental setting of bits by irrelevant characters in a document signature. The other type is an adjacency false hit (AFH) which is due to the loss of character sequence information in signature creation. Since many query terms are proper nouns and Chinese names which often contain three characters (tri-syllabic), we derive a formula to estimate the RFH for trisyllabic queries. As for the AFH which cannot be reduced by single character (monogram) hashing method, a method which hashes consecutive character pairs (bigram) is designed to reduce both the AFH and the RFH. We find that there exists an optimal weight assignment for a minimal false hit rate in a combined scheme which encodes both monogram and bigram keys in document signatures.
{"title":"False hits of tri-syllabic queries in a Chinese signature file","authors":"Tyne Liang, Suh-Yin Lee, Wei-Pang Yang","doi":"10.1109/ICDAR.1995.598966","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.598966","url":null,"abstract":"In the application of the superimposed coding method to character-based Chinese text retrieval we find two kinds of false hits for a multi-syllabic (multicharacter) query. The first type is a random false hit (RFH) which is due to accidental setting of bits by irrelevant characters in a document signature. The other type is an adjacency false hit (AFH) which is due to the loss of character sequence information in signature creation. Since many query terms are proper nouns and Chinese names which often contain three characters (tri-syllabic), we derive a formula to estimate the RFH for trisyllabic queries. As for the AFH which cannot be reduced by single character (monogram) hashing method, a method which hashes consecutive character pairs (bigram) is designed to reduce both the AFH and the RFH. We find that there exists an optimal weight assignment for a minimal false hit rate in a combined scheme which encodes both monogram and bigram keys in document signatures.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125858557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.598986
M. Röösli, G. Monagan
We present a vectorization system to generate vector data which corresponds to the line structures of a raster image. The vector data consists of the primitives: "straight lane segment" and "circular arc". The vectorization system measures the quality of each primitive generated. Thus, the vectorization does not only produce high quality vector data, it also gives a precise description of the quality of the data generated. This is crucial if the requirements set by industrial applications are to be met. In order not to lose the quality of the vector data while constructing primitives into line objects, geometric constraints are incorporated already at the vectorization level: constraints like requiring segments to be parallel or perpendicular, circular arcs to be concentric, or tangents of the primitives to be equal at their connection point. After the constraints have been satisfied the resulting primitives still fulfil the quality requirements as before the constraints were imposed. The possibility to refit the generated vector data under adapted constraints allows for an efficient interactive postprocessing of the data.
{"title":"A high quality vectorization combining local quality measures and global constraints","authors":"M. Röösli, G. Monagan","doi":"10.1109/ICDAR.1995.598986","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.598986","url":null,"abstract":"We present a vectorization system to generate vector data which corresponds to the line structures of a raster image. The vector data consists of the primitives: \"straight lane segment\" and \"circular arc\". The vectorization system measures the quality of each primitive generated. Thus, the vectorization does not only produce high quality vector data, it also gives a precise description of the quality of the data generated. This is crucial if the requirements set by industrial applications are to be met. In order not to lose the quality of the vector data while constructing primitives into line objects, geometric constraints are incorporated already at the vectorization level: constraints like requiring segments to be parallel or perpendicular, circular arcs to be concentric, or tangents of the primitives to be equal at their connection point. After the constraints have been satisfied the resulting primitives still fulfil the quality requirements as before the constraints were imposed. The possibility to refit the generated vector data under adapted constraints allows for an efficient interactive postprocessing of the data.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"435 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126104624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599040
P. Lefèvre, François Reynaud
This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named "ODIL"-Office Document Image description Language-that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML elements, and their characteristics are defined by SGML attributes. The basic objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language: texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DTD will permit to use SGML tools for the logical structure recognition which is viewed as an SGML up-conversion problem.
{"title":"ODIL: an SGML description language of the layout structure of documents","authors":"P. Lefèvre, François Reynaud","doi":"10.1109/ICDAR.1995.599040","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599040","url":null,"abstract":"This paper describes a coding format in SGML for the output of a document recognition prototype. Our proposal is a DTD named \"ODIL\"-Office Document Image description Language-that describes precisely the layout structure of a document after all recognition phases, including OCR. All layout objects of a document are defined in the form of SGML elements, and their characteristics are defined by SGML attributes. The basic objects are blocks, containing homogeneous information. Five types of information are supported by the ODIL language: texts, photos, line graphics, tables, mathematic formulas. The ODIL representation of the recognition results is well adapted to a further logical structure recognition. Starting from the ODIL DTD and using the RAINBOW transit DTD will permit to use SGML tools for the logical structure recognition which is viewed as an SGML up-conversion problem.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127730608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.601963
Jinhui Liu, Xiaoqing Ding, Youshou Wu
In this paper we present a form description method, in which frame lines are used to constitute a so-called frame template, which reflects the structure of a form either topologically or geometrically. Relevant item traversal algorithm is then proposed to locate and label form's items. We have also developed a robust and fast frame line detection method to make this form description practical for form recognition. Experimental results show our approach provides an effective way to convert printed forms into computerized format or collect information for database from printed forms.
{"title":"Description and recognition of form and automated form data entry","authors":"Jinhui Liu, Xiaoqing Ding, Youshou Wu","doi":"10.1109/ICDAR.1995.601963","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.601963","url":null,"abstract":"In this paper we present a form description method, in which frame lines are used to constitute a so-called frame template, which reflects the structure of a form either topologically or geometrically. Relevant item traversal algorithm is then proposed to locate and label form's items. We have also developed a robust and fast frame line detection method to make this form description practical for form recognition. Experimental results show our approach provides an effective way to convert printed forms into computerized format or collect information for database from printed forms.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114364483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1995-08-14DOI: 10.1109/ICDAR.1995.599038
Debashish Niyogi, S. Srihari
The analysis of a document image to derive a symbolic description of its structure and contents involves using spatial domain knowledge to classify the different printed blocks (e.g., text paragraphs), group them into logical units (e.g., newspaper stories), and determine the reading order of the text blocks within each unit. These steps describe the conversion of the physical structure of a document into its logical structure. We have developed a computational model for document logical structure derivation, in which a rule-based control strategy utilizes the data obtained from analyzing a digitized document image, and makes inferences using a multi-level knowledge base of document layout rules. The knowledge-based document logical structure derivation system (DeLoS) based on this model consists of a hierarchical rule-based control system to guide the block classification, grouping and read-ordering operations; a global data structure to store the document image data and incremental inferences; and a domain knowledge base to encode the rules governing document layout.
{"title":"Knowledge-based derivation of document logical structure","authors":"Debashish Niyogi, S. Srihari","doi":"10.1109/ICDAR.1995.599038","DOIUrl":"https://doi.org/10.1109/ICDAR.1995.599038","url":null,"abstract":"The analysis of a document image to derive a symbolic description of its structure and contents involves using spatial domain knowledge to classify the different printed blocks (e.g., text paragraphs), group them into logical units (e.g., newspaper stories), and determine the reading order of the text blocks within each unit. These steps describe the conversion of the physical structure of a document into its logical structure. We have developed a computational model for document logical structure derivation, in which a rule-based control strategy utilizes the data obtained from analyzing a digitized document image, and makes inferences using a multi-level knowledge base of document layout rules. The knowledge-based document logical structure derivation system (DeLoS) based on this model consists of a hierarchical rule-based control system to guide the block classification, grouping and read-ordering operations; a global data structure to store the document image data and incremental inferences; and a domain knowledge base to encode the rules governing document layout.","PeriodicalId":273519,"journal":{"name":"Proceedings of 3rd International Conference on Document Analysis and Recognition","volume":"35 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114127740","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}