Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620580
Deya Motawa, A. Amin, R. Sabourin
The main theme of the paper is the automatic segmentation of Arabic words using mathematical morphology tools. The proposed algorithm has been tested with a set of Arabic words written by different writers, ranging from poor to acceptable quality. The initial experimental results are very encouraging and promising.
{"title":"Segmentation of Arabic cursive script","authors":"Deya Motawa, A. Amin, R. Sabourin","doi":"10.1109/ICDAR.1997.620580","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620580","url":null,"abstract":"The main theme of the paper is the automatic segmentation of Arabic words using mathematical morphology tools. The proposed algorithm has been tested with a set of Arabic words written by different writers, ranging from poor to acceptable quality. The initial experimental results are very encouraging and promising.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122039687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619860
H. Hase, Toshiyuki Shinokawa, M. Yoneda, M. Sakai, H. Maruyama
An extraction algorithm for character strings is proposed. We first obtain a set of eight-connected components from a document image. For the components, we apply a relaxation method. The method makes mutual connections between components increase or decrease depending on the state of the neighboring components. While applying the relaxation method several times, the process proceeds from a local connection to a global connection, and finally character strings are extracted. We call this process multi stage relaxation. The advantages of this algorithm are that it does not need to nominate character components from an image beforehand, it is adaptive for character size and font, and it can also cope with a document which includes strings with various orientations. In our experiments we use a color image of a magazine cover and a monochromatic image of a graph. For the color image, the multi stage relaxation was executed for each binary image obtained by color segmentation. Lastly, we show the results of the experiments and discuss the effectiveness of our method.
{"title":"Character string extraction by multi-stage relaxation","authors":"H. Hase, Toshiyuki Shinokawa, M. Yoneda, M. Sakai, H. Maruyama","doi":"10.1109/ICDAR.1997.619860","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619860","url":null,"abstract":"An extraction algorithm for character strings is proposed. We first obtain a set of eight-connected components from a document image. For the components, we apply a relaxation method. The method makes mutual connections between components increase or decrease depending on the state of the neighboring components. While applying the relaxation method several times, the process proceeds from a local connection to a global connection, and finally character strings are extracted. We call this process multi stage relaxation. The advantages of this algorithm are that it does not need to nominate character components from an image beforehand, it is adaptive for character size and font, and it can also cope with a document which includes strings with various orientations. In our experiments we use a color image of a magazine cover and a monochromatic image of a graph. For the color image, the multi stage relaxation was executed for each binary image obtained by color segmentation. Lastly, we show the results of the experiments and discuss the effectiveness of our method.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1244 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122167561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619843
Ke Liu, Yea-Shuan Huang, C. Suen
Presents a robust thinning-based method for the segmentation of strokes from handwritten Chinese characters. A new set of feature points is proposed for the analysis of skeleton images. A geometrical graph-based approach is developed for the analysis of strokes. A novel criterion is proposed for the identification of the fork points in a skeleton image which correspond to the same joint points in the original character image. Experimental results show that the proposed method is effective.
{"title":"Robust stroke segmentation method for handwritten Chinese character recognition","authors":"Ke Liu, Yea-Shuan Huang, C. Suen","doi":"10.1109/ICDAR.1997.619843","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619843","url":null,"abstract":"Presents a robust thinning-based method for the segmentation of strokes from handwritten Chinese characters. A new set of feature points is proposed for the analysis of skeleton images. A geometrical graph-based approach is developed for the analysis of strokes. A novel criterion is proposed for the identification of the fork points in a skeleton image which correspond to the same joint points in the original character image. Experimental results show that the proposed method is effective.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122236247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619873
K. Takahashi, H. Yasuda, T. Matsumoto
A fast HMM algorithm is proposed for on-line hand written character recognition. After preprocessing input strokes are discretized so that a discrete HMM can be used. This particular discretization naturally leads to a simple procedure for assigning initial state and state transition probabilities. In the training phase, complete marginalization with respect to state is not performed (constrained Viterbi). A simple smoothing/flooring procedure yields fast and robust learning. A criterion based on the normalized maximum likelihood ratio is given for deciding when to create a new model for the same character in the learning phase, in order to cope with stroke order variations and large shape variations. Preliminary experiments are done on the new Kuchibue database from the Tokyo University of Agriculture and Technology. The results seem to be encouraging.
{"title":"A fast HMM algorithm for on-line handwritten character recognition","authors":"K. Takahashi, H. Yasuda, T. Matsumoto","doi":"10.1109/ICDAR.1997.619873","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619873","url":null,"abstract":"A fast HMM algorithm is proposed for on-line hand written character recognition. After preprocessing input strokes are discretized so that a discrete HMM can be used. This particular discretization naturally leads to a simple procedure for assigning initial state and state transition probabilities. In the training phase, complete marginalization with respect to state is not performed (constrained Viterbi). A simple smoothing/flooring procedure yields fast and robust learning. A criterion based on the normalized maximum likelihood ratio is given for deciding when to create a new model for the same character in the learning phase, in order to cope with stroke order variations and large shape variations. Preliminary experiments are done on the new Kuchibue database from the Tokyo University of Agriculture and Technology. The results seem to be encouraging.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124692610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620622
Toyohide Watanabe, Rui Zhang
Map recognition is one of the most interesting methods to distinguish meaningful information automatically from map images and to construct resource data in information systems such as GISs (geographic information systems). However, it is difficult to identify individual pieces of information because the map components mutually intersect or are overlayed, and because the properties of the map components are not well defined. In this paper, we propose a method of extracting character strings from color urban map images. The characteristic of our method is to validate the identified character strings with map composition rules. Our recognition process is composed of two phases: extraction of character strings and classification of character strings. The extraction phase is organized by a bottom-up approach, based on measurement/estimation among pixels, while the classification phase is composed by means of a top-down approach, based on interpretation/validation among map components.
{"title":"Recognition of character strings from color urban map images on the basis of validation mechanism","authors":"Toyohide Watanabe, Rui Zhang","doi":"10.1109/ICDAR.1997.620622","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620622","url":null,"abstract":"Map recognition is one of the most interesting methods to distinguish meaningful information automatically from map images and to construct resource data in information systems such as GISs (geographic information systems). However, it is difficult to identify individual pieces of information because the map components mutually intersect or are overlayed, and because the properties of the map components are not well defined. In this paper, we propose a method of extracting character strings from color urban map images. The characteristic of our method is to validate the identified character strings with map composition rules. Our recognition process is composed of two phases: extraction of character strings and classification of character strings. The extraction phase is organized by a bottom-up approach, based on measurement/estimation among pixels, while the classification phase is composed by means of a top-down approach, based on interpretation/validation among map components.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130296353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620656
Y. Li, M. Lalonde, E. Reiher, Jean-Francois Rizand, Chong Zhu
The CRIM Image Mining Environment (CIME) is an image understanding environment integrated with knowledge engineering technologies. The image understanding technology developed for CIME supports recognition tasks such as symbol recognition, contour line recognition, and general image processing operations. CIME is composed of an Object Model Description Language (OMDL) which allows for the semantic description of knowledge about objects and their context, an inference engine which performs image understanding based on the model descriptions in OMDL to achieve the recognition of complex objects, and a task builder which allows for the acquisition of image understanding knowledge and helps the application designer to build strategies for solving application problems through knowledge reuse.
CRIM图像挖掘环境(CIME)是一个集成了知识工程技术的图像理解环境。为CIME开发的图像理解技术支持诸如符号识别、轮廓线识别和一般图像处理操作等识别任务。CIME由对象模型描述语言(Object Model Description Language, OMDL)和任务构建器(task builder)组成,前者允许对对象及其上下文的知识进行语义描述,后者基于OMDL中的模型描述执行图像理解以实现复杂对象的识别;后者允许获取图像理解知识并帮助应用程序设计者通过知识重用来构建解决应用程序问题的策略。
{"title":"A knowledge-based image understanding environment for document processing","authors":"Y. Li, M. Lalonde, E. Reiher, Jean-Francois Rizand, Chong Zhu","doi":"10.1109/ICDAR.1997.620656","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620656","url":null,"abstract":"The CRIM Image Mining Environment (CIME) is an image understanding environment integrated with knowledge engineering technologies. The image understanding technology developed for CIME supports recognition tasks such as symbol recognition, contour line recognition, and general image processing operations. CIME is composed of an Object Model Description Language (OMDL) which allows for the semantic description of knowledge about objects and their context, an inference engine which performs image understanding based on the model descriptions in OMDL to achieve the recognition of complex objects, and a task builder which allows for the acquisition of image understanding knowledge and helps the application designer to build strategies for solving application problems through knowledge reuse.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127801514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620611
R. Safari, N. Narasimhamurthi, M. Shridhar, M. Ahmadi
In this paper, a technique for extracting the distortion parameters in filled-in forms is presented. The technique determines the transformations that are required to convert a filled-in form to match a known master and then extracts filled-in information. The method involves determining corresponding lines and key points between the master and the filled-in form and using the correspondence to determine the appropriate transformation. The correspondence problem is solved using results from affine geometry.
{"title":"Form registration: a computer vision approach","authors":"R. Safari, N. Narasimhamurthi, M. Shridhar, M. Ahmadi","doi":"10.1109/ICDAR.1997.620611","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620611","url":null,"abstract":"In this paper, a technique for extracting the distortion parameters in filled-in forms is presented. The technique determines the transformations that are required to convert a filled-in form to match a known master and then extracts filled-in information. The method involves determining corresponding lines and key points between the master and the filled-in form and using the correspondence to determine the appropriate transformation. The correspondence problem is solved using results from affine geometry.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129430352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619867
Liang-Hua Chen, Jiing-Yuh Wang
The paper presents a complete procedure for the extraction and recognition of handprinted numeral strings on maps. The character extraction algorithm can even segment slant and touching characters into individual components. The feature based recognition algorithm can recognize numeral characters of any size, position, and orientation. Our features for discrimination are simple and easily detectable. Experimental results on utility and cadastral maps have shown that the proposed technique is effective in the automatic data capture of geographic information systems.
{"title":"A system for extracting and recognizing numeral strings on maps","authors":"Liang-Hua Chen, Jiing-Yuh Wang","doi":"10.1109/ICDAR.1997.619867","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619867","url":null,"abstract":"The paper presents a complete procedure for the extraction and recognition of handprinted numeral strings on maps. The character extraction algorithm can even segment slant and touching characters into individual components. The feature based recognition algorithm can recognize numeral characters of any size, position, and orientation. Our features for discrimination are simple and easily detectable. Experimental results on utility and cadastral maps have shown that the proposed technique is effective in the automatic data capture of geographic information systems.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128891484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620598
K. Sugawara
Detection of skews in input images is an important preprocess for analyzing document images. We propose a novel skew detection method based on vertical as well as horizontal splitting of an image. The center of gravity of the black pixels in each section is used as a sample point for performing the Hough transform, which gives the skew angle of the document image. We extend the Hough transform to incorporate the notion of weight, and apply it to reducing spurious line detection caused by the presence of large non text areas with many black pixels. The weighting function is defined according to the proportion of black pixels in the area, giving a low weight to an area with an extremely high proportion of black pixels. We conducted skew detection experiments on several document images. By using the Hough transform on connected components our method works four times faster than the conventional method and it has no restrictions on the detection range within which skew angles can be detected.
{"title":"Weighted Hough transform on a gridded image plane","authors":"K. Sugawara","doi":"10.1109/ICDAR.1997.620598","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620598","url":null,"abstract":"Detection of skews in input images is an important preprocess for analyzing document images. We propose a novel skew detection method based on vertical as well as horizontal splitting of an image. The center of gravity of the black pixels in each section is used as a sample point for performing the Hough transform, which gives the skew angle of the document image. We extend the Hough transform to incorporate the notion of weight, and apply it to reducing spurious line detection caused by the presence of large non text areas with many black pixels. The weighting function is defined according to the proportion of black pixels in the area, giving a low weight to an area with an extremely high proportion of black pixels. We conducted skew detection experiments on several document images. By using the Hough transform on connected components our method works four times faster than the conventional method and it has no restrictions on the detection range within which skew angles can be detected.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121017718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619870
Diana Galindo, C. Faure
A pen based interactive editor for network diagrams is proposed. The network diagrams are composed of geometrical figures and connecting lines. The user may hand-sketch a first diagram and then, modify it by adding new components or by erasing, replacing or moving the existing ones. The machine beautifies the draft and updates the whole structure of the diagram when the user produces local modification. Graphic communication implies that the layout of the diagram follows principles which are grounded on visual perception. The machine must be able to detect perceptual constraints (alignments, equality of sizes etc.) for beautification and updating. A perceptually structured representation (PSR) of the diagram is built automatically. A formal model of diagram perception is defined as a set of rules which are applied during the global analysis following the local analysis where the figures are detected and recognised.
{"title":"Perceptually-based representation of network diagrams","authors":"Diana Galindo, C. Faure","doi":"10.1109/ICDAR.1997.619870","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619870","url":null,"abstract":"A pen based interactive editor for network diagrams is proposed. The network diagrams are composed of geometrical figures and connecting lines. The user may hand-sketch a first diagram and then, modify it by adding new components or by erasing, replacing or moving the existing ones. The machine beautifies the draft and updates the whole structure of the diagram when the user produces local modification. Graphic communication implies that the layout of the diagram follows principles which are grounded on visual perception. The machine must be able to detect perceptual constraints (alignments, equality of sizes etc.) for beautification and updating. A perceptually structured representation (PSR) of the diagram is built automatically. A formal model of diagram perception is defined as a set of rules which are applied during the global analysis following the local analysis where the figures are detected and recognised.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123311513","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}