Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620602
J. Cullen, J. Hull, P. Hart
A system is presented that uses texture to retrieve and browse images stored in a large document image database. A method of graphically generating a candidate search image is used that shows the visual layout and content of a target document. All images similar to this candidate are returned for the purpose of browsing or further query. The system is accessed using a World Wide Web (Web) browser. Applications include the retrieval and browsing of document images including newspapers, fares and business letters.
{"title":"Document image database retrieval and browsing using texture analysis","authors":"J. Cullen, J. Hull, P. Hart","doi":"10.1109/ICDAR.1997.620602","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620602","url":null,"abstract":"A system is presented that uses texture to retrieve and browse images stored in a large document image database. A method of graphically generating a candidate search image is used that shows the visual layout and content of a target document. All images similar to this candidate are returned for the purpose of browsing or further query. The system is accessed using a World Wide Web (Web) browser. Applications include the retrieval and browsing of document images including newspapers, fares and business letters.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114985585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619817
D. Dori, Wenyin Liu
Accurate arc segmentation, essential for high level engineering drawing understanding is very difficult due to noise, clutter, tangency, and intersections with other geometry objects. We present an application of a generic methodology for recognition of multicomponent graphic objects in engineering drawings to the segmentation of circular arcs. The underlying mechanism is a sequential stepwise recovery of components that are segmented as wire fragments during the sparse pixel vectorization process and meet a set of continuity conditions. Proper threshold selection and consistent checking of co-circularity of the assumed arc pieces result in an accurate arc segmentation method.
{"title":"Arc segmentation from complex line environments-a vector-based stepwise recovery algorithm","authors":"D. Dori, Wenyin Liu","doi":"10.1109/ICDAR.1997.619817","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619817","url":null,"abstract":"Accurate arc segmentation, essential for high level engineering drawing understanding is very difficult due to noise, clutter, tangency, and intersections with other geometry objects. We present an application of a generic methodology for recognition of multicomponent graphic objects in engineering drawings to the segmentation of circular arcs. The underlying mechanism is a sequential stepwise recovery of components that are segmented as wire fragments during the sparse pixel vectorization process and meet a set of continuity conditions. Proper threshold selection and consistent checking of co-circularity of the assumed arc pieces result in an accurate arc segmentation method.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127337557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619828
H. Nishida
A novel method is presented for extracting closed boundaries of document components such as characters and symbols directly from gray-scale document images based on the surface data structures along with structural features. The method is based on the simple model assuming that a closed boundary of document components can be approximated as a series of horizontal line segments and can be extracted by linking surface components with steep gradients which share commonly intersecting horizontal planes. The proposed algorithm is compared with some binarization algorithms, shown to be effective for improving recognition accuracy for very poor quality data.
{"title":"Boundary feature extraction from gray-scale document images","authors":"H. Nishida","doi":"10.1109/ICDAR.1997.619828","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619828","url":null,"abstract":"A novel method is presented for extracting closed boundaries of document components such as characters and symbols directly from gray-scale document images based on the surface data structures along with structural features. The method is based on the simple model assuming that a closed boundary of document components can be approximated as a series of horizontal line segments and can be extracted by linking surface components with steep gradients which share commonly intersecting horizontal planes. The proposed algorithm is compared with some binarization algorithms, shown to be effective for improving recognition accuracy for very poor quality data.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"183 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125834470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620561
S. Madhvanath, V. Krpasundar
We present a technique for pruning of large lexicons for recognition of cursive script words. The technique involves extraction and representation of downward pen-strokes from the cursive word (off-line or online) to obtain a generalized descriptor which provides a coarse characterization of word shape. The descriptor is matched with ideal descriptors of lexicon entries organized as a trie. When used with a static lexicon of 21,000 words, the accuracy of reduction to 1000 words exceeds 95%.
{"title":"Pruning large lexicons using generalized word shape descriptors","authors":"S. Madhvanath, V. Krpasundar","doi":"10.1109/ICDAR.1997.620561","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620561","url":null,"abstract":"We present a technique for pruning of large lexicons for recognition of cursive script words. The technique involves extraction and representation of downward pen-strokes from the cursive word (off-line or online) to obtain a generalized descriptor which provides a coarse characterization of word shape. The descriptor is matched with ideal descriptors of lexicon entries organized as a trie. When used with a static lexicon of 21,000 words, the accuracy of reduction to 1000 words exceeds 95%.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129838857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619820
John T. Favata
Focuses on the problem of isolated off-line general word recognition using an approximate stroke-segment/string matching algorithm. Several recently proposed word recognition algorithms use the strategy of directly matching the stroke segments (with OCR estimates) to the sequence of characters in each lexicon word. This idea works very well under ideal conditions; however, many applications require the recognition of text in the presence of document noise, poor handwriting and lexicon errors. These factors require careful design of the matching strategy such that a moderate amount of any form of degradation does not cause a recognition failure. A segment-to-string matching algorithm is proposed which robustly recovers from moderate levels of noise and system errors. This algorithm is developed in the context of a complete word recognition system and serves as its final post-processing module.
{"title":"General word recognition using approximate segment-string matching","authors":"John T. Favata","doi":"10.1109/ICDAR.1997.619820","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619820","url":null,"abstract":"Focuses on the problem of isolated off-line general word recognition using an approximate stroke-segment/string matching algorithm. Several recently proposed word recognition algorithms use the strategy of directly matching the stroke segments (with OCR estimates) to the sequence of characters in each lexicon word. This idea works very well under ideal conditions; however, many applications require the recognition of text in the presence of document noise, poor handwriting and lexicon errors. These factors require careful design of the matching strategy such that a moderate amount of any form of degradation does not cause a recognition failure. A segment-to-string matching algorithm is proposed which robustly recovers from moderate levels of noise and system errors. This algorithm is developed in the context of a complete word recognition system and serves as its final post-processing module.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128653881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619850
Jiangying Zhou, D. Lopresti
The authors examine the problem of locating and extracting text from images on the World Wide Web. They describe a text detection algorithm which is based on color clustering and connected component analysis. The algorithm first quantizes the color space of the input image into a number of color classes using a parameter-free clustering procedure. It then identifies text-like connected components in each color class based on their shapes. Finally, a post-processing procedure aligns text-like components into text lines. Experimental results suggest this approach is promising despite the challenging nature of the input data.
{"title":"Extracting text from WWW images","authors":"Jiangying Zhou, D. Lopresti","doi":"10.1109/ICDAR.1997.619850","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619850","url":null,"abstract":"The authors examine the problem of locating and extracting text from images on the World Wide Web. They describe a text detection algorithm which is based on color clustering and connected component analysis. The algorithm first quantizes the color space of the input image into a number of color classes using a parameter-free clustering procedure. It then identifies text-like connected components in each color class based on their shapes. Finally, a post-processing procedure aligns text-like components into text lines. Experimental results suggest this approach is promising despite the challenging nature of the input data.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130305583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619823
Myriam Côté, M. Cheriet, É. Lecolinet, C. Suen
Presents a model for reading cursive scripts which has an architecture inspired by a reading model and which is based on perceptual concepts. We limit the scope of our study to the off-line recognition of isolated cursive words. First of all, we justify why we chose McClelland & Rumelhart's (1981) reading model as the inspiration for our system. A brief resume/spl acute/ of the method's behavior is presented and the main originalities of our model are underlined. After this, we focus on the new updates added to the original system: a new baseline extraction module, a new feature extraction module and a new generation, validation and hypothesis insertion process. After implementation of our method, new results have been obtained on real images from a training set of 184 images, and a testing set of 100 images, and are discussed. We are concentrating now on validating the model using a larger database.
{"title":"Automatic reading of cursive scripts using human knowledge","authors":"Myriam Côté, M. Cheriet, É. Lecolinet, C. Suen","doi":"10.1109/ICDAR.1997.619823","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619823","url":null,"abstract":"Presents a model for reading cursive scripts which has an architecture inspired by a reading model and which is based on perceptual concepts. We limit the scope of our study to the off-line recognition of isolated cursive words. First of all, we justify why we chose McClelland & Rumelhart's (1981) reading model as the inspiration for our system. A brief resume/spl acute/ of the method's behavior is presented and the main originalities of our model are underlined. After this, we focus on the new updates added to the original system: a new baseline extraction module, a new feature extraction module and a new generation, validation and hypothesis insertion process. After implementation of our method, new results have been obtained on real images from a training set of 184 images, and a testing set of 100 images, and are discussed. We are concentrating now on validating the model using a larger database.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126874135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620557
S. Madhvanath, V. Govindaraju
The one-dimensional nature of contour representations presents interesting challenges for processing of images for handwritten word recognition. In this paper, we discuss the issues of determination of upper and lower contours of the word, determination of significant focal extrema on the contour, and determination of reference lines from contour representations of handwritten words.
{"title":"Contour-based image preprocessing for holistic handwritten word recognition","authors":"S. Madhvanath, V. Govindaraju","doi":"10.1109/ICDAR.1997.620557","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620557","url":null,"abstract":"The one-dimensional nature of contour representations presents interesting challenges for processing of images for handwritten word recognition. In this paper, we discuss the issues of determination of upper and lower contours of the word, determination of significant focal extrema on the contour, and determination of reference lines from contour representations of handwritten words.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121630947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620644
Ioannis T. Pavlidis, Rahul Singh, N. Papanikolopoulos
We propose a novel user-dependent method for the recognition of on-line handwritten notes. The method employs as a dissimilarity measure the "degree of morphing" between an input curve and a template curve. A physics-based approach substantiates the "degree of morphing" as a deformation energy and casts the problem as an energy minimization problem. The method operates upon key segmentation points that are provided by an appropriate segmentation algorithm. The segmentation objective is not to locate letters, but instead to locate corners and some key low curvature points (an easier task). This is part of the method's strategy to see the word as a generic on-line curve. Due to this strategy, the proposed method can handle collectively both cursive words and hand-drawn line figures, the two key ingredients of handwritten notes. Most importantly, the proposed system achieves high recognition rates without ever resorting to statistical models.
{"title":"An on-line handwritten note recognition method using shape metamorphosis","authors":"Ioannis T. Pavlidis, Rahul Singh, N. Papanikolopoulos","doi":"10.1109/ICDAR.1997.620644","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620644","url":null,"abstract":"We propose a novel user-dependent method for the recognition of on-line handwritten notes. The method employs as a dissimilarity measure the \"degree of morphing\" between an input curve and a template curve. A physics-based approach substantiates the \"degree of morphing\" as a deformation energy and casts the problem as an energy minimization problem. The method operates upon key segmentation points that are provided by an appropriate segmentation algorithm. The segmentation objective is not to locate letters, but instead to locate corners and some key low curvature points (an easier task). This is part of the method's strategy to see the word as a generic on-line curve. Due to this strategy, the proposed method can handle collectively both cursive words and hand-drawn line figures, the two key ingredients of handwritten notes. Most importantly, the proposed system achieves high recognition rates without ever resorting to statistical models.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121653862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620672
G. Menier, G. Lorette
In a lexical analyzer, the scanning of a whole dictionary using an editing distance measure has a very high computational cost. We present a lexical analyzer designed to focus on a very limited subset of the whole dictionary. The system is based on a self-organizing feature map which maps the dictionary on to a 2D space. The neighborhood relationships on this space are then used to define a short list of hypotheses. We introduce a multi-stage pyramidal network to speed up the access, and we present the performance of the system. These results are then interpreted.
{"title":"Lexical analyzer based on a self-organizing feature map","authors":"G. Menier, G. Lorette","doi":"10.1109/ICDAR.1997.620672","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620672","url":null,"abstract":"In a lexical analyzer, the scanning of a whole dictionary using an editing distance measure has a very high computational cost. We present a lexical analyzer designed to focus on a very limited subset of the whole dictionary. The system is based on a self-organizing feature map which maps the dictionary on to a 2D space. The neighborhood relationships on this space are then used to define a short list of hypotheses. We introduce a multi-stage pyramidal network to speed up the access, and we present the performance of the system. These results are then interpreted.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126373239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}