Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620543
K. Sawa, S. Tsuruoka, T. Wakabayashi, F. Kimura, Y. Miyake
Describes a method for dot-printed character string recognition on a piece of steel for factory automation. Our scanned string consists of alphanumerics and the '-' character, and the number of characters is variable from 6 to 12 characters. We propose a new recognition procedure for low-quality strings. The procedure includes image emphasis with a Gaussian Laplacian filter, the extraction of the string subimage, segmentation-recognition with dynamic programming, and fine character recognition. We evaluated its accuracy on a UNIX workstation for 1036 images (8806 characters) scanned by a monochrome video camera in the actual production line at a steel-producing factory, and the average recognition rates were 99.2% for the character recognition and 91.6% for the string recognition.
{"title":"Low quality string recognition for factory automation","authors":"K. Sawa, S. Tsuruoka, T. Wakabayashi, F. Kimura, Y. Miyake","doi":"10.1109/ICDAR.1997.620543","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620543","url":null,"abstract":"Describes a method for dot-printed character string recognition on a piece of steel for factory automation. Our scanned string consists of alphanumerics and the '-' character, and the number of characters is variable from 6 to 12 characters. We propose a new recognition procedure for low-quality strings. The procedure includes image emphasis with a Gaussian Laplacian filter, the extraction of the string subimage, segmentation-recognition with dynamic programming, and fine character recognition. We evaluated its accuracy on a UNIX workstation for 1036 images (8806 characters) scanned by a monochrome video camera in the actual production line at a steel-producing factory, and the average recognition rates were 99.2% for the character recognition and 91.6% for the string recognition.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132745175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620680
R. Fateman
The electronic representation of scientific documents (journals, technical reports, program documentation, laboratory notebooks, etc.) presents challenges in several distinct communities. We see five distinct groups who are concerned with electronic versions of scientific documents: (1) publishers of journals, texts and reference works, and their authors; (2) software publishers for OCR/document analysis and document formatting; (3) software publishers whose products access "contents semantics" from documents, including library keyword search programs, natural language search programs, database systems, visual presentation systems, mathematical computation systems, etc.; (4) institutions maintaining access to electronic libraries, which must be broadly construed to include data and programs of all sorts; and (5) individuals and programs acting as their agents who need to use these libraries to identify, locate and retrieve relevant documents. It would be good to have a convergence in design and standards for encoding new or pre-existing (typically paper-based) documents in order to meet the needs of all these groups. Various efforts, some loosely coordinated, but just as often competing, are trying to set standards and build tools. This paper discusses where we are headed.
{"title":"More versatile scientific documents","authors":"R. Fateman","doi":"10.1109/ICDAR.1997.620680","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620680","url":null,"abstract":"The electronic representation of scientific documents (journals, technical reports, program documentation, laboratory notebooks, etc.) presents challenges in several distinct communities. We see five distinct groups who are concerned with electronic versions of scientific documents: (1) publishers of journals, texts and reference works, and their authors; (2) software publishers for OCR/document analysis and document formatting; (3) software publishers whose products access \"contents semantics\" from documents, including library keyword search programs, natural language search programs, database systems, visual presentation systems, mathematical computation systems, etc.; (4) institutions maintaining access to electronic libraries, which must be broadly construed to include data and programs of all sorts; and (5) individuals and programs acting as their agents who need to use these libraries to identify, locate and retrieve relevant documents. It would be good to have a convergence in design and standards for encoding new or pre-existing (typically paper-based) documents in order to meet the needs of all these groups. Various efforts, some loosely coordinated, but just as often competing, are trying to set standards and build tools. This paper discusses where we are headed.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133086589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619880
K. Ng, L. Cheng, C. H. Wong
We propose a dynamic text compression technique with a back searching algorithm and a new storage protocol. Codes being encoded are divided into three types namely copy, literal and hybrid codes. Multiple dictionaries are adopted and each of them has a linked sub-dictionary. Each dictionary has a portion of pre-defined words i.e. the most frequent words and the rest of the entries will depend on the message. A hashing function developed by Pearson (1990) is adopted. It serves two purposes. Firstly, it is used to initialize the dictionary. Secondly, it is used as a quick search to a particular word. By using this scheme, the spaces between words do not need to be considered. At the decoding side, a space character will be appended after each word is decoded. Therefore, the redundancy of space can also be compressed. The result shows that the original message will not be expanded even if we have poor dictionary design.
{"title":"Dynamic word based text compression","authors":"K. Ng, L. Cheng, C. H. Wong","doi":"10.1109/ICDAR.1997.619880","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619880","url":null,"abstract":"We propose a dynamic text compression technique with a back searching algorithm and a new storage protocol. Codes being encoded are divided into three types namely copy, literal and hybrid codes. Multiple dictionaries are adopted and each of them has a linked sub-dictionary. Each dictionary has a portion of pre-defined words i.e. the most frequent words and the rest of the entries will depend on the message. A hashing function developed by Pearson (1990) is adopted. It serves two purposes. Firstly, it is used to initialize the dictionary. Secondly, it is used as a quick search to a particular word. By using this scheme, the spaces between words do not need to be considered. At the decoding side, a space character will be appended after each word is decoded. Therefore, the redundancy of space can also be compressed. The result shows that the original message will not be expanded even if we have poor dictionary design.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130954123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620565
L. Tseng, R. Chen
A new approach is proposed to segment off-line handwritten Chinese characters. Many papers have been published on the off-line recognition of Chinese characters, and almost all of them focus on the recognition of isolated Chinese characters. The segmentation of text into characters was rarely discussed. The segmentation is an important preprocess of the off-line Chinese character recognition because correct recognition of characters relies on correct segmentation of characters. In handwritten Chinese characters, characters may be written to touch each other or to overlap with each other, therefore, the segmentation problem is not an easy one. In this paper, we present a novel method which uses strokes to build stroke bounding boxes first. Then, the knowledge-based merging operations are used to merge those stroke bounding boxes and finally, a dynamic programming method is applied to find the best segmentation boundaries. A series of experiments show that our method is very effective for off-line handwritten Chinese character segmentation.
{"title":"A new method for segmenting handwritten Chinese characters","authors":"L. Tseng, R. Chen","doi":"10.1109/ICDAR.1997.620565","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620565","url":null,"abstract":"A new approach is proposed to segment off-line handwritten Chinese characters. Many papers have been published on the off-line recognition of Chinese characters, and almost all of them focus on the recognition of isolated Chinese characters. The segmentation of text into characters was rarely discussed. The segmentation is an important preprocess of the off-line Chinese character recognition because correct recognition of characters relies on correct segmentation of characters. In handwritten Chinese characters, characters may be written to touch each other or to overlap with each other, therefore, the segmentation problem is not an easy one. In this paper, we present a novel method which uses strokes to build stroke bounding boxes first. Then, the knowledge-based merging operations are used to merge those stroke bounding boxes and finally, a dynamic programming method is applied to find the best segmentation boundaries. A series of experiments show that our method is very effective for off-line handwritten Chinese character segmentation.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134130155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619810
Hongwei Shi, T. Pavlidis
Font recognition and contextual processing are developed as two components that enhance the recognition accuracy of a text recognition system presented in a previous paper ((H. Shi and T. Pavlidis, 1996). Font information is extracted from two sources: one is the global page properties, and the other is the graph matching result of recognized short words such as a, it and of etc. Contextual processing is done by first composing word candidates from the recognition results and then checking each candidate with a dictionary through a spelling checker. Positional binary trigrams and word affixes are used to prune the search for word candidates.
{"title":"Font recognition and contextual processing for more accurate text recognition","authors":"Hongwei Shi, T. Pavlidis","doi":"10.1109/ICDAR.1997.619810","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619810","url":null,"abstract":"Font recognition and contextual processing are developed as two components that enhance the recognition accuracy of a text recognition system presented in a previous paper ((H. Shi and T. Pavlidis, 1996). Font information is extracted from two sources: one is the global page properties, and the other is the graph matching result of recognized short words such as a, it and of etc. Contextual processing is done by first composing word candidates from the recognition results and then checking each candidate with a dictionary through a spelling checker. Positional binary trigrams and word affixes are used to prune the search for word candidates.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133650078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619865
F. Cesarini, E. Francesconi, M. Gori, S. Marinai, Jianqing Sheng, G. Soda
We present a method for the logical labelling of physical rectangles, extracted from invoices, based on a conceptual model which describes, as generally as possible, the invoice universe. This general knowledge is used in the semi automatic construction of a model for each class of invoices. Once the model is constructed, it can be applied to understand an invoice instance, whose class is univocally identified by its logo. This approach is used to design a flexible system which is able to learn, from a nucleus of general knowledge, a monotonic set of specific knowledge for each class of invoices (document models), in terms of physical coordinates for each rectangle and related semantic label.
{"title":"Rectangle labelling for an invoice understanding system","authors":"F. Cesarini, E. Francesconi, M. Gori, S. Marinai, Jianqing Sheng, G. Soda","doi":"10.1109/ICDAR.1997.619865","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619865","url":null,"abstract":"We present a method for the logical labelling of physical rectangles, extracted from invoices, based on a conceptual model which describes, as generally as possible, the invoice universe. This general knowledge is used in the semi automatic construction of a model for each class of invoices. Once the model is constructed, it can be applied to understand an invoice instance, whose class is univocally identified by its logo. This approach is used to design a flexible system which is able to learn, from a nucleus of general knowledge, a monotonic set of specific knowledge for each class of invoices (document models), in terms of physical coordinates for each rectangle and related semantic label.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131998408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.619829
N.H. Han, C. La, P. Rhee
The paper addresses an efficient parallel thinning algorithm based on weight-values. The weight-value of a black pixel is calculated by observing neighboring pixels, and it gives one an efficient way to decide whether the pixel is deleted or not. Owing to weight-values, the proposed algorithm uses only 3/spl times/3 templates. Furthermore, it examines only the elimination conditions corresponding the weight-value of boundary pixels, and all elimination conditions will not be searched as most other parallel iterative thinning algorithms. Thus, the execution time can be reduced a lot compared to that of previous approaches. The weight-value also allow one to deal with typical troublesome patterns efficiently. Without smoothing before thinning, the algorithm produces robust thinned images even in the presence of two pixel-width noises. The authors obtain encouraging results from extensive experiments.
{"title":"An efficient fully parallel thinning algorithm","authors":"N.H. Han, C. La, P. Rhee","doi":"10.1109/ICDAR.1997.619829","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619829","url":null,"abstract":"The paper addresses an efficient parallel thinning algorithm based on weight-values. The weight-value of a black pixel is calculated by observing neighboring pixels, and it gives one an efficient way to decide whether the pixel is deleted or not. Owing to weight-values, the proposed algorithm uses only 3/spl times/3 templates. Furthermore, it examines only the elimination conditions corresponding the weight-value of boundary pixels, and all elimination conditions will not be searched as most other parallel iterative thinning algorithms. Thus, the execution time can be reduced a lot compared to that of previous approaches. The weight-value also allow one to deal with typical troublesome patterns efficiently. Without smoothing before thinning, the algorithm produces robust thinned images even in the presence of two pixel-width noises. The authors obtain encouraging results from extensive experiments.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132223944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620597
A. T. Abak, U. Baris, B. Sankur
The paper presents performance evaluation of thresholding algorithms in the context of document analysis and character recognition systems. Several thresholding algorithms are comparatively evaluated on the basis of the original bitmaps of characters. Different distance measures such as Hausdorff, Jaccard, and Yule are used to measure the similarity between thresholded bitmaps and original bitmaps of characters.
{"title":"The performance evaluation of thresholding algorithms for optical character recognition","authors":"A. T. Abak, U. Baris, B. Sankur","doi":"10.1109/ICDAR.1997.620597","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620597","url":null,"abstract":"The paper presents performance evaluation of thresholding algorithms in the context of document analysis and character recognition systems. Several thresholding algorithms are comparatively evaluated on the basis of the original bitmaps of characters. Different distance measures such as Hausdorff, Jaccard, and Yule are used to measure the similarity between thresholded bitmaps and original bitmaps of characters.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"694 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132790979","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620545
M.A. Ozdil, F. Vural
A segmentation-free approach for off-line optical character recognition is presented. The proposed method performs the recognition by extracting the characters from the whole word, avoiding the segmentation process. A control point set which includes position and attribute vectors is selected for the features. In the training mode, each sample character is mapped to a set of control points and is stored in an archive which belongs to an alphabet. In the recognition mode, the control points of the input image are first extracted. Then, each control point is matched to the control points in the alphabet according to its attributes. During the matching process, a probability matrix is constructed which holds some matching measures (probabilities) for identifying the characters. Experimental results indicate that the proposed method is very robust in extracting the characters from a cursive script.
{"title":"Optical character recognition without segmentation","authors":"M.A. Ozdil, F. Vural","doi":"10.1109/ICDAR.1997.620545","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620545","url":null,"abstract":"A segmentation-free approach for off-line optical character recognition is presented. The proposed method performs the recognition by extracting the characters from the whole word, avoiding the segmentation process. A control point set which includes position and attribute vectors is selected for the features. In the training mode, each sample character is mapped to a set of control points and is stored in an archive which belongs to an alphabet. In the recognition mode, the control points of the input image are first extracted. Then, each control point is matched to the control points in the alphabet according to its attributes. During the matching process, a probability matrix is constructed which holds some matching measures (probabilities) for identifying the characters. Experimental results indicate that the proposed method is very robust in extracting the characters from a cursive script.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130026715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1997-08-18DOI: 10.1109/ICDAR.1997.620590
Hiroshi Shinjo, Kazuki Nakashima, Masashi Koga, K. Marukawa, Y. Shima, Eiichi Hadano
Form document structure analysis is an essential technique for recognizing the positions of characters in general forms. However, it has a fundamental problem that interruptions of lines, as well as noise, lead to incorrect analysis. The paper focuses on a method for connecting junction patterns in which portions of the horizontal and vertical lines are not visible, referred to as "disappeared junction patterns". Our method has two key stages for making correct connections. The first is noise elimination, in which lines whose two end points meet no other lines and which are shorter than the minimum line length parameter, are eliminated. The second is object line selection, where only frame lines of tables are selected as object lines for connection. Experiments with 39 form images demonstrated the feasibility of this method.
{"title":"A method for connecting disappeared junction patterns on frame lines in form documents","authors":"Hiroshi Shinjo, Kazuki Nakashima, Masashi Koga, K. Marukawa, Y. Shima, Eiichi Hadano","doi":"10.1109/ICDAR.1997.620590","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620590","url":null,"abstract":"Form document structure analysis is an essential technique for recognizing the positions of characters in general forms. However, it has a fundamental problem that interruptions of lines, as well as noise, lead to incorrect analysis. The paper focuses on a method for connecting junction patterns in which portions of the horizontal and vertical lines are not visible, referred to as \"disappeared junction patterns\". Our method has two key stages for making correct connections. The first is noise elimination, in which lines whose two end points meet no other lines and which are shorter than the minimum line length parameter, are eliminated. The second is object line selection, where only frame lines of tables are selected as object lines for connection. Experiments with 39 form images demonstrated the feasibility of this method.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130250597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}