Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953896
U. Pal, B. Chaudhuri
In a general situation, a document page may contain several scriptforms. For optical character recognition (OCR) of such a document page, it is necessary to separate the scripts before feeding them to their individual OCR systems. An automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document is proposed. Shape based features, statistical features and some features obtained from the concept of a water reservoir are used for script identification. The proposed scheme has an accuracy of about 97.33%.
{"title":"Automatic identification of English, Chinese, Arabic, Devnagari and Bangla script line","authors":"U. Pal, B. Chaudhuri","doi":"10.1109/ICDAR.2001.953896","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953896","url":null,"abstract":"In a general situation, a document page may contain several scriptforms. For optical character recognition (OCR) of such a document page, it is necessary to separate the scripts before feeding them to their individual OCR systems. An automatic technique for the identification of printed Roman, Chinese, Arabic, Devnagari and Bangla text lines from a single document is proposed. Shape based features, statistical features and some features obtained from the concept of a water reservoir are used for script identification. The proposed scheme has an accuracy of about 97.33%.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"7 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122443578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953752
Ruini Cao, C. Tan
The separation of overlapping text from graphics is a challenging problem in document image analysis. This paper proposes a specific method for detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.
{"title":"Separation of overlapping text from graphics","authors":"Ruini Cao, C. Tan","doi":"10.1109/ICDAR.2001.953752","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953752","url":null,"abstract":"The separation of overlapping text from graphics is a challenging problem in document image analysis. This paper proposes a specific method for detecting and extracting characters that are touching graphics. It is based on the observation that the constituent strokes of characters are usually short segments in comparison with those of graphics. It combines line continuation with the feature line width to decompose and reconstruct segments underlying the region of intersection. Experimental results showed that the proposed method improved the percentage of correctly detected text as well as the accuracy of character recognition significantly.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129019035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953970
Fei Liu, Yupin Luo, D. Hu, Masataka Yoshikawa
The aim of the layout analysis is to extract the geometric structure from a document image. It is a progress of labeling homogenous regions of a document image. In order to present a complex newspaper layout analysis, this paper proposes a new component based bottom-up algorithm. With a novel homogeneity related definition of distance, it maintains a dynamic minimal distance mechanism to decide the components merging sequence. Under the restricting rules generated from the newspaper layout heuristically, we derive the preferred analysis result. Experimental results reveal the proposed approach is effective.
{"title":"A new component based algorithm for newspaper layout analysis","authors":"Fei Liu, Yupin Luo, D. Hu, Masataka Yoshikawa","doi":"10.1109/ICDAR.2001.953970","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953970","url":null,"abstract":"The aim of the layout analysis is to extract the geometric structure from a document image. It is a progress of labeling homogenous regions of a document image. In order to present a complex newspaper layout analysis, this paper proposes a new component based bottom-up algorithm. With a novel homogeneity related definition of distance, it maintains a dynamic minimal distance mechanism to decide the components merging sequence. Under the restricting rules generated from the newspaper layout heuristically, we derive the preferred analysis result. Experimental results reveal the proposed approach is effective.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129349595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953786
Bertrand Coüasnon
Genericity in structured document recognition is a difficult challenge. We therefore propose a new generic document recognition method, called DMOS (Description and MOdification of Segmentation), that is made up of a new grammatical formalism, called EPF (Enhanced Position Formalism) and an associated parser which is able to introduce context in segmentation. We implement this method to obtain a generator of document recognition systems. This generator can automatically produce new recognition systems. It is only necessary to describe the document with an EPF grammar, which is then simply compiled. In this way, we have developed various recognition systems: one on musical scores, one on mathematical formulae and one on recursive table structures. We have also defined a specific application to damaged military forms of the 19th Century. We have been able to test the generated system on 5,000 of these military forms. This has permitted us to validate the DMOS method on a real-world application.
{"title":"DMOS: a generic document recognition method, application to an automatic generator of musical scores, mathematical formulae and table structures recognition systems","authors":"Bertrand Coüasnon","doi":"10.1109/ICDAR.2001.953786","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953786","url":null,"abstract":"Genericity in structured document recognition is a difficult challenge. We therefore propose a new generic document recognition method, called DMOS (Description and MOdification of Segmentation), that is made up of a new grammatical formalism, called EPF (Enhanced Position Formalism) and an associated parser which is able to introduce context in segmentation. We implement this method to obtain a generator of document recognition systems. This generator can automatically produce new recognition systems. It is only necessary to describe the document with an EPF grammar, which is then simply compiled. In this way, we have developed various recognition systems: one on musical scores, one on mathematical formulae and one on recursive table structures. We have also defined a specific application to damaged military forms of the 19th Century. We have been able to test the generated system on 5,000 of these military forms. This has permitted us to validate the DMOS method on a real-world application.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116991005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953764
S. Srihari, Sung-Hyuk Cha, Hina Arora, Sangjik Lee
Motivated by several rulings in United States courts concerning expert testimony in general and handwriting testimony in particular, we undertook a study to objectively validate the hypothesis that handwriting is individualistic. Handwriting samples of 1500 individuals, representative of the US population with respect to gender, age, ethnic groups, etc., were obtained. Analyzing differences in handwriting was done by using computer algorithms for extracting features from scanned images of handwriting. Attributes characteristic of the handwriting were obtained, e.g., line separation, slant, character shapes, etc. These attributes, which are a subset of attributes used by expert document examiners, were used to quantitatively establish individuality by using machine learning approaches. Using global attributes of handwriting and very few characters in the writing, the ability to determine the writer with a high degree of confidence was established. The work is a step towards providing scientific support for admitting handwriting evidence in court. The mathematical approach and the resulting software also have the promise of aiding the expert document examiner.
{"title":"Individuality of handwriting: a validation study","authors":"S. Srihari, Sung-Hyuk Cha, Hina Arora, Sangjik Lee","doi":"10.1109/ICDAR.2001.953764","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953764","url":null,"abstract":"Motivated by several rulings in United States courts concerning expert testimony in general and handwriting testimony in particular, we undertook a study to objectively validate the hypothesis that handwriting is individualistic. Handwriting samples of 1500 individuals, representative of the US population with respect to gender, age, ethnic groups, etc., were obtained. Analyzing differences in handwriting was done by using computer algorithms for extracting features from scanned images of handwriting. Attributes characteristic of the handwriting were obtained, e.g., line separation, slant, character shapes, etc. These attributes, which are a subset of attributes used by expert document examiners, were used to quantitatively establish individuality by using machine learning approaches. Using global attributes of handwriting and very few characters in the writing, the ability to determine the writer with a high degree of confidence was established. The work is a step towards providing scientific support for admitting handwriting evidence in court. The mathematical approach and the resulting software also have the promise of aiding the expert document examiner.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117098710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953980
Faten Hussein, R. Ward, N. Kharma
Our aim is: a) to present a comprehensive survey of previous attempts at using genetic algorithms (GA) for feature selection in pattern recognition applications, with a special focus on character recognition; and b) to report on work that uses GA to optimize the weights of the classification module of a character recognition system. The main purpose of feature selection is to reduce the number of features, by eliminating irrelevant and redundant features, while simultaneously maintaining or enhancing classification accuracy. Many search algorithms have been used for feature selection. Among those, GA have proven to be an effective computational method, especially in situations where the search space is uncharacterized (mathematically), not fully understood, or/and highly dimensional.
{"title":"Genetic algorithms for feature selection and weighting, a review and study","authors":"Faten Hussein, R. Ward, N. Kharma","doi":"10.1109/ICDAR.2001.953980","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953980","url":null,"abstract":"Our aim is: a) to present a comprehensive survey of previous attempts at using genetic algorithms (GA) for feature selection in pattern recognition applications, with a special focus on character recognition; and b) to report on work that uses GA to optimize the weights of the classification module of a character recognition system. The main purpose of feature selection is to reduce the number of features, by eliminating irrelevant and redundant features, while simultaneously maintaining or enhancing classification accuracy. Many search algorithms have been used for feature selection. Among those, GA have proven to be an effective computational method, especially in situations where the search space is uncharacterized (mathematically), not fully understood, or/and highly dimensional.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114311143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953951
Gerald Penn, Jianying Hu, Hengbin Luo, Ryan T. McDonald
We propose a set of baseline heuristics for identifying genuinely tabular information and news links in HTML documents. A prototype implementation of these heuristics is described for delivering content from news providers' home pages to a narrow-bandwidth device such as a portable digital assistant or cellular phone display. Its evaluation on 75 Web sites is provided, along with a discussion of topics for future research.
{"title":"Flexible Web document analysis for delivery to narrow-bandwidth devices","authors":"Gerald Penn, Jianying Hu, Hengbin Luo, Ryan T. McDonald","doi":"10.1109/ICDAR.2001.953951","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953951","url":null,"abstract":"We propose a set of baseline heuristics for identifying genuinely tabular information and news links in HTML documents. A prototype implementation of these heuristics is described for delivering content from news providers' home pages to a narrow-bandwidth device such as a portable digital assistant or cellular phone display. Its evaluation on 75 Web sites is provided, along with a discussion of topics for future research.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114343037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953881
Kuo-Chin Fan, Mei-Lin Chang, Yuan-Kai Wang
Form recognition is one of the special applications of document analysis (DA). We present a novel form recognition method by analyzing the line structure embedded in an input form document. First, all vertical and horizontal lines embedded in the form image are extracted. By analyzing the crossing relationships among horizontal lines and vertical lines, a line crossing relationship matrix can be built with each row corresponding to one horizontal line and each column corresponding to one vertical line. Moreover two line distance relationship matrices, horizontal and vertical line distance relationship matrices, are built by analyzing the distance relationships among horizontal lines and vertical lines, respectively. Last, the recognition task is performed by matching these three matrices. Experimental results reveal the feasibility and efficiency of our proposed method in recognizing form documents.
{"title":"Form document identification using line structure based features","authors":"Kuo-Chin Fan, Mei-Lin Chang, Yuan-Kai Wang","doi":"10.1109/ICDAR.2001.953881","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953881","url":null,"abstract":"Form recognition is one of the special applications of document analysis (DA). We present a novel form recognition method by analyzing the line structure embedded in an input form document. First, all vertical and horizontal lines embedded in the form image are extracted. By analyzing the crossing relationships among horizontal lines and vertical lines, a line crossing relationship matrix can be built with each row corresponding to one horizontal line and each column corresponding to one vertical line. Moreover two line distance relationship matrices, horizontal and vertical line distance relationship matrices, are built by analyzing the distance relationships among horizontal lines and vertical lines, respectively. Last, the recognition task is performed by matching these three matrices. Experimental results reveal the feasibility and efficiency of our proposed method in recognizing form documents.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114084407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953799
Abderrazak Zahour, B. Taconet, P. Mercy, Said Ramdane
This paper describes a text-line extraction based method. The typical segmentation for a printed binary document is based on the horizontal projection analysis and then the regrouping of the connected components. These techniques can't be used for handwritten unconstrained text because data frequently contain undulations and shifts in the baseline, baseline-skew variability and inter-line distance variability. So, we think that the border line for a handwritten unconstrained documents should be a collection of horizontal line segments. From this point of view, we use a partial contour following based method to detect the separating lines. In the current version of our algorithm, we proceed to text slant detection, text line number evaluation by using partial projection. Then we carry out a partial contour following of every line; first in the direction of the writing, then in the opposite direction. After the treatment, the adjacent lines are separated. In the experimental session, we describe the application of the algorithm used for the extraction of text line. Database images contains about one hundred handwritten Arabic texts written by different writers. Results about diacritical points affectation are also reported.
{"title":"Arabic hand-written text-line extraction","authors":"Abderrazak Zahour, B. Taconet, P. Mercy, Said Ramdane","doi":"10.1109/ICDAR.2001.953799","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953799","url":null,"abstract":"This paper describes a text-line extraction based method. The typical segmentation for a printed binary document is based on the horizontal projection analysis and then the regrouping of the connected components. These techniques can't be used for handwritten unconstrained text because data frequently contain undulations and shifts in the baseline, baseline-skew variability and inter-line distance variability. So, we think that the border line for a handwritten unconstrained documents should be a collection of horizontal line segments. From this point of view, we use a partial contour following based method to detect the separating lines. In the current version of our algorithm, we proceed to text slant detection, text line number evaluation by using partial projection. Then we carry out a partial contour following of every line; first in the direction of the writing, then in the opposite direction. After the treatment, the adjacent lines are separated. In the experimental session, we describe the application of the algorithm used for the extraction of text line. Database images contains about one hundred handwritten Arabic texts written by different writers. Results about diacritical points affectation are also reported.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125347342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2001-09-10DOI: 10.1109/ICDAR.2001.953757
H. Miled, N. Amara
In this paper, we show how planar hidden Markov models (PHMM) can offer great potential to solve difficult Arabic character recognition problems, especially its cursivness. A convenient architecture is defined for printed Arabic sub-words. It yields an easy solution to implement the modeling of the different morphological variations of the Arabic writing, i.e., vertical and variable horizontal linkages. A more flexible architecture, developed for Arabic handwritten words, is under test. The structure proposed presents the aptitude to absorb the variability of the manuscript. Indeed, the experiments have shown promising results and directions for further improvements. In the present paper, we describe both retained architectures, showing the applicability of the PHMM to the Arabic complexities. This is owed precisely to the definition of the PHMM, which permits to follow efficiently the natural variations in bands of the Arabic script.
{"title":"Planar Markov modeling for Arabic writing recognition: advancement state","authors":"H. Miled, N. Amara","doi":"10.1109/ICDAR.2001.953757","DOIUrl":"https://doi.org/10.1109/ICDAR.2001.953757","url":null,"abstract":"In this paper, we show how planar hidden Markov models (PHMM) can offer great potential to solve difficult Arabic character recognition problems, especially its cursivness. A convenient architecture is defined for printed Arabic sub-words. It yields an easy solution to implement the modeling of the different morphological variations of the Arabic writing, i.e., vertical and variable horizontal linkages. A more flexible architecture, developed for Arabic handwritten words, is under test. The structure proposed presents the aptitude to absorb the variability of the manuscript. Indeed, the experiments have shown promising results and directions for further improvements. In the present paper, we describe both retained architectures, showing the applicability of the PHMM to the Arabic complexities. This is owed precisely to the definition of the PHMM, which permits to follow efficiently the natural variations in bands of the Arabic script.","PeriodicalId":277816,"journal":{"name":"Proceedings of Sixth International Conference on Document Analysis and Recognition","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2001-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128060524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}