Sketching is considered as a way to naturally express ideas during the early phases of design. For this reason, many efforts have been made to develop user interfaces and recognizers, which enable users to create sketches using pen-based devices. However, in some domains, such as in architectural and engineering fields, the drawing process turns out to be particularly tedious and time-consuming, since the symbols to be drawn may have a complex shape and recur many times in the sketches. In this paper we present a technique for symbol completion that allows users to rapidly draw diagrammatic sketches. The completion technique recovers the information on missing strokes by interacting with symbol recognizers, which are automatically generated from grammar specifications. Moreover, in order to maintain the sketch layout more familiar to the users, the added strokes are drawn according to the user drawing style.
{"title":"Using Grammar-Based Recognizers for Symbol Completion in Diagrammatic Sketches","authors":"G. Costagliola, V. Deufemia, M. Risi","doi":"10.1109/ICDAR.2007.259","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.259","url":null,"abstract":"Sketching is considered as a way to naturally express ideas during the early phases of design. For this reason, many efforts have been made to develop user interfaces and recognizers, which enable users to create sketches using pen-based devices. However, in some domains, such as in architectural and engineering fields, the drawing process turns out to be particularly tedious and time-consuming, since the symbols to be drawn may have a complex shape and recur many times in the sketches. In this paper we present a technique for symbol completion that allows users to rapidly draw diagrammatic sketches. The completion technique recovers the information on missing strokes by interacting with symbol recognizers, which are automatically generated from grammar specifications. Moreover, in order to maintain the sketch layout more familiar to the users, the added strokes are drawn according to the user drawing style.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131179583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lexical text correction relies on a central step where approximate search in a dictionary is used to select the best correction suggestions for an ill-formed input token. In previous work we introduced the concept of a universal Levenshtein automaton and showed how to use these automata for efficiently selecting from a dictionary all entries within a fixed Levenshtein distance to the garbled input word. In this paper we look at refinements of the basic Levenshtein distance that yield more sensible notions of similarity in distinct text correction applications, e.g. OCR. We show that the concept of a universal Levenshtein automaton can be adapted to these refinements. In this way we obtain a method for selecting correction candidates which is very efficient, at the same time selecting small candidate sets with high recall.
{"title":"Fast Selection of Small and Precise Candidate Sets from Dictionaries for Text Correction Tasks","authors":"K. Schulz, S. Mihov, Petar Mitankin","doi":"10.1109/ICDAR.2007.119","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.119","url":null,"abstract":"Lexical text correction relies on a central step where approximate search in a dictionary is used to select the best correction suggestions for an ill-formed input token. In previous work we introduced the concept of a universal Levenshtein automaton and showed how to use these automata for efficiently selecting from a dictionary all entries within a fixed Levenshtein distance to the garbled input word. In this paper we look at refinements of the basic Levenshtein distance that yield more sensible notions of similarity in distinct text correction applications, e.g. OCR. We show that the concept of a universal Levenshtein automaton can be adapted to these refinements. In this way we obtain a method for selecting correction candidates which is very efficient, at the same time selecting small candidate sets with high recall.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131375563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the emergence of digital pen and paper interfaces, there is a need for gesture recognition tools for digital pen input. While there exists a variety of gesture recognition frameworks, none of them addresses the issues of supporting application developers as well as the designers of new recognition algorithms and, at the same time, can be integrated with new forms of input devices such as digital pens. We introduce iGesture, a Java-based gesture recognition framework focusing on extensibility and cross-application reusability by providing an integrated solution that includes tools for gesture recognition as well as the creation and management of gesture sets for the evaluation and optimisation of new or existing gesture recognition algorithms. In addition to traditional screen-based interaction, iGesture provides a digital pen and paper interface.
{"title":"iGesture: A General Gesture Recognition Framework","authors":"B. Signer, U. Kurmann, M. Norrie","doi":"10.1109/ICDAR.2007.139","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.139","url":null,"abstract":"With the emergence of digital pen and paper interfaces, there is a need for gesture recognition tools for digital pen input. While there exists a variety of gesture recognition frameworks, none of them addresses the issues of supporting application developers as well as the designers of new recognition algorithms and, at the same time, can be integrated with new forms of input devices such as digital pens. We introduce iGesture, a Java-based gesture recognition framework focusing on extensibility and cross-application reusability by providing an integrated solution that includes tools for gesture recognition as well as the creation and management of gesture sets for the evaluation and optimisation of new or existing gesture recognition algorithms. In addition to traditional screen-based interaction, iGesture provides a digital pen and paper interface.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121867685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To distinguish similar characters, it is preferable to construct a classifier using a projective feature space which differentiates two similar categories. The classifier CMF has been proposed for a discriminant function, in similar characters recognition. In the CMF, a subspace is constructed by some eigenvectors, that corresponds to the smallest eigenvalues, is applied as projective feature space. A difference vector of two class-mean feature vectors are assumed as the difference between two similar categories, the CMF is constructed by projecting a feature vector onto this difference vector. In this paper, we propose new discriminant function expanding the CMF. In proposed method, we treat the Difference Subspace, which is difference between two subspaces as difference between two similar categories. The efficiency of the proposed new discriminant function has been demonstrated in similar characters recognition through extensive experiments on hand-written Japanese characters derived from the ETL9B database.
{"title":"A Classifier of Similar Characters using Compound Mahalanobis Function based on Difference Subspace","authors":"J. Hirayama, Hidehisa Nakayama, N. Kato","doi":"10.1109/ICDAR.2007.4","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.4","url":null,"abstract":"To distinguish similar characters, it is preferable to construct a classifier using a projective feature space which differentiates two similar categories. The classifier CMF has been proposed for a discriminant function, in similar characters recognition. In the CMF, a subspace is constructed by some eigenvectors, that corresponds to the smallest eigenvalues, is applied as projective feature space. A difference vector of two class-mean feature vectors are assumed as the difference between two similar categories, the CMF is constructed by projecting a feature vector onto this difference vector. In this paper, we propose new discriminant function expanding the CMF. In proposed method, we treat the Difference Subspace, which is difference between two subspaces as difference between two similar categories. The efficiency of the proposed new discriminant function has been demonstrated in similar characters recognition through extensive experiments on hand-written Japanese characters derived from the ETL9B database.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134579434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Seki, Masakazu Fujio, T. Nagasaki, Hiroshi Shinjo, K. Marukawa
An information management system using analyzing document structure is presented. The purpose is simultaneous management of information in various paper and electronic documents. The system contains image document analysis, PDF document analysis, and HTML document analysis. The two applications are presented and the developed prototypes are described. One application is document summarization. The other application is table understanding to correlate data to items.
{"title":"Information Management System Using Structure Analysis of Paper/Electronic Documents and Its Applications","authors":"M. Seki, Masakazu Fujio, T. Nagasaki, Hiroshi Shinjo, K. Marukawa","doi":"10.1109/ICDAR.2007.144","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.144","url":null,"abstract":"An information management system using analyzing document structure is presented. The purpose is simultaneous management of information in various paper and electronic documents. The system contains image document analysis, PDF document analysis, and HTML document analysis. The two applications are presented and the developed prototypes are described. One application is document summarization. The other application is table understanding to correlate data to items.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115082142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose a method of information extraction from HTML documents based on modelling the visual information in the document. A page segmentation algorithm is used for detecting the document layout and subsequently, the extraction process is based on the analysis of mutual positions of the detected blocks and their visual features. This approach is more robust that the traditional DOM-based methods and it opens new possibilities for the extraction task specification.
{"title":"Layout Based Information Extraction from HTML Documents","authors":"Radek Burget","doi":"10.1109/ICDAR.2007.155","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.155","url":null,"abstract":"We propose a method of information extraction from HTML documents based on modelling the visual information in the document. A page segmentation algorithm is used for detecting the document layout and subsequently, the extraction process is based on the analysis of mutual positions of the detected blocks and their visual features. This approach is more robust that the traditional DOM-based methods and it opens new possibilities for the extraction task specification.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115696483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the present article, we describe a novel direction code based feature extraction approach for recognition of online Bangla handwritten basic characters. We have implemented the proposed approach on a database of 7043 online handwritten Bangla (a major script of the Indian subcontinent) character samples, which has been developed by us. This is a 50-class recognition problem and we achieved 93.90% and 83.61% recognition accuracies respectively on its training and test sets.
{"title":"Direction Code Based Features for Recognition of Online Handwritten Characters of Bangla","authors":"U. Bhattacharya, B. K. Gupta, S. K. Parui","doi":"10.1109/ICDAR.2007.100","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.100","url":null,"abstract":"In the present article, we describe a novel direction code based feature extraction approach for recognition of online Bangla handwritten basic characters. We have implemented the proposed approach on a database of 7043 online handwritten Bangla (a major script of the Indian subcontinent) character samples, which has been developed by us. This is a 50-class recognition problem and we achieved 93.90% and 83.61% recognition accuracies respectively on its training and test sets.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114552642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We present a a statistical approach to skew detection, where the textual features of a document image are modeled as a mixture of straight lines in Gaussian noise. The EM algorithm is used to estimate the parameters of the mixture model and the skew angle estimate is extracted from the estimated parameters. Experiments prove that our method has some advantages over other existing methods in terms of accuracy and efficiency.
{"title":"An EM Based Algorithm for Skew Detection","authors":"A. Egozi, I. Dinstein, J. Chapran, M. Fairhurst","doi":"10.1109/ICDAR.2007.52","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.52","url":null,"abstract":"We present a a statistical approach to skew detection, where the textual features of a document image are modeled as a mixture of straight lines in Gaussian noise. The EM algorithm is used to estimate the parameters of the mixture model and the skew angle estimate is extracted from the estimated parameters. Experiments prove that our method has some advantages over other existing methods in terms of accuracy and efficiency.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115793901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ming Ye, Paul A. Viola, Sashi Raghupathy, H. Sutanto, Chengyang Li
This paper proposes a machine learning approach to grouping problems in ink parsing. Starting from an initial segmentation, hypotheses are generated by perturbing local configurations and processed in a high-confidence-first fashion, where the confidence of each hypothesis is produced by a data-driven AdaBoost decision-tree classifier with a set of intuitive features. This framework has successfully applied to grouping text lines and regions in complex freeform digital ink notes from real TabletPC users. It holds great potential in solving many other grouping problems in the ink parsing and document image analysis domains.
{"title":"Learning to Group Text Lines and Regions in Freeform Handwritten Notes","authors":"Ming Ye, Paul A. Viola, Sashi Raghupathy, H. Sutanto, Chengyang Li","doi":"10.1109/ICDAR.2007.159","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.159","url":null,"abstract":"This paper proposes a machine learning approach to grouping problems in ink parsing. Starting from an initial segmentation, hypotheses are generated by perturbing local configurations and processed in a high-confidence-first fashion, where the confidence of each hypothesis is produced by a data-driven AdaBoost decision-tree classifier with a set of intuitive features. This framework has successfully applied to grouping text lines and regions in complex freeform digital ink notes from real TabletPC users. It holds great potential in solving many other grouping problems in the ink parsing and document image analysis domains.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123107893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present a system towards the recognition of off-line handwritten characters of Devnagari, the most popular script in India. The features used for recognition purpose are mainly based on directional information obtained from the arc tangent of the gradient. To get the feature, at first, a 2times2 mean filtering is applied 4 times on the gray level image and a non-linear size normalization is done on the image. The normalized image is then segmented to 49times49 blocks and a Roberts filter is applied to obtain gradient image. Next, the arc tangent of the gradient (direction of gradient) is initially quantized into 32 directions and the strength of the gradient is accumulated with each of the quantized direction. Finally, the blocks and the directions are down sampled using Gaussian filter to get 392 dimensional feature vector. A modified quadratic classifier is applied on these features for recognition. We used 36172 handwritten data for testing our system and obtained 94.24% accuracy using 5-fold cross-validation scheme.
{"title":"Off-Line Handwritten Character Recognition of Devnagari Script","authors":"U. Pal, N. Sharma, T. Wakabayashi, F. Kimura","doi":"10.1109/ICDAR.2007.189","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.189","url":null,"abstract":"In this paper we present a system towards the recognition of off-line handwritten characters of Devnagari, the most popular script in India. The features used for recognition purpose are mainly based on directional information obtained from the arc tangent of the gradient. To get the feature, at first, a 2times2 mean filtering is applied 4 times on the gray level image and a non-linear size normalization is done on the image. The normalized image is then segmented to 49times49 blocks and a Roberts filter is applied to obtain gradient image. Next, the arc tangent of the gradient (direction of gradient) is initially quantized into 32 directions and the strength of the gradient is accumulated with each of the quantized direction. Finally, the blocks and the directions are down sampled using Gaussian filter to get 392 dimensional feature vector. A modified quadratic classifier is applied on these features for recognition. We used 36172 handwritten data for testing our system and obtained 94.24% accuracy using 5-fold cross-validation scheme.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123999361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}