首页 > 最新文献

Proceedings of the Fourth International Conference on Document Analysis and Recognition最新文献

英文 中文
Evaluating OCR and non-OCR text representations for learning document classifiers 评估OCR和非OCR文本表示用于学习文档分类器
Markus Junker, R. Hoch
In the literature, many feature types and learning algorithms have been proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not been done yet. In order to investigate different text representations for document classification, we have developed a tool which transforms documents into feature-value representations that are suitable for standard learning algorithms. In this paper, we investigate seven document representations for German texts based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts.
在文献中,已经提出了许多特征类型和学习算法用于文档分类。但是,尚未对各种方法进行广泛和系统的评价。为了研究用于文档分类的不同文本表示,我们开发了一个工具,可以将文档转换为适合标准学习算法的特征值表示。在本文中,我们研究了基于n-gram和单个单词的德语文本的七种文档表示。我们比较了它们在两个领域(商业信函和技术报告摘要)中对OCR文本和相应的正确ASCII文本进行分类的有效性。我们的结果表明,使用n-grams是一种有吸引力的技术,甚至可以与依赖于形态学分析的技术相比较。这既适用于OCR文本,也适用于正确的ASCII文本。
{"title":"Evaluating OCR and non-OCR text representations for learning document classifiers","authors":"Markus Junker, R. Hoch","doi":"10.1109/ICDAR.1997.620671","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620671","url":null,"abstract":"In the literature, many feature types and learning algorithms have been proposed for document classification. However, an extensive and systematic evaluation of the various approaches has not been done yet. In order to investigate different text representations for document classification, we have developed a tool which transforms documents into feature-value representations that are suitable for standard learning algorithms. In this paper, we investigate seven document representations for German texts based on n-grams and single words. We compare their effectiveness in classifying OCR texts and the corresponding correct ASCII texts in two domains: business letters and abstracts of technical reports. Our results indicate that the use of n-grams is an attractive technique which can even compare to techniques relying on a morphological analysis. This holds for OCR texts as well as for correct ASCII texts.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"14 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115071972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
An evolutionary neuro-fuzzy approach to recognize on-line Arabic handwriting 一种进化神经模糊方法识别在线阿拉伯笔迹
A. Alimi
The author describes a system that recognizes on-line Arabic cursive handwriting. In this system, a genetic algorithm is used to select the best combination of characters recognized by a fuzzy neural network. The handwritten words used in this system are modelled by a theory of movement generation. Based on this motor theory, the features extracted from each character are the neuro-physiological and biomechanical parameters of the equation describing the curvilinear velocity of the script. The evolutionary approach proposed permits the recognition of cursive handwriting with a segmentation procedure allowing overlapped strokes having neuro-physiological meaning.
作者描述了一个在线识别阿拉伯草书笔迹的系统。该系统采用遗传算法选择模糊神经网络识别的最佳字符组合。在这个系统中使用的手写文字是由运动生成理论建模的。基于这一运动理论,从每个汉字中提取的特征是描述文字曲线速度方程的神经生理和生物力学参数。提出的进化方法允许识别草书笔迹的分割过程,允许重叠的笔画具有神经生理意义。
{"title":"An evolutionary neuro-fuzzy approach to recognize on-line Arabic handwriting","authors":"A. Alimi","doi":"10.1109/ICDAR.1997.619875","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619875","url":null,"abstract":"The author describes a system that recognizes on-line Arabic cursive handwriting. In this system, a genetic algorithm is used to select the best combination of characters recognized by a fuzzy neural network. The handwritten words used in this system are modelled by a theory of movement generation. Based on this motor theory, the features extracted from each character are the neuro-physiological and biomechanical parameters of the equation describing the curvilinear velocity of the script. The evolutionary approach proposed permits the recognition of cursive handwriting with a segmentation procedure allowing overlapped strokes having neuro-physiological meaning.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114658407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 98
Layout and language: preliminary investigations in recognizing the structure of tables 布局与语言:认识表格结构的初步调查
Matthew F. Hurst, Shona Douglas
Describes a prototype system for assigning table cells to their proper place in the logical structure of the table, based on a simple model of table structure combined with a number of measures of cohesion between cells. A framework is presented for examining the effect of particular variables on the performance of the system, and preliminary results are presented showing the effect of cohesion measures based on the simplest domain-independent analyses, with the aim allowing future comparison with more knowledge-intensive analyses based on natural language processing. These baseline results suggest that very simple string-based cohesion measures are not sufficient to support the extraction of tuples as we require. Future work will pursue the aim of more adequate approximations to a notional subtype/supertype definition of the relationship between value cells and label cells.
描述一个原型系统,该系统基于一个简单的表结构模型,结合了许多单元格之间的内聚度量,将表单元格分配到表逻辑结构中的适当位置。提出了一个框架,用于检查特定变量对系统性能的影响,并提出了基于最简单的领域独立分析的内聚度量的初步结果,目的是允许将来与基于自然语言处理的更多知识密集型分析进行比较。这些基线结果表明,非常简单的基于字符串的内聚度量不足以支持我们所需要的元组提取。未来的工作将追求更充分的近似值单元格和标签单元格之间关系的概念子类型/超类型定义的目标。
{"title":"Layout and language: preliminary investigations in recognizing the structure of tables","authors":"Matthew F. Hurst, Shona Douglas","doi":"10.1109/ICDAR.1997.620668","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620668","url":null,"abstract":"Describes a prototype system for assigning table cells to their proper place in the logical structure of the table, based on a simple model of table structure combined with a number of measures of cohesion between cells. A framework is presented for examining the effect of particular variables on the performance of the system, and preliminary results are presented showing the effect of cohesion measures based on the simplest domain-independent analyses, with the aim allowing future comparison with more knowledge-intensive analyses based on natural language processing. These baseline results suggest that very simple string-based cohesion measures are not sufficient to support the extraction of tuples as we require. Future work will pursue the aim of more adequate approximations to a notional subtype/supertype definition of the relationship between value cells and label cells.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128239473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
High accuracy handwritten Chinese character recognition by improved feature matching method 基于改进特征匹配方法的高精度手写体汉字识别
Cheng-Lin Liu, In-Jung Kim, J. H. Kim
Proposes some strategies to improve the recognition performance of a feature matching method for handwritten Chinese character recognition (HCCR). Favorable modifications are given to all stages throughout the recognition. In pre-processing, we devised a modified nonlinear normalization algorithm and a connectivity-preserving smoothing algorithm. For feature extraction, an efficient directional decomposition algorithm and a systematic approach to design a blurring mask are presented. Finally, a modified LVQ3 algorithm is applied to optimize the reference vectors for classification. The integrated effect of these strategies significantly improves the recognition performance. Recognition results on the large-vocabulary databases ETL8B2 and ETL9B are promising.
提出了一些改进手写体汉字识别特征匹配方法识别性能的策略。在整个识别过程中,对各个阶段进行了有利的修改。在预处理中,我们设计了一种改进的非线性归一化算法和一种保持连通性的平滑算法。在特征提取方面,提出了一种有效的方向分解算法和系统的模糊掩模设计方法。最后,采用改进的LVQ3算法对参考向量进行优化分类。这些策略的综合作用显著提高了识别性能。在大词汇库ETL8B2和ETL9B上的识别结果令人满意。
{"title":"High accuracy handwritten Chinese character recognition by improved feature matching method","authors":"Cheng-Lin Liu, In-Jung Kim, J. H. Kim","doi":"10.1109/ICDAR.1997.620666","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620666","url":null,"abstract":"Proposes some strategies to improve the recognition performance of a feature matching method for handwritten Chinese character recognition (HCCR). Favorable modifications are given to all stages throughout the recognition. In pre-processing, we devised a modified nonlinear normalization algorithm and a connectivity-preserving smoothing algorithm. For feature extraction, an efficient directional decomposition algorithm and a systematic approach to design a blurring mask are presented. Finally, a modified LVQ3 algorithm is applied to optimize the reference vectors for classification. The integrated effect of these strategies significantly improves the recognition performance. Recognition results on the large-vocabulary databases ETL8B2 and ETL9B are promising.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128627510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Revealing the hidden Markov recognizer 揭示隐藏的马尔可夫识别器
Claus Aufmuth
The article describes a tool for visualizing hidden Markov recognizers (HMR) which allows the developer to get a detailed view of the recognition process. Improvements are suggested for a hidden Markov recognizer using an appropriate processing and visualization tool.
本文描述了一种用于可视化隐藏马尔可夫识别器(HMR)的工具,该工具允许开发人员获得识别过程的详细视图。建议使用适当的处理和可视化工具对隐马尔可夫识别器进行改进。
{"title":"Revealing the hidden Markov recognizer","authors":"Claus Aufmuth","doi":"10.1109/ICDAR.1997.620563","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620563","url":null,"abstract":"The article describes a tool for visualizing hidden Markov recognizers (HMR) which allows the developer to get a detailed view of the recognition process. Improvements are suggested for a hidden Markov recognizer using an appropriate processing and visualization tool.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"216 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134187712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Moby Dick meets GEOCR: lexical considerations in word recognition 《白鲸记》符合GEOCR:单词识别中的词汇考虑
A. Spitz
The author has previously (Proc. Int. Conf. on Doc. Anal. and Recognition, Montreal, pp. 723-728, 1995) described a high-speed, lexically driven OCR called GEOCR (Good Enough Optical Character Recognition). This paper expands on that work by describing the effects of lexical content, structure and processing on the performance of GEOCR as a word recognition engine, describing the recognition of a particular text, Moby Dick. Word recognition performance is shown to be enhanced by the application of an appropriate lexicon. Recognition speed is essentially independent of the details of lexical content, provided that the intersection of the occurrences of words in the document and the lexicon is high. Word recognition accuracy is dependent on both the intersection and specificity of the lexicon.
作者以前曾(Proc. Int.)别担心,医生。分析的和Recognition, Montreal, pp. 723-728, 1995)描述了一种高速的、词法驱动的OCR,称为GEOCR (Good Enough Optical Character Recognition)。本文通过描述词法内容、结构和处理对GEOCR作为一个词识别引擎的性能的影响来扩展这项工作,并描述了对特定文本《白鲸记》的识别。使用合适的词汇可以提高单词识别性能。识别速度基本上与词汇内容的细节无关,前提是文档中单词的出现频率与词汇的出现频率有很大的交集。单词识别的准确性取决于词汇的交叉性和特异性。
{"title":"Moby Dick meets GEOCR: lexical considerations in word recognition","authors":"A. Spitz","doi":"10.1109/ICDAR.1997.619845","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619845","url":null,"abstract":"The author has previously (Proc. Int. Conf. on Doc. Anal. and Recognition, Montreal, pp. 723-728, 1995) described a high-speed, lexically driven OCR called GEOCR (Good Enough Optical Character Recognition). This paper expands on that work by describing the effects of lexical content, structure and processing on the performance of GEOCR as a word recognition engine, describing the recognition of a particular text, Moby Dick. Word recognition performance is shown to be enhanced by the application of an appropriate lexicon. Recognition speed is essentially independent of the details of lexical content, provided that the intersection of the occurrences of words in the document and the lexicon is high. Word recognition accuracy is dependent on both the intersection and specificity of the lexicon.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"19 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134404616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Scalable image coding by spline approximation for a gray-scale image 灰度图像的样条近似可扩展图像编码
R. Haruki, T. Horiuchi
The proposed method expresses a gray-scale image by parametric spline functions for edge components and by two-variable spline functions for low frequency components. It can reconstruct the image keeping its quality for the basic shape transformation. If a binary image is input as a special case, the proposed method can make a scalable vector font automatically. The performance of the proposed method is verified by some experiments.
该方法用参数样条函数表示灰度图像的边缘分量,用双变量样条函数表示低频分量。对图像进行基本形状变换后,能保持图像的质量。对于特殊情况下输入的二值图像,该方法可以自动生成可伸缩的矢量字体。通过实验验证了该方法的有效性。
{"title":"Scalable image coding by spline approximation for a gray-scale image","authors":"R. Haruki, T. Horiuchi","doi":"10.1109/ICDAR.1997.619879","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619879","url":null,"abstract":"The proposed method expresses a gray-scale image by parametric spline functions for edge components and by two-variable spline functions for low frequency components. It can reconstruct the image keeping its quality for the basic shape transformation. If a binary image is input as a special case, the proposed method can make a scalable vector font automatically. The performance of the proposed method is verified by some experiments.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134016161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Recognition of facsimile documents using a database of robust features 利用鲁棒特征数据库识别传真文档
G. Raza, A. Hennig, N. Sherkat, R. Whitrow
A method for the recognition of poor quality documents containing touching characters is presented. The method is based on extraction of independent and robust features of each object of a sample word, where objects consist of single letters or of several touching ones. Thus avoiding letter segmentation the method eliminates errors frequently introduced in segmentation based approaches. Features are attributed by their position and extent in order to facilitate discrimination between different classes of objects. A method for automatic construction of a comprehensive database is presented. From a given dictionary every possible letter combination is obtained and the images of the artificially touching letters created. These images are subjected to noise and their features extracted. For recognition, alternatives for each object are found based on the database. Object alternatives are then combined into valid word alternatives using lexicon lookup. It has been observed that the developed method is effective for the recognition of poor quality documents.
提出了一种含有触摸字符的低质量文档的识别方法。该方法是基于提取样本词的每个对象的独立和鲁棒性特征,其中对象由单个字母或多个相邻字母组成。因此,该方法避免了字母分割,消除了基于分割的方法中经常引入的错误。特征是根据它们的位置和程度来归类的,以便于区分不同类别的物体。提出了一种自动构建综合数据库的方法。从给定的字典中获取所有可能的字母组合,并创建人工接触字母的图像。对这些图像进行噪声处理并提取其特征。为了识别,根据数据库找到每个对象的替代方案。然后使用词典查找将对象替代组合为有效的单词替代。结果表明,该方法对识别质量较差的文件是有效的。
{"title":"Recognition of facsimile documents using a database of robust features","authors":"G. Raza, A. Hennig, N. Sherkat, R. Whitrow","doi":"10.1109/ICDAR.1997.619886","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619886","url":null,"abstract":"A method for the recognition of poor quality documents containing touching characters is presented. The method is based on extraction of independent and robust features of each object of a sample word, where objects consist of single letters or of several touching ones. Thus avoiding letter segmentation the method eliminates errors frequently introduced in segmentation based approaches. Features are attributed by their position and extent in order to facilitate discrimination between different classes of objects. A method for automatic construction of a comprehensive database is presented. From a given dictionary every possible letter combination is obtained and the images of the artificially touching letters created. These images are subjected to noise and their features extracted. For recognition, alternatives for each object are found based on the database. Object alternatives are then combined into valid word alternatives using lexicon lookup. It has been observed that the developed method is effective for the recognition of poor quality documents.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132996863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Construction of retrieval system for pictorial book of flora 植物图画书检索系统的构建
Yasuhiko Watanabe, M. Nagao
Pattern information and natural language information used together can complement and reinforce each other to enable more effective communication than can either medium alone. A good example is a pictorial book of flora (PBF). In the PBF, readable explanations combine texts and pictures. However, it is difficult to retrieve explanation text and pictures from the PBF when we don't know the names of flowers. To solve this problem, we propose a retrieval method for the PBF using the color feature of each flower and fruit, and construct an experimental retrieval system for the PBF. For obtaining the color feature of each flower and fruit, we analysed the PBF pictures and found several problems as follows: Pictures of the PBF contain many kinds of objects. In addition to flowers and fruits, there are leaves, stems, skies, soils, and sometimes humans in the PBF pictures. The position, size, and direction of flowers and fruits vary quite widely in each picture. Each flower and fruit has its unique shape, color, and texture which are commonly different from those of the others. Because of these problems, it is difficult to build the general and precise model for analyzing the PBF pictures in advance. We propose a method for image analysis using natural language information. Our method works as follows. First, we analyse the PBF explanation texts for extracting the color information on each flower and fruit. Then, we analyse the PBF pictures by using the results of the natural language processing, and finally obtain the color feature of each flower and fruit.
模式信息和自然语言信息一起使用可以相互补充和加强,从而实现比单独使用任何一种媒介更有效的沟通。植物图画书(PBF)就是一个很好的例子。在PBF中,可读的解释结合了文本和图片。然而,当我们不知道花的名称时,很难从PBF中检索到解释文本和图片。为了解决这一问题,我们提出了一种利用每个花和果实的颜色特征来检索PBF的方法,并构建了一个PBF的实验检索系统。为了获得每种花和水果的颜色特征,我们对PBF图像进行了分析,发现了以下几个问题:PBF图像中包含多种物体。除了花和水果,还有叶子、茎、天空、土壤,有时在PBF图片中还有人。在每幅画中,花和水果的位置、大小和方向都有很大的不同。每一种花和果实都有其独特的形状、颜色和质地,通常与其他的不同。由于存在这些问题,很难预先建立通用的、精确的PBF图像分析模型。我们提出了一种利用自然语言信息进行图像分析的方法。我们的方法如下。首先,我们对PBF解释文本进行分析,提取每种花和水果的颜色信息。然后,我们利用自然语言处理的结果对PBF图像进行分析,最终得到每个花和水果的颜色特征。
{"title":"Construction of retrieval system for pictorial book of flora","authors":"Yasuhiko Watanabe, M. Nagao","doi":"10.1109/ICDAR.1997.620653","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.620653","url":null,"abstract":"Pattern information and natural language information used together can complement and reinforce each other to enable more effective communication than can either medium alone. A good example is a pictorial book of flora (PBF). In the PBF, readable explanations combine texts and pictures. However, it is difficult to retrieve explanation text and pictures from the PBF when we don't know the names of flowers. To solve this problem, we propose a retrieval method for the PBF using the color feature of each flower and fruit, and construct an experimental retrieval system for the PBF. For obtaining the color feature of each flower and fruit, we analysed the PBF pictures and found several problems as follows: Pictures of the PBF contain many kinds of objects. In addition to flowers and fruits, there are leaves, stems, skies, soils, and sometimes humans in the PBF pictures. The position, size, and direction of flowers and fruits vary quite widely in each picture. Each flower and fruit has its unique shape, color, and texture which are commonly different from those of the others. Because of these problems, it is difficult to build the general and precise model for analyzing the PBF pictures in advance. We propose a method for image analysis using natural language information. Our method works as follows. First, we analyse the PBF explanation texts for extracting the color information on each flower and fruit. Then, we analyse the PBF pictures by using the results of the natural language processing, and finally obtain the color feature of each flower and fruit.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133154770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Skew and slant correction for document images using gradient direction 使用梯度方向对文档图像进行倾斜和倾斜校正
Changming Sun, Deyi Si
A fast algorithm is presented for skew and slant correction in printed document images. The algorithm employs only the gradient information. The skew angle is obtained by searching for a peak in the histogram of the gradient orientation of the input grey-level image. The skewness of the document is corrected by a rotation at such an angle. The slant of characters can also be detected using the same technique, and can be corrected by a shear operation. A second method for character slant correction by fitting parallelograms to the connected components is also described. Document images with different contents (tables, figures, and photos) have been tested for skew correction and the algorithm gives accurate results on all the test images, and the algorithm is very easy to implement.
提出了一种快速的打印文档图像倾斜和倾斜校正算法。该算法仅使用梯度信息。斜角度是通过在输入灰度图像的梯度方向直方图中搜索一个峰值得到的。以这样的角度旋转可以纠正文件的偏度。字符的倾斜也可以使用相同的技术来检测,并且可以通过剪切操作来纠正。还描述了通过将平行四边形拟合到连接的分量来进行字符倾斜校正的第二种方法。对不同内容的文档图像(表格、图形和照片)进行了倾斜校正测试,该算法对所有测试图像都给出了准确的结果,并且该算法非常容易实现。
{"title":"Skew and slant correction for document images using gradient direction","authors":"Changming Sun, Deyi Si","doi":"10.1109/ICDAR.1997.619830","DOIUrl":"https://doi.org/10.1109/ICDAR.1997.619830","url":null,"abstract":"A fast algorithm is presented for skew and slant correction in printed document images. The algorithm employs only the gradient information. The skew angle is obtained by searching for a peak in the histogram of the gradient orientation of the input grey-level image. The skewness of the document is corrected by a rotation at such an angle. The slant of characters can also be detected using the same technique, and can be corrected by a shear operation. A second method for character slant correction by fitting parallelograms to the connected components is also described. Document images with different contents (tables, figures, and photos) have been tested for skew correction and the algorithm gives accurate results on all the test images, and the algorithm is very easy to implement.","PeriodicalId":435320,"journal":{"name":"Proceedings of the Fourth International Conference on Document Analysis and Recognition","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1997-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132390563","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
期刊
Proceedings of the Fourth International Conference on Document Analysis and Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1