首页 > 最新文献

Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)最新文献

英文 中文
Language-Based Feature Extraction Using Template-Matching in Farsi/Arabic Handwritten Numeral Recognition 基于模板匹配的波斯语/阿拉伯语手写体数字识别特征提取
M. Ziaratban, K. Faez, F. Faradji
A recognition system based on template matching for identifying handwritten Farsi/Arabic numerals has been developed in this paper. Template matching is a fundamental method of detecting the presence of objects and identifying them in an image. In the proposed method, templates have been chosen so that they represent the features of FARSI/Arabic prescribed form of writing as possible. Experimental results show that the performance of proposed language-based method has been achieved more than the other usual common feature extraction approaches. NM-MLP is used as a classifier and trained with 6000 samples. Test set includes 4000 samples. The recognition rate of 97.65% was obtained, which is 0.64% more than Zernike moment approach.
本文开发了一种基于模板匹配的手写体波斯语/阿拉伯语数字识别系统。模板匹配是检测和识别图像中目标的基本方法。在拟议的方法中,所选择的模板尽可能代表波斯语/阿拉伯语规定的书写形式的特点。实验结果表明,本文提出的基于语言的特征提取方法比其他常用的特征提取方法具有更高的性能。使用NM-MLP作为分类器,使用6000个样本进行训练。测试集包括4000个样本。该方法的识别率为97.65%,比Zernike矩法提高了0.64%。
{"title":"Language-Based Feature Extraction Using Template-Matching in Farsi/Arabic Handwritten Numeral Recognition","authors":"M. Ziaratban, K. Faez, F. Faradji","doi":"10.1109/ICDAR.2007.273","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.273","url":null,"abstract":"A recognition system based on template matching for identifying handwritten Farsi/Arabic numerals has been developed in this paper. Template matching is a fundamental method of detecting the presence of objects and identifying them in an image. In the proposed method, templates have been chosen so that they represent the features of FARSI/Arabic prescribed form of writing as possible. Experimental results show that the performance of proposed language-based method has been achieved more than the other usual common feature extraction approaches. NM-MLP is used as a classifier and trained with 6000 samples. Test set includes 4000 samples. The recognition rate of 97.65% was obtained, which is 0.64% more than Zernike moment approach.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114902115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 49
A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points 基于特征点局部排列对齐的纸质文档标注提取方法
Pub Date : 2007-11-12 DOI: 10.1109/ICDAR.2007.4378669
T. Nakai, K. Kise, M. Iwamura
Annotations on paper documents include important information. We can exploit the information by extracting and analyzing annotations. In this paper, we propose a method of annotation extraction from paper documents. Unlike previous methods which limit colors or types of annotations to be extracted, the proposed method attempts to extract annotations by comparing a document image of an annotated document with its original document image for removing the limitations. The proposed method is characterized by fast matching and flexible subtraction of images both of which are essential to the annotation extraction by comparison. Experimental results have shown that color annotations can be extracted from color documents.
纸质文档上的注释包含重要信息。我们可以通过提取和分析注释来利用这些信息。本文提出了一种从纸质文档中提取注释的方法。不像以前的方法限制要提取的注释的颜色或类型,该方法试图通过将注释文档的文档图像与其原始文档图像进行比较来提取注释,以消除限制。该方法具有快速匹配和灵活减法的特点,两者都是比较标注提取的关键。实验结果表明,该方法可以从颜色文档中提取颜色标注。
{"title":"A Method of Annotation Extraction from Paper Documents Using Alignment Based on Local Arrangements of Feature Points","authors":"T. Nakai, K. Kise, M. Iwamura","doi":"10.1109/ICDAR.2007.4378669","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.4378669","url":null,"abstract":"Annotations on paper documents include important information. We can exploit the information by extracting and analyzing annotations. In this paper, we propose a method of annotation extraction from paper documents. Unlike previous methods which limit colors or types of annotations to be extracted, the proposed method attempts to extract annotations by comparing a document image of an annotated document with its original document image for removing the limitations. The proposed method is characterized by fast matching and flexible subtraction of images both of which are essential to the annotation extraction by comparison. Experimental results have shown that color annotations can be extracted from color documents.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116847547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Removing Shading Distortions in Camera-based Document Images Using Inpainting and Surface Fitting With Radial Basis Functions 在基于相机的文档图像中,使用径向基函数的补漆和表面拟合去除阴影失真
Li Zhang, A. Yip, C. Tan
Shading distortions are often perceived in geometrically distorted document images due to the change of surface normal with respect to the illumination direction. Such distortions are undesirable because they hamper OCR performance tremendously even when the geometric distortions are corrected. In this paper, we propose an effective method that removes shading distortions in images of documents with various geometric shapes based on the notion of intrinsic images. We first try to derive the shading image using an inpainting technique with an automatic mask generation routine and then apply a surface fitting procedure with radial basis functions to remove pepper noises in the inpainted image and return a smooth shading image. Once the shading image is extracted, the reflectance image can be obtained automatically. Experiments on a wide range of distorted document images demonstrate a robust performance. Moreover, we also show its potential applications to the restoration of historical handwritten documents.
由于表面法线相对于照明方向的变化,在几何扭曲的文档图像中经常会感知到阴影失真。这样的失真是不可取的,因为它们极大地阻碍了OCR的性能,即使当几何失真被纠正。在本文中,我们提出了一种基于内在图像概念的有效方法来消除具有各种几何形状的文档图像中的阴影失真。我们首先尝试使用带有自动蒙版生成例程的着色技术来导出底纹图像,然后应用具有径向基函数的表面拟合过程来去除所绘制图像中的胡椒噪声并返回光滑的底纹图像。提取底纹图像后,自动获得反射率图像。在大范围的失真文档图像上的实验证明了该算法的鲁棒性。此外,我们还展示了它在历史手写文件修复中的潜在应用。
{"title":"Removing Shading Distortions in Camera-based Document Images Using Inpainting and Surface Fitting With Radial Basis Functions","authors":"Li Zhang, A. Yip, C. Tan","doi":"10.1109/ICDAR.2007.217","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.217","url":null,"abstract":"Shading distortions are often perceived in geometrically distorted document images due to the change of surface normal with respect to the illumination direction. Such distortions are undesirable because they hamper OCR performance tremendously even when the geometric distortions are corrected. In this paper, we propose an effective method that removes shading distortions in images of documents with various geometric shapes based on the notion of intrinsic images. We first try to derive the shading image using an inpainting technique with an automatic mask generation routine and then apply a surface fitting procedure with radial basis functions to remove pepper noises in the inpainted image and return a smooth shading image. Once the shading image is extracted, the reflectance image can be obtained automatically. Experiments on a wide range of distorted document images demonstrate a robust performance. Moreover, we also show its potential applications to the restoration of historical handwritten documents.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"356 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115424548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
New Strategy for the On-Line Handwriting Modelling 在线书写建模的新策略
H. Boubaker, M. Kherallah, A. Alimi
In this article, we present initially arguments supporting the idea of the approximation of a cursive handwriting trajectory by arcs of ellipses. Then, we introduce a new strategy which improves the dynamic and geometrical features of the online handwritten trajectory modeling. We show that the curvilinear velocity can be rebuilt with the superposition of two components successively named the "Beta" model and the "Carrying" dragged component. After that, we integrated the geometrical characteristics as the arcs of ellipses for the layout modeling.
在这篇文章中,我们提出了最初的论点,支持草书笔迹轨迹的近似椭圆弧的想法。然后,我们引入了一种新的策略,改进了在线手写轨迹建模的动态和几何特征。我们证明了曲线速度可以用两个分量的叠加来重建,这两个分量分别被称为“Beta”模型和“Carrying”拖曳分量。然后,我们将几何特征整合为椭圆的圆弧进行布局建模。
{"title":"New Strategy for the On-Line Handwriting Modelling","authors":"H. Boubaker, M. Kherallah, A. Alimi","doi":"10.1109/ICDAR.2007.177","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.177","url":null,"abstract":"In this article, we present initially arguments supporting the idea of the approximation of a cursive handwriting trajectory by arcs of ellipses. Then, we introduce a new strategy which improves the dynamic and geometrical features of the online handwritten trajectory modeling. We show that the curvilinear velocity can be rebuilt with the superposition of two components successively named the \"Beta\" model and the \"Carrying\" dragged component. After that, we integrated the geometrical characteristics as the arcs of ellipses for the layout modeling.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"54 5","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120814857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Contribution of Multiresolution Description for Archive Document Structure Recognition 多分辨率描述对档案文档结构识别的贡献
Aurélie Lemaitre, J. Camillerapp, Bertrand Coüasnon
When reading a document, we intuitively have a first global approach in order to determine the whole structure, before reading parts in details. We propose to apply the same kind of mechanism by introducing the concept of multiresolution in an existing generic method for structured document recognition. This new combination of different vision levels makes it possible to recognize low structured documents. We present our work on an example: the multiresolution description of archive documents that are naturalization decree registers from the 19th and 20th century. The validation has been made on 85,088 images. Integrated in a platform for archive documents, the located elements offers to users a fast leaf through naturalization decrees.
在阅读文档时,我们会先直观地采用全局方法来确定整个结构,然后再详细阅读部分内容。我们建议通过在现有的结构化文档识别通用方法中引入多分辨率的概念来应用相同的机制。这种不同视觉层次的新组合使得识别低结构化文档成为可能。我们在一个例子上展示了我们的工作:档案文件的多分辨率描述,这些文件是19世纪和20世纪的入籍法令登记册。在85,088张图像上进行了验证。集成在存档文件的平台中,定位的元素为用户提供了快速浏览入籍法令的方式。
{"title":"Contribution of Multiresolution Description for Archive Document Structure Recognition","authors":"Aurélie Lemaitre, J. Camillerapp, Bertrand Coüasnon","doi":"10.1109/ICDAR.2007.92","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.92","url":null,"abstract":"When reading a document, we intuitively have a first global approach in order to determine the whole structure, before reading parts in details. We propose to apply the same kind of mechanism by introducing the concept of multiresolution in an existing generic method for structured document recognition. This new combination of different vision levels makes it possible to recognize low structured documents. We present our work on an example: the multiresolution description of archive documents that are naturalization decree registers from the 19th and 20th century. The validation has been made on 85,088 images. Integrated in a platform for archive documents, the located elements offers to users a fast leaf through naturalization decrees.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121105775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
An Effective and Practical Classifier Fusion Strategy for Improving Handwritten Character Recognition 一种有效实用的改进手写体字符识别的分类器融合策略
Qiang Fu, X. Ding, T. Li, C. Liu
In this paper, we propose a classifier fusion strategy which trains MQDF (modified quadratic discriminant functions) classifiers using cascade structure and combines classifiers on the measurement level to improve handwritten character recognition performance. The generalized confidence is introduced to compute recognition score, and the maximum rule based fusion is applied. The proposed fusion strategy is practical and effective. Its performance is evaluated by handwritten Chinese character recognition experiments on different databases. Experimental results show that the proposed algorithm achieves at least 10% reduction on classification error, and even higher 24% classification error reduction on bad quality samples.
本文提出了一种分类器融合策略,该策略采用级联结构训练MQDF(修正二次判别函数)分类器,并在度量层面上组合分类器以提高手写字符识别性能。引入广义置信度计算识别分数,并采用基于最大规则的融合。所提出的融合策略是实用有效的。通过不同数据库的手写体汉字识别实验,对其性能进行了评价。实验结果表明,该算法的分类误差至少降低了10%,对质量较差的样本的分类误差降低了24%以上。
{"title":"An Effective and Practical Classifier Fusion Strategy for Improving Handwritten Character Recognition","authors":"Qiang Fu, X. Ding, T. Li, C. Liu","doi":"10.1109/ICDAR.2007.48","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.48","url":null,"abstract":"In this paper, we propose a classifier fusion strategy which trains MQDF (modified quadratic discriminant functions) classifiers using cascade structure and combines classifiers on the measurement level to improve handwritten character recognition performance. The generalized confidence is introduced to compute recognition score, and the maximum rule based fusion is applied. The proposed fusion strategy is practical and effective. Its performance is evaluated by handwritten Chinese character recognition experiments on different databases. Experimental results show that the proposed algorithm achieves at least 10% reduction on classification error, and even higher 24% classification error reduction on bad quality samples.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121259293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Writer Identification in Handwritten Documents 手写文件中的作者识别
I. Siddiqi, N. Vincent
This work presents an effective method for writer identification in handwritten documents. We have developed a local approach, based on the extraction of characteristics that are specific to a writer. To exploit the existence of redundant patterns within a handwriting, the writing is divided into a large number of small sub-images, and the sub-images that are morphologically similar are grouped together in the same classes. The patterns, which occur frequently for a writer are thus extracted. The author of the unknown document is then identified by a Bayesian classifier. The system trained and tested on 50 documents of the same number of authors, reported an identification rate of 94%.
本文提出了一种有效的手写文件作者识别方法。我们已经开发了一种基于提取特定于作家的特征的本地方法。为了利用笔迹中存在的冗余模式,将笔迹分成大量小的子图像,并将形态学相似的子图像分组在同一类中。这样就提取了经常出现在编写器上的模式。然后用贝叶斯分类器识别未知文档的作者。该系统对相同数量作者的50份文件进行了训练和测试,报告识别率为94%。
{"title":"Writer Identification in Handwritten Documents","authors":"I. Siddiqi, N. Vincent","doi":"10.1109/ICDAR.2007.270","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.270","url":null,"abstract":"This work presents an effective method for writer identification in handwritten documents. We have developed a local approach, based on the extraction of characteristics that are specific to a writer. To exploit the existence of redundant patterns within a handwriting, the writing is divided into a large number of small sub-images, and the sub-images that are morphologically similar are grouped together in the same classes. The patterns, which occur frequently for a writer are thus extracted. The author of the unknown document is then identified by a Bayesian classifier. The system trained and tested on 50 documents of the same number of authors, reported an identification rate of 94%.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127360143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 72
Camera-Based Graphical Symbol Detection 基于摄像机的图形符号检测
Marçal Rusiñol, J. Lladós, P. Dosch
In this paper we present a method to locate and recognize graphical symbols appearing in real images. A vectorial signature is defined to describe graphical symbols. It is formulated in terms of accumulated length and angular information computed from polygonal approximation of contours. The proposed method aims to locate and recognize graphical symbols in cluttered environments at the same time, without needing a segmentation step. The symbol signature is tolerant to rotation, scale, translation and to distortions such as weak perspective, blurring effect and illumination changes usually present when working with scenes acquired with low resolution cameras in open environments.
本文提出了一种定位和识别真实图像中图形符号的方法。定义矢量签名来描述图形符号。它是用轮廓多边形近似计算的累积长度和角度信息来表示的。该方法的目的是在不需要分割步骤的情况下,同时在混乱环境中对图形符号进行定位和识别。符号签名容忍旋转、缩放、平移和扭曲,如弱透视、模糊效果和照明变化,通常出现在使用低分辨率相机在开放环境中获取的场景时。
{"title":"Camera-Based Graphical Symbol Detection","authors":"Marçal Rusiñol, J. Lladós, P. Dosch","doi":"10.1109/ICDAR.2007.76","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.76","url":null,"abstract":"In this paper we present a method to locate and recognize graphical symbols appearing in real images. A vectorial signature is defined to describe graphical symbols. It is formulated in terms of accumulated length and angular information computed from polygonal approximation of contours. The proposed method aims to locate and recognize graphical symbols in cluttered environments at the same time, without needing a segmentation step. The symbol signature is tolerant to rotation, scale, translation and to distortions such as weak perspective, blurring effect and illumination changes usually present when working with scenes acquired with low resolution cameras in open environments.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127493826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A New Physically Motivated Warping Model for Form Drop-Out 一种新的物理激励变形模型
G. Rosman, A. Tzadok, D. Tal
Documents scanned by sheet-fed scanners often exhibit distortions due to the feeding and scanning mechanism. This paper presents a new model, motivated by the distortions observed in such documents. Numerical problems affecting the use of this model are addressed using an approximated model which is easier to estimate correctly. We demonstrate results showing the robustness and accuracy of this model on sheet-fed scanners output, and relate to existing techniques for registration and drop-out of structured forms.
单张纸扫描器扫描的文件通常由于进给和扫描机制而出现畸变。本文提出了一个新的模型,动机是在这些文件中观察到的扭曲。影响该模型使用的数值问题使用更容易正确估计的近似模型来解决。我们展示的结果显示了该模型在单张纸扫描仪输出上的鲁棒性和准确性,并与现有的结构化表单的注册和退出技术相关。
{"title":"A New Physically Motivated Warping Model for Form Drop-Out","authors":"G. Rosman, A. Tzadok, D. Tal","doi":"10.1109/ICDAR.2007.25","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.25","url":null,"abstract":"Documents scanned by sheet-fed scanners often exhibit distortions due to the feeding and scanning mechanism. This paper presents a new model, motivated by the distortions observed in such documents. Numerical problems affecting the use of this model are addressed using an approximated model which is easier to estimate correctly. We demonstrate results showing the robustness and accuracy of this model on sheet-fed scanners output, and relate to existing techniques for registration and drop-out of structured forms.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125898529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Shared Parts Model for Document Image Recognition 文档图像识别的共享部件模型
Mithun Das Gupta, Prateek Sarkar
We address document image classification by visual appearance. An image is represented by a variable-length list of visually salient features. A hierarchical Bayesian network is used to model the joint density of these features. This model promotes generalization from a few samples by sharing component probability distributions among different categories, and by factoring out a common displacement vector shared by all features within an image. The Bayesian network is implemented as a factor graph, and parameter estimation and inference are both done by loopy belief propagation. We explain and illustrate our model on a simple shape classification task. We obtain close to 90% accuracy on classifying journal articles from memos in the UWASH-II dataset, as well as on other classification tasks on a home-grown data set of technical articles.
我们通过视觉外观来处理文档图像分类。图像由视觉显著特征的可变长度列表表示。使用层次贝叶斯网络对这些特征的联合密度进行建模。该模型通过共享不同类别之间的分量概率分布,并通过分解出图像中所有特征共享的公共位移向量来促进少数样本的泛化。贝叶斯网络以因子图的形式实现,参数估计和推理都是通过循环信念传播来完成的。我们在一个简单的形状分类任务上解释和说明我们的模型。我们从UWASH-II数据集中的备忘录中对期刊文章进行分类的准确率接近90%,以及在本地技术文章数据集上的其他分类任务。
{"title":"A Shared Parts Model for Document Image Recognition","authors":"Mithun Das Gupta, Prateek Sarkar","doi":"10.1109/ICDAR.2007.34","DOIUrl":"https://doi.org/10.1109/ICDAR.2007.34","url":null,"abstract":"We address document image classification by visual appearance. An image is represented by a variable-length list of visually salient features. A hierarchical Bayesian network is used to model the joint density of these features. This model promotes generalization from a few samples by sharing component probability distributions among different categories, and by factoring out a common displacement vector shared by all features within an image. The Bayesian network is implemented as a factor graph, and parameter estimation and inference are both done by loopy belief propagation. We explain and illustrate our model on a simple shape classification task. We obtain close to 90% accuracy on classifying journal articles from memos in the UWASH-II dataset, as well as on other classification tasks on a home-grown data set of technical articles.","PeriodicalId":279268,"journal":{"name":"Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126080192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Ninth International Conference on Document Analysis and Recognition (ICDAR 2007)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1