首页 > 最新文献

2016 12th IAPR Workshop on Document Analysis Systems (DAS)最新文献

英文 中文
Historical Document Dating Using Unsupervised Attribute Learning 使用无监督属性学习的历史文献年代测定
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.38
Sheng He, P. Samara, J. Burgers, Lambert Schomaker
The date of historical documents is an important metadata for scholars using them, as they need to know the historical context of the documents. This paper presents a novel attribute representation for medieval documents to automatically estimate the date information, which are the years they had been written. Non-semantic attributes are discovered in the low-level feature space using an unsupervised attribute learning method. A negative data set is involved in the attribute learning to make sure that our system rejects the documents which are not from the Middle Ages nor from the same archives. Experimental results on the basis of the Medieval Paleographic Scale (MPS) data set demonstrate that the proposed method achieves the state-of-the-art result.
历史文献的日期对于使用文献的学者来说是一个重要的元数据,因为他们需要知道文献的历史背景。本文提出了一种新的中世纪文献属性表示方法,用于自动估计文献写作的日期信息。使用无监督属性学习方法在底层特征空间中发现非语义属性。属性学习中包含一个负数据集,以确保我们的系统拒绝那些不是来自中世纪也不是来自同一档案的文件。基于中世纪古比例尺(MPS)数据集的实验结果表明,该方法达到了最先进的效果。
{"title":"Historical Document Dating Using Unsupervised Attribute Learning","authors":"Sheng He, P. Samara, J. Burgers, Lambert Schomaker","doi":"10.1109/DAS.2016.38","DOIUrl":"https://doi.org/10.1109/DAS.2016.38","url":null,"abstract":"The date of historical documents is an important metadata for scholars using them, as they need to know the historical context of the documents. This paper presents a novel attribute representation for medieval documents to automatically estimate the date information, which are the years they had been written. Non-semantic attributes are discovered in the low-level feature space using an unsupervised attribute learning method. A negative data set is involved in the attribute learning to make sure that our system rejects the documents which are not from the Middle Ages nor from the same archives. Experimental results on the basis of the Medieval Paleographic Scale (MPS) data set demonstrate that the proposed method achieves the state-of-the-art result.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115656837","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
RNN Based Uyghur Text Line Recognition and Its Training Strategy 基于RNN的维吾尔语文本行识别及其训练策略
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.20
Pengchao Li, Jiadong Zhu, Liangrui Peng, Yunbiao Guo
Uyghur language is written in a modified Arabic script. Due to its cursive nature and the lack of enough labeled training samples, Uyghur document recognition is still a challenging problem. In this paper, we propose a new Recurrent Neural Network (RNN) based Uyghur text line recognition method combining Gated Recurrent Unit (GRU) and Restricted Boltzmann Machine (RBM) with pretraining mechanism. We also present a novel curriculum learning technique guided by sample distribution information. Experimental results on practical Uyghur printed document image dataset show that the proposed network architecture and training strategy not only achieve better recognition accuracy compared with traditional methods, but can accelerate the training speed as well.
维吾尔语是由阿拉伯文字修改而成的。由于其草书性质和缺乏足够的标记训练样本,维吾尔语文档识别仍然是一个具有挑战性的问题。本文提出了一种新的基于递归神经网络(RNN)的维吾尔语文本行识别方法,该方法结合了门控递归单元(GRU)和受限玻尔兹曼机(RBM)和预训练机制。提出了一种基于样本分布信息的课程学习方法。在实际维吾尔语打印文档图像数据集上的实验结果表明,与传统方法相比,所提出的网络架构和训练策略不仅具有更好的识别精度,而且可以加快训练速度。
{"title":"RNN Based Uyghur Text Line Recognition and Its Training Strategy","authors":"Pengchao Li, Jiadong Zhu, Liangrui Peng, Yunbiao Guo","doi":"10.1109/DAS.2016.20","DOIUrl":"https://doi.org/10.1109/DAS.2016.20","url":null,"abstract":"Uyghur language is written in a modified Arabic script. Due to its cursive nature and the lack of enough labeled training samples, Uyghur document recognition is still a challenging problem. In this paper, we propose a new Recurrent Neural Network (RNN) based Uyghur text line recognition method combining Gated Recurrent Unit (GRU) and Restricted Boltzmann Machine (RBM) with pretraining mechanism. We also present a novel curriculum learning technique guided by sample distribution information. Experimental results on practical Uyghur printed document image dataset show that the proposed network architecture and training strategy not only achieve better recognition accuracy compared with traditional methods, but can accelerate the training speed as well.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122676018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Delaunay Triangulation-Based Features for Camera-Based Document Image Retrieval System 基于Delaunay三角特征的相机文档图像检索系统
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.66
Quoc Bao Dang, Marçal Rusiñol, Mickaël Coustaty, M. Luqman, De Cao Tran, J. Ogier
In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution) and 700 textual document images.
在本文中,我们提出了一种新的特征向量,称为DElaunay TRIangulation-based Features (DETRIF),用于基于相机的实时文档图像检索。DETRIF是基于delaunay三角剖分中每对邻接三角形的几何约束来计算的,delaunay三角剖分是由连通分量的质心构成的。此外,我们采用了一个基于哈希的索引系统来评估DETRIF的性能,并将其与LLAH和SRIF等其他系统进行比较。实验在两个数据集上进行,其中包括400张异构内容复杂语言地图图像(超大尺寸,分辨率为9800 X 11768像素)和700张文本文档图像。
{"title":"Delaunay Triangulation-Based Features for Camera-Based Document Image Retrieval System","authors":"Quoc Bao Dang, Marçal Rusiñol, Mickaël Coustaty, M. Luqman, De Cao Tran, J. Ogier","doi":"10.1109/DAS.2016.66","DOIUrl":"https://doi.org/10.1109/DAS.2016.66","url":null,"abstract":"In this paper, we propose a new feature vector, named DElaunay TRIangulation-based Features (DETRIF), for real-time camera-based document image retrieval. DETRIF is computed based on the geometrical constraints from each pair of adjacency triangles in delaunay triangulation which is constructed from centroids of connected components. Besides, we employ a hashing-based indexing system in order to evaluate the performance of DETRIF and to compare it with other systems such as LLAH and SRIF. The experimentation is carried out on two datasets comprising of 400 heterogeneous-content complex linguistic map images (huge size, 9800 X 11768 pixels resolution) and 700 textual document images.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127542529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Analysis of Stroke Intersection for Overlapping PGF Elements 重叠PGF元素的笔画相交分析
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.11
Yan Chen, Xiaoqing Lu, J. Qu, Zhi Tang
Query-by-figure is an effective retrieval approach for educational documents. However, complex geometric diagrams in the field of mathematics education remain as obstacles in current retrieval systems. This study aims to explore a query method for plane geometric figures (PGFs) via sketched figures on smart mobile devices. We adopt an undirected graph model to describe PGFs and a divide-and-conquer strategy to analyze the relationships among strokes. Our main contribution is the detailed analysis of the stroke intersection that frequently occurs in PGFs. Numerous accurate elements obtained through overlapping analysis are then selected to construct strong descriptors for PGFs. Only the compressed query features, instead of a query figure, are transmitted to an image-based retrieval system located on a remote server, where the sketched PGF is finally recognized with low delay response. The experiments show that the proposed method achieves high efficiency and provides users with good interactive experience.
图查询是一种有效的教育类文献检索方法。然而,复杂的几何图形在数学教育领域仍然是当前检索系统的障碍。本研究旨在探索一种在智能移动设备上通过速写图形查询平面几何图形的方法。我们采用无向图模型来描述PGFs,并采用分而治之的策略来分析笔画之间的关系。我们的主要贡献是详细分析了在PGFs中经常发生的笔画交集。然后选择通过重叠分析获得的许多精确元素来构建PGFs的强描述符。只有压缩的查询特征,而不是查询图,被传输到位于远程服务器上的基于图像的检索系统,在那里,草图的PGF最终以低延迟响应被识别。实验表明,该方法效率高,为用户提供了良好的交互体验。
{"title":"Analysis of Stroke Intersection for Overlapping PGF Elements","authors":"Yan Chen, Xiaoqing Lu, J. Qu, Zhi Tang","doi":"10.1109/DAS.2016.11","DOIUrl":"https://doi.org/10.1109/DAS.2016.11","url":null,"abstract":"Query-by-figure is an effective retrieval approach for educational documents. However, complex geometric diagrams in the field of mathematics education remain as obstacles in current retrieval systems. This study aims to explore a query method for plane geometric figures (PGFs) via sketched figures on smart mobile devices. We adopt an undirected graph model to describe PGFs and a divide-and-conquer strategy to analyze the relationships among strokes. Our main contribution is the detailed analysis of the stroke intersection that frequently occurs in PGFs. Numerous accurate elements obtained through overlapping analysis are then selected to construct strong descriptors for PGFs. Only the compressed query features, instead of a query figure, are transmitted to an image-based retrieval system located on a remote server, where the sketched PGF is finally recognized with low delay response. The experiments show that the proposed method achieves high efficiency and provides users with good interactive experience.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116367128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Searching Corrupted Document Collections 搜索损坏的文档集合
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.28
Jason J. Soo, O. Frieder
Historical documents are typically digitized using optical Character Recognition. While effective, the results may not always be accurate and are highly dependent on the input. Consequently, degraded documents are often corrupted. Our focus is finding flexible, reliable methods to correct for such degradation, in the face of limited resources. We extend upon our substring and context fusion based retrieval system known as Segments, to consider metadata. By extracting topics from documents, and supplementing and weighting our lexicon with co-occurring terms found in documents with those topics, we achieve a statistically significant improvement over the state-of-the-art in all but one test configuration. Our mean reciprocal rank measured on two free, publicly available, independently judged datasets is 0.7657 and 0.5382.
历史文献通常使用光学字符识别进行数字化。虽然有效,但结果可能并不总是准确的,并且高度依赖于输入。因此,降级的文档经常被损坏。面对有限的资源,我们的重点是寻找灵活、可靠的方法来纠正这种退化。我们扩展了基于子字符串和上下文融合的检索系统片段,考虑元数据。通过从文档中提取主题,并用在具有这些主题的文档中发现的共同出现的术语来补充和加权我们的词典,我们在除一个测试配置之外的所有测试配置中都实现了统计上的显著改进。我们在两个免费的、公开的、独立判断的数据集上测量的平均倒数排名是0.7657和0.5382。
{"title":"Searching Corrupted Document Collections","authors":"Jason J. Soo, O. Frieder","doi":"10.1109/DAS.2016.28","DOIUrl":"https://doi.org/10.1109/DAS.2016.28","url":null,"abstract":"Historical documents are typically digitized using optical Character Recognition. While effective, the results may not always be accurate and are highly dependent on the input. Consequently, degraded documents are often corrupted. Our focus is finding flexible, reliable methods to correct for such degradation, in the face of limited resources. We extend upon our substring and context fusion based retrieval system known as Segments, to consider metadata. By extracting topics from documents, and supplementing and weighting our lexicon with co-occurring terms found in documents with those topics, we achieve a statistically significant improvement over the state-of-the-art in all but one test configuration. Our mean reciprocal rank measured on two free, publicly available, independently judged datasets is 0.7657 and 0.5382.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125671789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Page Segmentation for Historical Document Images Based on Superpixel Classification with Unsupervised Feature Learning 基于无监督特征学习的超像素分类历史文档图像页面分割
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.13
Kai Chen, Cheng-Lin Liu, Mathias Seuret, M. Liwicki, J. Hennebert, R. Ingold
In this paper, we present an efficient page segmentation method for historical document images. Many existing methods either rely on hand-crafted features or perform rather slow as they treat the problem as a pixel-level assignment problem. In order to create a feasible method for real applications, we propose to use superpixels as basic units of segmentation, and features are learned directly from pixels. An image is first oversegmented into superpixels with the simple linear iterative clustering (SLIC) algorithm. Then, each superpixel is represented by the features of its central pixel. The features are learned from pixel intensity values with stacked convolutional autoencoders in an unsupervised manner. A support vector machine (SVM) classifier is used to classify superpixels into four classes: periphery, background, text block, and decoration. Finally, the segmentation results are refined by a connected component based smoothing procedure. Experiments on three public datasets demonstrate that compared to our previous method, the proposed method is much faster and achieves comparable segmentation results. Additionally, much fewer pixels are used for classifier training.
本文提出了一种高效的历史文档图像页面分割方法。许多现有的方法要么依赖于手工制作的特性,要么执行得相当慢,因为它们将问题视为像素级分配问题。为了在实际应用中创造一种可行的方法,我们提出使用超像素作为分割的基本单位,并直接从像素中学习特征。首先用简单线性迭代聚类(SLIC)算法对图像进行超像素分割。然后,每个超像素由其中心像素的特征表示。特征是用堆叠卷积自编码器以无监督的方式从像素强度值中学习的。使用支持向量机(SVM)分类器将超像素分为外围、背景、文本块和装饰四类。最后,通过基于连通分量的平滑处理对分割结果进行细化。在三个公共数据集上的实验表明,与我们之前的方法相比,所提方法的分割速度快得多,并且取得了相当的分割结果。此外,用于分类器训练的像素要少得多。
{"title":"Page Segmentation for Historical Document Images Based on Superpixel Classification with Unsupervised Feature Learning","authors":"Kai Chen, Cheng-Lin Liu, Mathias Seuret, M. Liwicki, J. Hennebert, R. Ingold","doi":"10.1109/DAS.2016.13","DOIUrl":"https://doi.org/10.1109/DAS.2016.13","url":null,"abstract":"In this paper, we present an efficient page segmentation method for historical document images. Many existing methods either rely on hand-crafted features or perform rather slow as they treat the problem as a pixel-level assignment problem. In order to create a feasible method for real applications, we propose to use superpixels as basic units of segmentation, and features are learned directly from pixels. An image is first oversegmented into superpixels with the simple linear iterative clustering (SLIC) algorithm. Then, each superpixel is represented by the features of its central pixel. The features are learned from pixel intensity values with stacked convolutional autoencoders in an unsupervised manner. A support vector machine (SVM) classifier is used to classify superpixels into four classes: periphery, background, text block, and decoration. Finally, the segmentation results are refined by a connected component based smoothing procedure. Experiments on three public datasets demonstrate that compared to our previous method, the proposed method is much faster and achieves comparable segmentation results. Additionally, much fewer pixels are used for classifier training.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129372879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Removal of Gray Rubber Stamps 去除灰色橡皮图章
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.26
Soumyadeep Dey, J. Mukhopadhyay, S. Sural
Rubber stamps often overlap with original text content of a document, and hence obscure the text regions very badly. Removal of these stamp regions becomes a necessity for successful conversion of such documents into electronic format. Stamp removal from a document becomes more difficult when they are in gray scale, or text and stamp are of the same color. In this paper, we propose a technique to remove such stamps from overlapped regions by identifying stamp regions and stamp pixels.
橡皮图章经常与文档的原始文本内容重叠,从而使文本区域非常模糊。为了成功地将这些文件转换为电子格式,必须删除这些印章区域。如果文档中的印章是灰度的,或者文本和印章的颜色相同,则从文档中删除印章会变得更加困难。在本文中,我们提出了一种通过识别邮票区域和邮票像素来从重叠区域中去除这些邮票的技术。
{"title":"Removal of Gray Rubber Stamps","authors":"Soumyadeep Dey, J. Mukhopadhyay, S. Sural","doi":"10.1109/DAS.2016.26","DOIUrl":"https://doi.org/10.1109/DAS.2016.26","url":null,"abstract":"Rubber stamps often overlap with original text content of a document, and hence obscure the text regions very badly. Removal of these stamp regions becomes a necessity for successful conversion of such documents into electronic format. Stamp removal from a document becomes more difficult when they are in gray scale, or text and stamp are of the same color. In this paper, we propose a technique to remove such stamps from overlapped regions by identifying stamp regions and stamp pixels.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131575587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Camera-Based System for User Friendly Annotation of Documents 基于摄像头的文档友好标注系统
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.62
Yusuke Oguma, K. Kise
We propose a system for document annotation using a camera mounted on a smartphone. It works both on paper documents and electronic documents thanks to the functionality of document image retrieval. An important characteristic of this system is in the way of annotating documents. The system employs simple character stickers which represent user's opinions ("hard to understand", "interesting", "boring", "surprising", "doubtful") for friendly annotation on documents. We evaluated our system by changing the way of annotation and found that users most liked the proposed way of annotation with stickers though it sometimes caused a confusion about the interpretation of stickers. We discuss the possible way of solving this issue as a result of the analysis of experimental results.
我们提出了一种使用安装在智能手机上的摄像头进行文档注释的系统。由于具有文档图像检索功能,它既适用于纸质文档,也适用于电子文档。该系统的一个重要特点是在标注文档的方式上。该系统采用简单的字符贴纸,代表用户的意见(“难以理解”、“有趣”、“无聊”、“惊讶”、“怀疑”),对文档进行友好注释。我们通过改变标注方式来评估我们的系统,发现用户最喜欢用贴纸标注的方法,尽管它有时会引起对贴纸解释的混淆。通过对实验结果的分析,探讨了解决这一问题的可能途径。
{"title":"Camera-Based System for User Friendly Annotation of Documents","authors":"Yusuke Oguma, K. Kise","doi":"10.1109/DAS.2016.62","DOIUrl":"https://doi.org/10.1109/DAS.2016.62","url":null,"abstract":"We propose a system for document annotation using a camera mounted on a smartphone. It works both on paper documents and electronic documents thanks to the functionality of document image retrieval. An important characteristic of this system is in the way of annotating documents. The system employs simple character stickers which represent user's opinions (\"hard to understand\", \"interesting\", \"boring\", \"surprising\", \"doubtful\") for friendly annotation on documents. We evaluated our system by changing the way of annotation and found that users most liked the proposed way of annotation with stickers though it sometimes caused a confusion about the interpretation of stickers. We discuss the possible way of solving this issue as a result of the analysis of experimental results.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134290964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Keyword Retrieval Using Scale-Space Pyramid 基于尺度-空间金字塔的关键词检索
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.16
Irina Rabaev, K. Kedem, Jihad El-Sana
We propose a pyramid-based method for keyword spotting in historical document images. The documents are represented by a scale-space pyramid of their features. The search for a query keyword begins at the highest level of the pyramid, where the initial candidates for matching are located. The candidates are further refined at each level of the pyramid. The number of levels is adaptive and depends on the length of the query word. The results from all the document images are combined and ranked. We compare two feature representations, grid-based and continuous, and show that continuous feature representation outperforms the grid-based representation. In order to reduce the memory used to store the scale-space pyramid of features, we discuss and compare two compressing approaches. The proposed method was evaluated on four different collections of historical documents achieving state-of-the-art results.
我们提出了一种基于金字塔的历史文献图像关键字识别方法。文件由其特征的比例空间金字塔表示。查询关键字的搜索从金字塔的最高层开始,这里是初始匹配候选项所在的位置。候选人在金字塔的每一层都进一步细化。级别的数量是自适应的,取决于查询词的长度。对所有文档图像的结果进行组合和排序。我们比较了基于网格和连续的两种特征表示,结果表明连续特征表示优于基于网格的特征表示。为了减少用于存储尺度空间金字塔特征的内存,我们讨论并比较了两种压缩方法。所提出的方法在四种不同的历史文献集合上进行了评估,获得了最先进的结果。
{"title":"Keyword Retrieval Using Scale-Space Pyramid","authors":"Irina Rabaev, K. Kedem, Jihad El-Sana","doi":"10.1109/DAS.2016.16","DOIUrl":"https://doi.org/10.1109/DAS.2016.16","url":null,"abstract":"We propose a pyramid-based method for keyword spotting in historical document images. The documents are represented by a scale-space pyramid of their features. The search for a query keyword begins at the highest level of the pyramid, where the initial candidates for matching are located. The candidates are further refined at each level of the pyramid. The number of levels is adaptive and depends on the length of the query word. The results from all the document images are combined and ranked. We compare two feature representations, grid-based and continuous, and show that continuous feature representation outperforms the grid-based representation. In order to reduce the memory used to store the scale-space pyramid of features, we discuss and compare two compressing approaches. The proposed method was evaluated on four different collections of historical documents achieving state-of-the-art results.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115233292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Word Segmentation Using the Student's-t Distribution 使用Student's-t分布的分词
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.35
G. Louloudis, Giorgos Sfikas, N. Stamatopoulos, B. Gatos
Word segmentation refers to the process of defining the word regions of a text line. It is a critical stage towards word and character recognition as well as word spotting and mainly concerns three basic stages, namely preprocessing, distance computation and gap classification. In this paper, we propose a novel word segmentation method which uses the Student's-t distribution for the gap classification stage. The main advantage of the Student's-t distribution concerns its robustness to the existence of outliers. In order to test the efficiency of the proposed method we used the four benchmarking datasets of the ICDAR/ICFHR Handwriting Segmentation Contests as well as a historical typewritten dataset of Greek polytonic text. It is observed that the use of mixtures of Student's-t distributions for word segmentation outperforms other gap classification methods in terms of Recognition Accuracy and F-Measure. Also, in terms of all examined benchmarks, the Student's-t is shown to produce a perfect segmentation result in significantly more cases than the state-of-the-art Gaussian mixture model.
分词是指定义文本行的词区域的过程。它是字词识别和词点识别的关键阶段,主要涉及预处理、距离计算和间隙分类三个基本阶段。在本文中,我们提出了一种新的分词方法,该方法将Student's-t分布用于间隙分类阶段。Student -t分布的主要优点在于它对异常值的存在具有稳健性。为了测试该方法的有效性,我们使用了ICDAR/ICFHR笔迹分割比赛的四个基准数据集以及希腊多音文本的历史打字数据集。可以观察到,在识别精度和F-Measure方面,使用混合Student's-t分布进行分词优于其他间隙分类方法。此外,就所有测试基准而言,与最先进的高斯混合模型相比,Student's-t在更多的情况下显示出完美的分割结果。
{"title":"Word Segmentation Using the Student's-t Distribution","authors":"G. Louloudis, Giorgos Sfikas, N. Stamatopoulos, B. Gatos","doi":"10.1109/DAS.2016.35","DOIUrl":"https://doi.org/10.1109/DAS.2016.35","url":null,"abstract":"Word segmentation refers to the process of defining the word regions of a text line. It is a critical stage towards word and character recognition as well as word spotting and mainly concerns three basic stages, namely preprocessing, distance computation and gap classification. In this paper, we propose a novel word segmentation method which uses the Student's-t distribution for the gap classification stage. The main advantage of the Student's-t distribution concerns its robustness to the existence of outliers. In order to test the efficiency of the proposed method we used the four benchmarking datasets of the ICDAR/ICFHR Handwriting Segmentation Contests as well as a historical typewritten dataset of Greek polytonic text. It is observed that the use of mixtures of Student's-t distributions for word segmentation outperforms other gap classification methods in terms of Recognition Accuracy and F-Measure. Also, in terms of all examined benchmarks, the Student's-t is shown to produce a perfect segmentation result in significantly more cases than the state-of-the-art Gaussian mixture model.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115494752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2016 12th IAPR Workshop on Document Analysis Systems (DAS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1