首页 > 最新文献

2016 12th IAPR Workshop on Document Analysis Systems (DAS)最新文献

英文 中文
Document Image Quality Assessment Using Discriminative Sparse Representation 基于判别稀疏表示的文档图像质量评估
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.24
Xujun Peng, Huaigu Cao, P. Natarajan
The goal of document image quality assessment (DIQA) is to build a computational model which can predict the degree of degradation for document images. Based on the estimated quality scores, the immediate feedback can be provided by document processing and analysis systems, which helps to maintain, organize, recognize and retrieve the information from document images. Recently, the bag-of-visual-words (BoV) based approaches have gained increasing attention from researchers to fulfill the task of quality assessment, but how to use BoV to represent images more accurately is still a challenging problem. In this paper, we propose to utilize a sparse representation based method to estimate document image's quality with respect to the OCR capability. Unlike the conventional sparse representation approaches, we introduce the target quality scores into the training phase of sparse representation. The proposed method improves the discriminability of the system and ensures the obtained codebook is more suitable for our assessment task. The experimental results on a public dataset show that the proposed method outperforms other hand-crafted and BoV based DIQA approaches.
文档图像质量评估(DIQA)的目标是建立一个能够预测文档图像退化程度的计算模型。根据估计的质量分数,文档处理和分析系统可以提供即时反馈,帮助维护、组织、识别和检索文档图像中的信息。近年来,基于视觉词袋(BoV)的图像质量评价方法越来越受到研究人员的关注,但如何使用视觉词袋更准确地表示图像仍然是一个具有挑战性的问题。在本文中,我们提出了一种基于稀疏表示的方法来估计文档图像的质量与OCR能力。与传统的稀疏表示方法不同,我们将目标质量分数引入到稀疏表示的训练阶段。该方法提高了系统的可分辨性,保证了得到的码本更适合我们的评估任务。在公共数据集上的实验结果表明,该方法优于其他手工制作和基于BoV的DIQA方法。
{"title":"Document Image Quality Assessment Using Discriminative Sparse Representation","authors":"Xujun Peng, Huaigu Cao, P. Natarajan","doi":"10.1109/DAS.2016.24","DOIUrl":"https://doi.org/10.1109/DAS.2016.24","url":null,"abstract":"The goal of document image quality assessment (DIQA) is to build a computational model which can predict the degree of degradation for document images. Based on the estimated quality scores, the immediate feedback can be provided by document processing and analysis systems, which helps to maintain, organize, recognize and retrieve the information from document images. Recently, the bag-of-visual-words (BoV) based approaches have gained increasing attention from researchers to fulfill the task of quality assessment, but how to use BoV to represent images more accurately is still a challenging problem. In this paper, we propose to utilize a sparse representation based method to estimate document image's quality with respect to the OCR capability. Unlike the conventional sparse representation approaches, we introduce the target quality scores into the training phase of sparse representation. The proposed method improves the discriminability of the system and ensures the obtained codebook is more suitable for our assessment task. The experimental results on a public dataset show that the proposed method outperforms other hand-crafted and BoV based DIQA approaches.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130758532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Increasing Robustness of Handwriting Recognition Using Character N-Gram Decoding on Large Lexica 基于字符N-Gram解码的手写识别鲁棒性研究
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.43
M. Schall, M. Schambach, M. Franz
Offline handwriting recognition systems often include a decoding step, that is retrieving the most likely character sequence from the underlying machine learning algorithm. Decoding is sensitive to ranges of weakly predicted characters, caused e.g. by obstructions in the scanned document. We present a new algorithm for robust decoding of handwriting recognizer outputs using character n-grams. Multidimensional hierarchical subsampling artificial neural networks with Long-Short-Term-Memory cells have been successfully applied to offline handwriting recognition. Output activations from such networks, trained with Connectionist Temporal Classification, can be decoded with several different algorithms in order to retrieve the most likely literal string that it represents. We present a new algorithm for decoding the network output while restricting the possible strings to a large lexicon. The index used for this work is an n-gram index with tri-grams used for experimental comparisons. N-grams are extracted from the network output using a backtracking algorithm and each n-gram assigned a mean probability. The decoding result is obtained by intersecting the n-gram hit lists while calculating the total probability for each matched lexicon entry. We conclude with an experimental comparison of different decoding algorithms on a large lexicon.
离线手写识别系统通常包括一个解码步骤,即从底层机器学习算法中检索最可能的字符序列。解码对弱预测字符的范围很敏感,例如由扫描文档中的障碍物引起的。提出了一种基于字符n-图的手写识别器输出鲁棒解码算法。具有长短期记忆单元的多维层次子采样人工神经网络已成功应用于离线手写识别。通过Connectionist Temporal Classification进行训练的此类网络的输出激活可以用几种不同的算法进行解码,以便检索它所代表的最可能的文字字符串。我们提出了一种解码网络输出的新算法,同时将可能的字符串限制在一个大的词典中。这项工作中使用的索引是n-gram索引,用于实验比较的是三-gram索引。使用回溯算法从网络输出中提取n个图,并为每个n个图分配一个平均概率。解码结果是通过交叉n-gram命中表,同时计算每个匹配词汇条目的总概率来获得的。最后,我们对一个大词典的不同解码算法进行了实验比较。
{"title":"Increasing Robustness of Handwriting Recognition Using Character N-Gram Decoding on Large Lexica","authors":"M. Schall, M. Schambach, M. Franz","doi":"10.1109/DAS.2016.43","DOIUrl":"https://doi.org/10.1109/DAS.2016.43","url":null,"abstract":"Offline handwriting recognition systems often include a decoding step, that is retrieving the most likely character sequence from the underlying machine learning algorithm. Decoding is sensitive to ranges of weakly predicted characters, caused e.g. by obstructions in the scanned document. We present a new algorithm for robust decoding of handwriting recognizer outputs using character n-grams. Multidimensional hierarchical subsampling artificial neural networks with Long-Short-Term-Memory cells have been successfully applied to offline handwriting recognition. Output activations from such networks, trained with Connectionist Temporal Classification, can be decoded with several different algorithms in order to retrieve the most likely literal string that it represents. We present a new algorithm for decoding the network output while restricting the possible strings to a large lexicon. The index used for this work is an n-gram index with tri-grams used for experimental comparisons. N-grams are extracted from the network output using a backtracking algorithm and each n-gram assigned a mean probability. The decoding result is obtained by intersecting the n-gram hit lists while calculating the total probability for each matched lexicon entry. We conclude with an experimental comparison of different decoding algorithms on a large lexicon.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128597368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automatic Handwritten Character Segmentation for Paleographical Character Shape Analysis 用于古文字形状分析的自动手写字符分割
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.74
Théodore Bluche, D. Stutzmann, Christopher Kermorvant
Written texts are both physical (signs, shapes and graphical systems) and abstract objects (ideas), whose meanings and social connotations evolve through time. To study this dual nature of texts, palaeographers need to analyse large scale corpora at the finest granularity, such as character shape. This goal can only be reached through an automatic segmentation process. In this paper, we present a method, based on Handwritten Text Recognition, to automatically align images of digitized manuscripts with texts from scholarly editions, at the levels of page, column, line, word, and character. It has been successfully applied to two datasets of medieval manuscripts, which are now almost fully segmented at character level. The quality of the word and character segmentations are evaluated and further palaeographical analysis are presented.
书面文本既是物理的(符号、形状和图形系统),也是抽象的对象(思想),其意义和社会内涵随着时间的推移而演变。为了研究文本的这种双重性质,古学家需要在最细的粒度上分析大型语料库,例如字符形状。这一目标只能通过自动分割过程来实现。在本文中,我们提出了一种基于手写文本识别的方法,在页、列、行、词和字符级别上自动将数字化手稿的图像与学术版本的文本对齐。它已成功地应用于中世纪手稿的两个数据集,现在几乎完全分割在字符水平。对字词切分的质量进行了评价,并提出了进一步的古地理分析。
{"title":"Automatic Handwritten Character Segmentation for Paleographical Character Shape Analysis","authors":"Théodore Bluche, D. Stutzmann, Christopher Kermorvant","doi":"10.1109/DAS.2016.74","DOIUrl":"https://doi.org/10.1109/DAS.2016.74","url":null,"abstract":"Written texts are both physical (signs, shapes and graphical systems) and abstract objects (ideas), whose meanings and social connotations evolve through time. To study this dual nature of texts, palaeographers need to analyse large scale corpora at the finest granularity, such as character shape. This goal can only be reached through an automatic segmentation process. In this paper, we present a method, based on Handwritten Text Recognition, to automatically align images of digitized manuscripts with texts from scholarly editions, at the levels of page, column, line, word, and character. It has been successfully applied to two datasets of medieval manuscripts, which are now almost fully segmented at character level. The quality of the word and character segmentations are evaluated and further palaeographical analysis are presented.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125870543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Globally Optimal Text Line Extraction Based on K-Shortest Paths Algorithm 基于k -最短路径算法的全局最优文本行提取
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.12
Liuan Wang, S. Uchida, Wei-liang Fan, Jun Sun
The task of text line extraction in images is a crucial prerequisite for content-based image understanding applications. In this paper, we propose a novel text line extraction method based on k-shortest paths global optimization in images. Firstly, the candidate connected components are extracted by reformulating it as Maximal Stable Extremal Region (MSER) results in images. Then, the directed graph is built upon the connected component nodes with edges comprising of unary and pairwise cost function. Finally, the text line extraction problem is solved using the k-shortest paths optimization algorithm by taking advantage of the particular structure of the directed graph. Experimental results on public dataset demonstrate the effectiveness of proposed method in comparison with state-of-the-art methods.
图像中的文本行提取任务是基于内容的图像理解应用程序的关键先决条件。本文提出了一种基于k最短路径全局优化的图像文本行提取方法。首先,将候选连通分量重新表述为图像中的最大稳定极值区域(MSER)结果,提取候选连通分量;然后,在连通的组件节点上构建有向图,这些节点的边由一元和成对代价函数组成。最后,利用有向图的特殊结构,利用k最短路径优化算法解决文本行提取问题。在公共数据集上的实验结果表明了该方法与现有方法的有效性。
{"title":"Globally Optimal Text Line Extraction Based on K-Shortest Paths Algorithm","authors":"Liuan Wang, S. Uchida, Wei-liang Fan, Jun Sun","doi":"10.1109/DAS.2016.12","DOIUrl":"https://doi.org/10.1109/DAS.2016.12","url":null,"abstract":"The task of text line extraction in images is a crucial prerequisite for content-based image understanding applications. In this paper, we propose a novel text line extraction method based on k-shortest paths global optimization in images. Firstly, the candidate connected components are extracted by reformulating it as Maximal Stable Extremal Region (MSER) results in images. Then, the directed graph is built upon the connected component nodes with edges comprising of unary and pairwise cost function. Finally, the text line extraction problem is solved using the k-shortest paths optimization algorithm by taking advantage of the particular structure of the directed graph. Experimental results on public dataset demonstrate the effectiveness of proposed method in comparison with state-of-the-art methods.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122944559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Word Spotting in Historical Document Collections with Online-Handwritten Queries 联机手写查询在历史文献馆藏中的词识别
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.41
Christian Wieprecht, Leonard Rothacker, G. Fink
Pen-based systems are becoming more and more important due to the growing availability of touch sensitive devices in various forms and sizes. Their interfaces offer the possibility to directly interact with a system by natural handwriting. In contrast to other input modalities it is not required to switch to special modes, like software-keyboards. In this paper we propose a new method for querying digital archives of historical documents. Word images are retrieved with respect to search terms that users write on a pen-based system by hand. The captured trajectory is used as a query which we call query-by-online-trajectory word spotting. By using attribute embeddings for both online-trajectory and visual features, word images are retrieved based on their distance to the query in a common subspace. The system is therefore robust, as no explicit transcription for queries or word images is required. We evaluate our approach for writer-dependent as well as writer-independent scenarios, where we present highly accurate retrieval results in the former and compelling retrieval results in the latter case. Our performance is very competitive in comparison to related methods from the literature.
由于各种形式和尺寸的触摸敏感设备越来越多,基于笔的系统变得越来越重要。它们的接口提供了通过自然手写直接与系统交互的可能性。与其他输入模式相比,它不需要切换到特殊模式,如软件键盘。本文提出了一种新的历史文献数字档案查询方法。根据用户在基于笔的系统上手写的搜索词检索单词图像。捕获的轨迹被用作查询,我们称之为查询-按在线轨迹查找单词。通过对在线轨迹和视觉特征使用属性嵌入,基于它们在公共子空间中的距离来检索单词图像。因此,该系统是健壮的,因为不需要对查询或单词图像进行显式转录。我们评估了作者依赖和作者独立两种情况下的方法,前者提供了高度准确的检索结果,后者提供了令人信服的检索结果。与文献中的相关方法相比,我们的表现非常有竞争力。
{"title":"Word Spotting in Historical Document Collections with Online-Handwritten Queries","authors":"Christian Wieprecht, Leonard Rothacker, G. Fink","doi":"10.1109/DAS.2016.41","DOIUrl":"https://doi.org/10.1109/DAS.2016.41","url":null,"abstract":"Pen-based systems are becoming more and more important due to the growing availability of touch sensitive devices in various forms and sizes. Their interfaces offer the possibility to directly interact with a system by natural handwriting. In contrast to other input modalities it is not required to switch to special modes, like software-keyboards. In this paper we propose a new method for querying digital archives of historical documents. Word images are retrieved with respect to search terms that users write on a pen-based system by hand. The captured trajectory is used as a query which we call query-by-online-trajectory word spotting. By using attribute embeddings for both online-trajectory and visual features, word images are retrieved based on their distance to the query in a common subspace. The system is therefore robust, as no explicit transcription for queries or word images is required. We evaluate our approach for writer-dependent as well as writer-independent scenarios, where we present highly accurate retrieval results in the former and compelling retrieval results in the latter case. Our performance is very competitive in comparison to related methods from the literature.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132820100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
OCR Error Correction Using Character Correction and Feature-Based Word Classification 基于字符校正和特征词分类的OCR纠错
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.44
Ido Kissos, N. Dershowitz
This paper explores the use of a learned classifier for post-OCR text correction. Experiments with the Arabic language show that this approach, which integrates a weighted confusion matrix and a shallow language model, improves the vast majority of segmentation and recognition errors, the most frequent types of error on our dataset.
本文探讨了使用学习分类器进行后ocr文本校正。阿拉伯语的实验表明,这种方法集成了加权混淆矩阵和浅语言模型,改善了绝大多数分割和识别错误,这是我们数据集中最常见的错误类型。
{"title":"OCR Error Correction Using Character Correction and Feature-Based Word Classification","authors":"Ido Kissos, N. Dershowitz","doi":"10.1109/DAS.2016.44","DOIUrl":"https://doi.org/10.1109/DAS.2016.44","url":null,"abstract":"This paper explores the use of a learned classifier for post-OCR text correction. Experiments with the Arabic language show that this approach, which integrates a weighted confusion matrix and a shallow language model, improves the vast majority of segmentation and recognition errors, the most frequent types of error on our dataset.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131166329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 67
Text Extraction in Document Images: Highlight on Using Corner Points 文档图像中的文本提取:使用角点突出显示
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.67
Vikas Yadav, N. Ragot
During past years, text extraction in document images has been widely studied in the general context of Document Image Analysis (DIA) and especially in the framework of layout analysis. Many existing techniques rely on complex processes based on preprocessing, image transforms or component/edges extraction and their analysis. At the same time, text extraction inside videos has received an increased interest and the use of corner or key points has been proven to be very effective. Because it is noteworthy to notice that very few studies were performed on the use of corner points for text extraction in document images, we propose in this paper to evaluate the possibilities associated with this kind of approach for DIA. To do that, we designed a very simple technique based on FAST key points. A first stage divide the image into blocks and the density of points inside each one is computed. The more dense ones are kept as text blocks. Then, connectivity of blocks is checked to group them and to obtain complete text blocks. This technique has been evaluated on different kind of images: different languages (Telugu, Arabic, French), handwritten as well as typewritten, skewed documents, images at different resolution and with different kind and amount of noises (deformations, ink dot, bleed through, acquisition (blur, resolution)), etc. Even with fixed parameters for all such kind of documents images, the precision and recall are close or higher to 90% which makes this basic method already effective. Consequently, even if the proposed approach does not propose a breakthrough from theoretical aspects, it highlights that accurate text extraction could be achieved without complex approach. Moreover, this approach could also be easily improved to be more precise, robust and useful for more complex layout analysis.
近年来,在文档图像分析(DIA)的大背景下,特别是在布局分析的框架下,对文档图像中的文本提取进行了广泛的研究。许多现有的技术依赖于基于预处理、图像变换或成分/边缘提取及其分析的复杂过程。同时,视频内部的文本提取也受到越来越多的关注,使用角点或关键点被证明是非常有效的。因为值得注意的是,很少有研究在文档图像中使用角点进行文本提取,所以我们在本文中建议评估与这种DIA方法相关的可能性。为此,我们设计了一个非常简单的基于FAST关键点的技术。第一阶段将图像分成块,计算每个块内点的密度。较密集的则作为文本块保存。然后,检查文本块的连通性,对文本块进行分组,得到完整的文本块。该技术已在不同类型的图像上进行了评估:不同语言(泰卢固语,阿拉伯语,法语),手写和打字,倾斜文档,不同分辨率的图像以及不同类型和数量的噪声(变形,墨点,渗出,获取(模糊,分辨率))等。在固定参数的情况下,该方法的查全率和查全率都接近或高于90%,表明该方法是有效的。因此,即使所提出的方法没有从理论方面提出突破,它也强调了不需要复杂的方法就可以实现准确的文本提取。此外,这种方法也可以很容易地改进,以更精确,鲁棒性和有用的更复杂的布局分析。
{"title":"Text Extraction in Document Images: Highlight on Using Corner Points","authors":"Vikas Yadav, N. Ragot","doi":"10.1109/DAS.2016.67","DOIUrl":"https://doi.org/10.1109/DAS.2016.67","url":null,"abstract":"During past years, text extraction in document images has been widely studied in the general context of Document Image Analysis (DIA) and especially in the framework of layout analysis. Many existing techniques rely on complex processes based on preprocessing, image transforms or component/edges extraction and their analysis. At the same time, text extraction inside videos has received an increased interest and the use of corner or key points has been proven to be very effective. Because it is noteworthy to notice that very few studies were performed on the use of corner points for text extraction in document images, we propose in this paper to evaluate the possibilities associated with this kind of approach for DIA. To do that, we designed a very simple technique based on FAST key points. A first stage divide the image into blocks and the density of points inside each one is computed. The more dense ones are kept as text blocks. Then, connectivity of blocks is checked to group them and to obtain complete text blocks. This technique has been evaluated on different kind of images: different languages (Telugu, Arabic, French), handwritten as well as typewritten, skewed documents, images at different resolution and with different kind and amount of noises (deformations, ink dot, bleed through, acquisition (blur, resolution)), etc. Even with fixed parameters for all such kind of documents images, the precision and recall are close or higher to 90% which makes this basic method already effective. Consequently, even if the proposed approach does not propose a breakthrough from theoretical aspects, it highlights that accurate text extraction could be achieved without complex approach. Moreover, this approach could also be easily improved to be more precise, robust and useful for more complex layout analysis.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123341162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Text Detection in Arabic News Video Based on SWT Operator and Convolutional Auto-Encoders 基于SWT算子和卷积自编码器的阿拉伯语新闻视频文本检测
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.80
Oussama Zayene, Mathias Seuret, Sameh Masmoudi Touj, J. Hennebert, R. Ingold, N. Amara
Text detection in videos is a challenging problem due to variety of text specificities, presence of complex background and anti-aliasing/compression artifacts. In this paper, we present an approach for horizontally aligned artificial text detection in Arabic news video. The novelty of this method revolves around the combination of two techniques: an adapted version of the Stroke Width Transform (SWT) algorithm and a convolutional auto-encoder (CAE). First, the SWT extracts text candidates' components. They are then filtered and grouped using geometric constraints and Stroke Width information. Second, the CAE is used as an unsupervised feature learning method to discriminate the obtained textline candidates as text or non-text. We assess the proposed approach on the public Arabic-Text-in-Video database (AcTiV-DB) using different evaluation protocols including data from several TV channels. Experiments indicate that the use of learned features significantly improves the text detection results.
视频中的文本检测是一个具有挑战性的问题,由于各种文本的特殊性,复杂的背景和抗混叠/压缩伪影的存在。本文提出了一种阿拉伯语新闻视频中水平对齐的人工文本检测方法。这种方法的新颖之处在于结合了两种技术:一种改进型的笔画宽度变换(SWT)算法和一种卷积自编码器(CAE)。首先,SWT提取文本候选组件。然后使用几何约束和笔画宽度信息对它们进行过滤和分组。其次,将CAE作为一种无监督特征学习方法来区分获得的文本候选文本是文本还是非文本。我们使用不同的评估协议,包括来自几个电视频道的数据,在公共阿拉伯文本视频数据库(AcTiV-DB)上评估了所提出的方法。实验表明,使用学习到的特征显著提高了文本检测结果。
{"title":"Text Detection in Arabic News Video Based on SWT Operator and Convolutional Auto-Encoders","authors":"Oussama Zayene, Mathias Seuret, Sameh Masmoudi Touj, J. Hennebert, R. Ingold, N. Amara","doi":"10.1109/DAS.2016.80","DOIUrl":"https://doi.org/10.1109/DAS.2016.80","url":null,"abstract":"Text detection in videos is a challenging problem due to variety of text specificities, presence of complex background and anti-aliasing/compression artifacts. In this paper, we present an approach for horizontally aligned artificial text detection in Arabic news video. The novelty of this method revolves around the combination of two techniques: an adapted version of the Stroke Width Transform (SWT) algorithm and a convolutional auto-encoder (CAE). First, the SWT extracts text candidates' components. They are then filtered and grouped using geometric constraints and Stroke Width information. Second, the CAE is used as an unsupervised feature learning method to discriminate the obtained textline candidates as text or non-text. We assess the proposed approach on the public Arabic-Text-in-Video database (AcTiV-DB) using different evaluation protocols including data from several TV channels. Experiments indicate that the use of learned features significantly improves the text detection results.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121664488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Keyword Spotting in Handwritten Documents Using Projections of Oriented Gradients 利用定向梯度投影在手写文档中识别关键字
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.61
George Retsinas, G. Louloudis, N. Stamatopoulos, B. Gatos
In this paper, we present a novel approach for segmentation-based handwritten keyword spotting. The proposed approach relies upon the extraction of a simple yet efficient descriptor which is based on projections of oriented gradients. To this end, a global and a local word image descriptors, together with their combination, are proposed. Retrieval is performed using to the euclidean distance between the descriptors of a query image and the segmented word images. The proposed methods have been evaluated on the dataset of the ICFHR 2014 Competition on handwritten keyword spotting. Experimental results prove the efficiency of the proposed methods compared to several state-of-the-art techniques.
本文提出了一种基于分词的手写关键字识别方法。提出的方法依赖于提取一个简单而有效的描述符,该描述符基于定向梯度的投影。为此,提出了一种全局和局部词图像描述符及其组合。检索使用查询图像的描述符和分割的词图像之间的欧氏距离。所提出的方法已经在ICFHR 2014手写体关键词识别大赛的数据集上进行了评估。实验结果证明了该方法与几种最新技术相比的有效性。
{"title":"Keyword Spotting in Handwritten Documents Using Projections of Oriented Gradients","authors":"George Retsinas, G. Louloudis, N. Stamatopoulos, B. Gatos","doi":"10.1109/DAS.2016.61","DOIUrl":"https://doi.org/10.1109/DAS.2016.61","url":null,"abstract":"In this paper, we present a novel approach for segmentation-based handwritten keyword spotting. The proposed approach relies upon the extraction of a simple yet efficient descriptor which is based on projections of oriented gradients. To this end, a global and a local word image descriptors, together with their combination, are proposed. Retrieval is performed using to the euclidean distance between the descriptors of a query image and the segmented word images. The proposed methods have been evaluated on the dataset of the ICFHR 2014 Competition on handwritten keyword spotting. Experimental results prove the efficiency of the proposed methods compared to several state-of-the-art techniques.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124024583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
CNN Based Transfer Learning for Historical Chinese Character Recognition 基于CNN的古汉字识别迁移学习
Pub Date : 2016-04-11 DOI: 10.1109/DAS.2016.52
Yejun Tang, Liangrui Peng, Qianxiong Xu, Yanwei Wang, Akio Furuhata
Historical Chinese character recognition has been suffering from the problem of lacking sufficient labeled training samples. A transfer learning method based on Convolutional Neural Network (CNN) for historical Chinese character recognition is proposed in this paper. A CNN model L is trained by printed Chinese character samples in the source domain. The network structure and weights of model L are used to initialize another CNN model T, which is regarded as the feature extractor and classifier in the target domain. The model T is then fine-tuned by a few labeled historical or handwritten Chinese character samples, and used for final evaluation in the target domain. Several experiments regarding essential factors of the CNNbased transfer learning method are conducted, showing that the proposed method is effective.
历史汉字识别一直受到缺乏足够标记训练样本的困扰。提出了一种基于卷积神经网络(CNN)的历史汉字识别迁移学习方法。CNN模型L是通过源域的打印汉字样本进行训练的。利用模型L的网络结构和权值初始化另一个CNN模型T,作为目标域的特征提取器和分类器。然后通过一些标记的历史或手写汉字样本对模型T进行微调,并用于目标域中的最终评估。针对基于cnn的迁移学习方法的几个关键因素进行了实验,结果表明该方法是有效的。
{"title":"CNN Based Transfer Learning for Historical Chinese Character Recognition","authors":"Yejun Tang, Liangrui Peng, Qianxiong Xu, Yanwei Wang, Akio Furuhata","doi":"10.1109/DAS.2016.52","DOIUrl":"https://doi.org/10.1109/DAS.2016.52","url":null,"abstract":"Historical Chinese character recognition has been suffering from the problem of lacking sufficient labeled training samples. A transfer learning method based on Convolutional Neural Network (CNN) for historical Chinese character recognition is proposed in this paper. A CNN model L is trained by printed Chinese character samples in the source domain. The network structure and weights of model L are used to initialize another CNN model T, which is regarded as the feature extractor and classifier in the target domain. The model T is then fine-tuned by a few labeled historical or handwritten Chinese character samples, and used for final evaluation in the target domain. Several experiments regarding essential factors of the CNNbased transfer learning method are conducted, showing that the proposed method is effective.","PeriodicalId":197359,"journal":{"name":"2016 12th IAPR Workshop on Document Analysis Systems (DAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122178580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
期刊
2016 12th IAPR Workshop on Document Analysis Systems (DAS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1