首页 > 最新文献

2012 International Conference on Frontiers in Handwriting Recognition最新文献

英文 中文
A Courtesy Amount Recognition System for Chinese Bank Checks 中国银行支票的礼貌金额识别系统
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.154
Dong Liu, Youbin Chen
In this paper, we present a complete courtesy amount recognition system for Chinese bank checks. The system takes color bank check images as input and consists of three main processing steps: numeral string extraction, segmentation & recognition, and post-processing. They focus sequentially on: detection and extraction of numeral string; segmentation and recognition of the string; and further analysis of recognition results for acceptance or rejection. Information fusion, method complementarity, multi-hypotheses generation then evaluation are three principles employed for designing algorithms in the first two modules. And logistic regression is used for post-processing. A large number of real checks collected from different banks are used for testing the system. Read rate around 82% is observed when the substitution rate is set to 1%, which corresponds to that of a human operator. The performance can also be tuned further toward a suitable balance between inaccuracy and rejection, in accordance with user preference.
在本文中,我们提出了一个完整的中国银行支票礼貌金额识别系统。该系统以彩色库检查图像为输入,主要包括三个处理步骤:数字字符串提取、分割与识别和后处理。他们的重点依次是:数字字符串的检测和提取;字符串的分割与识别;并对识别结果进行进一步分析,以决定是否接受或拒绝。前两个模块的算法设计采用了信息融合、方法互补、多假设生成和评估三个原则。后处理采用逻辑回归。从不同银行收集的大量真实支票被用于测试该系统。当替换率设置为1%时,读取率约为82%,这与人工操作员的读取率相对应。还可以根据用户偏好进一步调整性能,以在不准确性和拒绝之间达到适当的平衡。
{"title":"A Courtesy Amount Recognition System for Chinese Bank Checks","authors":"Dong Liu, Youbin Chen","doi":"10.1109/ICFHR.2012.154","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.154","url":null,"abstract":"In this paper, we present a complete courtesy amount recognition system for Chinese bank checks. The system takes color bank check images as input and consists of three main processing steps: numeral string extraction, segmentation & recognition, and post-processing. They focus sequentially on: detection and extraction of numeral string; segmentation and recognition of the string; and further analysis of recognition results for acceptance or rejection. Information fusion, method complementarity, multi-hypotheses generation then evaluation are three principles employed for designing algorithms in the first two modules. And logistic regression is used for post-processing. A large number of real checks collected from different banks are used for testing the system. Read rate around 82% is observed when the substitution rate is set to 1%, which corresponds to that of a human operator. The performance can also be tuned further toward a suitable balance between inaccuracy and rejection, in accordance with user preference.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131722820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Novel Approach for Stroke Extraction of Off-Line Chinese Handwritten Characters Based on Optimum Paths 一种基于最优路径的离线汉字笔画提取新方法
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.165
J. Tan, J. Lai, Weishi Zheng, C. Suen
In recognition of Off-line handwritten characters and signatures, stroke extraction is often a crucial step. Given the large number of Chinese handwritten characters, pattern matching based on structural decomposition and analysis is useful and essential to Off-line Chinese recognition to reduce ambiguity. Two challenging problems for stroke extraction are: 1) how to extract primary strokes and 2) how to solve the segmentation ambiguities at intersection points. In this paper, we introduce a novel approach based on Optimum Paths(AOP) to solve this problem. Optimum Paths(AOP) are derived from the degree information and continuation property, we use them to tackle these two problems. Compared with other methods, the proposed approach has extracted strokes from Off-line Chinese handwritten characters with better performance.
在离线手写字符和签名识别中,笔划提取通常是关键步骤。面对大量的手写体汉字,基于结构分解和分析的模式匹配是离线中文识别中减少歧义的必要手段。笔画提取的两个难题是:1)如何提取原始笔画;2)如何解决相交点的分割歧义。本文提出了一种基于最优路径(AOP)的新方法来解决这一问题。最优路径(AOP)是由度信息和连续性衍生出来的,我们用它们来解决这两个问题。与其他方法相比,该方法对离线汉字笔画进行了更好的提取。
{"title":"A Novel Approach for Stroke Extraction of Off-Line Chinese Handwritten Characters Based on Optimum Paths","authors":"J. Tan, J. Lai, Weishi Zheng, C. Suen","doi":"10.1109/ICFHR.2012.165","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.165","url":null,"abstract":"In recognition of Off-line handwritten characters and signatures, stroke extraction is often a crucial step. Given the large number of Chinese handwritten characters, pattern matching based on structural decomposition and analysis is useful and essential to Off-line Chinese recognition to reduce ambiguity. Two challenging problems for stroke extraction are: 1) how to extract primary strokes and 2) how to solve the segmentation ambiguities at intersection points. In this paper, we introduce a novel approach based on Optimum Paths(AOP) to solve this problem. Optimum Paths(AOP) are derived from the degree information and continuation property, we use them to tackle these two problems. Compared with other methods, the proposed approach has extracted strokes from Off-line Chinese handwritten characters with better performance.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132134018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Text-Line Segmentation Using Codebooks and Graph Partitioning 使用代码本和图分区学习文本-线分割
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.228
Le Kang, J. Kumar, Peng Ye, D. Doermann
In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.
在本文中,我们提出了一种基于码本的手写文本行分割方法,该方法使用训练数据中的图像补丁来学习基于图的相似度进行聚类。我们首先利用k -媒质构造图像补丁的码本,得到编码局部证据的样本。然后,我们获得从给定图像中提取的所有补丁对应的码字,并使用学习到的证据构造相似图并进行分割以获得文本行。我们基于学习的方法在包含退化和无约束手写阿拉伯文档图像的现场数据集上表现良好。在ICDAR 2009分割大赛数据集上的实验结果表明,该方法具有较好的竞争力。
{"title":"Learning Text-Line Segmentation Using Codebooks and Graph Partitioning","authors":"Le Kang, J. Kumar, Peng Ye, D. Doermann","doi":"10.1109/ICFHR.2012.228","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.228","url":null,"abstract":"In this paper, we present a codebook based method for handwritten text-line segmentation which uses image-patches in the training data to learn a graph-based similarity for clustering. We first construct a codebook of image-patches using K-medoids, and obtain exemplars which encode local evidence. We then obtain the corresponding codewords for all patches extracted from a given image and construct a similarity graph using the learned evidence and partitioned to obtain text-lines. Our learning based approach performs well on a field dataset containing degraded and un-constrained handwritten Arabic document images. Results on ICDAR 2009 segmentation contest dataset show that the method is competitive with previous approaches.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121777376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Modeling Writing Styles for Online Writer Identification: A Hierarchical Bayesian Approach 在线作者识别的写作风格建模:层次贝叶斯方法
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.235
Arti Shivram, Chetan Ramaiah, U. Porwal, V. Govindaraju
With the explosive growth of the tablet form factor and greater availability of pen-based direct input, writer identification in online environments is increasingly becoming critical for a variety of downstream applications such as intelligent and adaptive user environments, search, retrieval, indexing and digital forensics. Extant research has approached writer identification by using writing styles as a discriminative function between writers. In contrast, we model writing styles as a shared component of an individualâs handwriting. We develop a theoretical framework for this conceptualization and model this using a three level hierarchical Bayesian model (Latent Dirichlet Allocation). In this text-independent, unsupervised model each writerâs handwriting is modeled as a distribution over finite writing styles that are shared amongst writers. We test our model on a novel online/offline handwriting dataset IBM UB 1 which is being made available to the public. Our experiments show comparable results to current benchmarks and demonstrate the efficacy of explicitly modeling shared writing styles.
随着平板电脑的爆炸式增长和基于笔的直接输入的可用性的提高,在线环境中的作者识别对各种下游应用(如智能和自适应用户环境、搜索、检索、索引和数字取证)变得越来越重要。现有的研究通过使用写作风格作为作家之间的判别函数来研究作家的身份。相反,我们将写作风格建模为个人笔迹的共享组成部分。我们为这种概念化开发了一个理论框架,并使用三层分层贝叶斯模型(潜在狄利克雷分配)对其进行建模。在这个文本无关的、无监督的模型中,每个写作者的笔迹被建模为写作者之间共享的有限写作风格的分布。我们在一个新的在线/离线手写数据集IBM UB 1上测试了我们的模型,该数据集正在向公众开放。我们的实验显示了与当前基准比较的结果,并证明了显式建模共享写作风格的有效性。
{"title":"Modeling Writing Styles for Online Writer Identification: A Hierarchical Bayesian Approach","authors":"Arti Shivram, Chetan Ramaiah, U. Porwal, V. Govindaraju","doi":"10.1109/ICFHR.2012.235","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.235","url":null,"abstract":"With the explosive growth of the tablet form factor and greater availability of pen-based direct input, writer identification in online environments is increasingly becoming critical for a variety of downstream applications such as intelligent and adaptive user environments, search, retrieval, indexing and digital forensics. Extant research has approached writer identification by using writing styles as a discriminative function between writers. In contrast, we model writing styles as a shared component of an individualâs handwriting. We develop a theoretical framework for this conceptualization and model this using a three level hierarchical Bayesian model (Latent Dirichlet Allocation). In this text-independent, unsupervised model each writerâs handwriting is modeled as a distribution over finite writing styles that are shared amongst writers. We test our model on a novel online/offline handwriting dataset IBM UB 1 which is being made available to the public. Our experiments show comparable results to current benchmarks and demonstrate the efficacy of explicitly modeling shared writing styles.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114752624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Improving Handwritten Signature-Based Identity Prediction through the Integration of Fuzzy Soft-Biometric Data 基于模糊软生物特征数据集成改进手写签名身份预测
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.221
Márjory Da Costa-Abreu, M. Fairhurst
Automated identification of individuals using biometric technologies is finding increasing application in diverse areas, yet designing practical systems can still present significant challenges. Choice of the modality to adopt, the classification/matching techniques best suited to the application, the most effective sensors to use, and so on, are all important considerations, and can help to ameliorate factors which might detract from optimal performance. Less well researched, however, is how to optimise performance by means of exploiting broader-based information often available in a specific task and, in particular, the exploitation of so-called "soft" biometric data is often overlooked. This paper proposes a novel approach to the integration of soft biometric data into an effective processing structure for an identification task by adopting a fuzzy representation of information which is inherently continuous, using subject age as a typical example. Our results show this to be a promising methodology with possible benefits in a number of potentially difficult practical scenarios.
使用生物识别技术的个人自动识别在不同领域的应用越来越多,但设计实用的系统仍然存在重大挑战。选择采用的模式、最适合应用的分类/匹配技术、使用最有效的传感器等等,都是重要的考虑因素,可以帮助改善可能影响最佳性能的因素。然而,研究较少的是如何通过利用通常在特定任务中可用的更广泛的信息来优化性能,特别是所谓的“软”生物识别数据的利用经常被忽视。本文以被试年龄为例,提出了一种将软生物特征数据集成到识别任务有效处理结构中的新方法,该方法采用固有连续信息的模糊表示。我们的结果表明,这是一种很有前途的方法,在许多潜在的困难的实际情况下可能会有好处。
{"title":"Improving Handwritten Signature-Based Identity Prediction through the Integration of Fuzzy Soft-Biometric Data","authors":"Márjory Da Costa-Abreu, M. Fairhurst","doi":"10.1109/ICFHR.2012.221","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.221","url":null,"abstract":"Automated identification of individuals using biometric technologies is finding increasing application in diverse areas, yet designing practical systems can still present significant challenges. Choice of the modality to adopt, the classification/matching techniques best suited to the application, the most effective sensors to use, and so on, are all important considerations, and can help to ameliorate factors which might detract from optimal performance. Less well researched, however, is how to optimise performance by means of exploiting broader-based information often available in a specific task and, in particular, the exploitation of so-called \"soft\" biometric data is often overlooked. This paper proposes a novel approach to the integration of soft biometric data into an effective processing structure for an identification task by adopting a fuzzy representation of information which is inherently continuous, using subject age as a typical example. Our results show this to be a promising methodology with possible benefits in a number of potentially difficult practical scenarios.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123307515","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Binarization of First Temple Period Inscriptions: Performance of Existing Algorithms and a New Registration Based Scheme 第一庙时期铭文的二值化:现有算法的性能和一种新的基于配准的方案
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.187
Arie Shaus, Eli Turkel, E. Piasetzky
The discipline of First Temple Period epigraphy (the study of writing) relies heavily on manually-drawn facsimiles (black and white images) of ancient inscriptions. This practice may unintentionally mix up documentation and interpretation. As an alternative, this article surveys the performance of several existing binarization techniques. The quality of their results is found to be inadequate for our purpose. A new method for automatically creating a facsimile is then suggested. The technique is based on a connected-component oriented elastic registration of an already existing imperfect facsimile to the inscription image. Some empirical results, supporting the methodology, are presented. The procedure is also relevant to the creation of facsimiles for other types of inscriptions.
第一圣殿时期的铭文学(研究文字的学科)很大程度上依赖于古代铭文的手工摹本(黑白图像)。这种做法可能会无意中混淆文档和解释。作为替代方案,本文将调查几种现有二值化技术的性能。我们发现他们的结果质量不足以达到我们的目的。提出了一种自动生成传真的新方法。该技术是基于已存在的不完美摹本与铭文图像的弹性配准。一些实证结果,支持该方法,提出。该程序也适用于为其他类型的铭文制作传真。
{"title":"Binarization of First Temple Period Inscriptions: Performance of Existing Algorithms and a New Registration Based Scheme","authors":"Arie Shaus, Eli Turkel, E. Piasetzky","doi":"10.1109/ICFHR.2012.187","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.187","url":null,"abstract":"The discipline of First Temple Period epigraphy (the study of writing) relies heavily on manually-drawn facsimiles (black and white images) of ancient inscriptions. This practice may unintentionally mix up documentation and interpretation. As an alternative, this article surveys the performance of several existing binarization techniques. The quality of their results is found to be inadequate for our purpose. A new method for automatically creating a facsimile is then suggested. The technique is based on a connected-component oriented elastic registration of an already existing imperfect facsimile to the inscription image. Some empirical results, supporting the methodology, are presented. The procedure is also relevant to the creation of facsimiles for other types of inscriptions.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123617506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Reduction of Bleed-Through Effect in Images of Chinese Bank Items 减少中国银行项目图像的渗滤效应
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.260
Bingyu Chi, Youbin Chen
Because of the existence of possible carbon and seals, it's quite often that images of financial documents such as Chinese bank checks are suffered from bleed-through effects, which will affect the performance of automatic financial document processing such as seal verification and OCR. This paper presents an effective algorithm to deal with bleed-through effects existing in the images of financial documents. Double-sided images scanned simultaneously are used as in-puts, and the bleed-through effect is detected and removed after the registration of the recto and verso side images. There are two major aspects of contribution in our work. First, our algorithm can deal with images with complex background from real-life financial documents while most other algorithms only deal with images with simple background. Second, we combine the fast ICA algorithm with Gatos' local adaptive thresholding algorithm [1] to deal with the bleed-through effects. Experiments show that our proposed algorithm is very promising.
由于可能的碳和印章的存在,中国银行支票等金融单据的图像经常会出现渗滤效应,从而影响到盖章验证、OCR等金融单据自动处理的性能。本文提出了一种有效的算法来处理财务文件图像中存在的渗滤效应。采用同时扫描的双面图像作为输入,对正、反两侧图像进行配准后检测并去除透血效应。我们的工作主要有两个方面的贡献。首先,我们的算法可以处理真实金融文档中具有复杂背景的图像,而大多数其他算法只能处理简单背景的图像。其次,我们将快速ICA算法与Gatos的局部自适应阈值分割算法[1]相结合来处理透血效应。实验表明,该算法是很有前途的。
{"title":"Reduction of Bleed-Through Effect in Images of Chinese Bank Items","authors":"Bingyu Chi, Youbin Chen","doi":"10.1109/ICFHR.2012.260","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.260","url":null,"abstract":"Because of the existence of possible carbon and seals, it's quite often that images of financial documents such as Chinese bank checks are suffered from bleed-through effects, which will affect the performance of automatic financial document processing such as seal verification and OCR. This paper presents an effective algorithm to deal with bleed-through effects existing in the images of financial documents. Double-sided images scanned simultaneously are used as in-puts, and the bleed-through effect is detected and removed after the registration of the recto and verso side images. There are two major aspects of contribution in our work. First, our algorithm can deal with images with complex background from real-life financial documents while most other algorithms only deal with images with simple background. Second, we combine the fast ICA algorithm with Gatos' local adaptive thresholding algorithm [1] to deal with the bleed-through effects. Experiments show that our proposed algorithm is very promising.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124080173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Statistical Machine Translation as a Language Model for Handwriting Recognition 统计机器翻译作为手写识别的语言模型
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.273
Jacob Devlin, M. Kamali, Krishna Subramanian, R. Prasad, P. Natarajan
When performing handwriting recognition on natural language text, the use of a word-level language model (LM) is known to significantly improve recognition accuracy. The most common type of language model, the n-gram model, decomposes sentences into short, overlapping chunks. In this paper, we propose a new type of language model which we use in addition to the standard n-gram LM. Our new model uses the likelihood score from a statistical machine translation system as a reranking feature. In general terms, we automatically translate each OCR hypothesis into another language, and then create a feature score based on how "difficult" it was to perform the translation. Intuitively, the difficulty of translation correlates with how well-formed the input sentence is. In an Arabic handwriting recognition task, we were able to obtain an 0.4% absolute improvement to word error rate (WER) on top of a powerful 5-gram LM.
在对自然语言文本进行手写识别时,使用单词级语言模型(LM)可以显著提高识别精度。最常见的语言模型是n-gram模型,它将句子分解成重叠的短块。在本文中,我们提出了一种新的语言模型,我们使用除了标准的n-gram LM。我们的新模型使用来自统计机器翻译系统的似然评分作为重新排序特征。一般来说,我们自动将每个OCR假设翻译成另一种语言,然后根据执行翻译的“难易程度”创建一个特征评分。从直觉上看,翻译的难度与输入句子的结构是否良好有关。在阿拉伯语手写识别任务中,我们能够在强大的5克LM之上获得单词错误率(WER)的0.4%的绝对改进。
{"title":"Statistical Machine Translation as a Language Model for Handwriting Recognition","authors":"Jacob Devlin, M. Kamali, Krishna Subramanian, R. Prasad, P. Natarajan","doi":"10.1109/ICFHR.2012.273","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.273","url":null,"abstract":"When performing handwriting recognition on natural language text, the use of a word-level language model (LM) is known to significantly improve recognition accuracy. The most common type of language model, the n-gram model, decomposes sentences into short, overlapping chunks. In this paper, we propose a new type of language model which we use in addition to the standard n-gram LM. Our new model uses the likelihood score from a statistical machine translation system as a reranking feature. In general terms, we automatically translate each OCR hypothesis into another language, and then create a feature score based on how \"difficult\" it was to perform the translation. Intuitively, the difficulty of translation correlates with how well-formed the input sentence is. In an Arabic handwriting recognition task, we were able to obtain an 0.4% absolute improvement to word error rate (WER) on top of a powerful 5-gram LM.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129340686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Confused Distance Maximization for Large Category Dimensionality Reduction 大类别降维的混淆距离最大化
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.196
Xu-Yao Zhang, Cheng-Lin Liu
The Fisher linear discriminant analysis (FDA) is the most well-known supervised dimensionality reduction model. However, when the number of classes is much larger than the reduced dimensionality, FDA suffers from the class separation problem in that it will preserve the distances of the already well-separated classes and cause a large overlap of neighboring classes. To cope with this problem, we propose a new model called confused distance maximization (CDM). The objective of CDM is to maximize the distance of the most confusable classes, according to the confusion matrix estimated from the training data with a pre-learned classifier. Compared with FDA that maximizes the sum of the distances of all class pairs, CDM is more relevant to the classification accuracy by weighting the pairwise distance according to the confusion matrix. Furthermore, CDM is computationally inexpensive which makes it indeed efficient and effective for large category problems. Experiments on two large-scale 3,755-class Chinese handwriting databases (offline and online) demonstrate that CDM can achieve the best performance compared with FDA and other competitive weighting based criteria.
Fisher线性判别分析(FDA)是最著名的监督降维模型。然而,当类的数量远远大于降维数时,FDA就会遇到类分离问题,因为它会保留已经很好分离的类的距离,并导致相邻类的大量重叠。为了解决这个问题,我们提出了一个新的模型,称为混淆距离最大化(CDM)。CDM的目标是根据预学习分类器从训练数据中估计的混淆矩阵,最大化最容易混淆的类的距离。与最大化所有类对距离之和的FDA相比,CDM通过根据混淆矩阵对成对距离进行加权,与分类精度更相关。此外,CDM计算成本低,这使得它对大类别问题确实是高效和有效的。在两个大型3755类中文手写数据库(离线和在线)上的实验表明,与FDA和其他基于权重的竞争标准相比,CDM可以获得最好的性能。
{"title":"Confused Distance Maximization for Large Category Dimensionality Reduction","authors":"Xu-Yao Zhang, Cheng-Lin Liu","doi":"10.1109/ICFHR.2012.196","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.196","url":null,"abstract":"The Fisher linear discriminant analysis (FDA) is the most well-known supervised dimensionality reduction model. However, when the number of classes is much larger than the reduced dimensionality, FDA suffers from the class separation problem in that it will preserve the distances of the already well-separated classes and cause a large overlap of neighboring classes. To cope with this problem, we propose a new model called confused distance maximization (CDM). The objective of CDM is to maximize the distance of the most confusable classes, according to the confusion matrix estimated from the training data with a pre-learned classifier. Compared with FDA that maximizes the sum of the distances of all class pairs, CDM is more relevant to the classification accuracy by weighting the pairwise distance according to the confusion matrix. Furthermore, CDM is computationally inexpensive which makes it indeed efficient and effective for large category problems. Experiments on two large-scale 3,755-class Chinese handwriting databases (offline and online) demonstrate that CDM can achieve the best performance compared with FDA and other competitive weighting based criteria.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130457974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Page Segmentation Based on Steerable Pyramid Features 基于可操纵金字塔特征的页面分割
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.253
Mohamed Benjelil, R. Mullot, A. Alimi
Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photos etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper.
页面分割和分类在文档布局分析系统中是非常重要的,然后才能提交给OCR系统或进行其他后续处理步骤。在本文中,我们提出了一个精确且设计合理的复杂文档分割系统。该系统基于可操纵的金字塔变换。从金字塔子带中提取的特征用于在一些受噪声感染、变形、多语言、多脚本的文档图像中定位和分类文本(机器打印或手写)和非文本(图像、图形、绘图或绘画)区域。这些文件包含表格结构、标志、邮票、手写脚本块、照片等。本文介绍了在1000个官方复杂文件图像数据集上获得的令人鼓舞和有希望的结果。
{"title":"Page Segmentation Based on Steerable Pyramid Features","authors":"Mohamed Benjelil, R. Mullot, A. Alimi","doi":"10.1109/ICFHR.2012.253","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.253","url":null,"abstract":"Page segmentation and classification is very important in document layout analysis system before it is presented to an OCR system or for any other subsequent processing steps. In this paper, we propose an accurate and suitably designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted from pyramid sub-bands serve to locate and classify regions into text (either machine printed or handwritten) and non-text (images, graphics, drawings or paintings) in some noise-infected, deformed, multilingual, multi-script document images. These documents contain tabular structures, logos, stamps, handwritten script blocks, photos etc. The encouraging and promising results obtained on 1,000 official complex document images data set are presented in this research paper.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"29 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125795496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2012 International Conference on Frontiers in Handwriting Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1