首页 > 最新文献

2012 International Conference on Frontiers in Handwriting Recognition最新文献

英文 中文
Hindi Off-Line Signature Verification 印地语离线签名验证
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.212
S. Pal, M. Blumenstein, U. Pal
Handwritten Signatures are one of the widely used biometrics for document authentication as well as human authorization. The purpose of this paper is to present an off-line signature verification system involving Hindi signatures. Signature verification is a process by which the questioned signature is examined in detail in order to determine whether it belongs to the claimed person or not. Despite of substantial research in the field of signature verification involving Western signatures, very little attention has been dedicated to non-Western signatures such as Chinese, Japanese, Arabic, Persian etc. In this paper, the performance of an off-line signature verification system involving Hindi signatures, whose style is distinct from Western scripts, has been investigated. The gradient and Zernike moment features were employed and Support Vector Machines (SVMs) were considered for verification. To the best of the authors' knowledge, Hindi signatures have never been used for the task of signature verification and this is the first report of using Hindi signatures in this area. The Hindi signature database employed for experimentation consisted of 840 (35×24) genuine signatures and 1050 (35×30) forgeries. An encouraging accuracy of 7.42% FRR and 4.28% FAR were obtained following experimentation when the gradient features were employed.
手写签名是一种广泛应用于文件认证和人类授权的生物识别技术。本文的目的是提出一个涉及印地语签名的离线签名验证系统。签名验证是详细检查被质疑的签名以确定其是否属于被索赔人的过程。尽管在涉及西方签名的签名验证领域进行了大量研究,但很少有人关注非西方签名,如汉语、日语、阿拉伯语、波斯语等。本文研究了一种具有不同于西方文字风格的印地语签名的离线签名验证系统的性能。采用梯度和泽尼克矩特征,并利用支持向量机(svm)进行验证。据作者所知,印地语签名从未用于签名核查任务,这是在这一领域使用印地语签名的第一份报告。用于实验的印地语签名数据库包含840个(35×24)真实签名和1050个(35×30)伪造签名。利用梯度特征进行实验,得到了7.42%的FRR和4.28%的FAR的精度。
{"title":"Hindi Off-Line Signature Verification","authors":"S. Pal, M. Blumenstein, U. Pal","doi":"10.1109/ICFHR.2012.212","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.212","url":null,"abstract":"Handwritten Signatures are one of the widely used biometrics for document authentication as well as human authorization. The purpose of this paper is to present an off-line signature verification system involving Hindi signatures. Signature verification is a process by which the questioned signature is examined in detail in order to determine whether it belongs to the claimed person or not. Despite of substantial research in the field of signature verification involving Western signatures, very little attention has been dedicated to non-Western signatures such as Chinese, Japanese, Arabic, Persian etc. In this paper, the performance of an off-line signature verification system involving Hindi signatures, whose style is distinct from Western scripts, has been investigated. The gradient and Zernike moment features were employed and Support Vector Machines (SVMs) were considered for verification. To the best of the authors' knowledge, Hindi signatures have never been used for the task of signature verification and this is the first report of using Hindi signatures in this area. The Hindi signature database employed for experimentation consisted of 840 (35×24) genuine signatures and 1050 (35×30) forgeries. An encouraging accuracy of 7.42% FRR and 4.28% FAR were obtained following experimentation when the gradient features were employed.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"158 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130079612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Evolution Maps for Connected Components in Text Documents 文本文档中连接组件的演化图
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.201
Ofer Biller, K. Kedem, I. Dinstein, Jihad El-Sana
For highly degraded text documents, common tasks such as binarization and line extraction, remain difficult tasks. Equipped with a reliable information regarding the distribution of character dimensions in the document, one can improve results of these algorithms significantly. We introduce a novel perspective of the image data which maps the evolution of connected components along the change in gray scale threshold. We use these maps to provide a robust algorithm for extracting information about character dimensions in degraded documents, and demonstrate improvement in binarization results using this information. We analyze statistically the characteristics of the evolution maps for text documents, and compare our results with ground truth data.
对于高度退化的文本文档,常见的任务,如二值化和行提取,仍然是困难的任务。有了关于文档中字符尺寸分布的可靠信息,可以显著改善这些算法的结果。我们引入了一种新的图像数据视角,它映射了连接分量沿着灰度阈值变化的演变。我们使用这些映射提供了一种鲁棒的算法,用于提取退化文档中有关字符尺寸的信息,并演示了使用这些信息对二值化结果的改进。我们统计分析了文本文档演化图的特征,并将我们的结果与地面真实数据进行了比较。
{"title":"Evolution Maps for Connected Components in Text Documents","authors":"Ofer Biller, K. Kedem, I. Dinstein, Jihad El-Sana","doi":"10.1109/ICFHR.2012.201","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.201","url":null,"abstract":"For highly degraded text documents, common tasks such as binarization and line extraction, remain difficult tasks. Equipped with a reliable information regarding the distribution of character dimensions in the document, one can improve results of these algorithms significantly. We introduce a novel perspective of the image data which maps the evolution of connected components along the change in gray scale threshold. We use these maps to provide a robust algorithm for extracting information about character dimensions in degraded documents, and demonstrate improvement in binarization results using this information. We analyze statistically the characteristics of the evolution maps for text documents, and compare our results with ground truth data.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"152 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133923344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Semi-supervised learning for cursive handwriting recognition using keyword spotting 草书手写识别的半监督学习
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.268
Volkmar Frinken, Markus Baumgartner, Andreas Fischer, H. Bunke
State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.
最先进的手写识别系统是基于学习的系统,需要大量的训练数据。因此,训练数据的创建,以及因此创建一个性能良好的识别系统,需要大量的人力工作。这可以通过半监督学习来减少,它也使用未标记的文本行进行训练。目前的方法是通过手写识别来估计未标记数据的正确转录,这不仅在计算成本方面要求极高,而且还需要一个良好的目标语言模型。在本文中,我们提出了一种不同的方法,利用关键字定位,这是显着更快,不需要任何语言模型。在一组实验中,我们证明了它比现有方法的优越性。
{"title":"Semi-supervised learning for cursive handwriting recognition using keyword spotting","authors":"Volkmar Frinken, Markus Baumgartner, Andreas Fischer, H. Bunke","doi":"10.1109/ICFHR.2012.268","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.268","url":null,"abstract":"State-of-the-art handwriting recognition systems are learning-based systems that require large sets of training data. The creation of training data, and consequently the creation of a well-performing recognition system, requires therefore a substantial amount of human work. This can be reduced with semi-supervised learning, which uses unlabeled text lines for training as well. Current approaches estimate the correct transcription of the unlabeled data via handwriting recognition which is not only extremely demanding as far as computational costs are concerned but also requires a good model of the target language. In this paper, we propose a different approach that makes use of keyword spotting, which is significantly faster and does not need any language model. In a set of experiments we demonstrate its superiority over existing approaches.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128260926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A System for Recognition of On-Line Handwritten Mathematical Expressions 一种在线手写数学表达式识别系统
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.172
Fotini Simistira, V. Papavassiliou, V. Katsouros, G. Carayannis
We present a system for recognizing online mathematical expressions (ME). Symbol recognition is based on a template elastic matching distance between pen direction features. The structural analysis of the ME is based on extracting the baseline of the ME and then classifying symbols into levels above and below the baseline. The symbols are then sequentially analyzed using six spatial relations and a respective 2d structure is processed to give the resulting MathML representation of the ME. The system was evaluated on the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2011 datasets and demonstrates promising results.
提出了一个在线数学表达式(ME)识别系统。符号识别是基于模板笔方向特征之间的弹性匹配距离。微信号的结构分析是基于提取微信号的基线,然后将符号分为高于基线和低于基线的级别。然后使用六个空间关系对符号进行顺序分析,并处理各自的2d结构,以给出ME的结果MathML表示。该系统在2011年在线手写数学表达式识别竞赛(CROHME)数据集上进行了评估,显示出令人满意的结果。
{"title":"A System for Recognition of On-Line Handwritten Mathematical Expressions","authors":"Fotini Simistira, V. Papavassiliou, V. Katsouros, G. Carayannis","doi":"10.1109/ICFHR.2012.172","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.172","url":null,"abstract":"We present a system for recognizing online mathematical expressions (ME). Symbol recognition is based on a template elastic matching distance between pen direction features. The structural analysis of the ME is based on extracting the baseline of the ME and then classifying symbols into levels above and below the baseline. The symbols are then sequentially analyzed using six spatial relations and a respective 2d structure is processed to give the resulting MathML representation of the ME. The system was evaluated on the Competition on Recognition of Online Handwritten Mathematical Expressions (CROHME) 2011 datasets and demonstrates promising results.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126906290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Structural Learning for Writer Identification in Offline Handwriting 离线手写写作者识别的结构学习
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.277
U. Porwal, Chetan Ramaiah, Arti Shivram, V. Govindaraju
Availability of sufficient labeled data is key to the performance of any learning algorithm. However, in document analysis obtaining the large amount of labeled data is difficult. Scarcity of labeled samples is often a main bottleneck in the performance of algorithms for document analysis. However, unlabeled data samples are present in abundance. We propose a semi supervised framework for writer identification for offline handwritten documents that leverages the information hidden in the unlabeled samples. The task of writer identification is a complex one and our framework tries to model the nuances of handwriting with the use of structural learning. This framework models the complexity of learning problem by selecting the best hypotheses space by breaking the main task into several sub tasks. All the hypotheses spaces pertaining to the sub tasks will be used for the best model selection by retrieving a common optimal sub structure that has high correspondence with all of the candidate hypotheses spaces. We have used publically available IAM data set to show the efficacy of our method.
足够的标记数据的可用性是任何学习算法性能的关键。然而,在文档分析中,获取大量的标记数据是困难的。标记样本的稀缺性通常是影响文档分析算法性能的主要瓶颈。然而,未标记的数据样本大量存在。我们提出了一个半监督框架,用于离线手写文档的作者识别,该框架利用隐藏在未标记样本中的信息。作者识别是一个复杂的任务,我们的框架试图用结构学习来模拟笔迹的细微差别。该框架通过将主任务分解为若干子任务,选择最佳假设空间,对学习问题的复杂性进行建模。所有与子任务相关的假设空间将通过检索与所有候选假设空间高度对应的公共最优子结构来用于最佳模型选择。我们使用公开可用的IAM数据集来显示我们方法的有效性。
{"title":"Structural Learning for Writer Identification in Offline Handwriting","authors":"U. Porwal, Chetan Ramaiah, Arti Shivram, V. Govindaraju","doi":"10.1109/ICFHR.2012.277","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.277","url":null,"abstract":"Availability of sufficient labeled data is key to the performance of any learning algorithm. However, in document analysis obtaining the large amount of labeled data is difficult. Scarcity of labeled samples is often a main bottleneck in the performance of algorithms for document analysis. However, unlabeled data samples are present in abundance. We propose a semi supervised framework for writer identification for offline handwritten documents that leverages the information hidden in the unlabeled samples. The task of writer identification is a complex one and our framework tries to model the nuances of handwriting with the use of structural learning. This framework models the complexity of learning problem by selecting the best hypotheses space by breaking the main task into several sub tasks. All the hypotheses spaces pertaining to the sub tasks will be used for the best model selection by retrieving a common optimal sub structure that has high correspondence with all of the candidate hypotheses spaces. We have used publically available IAM data set to show the efficacy of our method.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114145102","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Persian Signature Verification Based on Fractal Dimension Using Testing Hypothesis 基于检验假设的分形维数波斯语签名验证
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.254
A. Foroozandeh, Y. Akbari, M. Jalili, J. Sadri
A new approach for verifying off-line Persian signatures is presented, in this paper. In our method, feature extraction step is conducted based on estimated Fractal Dimension (FD) of signatures images, and making decision about acceptance/rejection of test signature is formulated as testing hypothesis which is used for the first time in order to verify offline Persian signatures. The proposed method has been tested on our new created database included 1000 genuine signatures and 200 skilled forgeries which have been collected from a population of 100 human subjects with different educational background. Obtained results confirm the effectiveness of the presented method.
提出了一种验证离线波斯语签名的新方法。该方法基于签名图像的估计分形维数(FD)进行特征提取,并将测试签名的接受/拒绝决策制定为测试假设,首次用于离线波斯语签名的验证。该方法已经在我们新创建的数据库中进行了测试,其中包括从100名不同教育背景的人类受试者中收集的1000个真实签名和200个熟练的伪造签名。仿真结果验证了该方法的有效性。
{"title":"Persian Signature Verification Based on Fractal Dimension Using Testing Hypothesis","authors":"A. Foroozandeh, Y. Akbari, M. Jalili, J. Sadri","doi":"10.1109/ICFHR.2012.254","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.254","url":null,"abstract":"A new approach for verifying off-line Persian signatures is presented, in this paper. In our method, feature extraction step is conducted based on estimated Fractal Dimension (FD) of signatures images, and making decision about acceptance/rejection of test signature is formulated as testing hypothesis which is used for the first time in order to verify offline Persian signatures. The proposed method has been tested on our new created database included 1000 genuine signatures and 200 skilled forgeries which have been collected from a population of 100 human subjects with different educational background. Obtained results confirm the effectiveness of the presented method.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116730193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
The role of the users in handwritten word spotting applications: query fusion and relevance feedback 用户在手写单词识别应用中的作用:查询融合和相关反馈
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.282
Marçal Rusiñol, J. Lladós
In this paper we present the importance of including the user in the loop in a handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and a baseline word spotting approach based on a bag-of-visual-words model.
在本文中,我们提出了在手写单词识别框架中包括用户在循环中的重要性。几种现成的查询融合和相关反馈策略已经在手写体单词识别环境中进行了测试。当用户被包含在循环中时,使用两个历史手写文档数据集和基于视觉词袋模型的基线单词发现方法来评估精度的提高。
{"title":"The role of the users in handwritten word spotting applications: query fusion and relevance feedback","authors":"Marçal Rusiñol, J. Lladós","doi":"10.1109/ICFHR.2012.282","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.282","url":null,"abstract":"In this paper we present the importance of including the user in the loop in a handwritten word spotting framework. Several off-the-shelf query fusion and relevance feedback strategies have been tested in the handwritten word spotting context. The increase in terms of precision when the user is included in the loop is assessed using two datasets of historical handwritten documents and a baseline word spotting approach based on a bag-of-visual-words model.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124989656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Model-Based Tabular Structure Detection and Recognition in Noisy Handwritten Documents 基于模型的有噪声手写文档表结构检测与识别
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.233
Jin Chen, D. Lopresti
Tabular structure detection and recognition can be a valuable step in the analysis of unstructured documents. The noisy handwritten documents we try to analyze may contain pre-printed rulings as the substrate, hand-drawn rulings, machine-printed text, handwritten text, and signatures, in addition to the tabular structures which we wish to decompose into basic cells, rows, and columns. Although work has been done to machine-printed documents, noisy handwritten documents may require modified and/or new techniques. In this work, we try to detect and decompose tabular structures into 2-D grids of table cells simultaneously. First, we detect "key points" that help determine the physical and logical structure of tables. Then, we make use of the 2-D grid assumption to build grids of key points. Finally, we extract structural features for the Min-Cut/Max-Flow algorithm to recognize tabular structures. Experiments on 22 tables which contain 584 table cells show a cell precision of 100% and a cell recall of 93.3%.
表格结构检测和识别在分析非结构化文档时是很有价值的一步。我们试图分析的嘈杂的手写文档可能包含作为基板的预打印规则、手绘规则、机器打印文本、手写文本和签名,以及我们希望分解为基本单元格、行和列的表格结构。虽然已经对机器打印的文档进行了改进,但是嘈杂的手写文档可能需要修改和/或新的技术。在这项工作中,我们试图同时检测并将表格结构分解为表格单元的二维网格。首先,我们检测有助于确定表的物理和逻辑结构的“关键点”。然后,利用二维网格假设建立关键点网格。最后,提取结构特征,用于Min-Cut/Max-Flow算法识别表格结构。在包含584个表单元的22个表上进行的实验表明,该方法的单元精度为100%,单元召回率为93.3%。
{"title":"Model-Based Tabular Structure Detection and Recognition in Noisy Handwritten Documents","authors":"Jin Chen, D. Lopresti","doi":"10.1109/ICFHR.2012.233","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.233","url":null,"abstract":"Tabular structure detection and recognition can be a valuable step in the analysis of unstructured documents. The noisy handwritten documents we try to analyze may contain pre-printed rulings as the substrate, hand-drawn rulings, machine-printed text, handwritten text, and signatures, in addition to the tabular structures which we wish to decompose into basic cells, rows, and columns. Although work has been done to machine-printed documents, noisy handwritten documents may require modified and/or new techniques. In this work, we try to detect and decompose tabular structures into 2-D grids of table cells simultaneously. First, we detect \"key points\" that help determine the physical and logical structure of tables. Then, we make use of the 2-D grid assumption to build grids of key points. Finally, we extract structural features for the Min-Cut/Max-Flow algorithm to recognize tabular structures. Experiments on 22 tables which contain 584 table cells show a cell precision of 100% and a cell recall of 93.3%.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125248661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Character Image Patterns as Big Data 作为大数据的人物形象模式
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.190
S. Uchida, R. Ishida, A. Yoshida, Wenjie Cai, Yaokai Feng
The ambitious goal of this research is to understand the real distribution of character patterns. Ideally, if we can collect all possible character patterns, we can totally understand how they are distributed in the image space. In addition, we also have the perfect character recognizer because we know the correct class for any character image. Of course, it is practically impossible to collect all those patterns - however, if we collect character patterns massively and analyze how the distribution changes according to the increase of patterns, we will be able to estimate the real distribution asymptotically. For this purpose, we use 822,714 manually ground-truthed 32×32 handwritten digit patterns in this paper. The distribution of those patterns are observed by nearest neighbor analysis and network analysis, both of which do not make any approximation (such as low-dimensional representation) and thus do not corrupt the details of the distribution.
这项研究的宏伟目标是了解性格模式的真实分布。理想情况下,如果我们可以收集所有可能的字符模式,我们就可以完全了解它们在图像空间中的分布情况。此外,我们还拥有完美的字符识别器,因为我们知道任何字符图像的正确类别。当然,收集所有这些模式实际上是不可能的,但是,如果我们大量收集特征模式并分析分布如何随着模式的增加而变化,我们将能够渐近地估计真实分布。为此,我们在本文中使用了822,714个手动接地的32×32手写数字模式。这些模式的分布是通过最近邻分析和网络分析来观察的,这两种分析都不做任何近似(比如低维表示),因此不会破坏分布的细节。
{"title":"Character Image Patterns as Big Data","authors":"S. Uchida, R. Ishida, A. Yoshida, Wenjie Cai, Yaokai Feng","doi":"10.1109/ICFHR.2012.190","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.190","url":null,"abstract":"The ambitious goal of this research is to understand the real distribution of character patterns. Ideally, if we can collect all possible character patterns, we can totally understand how they are distributed in the image space. In addition, we also have the perfect character recognizer because we know the correct class for any character image. Of course, it is practically impossible to collect all those patterns - however, if we collect character patterns massively and analyze how the distribution changes according to the increase of patterns, we will be able to estimate the real distribution asymptotically. For this purpose, we use 822,714 manually ground-truthed 32×32 handwritten digit patterns in this paper. The distribution of those patterns are observed by nearest neighbor analysis and network analysis, both of which do not make any approximation (such as low-dimensional representation) and thus do not corrupt the details of the distribution.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122892159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Benchmarking of update learning strategies on digit classifier systems 数字分类器系统更新学习策略的基准测试
Pub Date : 2012-09-18 DOI: 10.1109/ICFHR.2012.186
D. Barbuzzi, D. Impedovo, G. Pirlo
Three different strategies in order to re-train classifiers, when new labeled data become available, are presented in a multi-expert scenario. The first method is the use of the entire new dataset. The second one is related to the consideration that each single classifier is able to select new samples starting from those on which it performs a missclassification. Finally, by inspecting the multi expert system behavior, a sample misclassified by an expert, is used to update that classifier only if it produces a miss-classification by the ensemble of classifiers. This paper provides a comparison of three approaches under different conditions on two state of the art classifiers (SVM and Naive Bayes) by taking into account four different combination techniques. Experiments have been performed by considering the CEDAR (handwritten digit) database. It is shown how results depend by the amount of the new training samples, as well as by the specific combination decision schema and by classifiers in the ensemble.
当新的标记数据可用时,在多专家场景中提出了三种不同的重新训练分类器的策略。第一种方法是使用整个新数据集。第二个问题与考虑到每个分类器都能够从执行错误分类的样本中选择新样本有关。最后,通过检查多专家系统的行为,使用被专家错误分类的样本,只有当它产生由分类器集成的错误分类时,才会更新该分类器。本文通过考虑四种不同的组合技术,对两种最先进的分类器(SVM和朴素贝叶斯)在不同条件下的三种方法进行了比较。实验考虑了雪松(手写数字)数据库。它显示了结果如何取决于新训练样本的数量,以及特定的组合决策模式和集成中的分类器。
{"title":"Benchmarking of update learning strategies on digit classifier systems","authors":"D. Barbuzzi, D. Impedovo, G. Pirlo","doi":"10.1109/ICFHR.2012.186","DOIUrl":"https://doi.org/10.1109/ICFHR.2012.186","url":null,"abstract":"Three different strategies in order to re-train classifiers, when new labeled data become available, are presented in a multi-expert scenario. The first method is the use of the entire new dataset. The second one is related to the consideration that each single classifier is able to select new samples starting from those on which it performs a missclassification. Finally, by inspecting the multi expert system behavior, a sample misclassified by an expert, is used to update that classifier only if it produces a miss-classification by the ensemble of classifiers. This paper provides a comparison of three approaches under different conditions on two state of the art classifiers (SVM and Naive Bayes) by taking into account four different combination techniques. Experiments have been performed by considering the CEDAR (handwritten digit) database. It is shown how results depend by the amount of the new training samples, as well as by the specific combination decision schema and by classifiers in the ensemble.","PeriodicalId":291062,"journal":{"name":"2012 International Conference on Frontiers in Handwriting Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128090403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2012 International Conference on Frontiers in Handwriting Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1