首页 > 最新文献

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.最新文献

英文 中文
Recovery of writing sequence of static images of handwriting using UWM 用UWM恢复手写静态图像的书写顺序
K. K. Lau, P. Yuen, Y. Tang
It is generally agreed that an on-line recognitionsystem is always reliable than an off-line one. It is due tothe availability of the dynamic information, especially thewriting sequence of the strokes. This paper presents anew statistical method to reconstruct the writing order ofa handwritten script from a two-dimensional static image.The reconstruction process consists of two phases, namedthe training phase and the testing phase. In the trainingphase, the writing order with other attributes, such aslength and direction, are extracted from a set of trainingon-line handwritten scripts statistically to form auniversal writing model (UWM). In the testing phase,UWM is applied to reconstruct the drawing order of off-linehandwritten scripts by finding the highest totalprobability. 300 off-line signatures with ground truth areused for evaluation. Experimental results show that thereconstructed writing sequence using UWM is close to theactual writing sequence.
人们普遍认为,在线识别系统总是比离线识别系统可靠。这是由于动态信息的可用性,特别是笔画的书写顺序。本文提出了一种新的统计方法,从二维静态图像中重建手写体的书写顺序。重构过程包括两个阶段,即训练阶段和测试阶段。在训练阶段,从一组在线训练手写体中统计提取具有长度和方向等其他属性的书写顺序,形成通用书写模型(universal writing model, UWM)。在测试阶段,通过寻找最大的总概率,将UWM应用于重建离线手写脚本的绘制顺序。300个离线签名被用于评估。实验结果表明,用UWM构造的写入序列与实际写入序列较为接近。
{"title":"Recovery of writing sequence of static images of handwriting using UWM","authors":"K. K. Lau, P. Yuen, Y. Tang","doi":"10.1109/ICDAR.2003.1227831","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227831","url":null,"abstract":"It is generally agreed that an on-line recognitionsystem is always reliable than an off-line one. It is due tothe availability of the dynamic information, especially thewriting sequence of the strokes. This paper presents anew statistical method to reconstruct the writing order ofa handwritten script from a two-dimensional static image.The reconstruction process consists of two phases, namedthe training phase and the testing phase. In the trainingphase, the writing order with other attributes, such aslength and direction, are extracted from a set of trainingon-line handwritten scripts statistically to form auniversal writing model (UWM). In the testing phase,UWM is applied to reconstruct the drawing order of off-linehandwritten scripts by finding the highest totalprobability. 300 off-line signatures with ground truth areused for evaluation. Experimental results show that thereconstructed writing sequence using UWM is close to theactual writing sequence.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130556150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Localization, extraction and recognition of text in Telugu document images 泰卢固语文档图像文本的定位、提取和识别
A. Negi, K. Shanker, C. K. Chereddi
In this paper we present a system to locate, extract andrecognize Telugu text. The circular nature of Telugu scriptis exploited for segmenting text regions using the HoughTransform. First, the Hough Transform for circles is performedon the Sobel gradient magnitude of the image tolocate text. The located circles are filled to yield text regions,followed by Recursive XY Cuts to segment the regionsinto paragraphs, lines and word regions. A regionmerging process with a bottom-up approach envelopes individualwords. Local binarization of the word MBRs yieldsconnected components containing glyphs for recognition.The recognition process first identifies candidate charactersby a zoning technique and then constructs structural featurevectors by cavity analysis. Finally, if required, crossingcount based non-linear normalization and scaling is performedbefore template matching. The segmentation processsucceeds in extracting text from images with complexNon-Manhattan layouts. The recognition process gave acharacter recognition accuracy of 97%-98%.
本文提出了一个泰卢固语文本的定位、提取和识别系统。泰卢固语脚本的循环特性被用于使用HoughTransform分割文本区域。首先,对图像的Sobel梯度大小进行霍夫变换来定位文本。填充定位的圆圈以生成文本区域,然后使用递归XY切割将这些区域分割为段落、线条和单词区域。采用自底向上方法的区域合并过程包含单个单词。对单词mbr进行局部二值化,得到包含字形的连接分量,便于识别。识别过程首先通过分区技术识别候选特征,然后通过空腔分析构建结构特征向量。最后,如果需要,在模板匹配之前执行基于交叉计数的非线性归一化和缩放。分割过程成功地从复杂的非曼哈顿布局的图像中提取文本。该识别过程使汉字识别准确率达到97% ~ 98%。
{"title":"Localization, extraction and recognition of text in Telugu document images","authors":"A. Negi, K. Shanker, C. K. Chereddi","doi":"10.1109/ICDAR.2003.1227846","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227846","url":null,"abstract":"In this paper we present a system to locate, extract andrecognize Telugu text. The circular nature of Telugu scriptis exploited for segmenting text regions using the HoughTransform. First, the Hough Transform for circles is performedon the Sobel gradient magnitude of the image tolocate text. The located circles are filled to yield text regions,followed by Recursive XY Cuts to segment the regionsinto paragraphs, lines and word regions. A regionmerging process with a bottom-up approach envelopes individualwords. Local binarization of the word MBRs yieldsconnected components containing glyphs for recognition.The recognition process first identifies candidate charactersby a zoning technique and then constructs structural featurevectors by cavity analysis. Finally, if required, crossingcount based non-linear normalization and scaling is performedbefore template matching. The segmentation processsucceeds in extracting text from images with complexNon-Manhattan layouts. The recognition process gave acharacter recognition accuracy of 97%-98%.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123665284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Bank-check processing system: modifications due to the new European currency 银行支票处理系统:由于新的欧洲货币的修改
N. Greco, D. Impedovo, M. Lucchese, A. Salzo, L. Sarcinella
The introduction of a new currency in Europe has changed the way of writing both the courtesy and the legal amount on checks. This paper presents the most important modifications brought on the bank-check processing system in order to solve the related problems also by proposing the software tools that must be utilized. The Computer Aided Software Engineering tools provided by the "Khoros" system are used to support the improvement of the system prototype. A visual programming environment is used to assemble the bankcheck processing system that can be easily modified and extended. The experimental results allow the adjournment of the improved system, as the modifications are introduced.
欧洲新货币的引入改变了支票上礼貌金额和法定金额的书写方式。本文介绍了银行支票处理系统为解决相关问题所做的主要修改,并提出了必须使用的软件工具。使用“Khoros”系统提供的计算机辅助软件工程工具来支持系统原型的改进。采用可视化编程环境对银行支票处理系统进行组装,使其易于修改和扩展。实验结果允许对改进后的系统进行修改。
{"title":"Bank-check processing system: modifications due to the new European currency","authors":"N. Greco, D. Impedovo, M. Lucchese, A. Salzo, L. Sarcinella","doi":"10.1109/ICDAR.2003.1227686","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227686","url":null,"abstract":"The introduction of a new currency in Europe has changed the way of writing both the courtesy and the legal amount on checks. This paper presents the most important modifications brought on the bank-check processing system in order to solve the related problems also by proposing the software tools that must be utilized. The Computer Aided Software Engineering tools provided by the \"Khoros\" system are used to support the improvement of the system prototype. A visual programming environment is used to assemble the bankcheck processing system that can be easily modified and extended. The experimental results allow the adjournment of the improved system, as the modifications are introduced.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121532052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Determination of the number of writing variants with an HMM based cursive word recognition system 基于HMM的草书词识别系统中书写变体数量的确定
M. Schambach
An important parameter for building a cursive script model is the number of different, relevant letter writing variants. An algorithm performing this task automatically by optimizing the number of letter models in an HMM-based script recognition system is presented. The algorithm iteratively modified selected letter models; for selection, quality measures like HMM distance and emission weight entropy are developed, and their correlation with recognition performance is shown. Theoretical measures for the selection of overall model complexity are presented, but best results are obtained by direct selection criteria: likelihood and recognition rate of training data. With the optimized models, an average improvement in recognition rate of up to 5.8 percent could be achieved.
构建草书模型的一个重要参数是不同的、相关的字母书写变体的数量。提出了一种基于hmm的文字识别系统中,通过优化字母模型数目来自动完成该任务的算法。该算法迭代修改选定字母模型;在选择方面,提出了HMM距离和发射权熵等质量度量,并给出了它们与识别性能的相关性。虽然提出了选择整体模型复杂性的理论度量,但通过直接选择标准:训练数据的似然和识别率获得了最好的结果。通过优化后的模型,识别率平均提高了5.8%。
{"title":"Determination of the number of writing variants with an HMM based cursive word recognition system","authors":"M. Schambach","doi":"10.1109/ICDAR.2003.1227644","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227644","url":null,"abstract":"An important parameter for building a cursive script model is the number of different, relevant letter writing variants. An algorithm performing this task automatically by optimizing the number of letter models in an HMM-based script recognition system is presented. The algorithm iteratively modified selected letter models; for selection, quality measures like HMM distance and emission weight entropy are developed, and their correlation with recognition performance is shown. Theoretical measures for the selection of overall model complexity are presented, but best results are obtained by direct selection criteria: likelihood and recognition rate of training data. With the optimized models, an average improvement in recognition rate of up to 5.8 percent could be achieved.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127721799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Offline character recognition using online character writing information 使用在线字符书写信息进行离线字符识别
Hiromitsu Nishimura, Takehiko Timikawa
Recognition of variously deformed character patterns is a salient subject for offline hand-printed character recognition. Sufficient recognition performance for practical use has not been achieved despite reports of many recognition techniques. Our research examines effective recognition techniques for deformed characters, extending conventional recognition techniques using online character writing information containing writing pressure data. This study extends conventional recognition techniques using online character writing information containing writing pressure information. A recognition system using simple pattern matching and HMM was made for evaluation experiments using common hand-printed English character patterns from the ETL6 database to determine effectiveness of the proposed extending recognition method. Character recognition performance is increased in both expansion recognition methods using online writing information.
各种变形字符模式的识别是离线手印字符识别的一个重要课题。尽管有许多识别技术的报道,但实际使用的足够识别性能尚未实现。我们的研究探讨了变形字符的有效识别技术,扩展了传统的识别技术,使用包含书写压力数据的在线字符书写信息。本研究扩展了传统的识别技术,使用包含书写压力信息的在线汉字书写信息。利用ETL6数据库中常见的手印英文字符模式,构建了基于简单模式匹配和HMM的识别系统,并进行了评价实验,以确定所提出的扩展识别方法的有效性。两种利用在线书写信息的扩展识别方法都提高了字符识别性能。
{"title":"Offline character recognition using online character writing information","authors":"Hiromitsu Nishimura, Takehiko Timikawa","doi":"10.1109/ICDAR.2003.1227653","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227653","url":null,"abstract":"Recognition of variously deformed character patterns is a salient subject for offline hand-printed character recognition. Sufficient recognition performance for practical use has not been achieved despite reports of many recognition techniques. Our research examines effective recognition techniques for deformed characters, extending conventional recognition techniques using online character writing information containing writing pressure data. This study extends conventional recognition techniques using online character writing information containing writing pressure information. A recognition system using simple pattern matching and HMM was made for evaluation experiments using common hand-printed English character patterns from the ETL6 database to determine effectiveness of the proposed extending recognition method. Character recognition performance is increased in both expansion recognition methods using online writing information.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127791463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Generalized hough transform for arabic optical character recognition 阿拉伯光学字符识别的广义霍夫变换
S. Touj, N. Amara, H. Amiri
The Generalized Hough Transform is a technique used todetect arbitrary objects in a given image. This techniqueis known for its capacity of absorption of distortions aswell as noises. In the present paper, we describe anapproach showing the efficiency of the use of theGeneralized Hough Transform to recognize Arabicprinted characters in their different shapes.
广义霍夫变换是一种用于检测给定图像中任意物体的技术。这种技术以其吸收失真和噪声的能力而闻名。在本文中,我们描述了一种方法,显示了使用广义霍夫变换来识别不同形状的阿拉伯印刷字符的效率。
{"title":"Generalized hough transform for arabic optical character recognition","authors":"S. Touj, N. Amara, H. Amiri","doi":"10.1109/ICDAR.2003.1227856","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227856","url":null,"abstract":"The Generalized Hough Transform is a technique used todetect arbitrary objects in a given image. This techniqueis known for its capacity of absorption of distortions aswell as noises. In the present paper, we describe anapproach showing the efficiency of the use of theGeneralized Hough Transform to recognize Arabicprinted characters in their different shapes.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"124 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128090028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
An approach to extracting the target text line from a document image captured by a pen scanner 一种从笔扫描仪捕获的文档图像中提取目标文本行的方法
Zhenlong Bai, Qiang Huo
In this paper, we present a new approach to extracting the target text line from a document image captured by a pen scanner. Given the binary image, a set of possible text lines are first formed by nearest-neighbor grouping of connected components (CC). They are then refined by text line merging and adding the missed CCs. The possible target text line is identified by using a geometric feature based score function and fed to an OCR engine for character recognition. If the recognition result is confident enough, the target text line is accepted. Otherwise, all the remaining text lines are fed to the OCR engine to verify whether an alternative target text line exists or the whole image should be rejected. The effectiveness of the above approach is confirmed by experiments on a testing database consisting of 117 document images captured by C-Pen and ScanEye pen scanners.
在本文中,我们提出了一种从笔扫描仪捕获的文档图像中提取目标文本行的新方法。给定二值图像,首先通过连接组件的最近邻分组(CC)形成一组可能的文本行。然后通过文本行合并和添加遗漏的cc来改进它们。使用基于几何特征的分数函数来识别可能的目标文本行,并将其馈送到OCR引擎进行字符识别。如果识别结果足够自信,则接受目标文本行。否则,所有剩余的文本行将被馈送到OCR引擎,以验证是否存在替代的目标文本行,或者应该拒绝整个图像。在由C-Pen和ScanEye笔式扫描仪捕获的117张文档图像组成的测试数据库上进行了实验,验证了上述方法的有效性。
{"title":"An approach to extracting the target text line from a document image captured by a pen scanner","authors":"Zhenlong Bai, Qiang Huo","doi":"10.1109/ICDAR.2003.1227631","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227631","url":null,"abstract":"In this paper, we present a new approach to extracting the target text line from a document image captured by a pen scanner. Given the binary image, a set of possible text lines are first formed by nearest-neighbor grouping of connected components (CC). They are then refined by text line merging and adding the missed CCs. The possible target text line is identified by using a geometric feature based score function and fed to an OCR engine for character recognition. If the recognition result is confident enough, the target text line is accepted. Otherwise, all the remaining text lines are fed to the OCR engine to verify whether an alternative target text line exists or the whole image should be rejected. The effectiveness of the above approach is confirmed by experiments on a testing database consisting of 117 document images captured by C-Pen and ScanEye pen scanners.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130036968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Recognition of rotated characters by Eigen-space 基于特征空间的旋转字符识别
H. Hase, Toshiyuki Shinokawa, M. Yoneda, C. Suen
In this paper, we present a method of recognizinginclined, rotated characters. First we construct an eigensub-space for each category using the covariance matrixwhich is calculated from a sufficient number of rotatedcharacters. Next, we can obtain a locus by projectingtheir rotated characters onto the eigen sub-space andinterpolating between their projected points. An unknowncharacter is also projected onto the eigen sub-space ofeach category. Then, the verification is carried out bycalculating the distance between the projected point ofthe unknown character and the locus. In our experiment,we obtained quite good results for the CENTURY font of26 capital letters of the English alphabet (A, B, .... ,Z).This method has the added advantage of obtaining therecognition result (category) and angle of inclination atthe same time
在本文中,我们提出了一种识别倾斜、旋转字符的方法。首先,我们使用协方差矩阵为每个类别构造一个特征子空间,协方差矩阵是从足够数量的旋转字符中计算出来的。接下来,我们可以通过将它们的旋转特征投影到特征子空间上,并在投影点之间进行插值来获得轨迹。一个未知字符也被投影到每个范畴的特征子空间上。然后,通过计算未知特征投影点到轨迹的距离进行验证。在我们的实验中,我们对英文字母表中26个大写字母(A, B, ....)的CENTURY字体获得了相当好的结果, Z)。该方法具有同时获得识别结果(类别)和倾斜角的优点
{"title":"Recognition of rotated characters by Eigen-space","authors":"H. Hase, Toshiyuki Shinokawa, M. Yoneda, C. Suen","doi":"10.1109/ICDAR.2003.1227758","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227758","url":null,"abstract":"In this paper, we present a method of recognizinginclined, rotated characters. First we construct an eigensub-space for each category using the covariance matrixwhich is calculated from a sufficient number of rotatedcharacters. Next, we can obtain a locus by projectingtheir rotated characters onto the eigen sub-space andinterpolating between their projected points. An unknowncharacter is also projected onto the eigen sub-space ofeach category. Then, the verification is carried out bycalculating the distance between the projected point ofthe unknown character and the locus. In our experiment,we obtained quite good results for the CENTURY font of26 capital letters of the English alphabet (A, B, .... ,Z).This method has the added advantage of obtaining therecognition result (category) and angle of inclination atthe same time","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130180903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Recognition of printed Urdu script 乌尔都语文字的识别
Zaheer Ahmad, Jehanzeb Khan Orakzai, Inam Shamsher
This paper deals with an Optical Character Recognitionsystem for printed Urdu, a popular Indian script. Thedevelopment of OCR for this script is difficult because (i) alarge number of characters have to be recognized (ii) thereare many similar shaped characters. In the proposedsystem individual characters are recognized using acombination of topological, contour and water reservoirconcept based features. The feature detection methods aresimple and robust. A prototype of the system has beentested on printed Urdu characters and currently achieves97.8% character level accuracy on average.
本文研究了一种用于印刷乌尔都语(一种流行的印度文字)的光学字符识别系统。这种文字的OCR开发是困难的,因为(i)大量的字符必须被识别(ii)有许多相似形状的字符。在所提出的系统中,使用基于拓扑、轮廓和水库概念的特征相结合来识别个体特征。特征检测方法简单,鲁棒性好。该系统的原型已经在印刷的乌尔都语字符上进行了测试,目前平均达到了97.8%的字符级准确率。
{"title":"Recognition of printed Urdu script","authors":"Zaheer Ahmad, Jehanzeb Khan Orakzai, Inam Shamsher","doi":"10.1109/ICDAR.2003.1227844","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227844","url":null,"abstract":"This paper deals with an Optical Character Recognitionsystem for printed Urdu, a popular Indian script. Thedevelopment of OCR for this script is difficult because (i) alarge number of characters have to be recognized (ii) thereare many similar shaped characters. In the proposedsystem individual characters are recognized using acombination of topological, contour and water reservoirconcept based features. The feature detection methods aresimple and robust. A prototype of the system has beentested on printed Urdu characters and currently achieves97.8% character level accuracy on average.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130277437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
Feature selection for ensembles:a hierarchical multi-objective genetic algorithm approach 集成的特征选择:一种分层多目标遗传算法方法
Luiz Oliveira, R. Sabourin, Flávio Bortolozzi, C. Suen
Feature selection for ensembles has shown to be an effectivestrategy for ensemble creation. In this paper we presentan ensemble feature selection approach based on a hierarchicalmulti-objective genetic algorithm. The first level performsfeature selection in order to generate a set of goodclassifiers while the second one combines them to providea set of powerful ensembles. The proposed method is evaluatedin the context of handwritten digit recognition, usingthree different feature sets and neural networks (MLP) asclassifiers. Experiments conducted on NIST SD19 demonstratedthe effectiveness of the proposed strategy.
集成的特征选择已被证明是集成创建的一种有效策略。本文提出了一种基于层次多目标遗传算法的集成特征选择方法。第一级执行特征选择以生成一组好的分类器,而第二级将它们组合起来以提供一组强大的集成。在手写体数字识别的背景下,使用三种不同的特征集和神经网络(MLP)作为分类器对所提出的方法进行了评估。在NIST SD19上进行的实验证明了所提出策略的有效性。
{"title":"Feature selection for ensembles:a hierarchical multi-objective genetic algorithm approach","authors":"Luiz Oliveira, R. Sabourin, Flávio Bortolozzi, C. Suen","doi":"10.1109/ICDAR.2003.1227748","DOIUrl":"https://doi.org/10.1109/ICDAR.2003.1227748","url":null,"abstract":"Feature selection for ensembles has shown to be an effectivestrategy for ensemble creation. In this paper we presentan ensemble feature selection approach based on a hierarchicalmulti-objective genetic algorithm. The first level performsfeature selection in order to generate a set of goodclassifiers while the second one combines them to providea set of powerful ensembles. The proposed method is evaluatedin the context of handwritten digit recognition, usingthree different feature sets and neural networks (MLP) asclassifiers. Experiments conducted on NIST SD19 demonstratedthe effectiveness of the proposed strategy.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134054414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
期刊
Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1