MOCR '13最新文献

英文中文

Bag-of-features HMMs for segmentation-free Bangla word spotting 特征袋hmm用于无分词的孟加拉语单词识别

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505384

Leonard Rothacker, G. Fink, P. Banerjee, U. Bhattacharya, B. Chaudhuri

In this paper we present how Bag-of-Features Hidden Markov Models can be applied to printed Bangla word spotting. These statistical models allow for an easy adaption to different problem domains. This is possible due to the integration of automatically estimated visual appearance features and Hidden Markov Models for spatial sequential modeling. In our evaluation we are able to report high retrieval scores on a new printed Bangla dataset. Furthermore, we outperform state-of-the-art results on the well-known George Washington word spotting benchmark. Both results have been achieved using an almost identical parametric method configuration.

在本文中，我们介绍了如何将特征袋隐马尔可夫模型应用于印刷孟加拉语单词识别。这些统计模型可以很容易地适应不同的问题领域。由于集成了自动估计的视觉外观特征和用于空间序列建模的隐马尔可夫模型，这是可能的。在我们的评估中，我们能够在新打印的孟加拉语数据集上报告高检索分数。此外，我们在著名的乔治·华盛顿单词识别基准上的表现优于最先进的结果。这两个结果都是使用几乎相同的参数方法配置实现的。

引用次数: 16

HMM-based script identification for OCR 用于OCR的基于hmm的脚本标识

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505382

Dmitriy Genzel, Ashok Popat, R. Teunen, Yasuhisa Fujii

While current OCR systems are able to recognize text in an increasing number of scripts and languages, typically they still need to be told in advance what those scripts and languages are. We propose an approach that repurposes the same HMM-based system used for OCR to the task of script/language ID, by replacing character labels with script class labels. We apply it in a multi-pass overall OCR process which achieves "universal" OCR over 54 tested languages in 18 distinct scripts, over a wide variety of typefaces in each. For comparison we also consider a brute-force approach, wherein a singe HMM-based OCR system is trained to recognize all considered scripts. Results are presented on a large and diverse evaluation set extracted from book images, both for script identification accuracy and for overall OCR accuracy. On this evaluation data, the script ID system provided a script ID error rate of 1.73% for 18 distinct scripts. The end-to-end OCR system with the script ID system achieved a character error rate of 4.05%, an increase of 0.77% over the case where the languages are known a priori.

虽然当前的OCR系统能够识别越来越多的脚本和语言的文本，但通常它们仍然需要事先被告知这些脚本和语言是什么。我们提出了一种方法，通过用脚本类标签替换字符标签，将用于OCR的相同基于hmm的系统重新用于脚本/语言ID的任务。我们将其应用于一个多通道整体OCR过程中，该过程在54种测试语言的18种不同的脚本中实现了“通用”OCR，每种语言都有各种各样的字体。为了比较，我们还考虑了一种蛮力方法，其中一个基于hmm的OCR系统被训练来识别所有考虑的脚本。从图书图像中提取了大量不同的评估集，结果显示了脚本识别的准确性和整体OCR的准确性。在该评价数据上，脚本ID系统对18个不同的脚本提供的脚本ID错误率为1.73%。使用脚本ID系统的端到端OCR系统实现了4.05%的字符错误率，比先验已知语言的情况增加了0.77%。

引用次数: 8

Ruling-based table analysis for noisy handwritten documents 噪声手写文档的基于规则的表分析

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505392

Jin Chen, D. Lopresti

Table analysis can be a valuable step in document image analysis. In the case of noisy handwritten documents, various artifacts complicate the task of locating tables on a page and segmenting them into cells. Our ruling-based approach first detects line segments to ensure high recall of table rulings, and then computes the intersections of horizontal and vertical rulings as key points. We then employ an optimization procedure to select the most probable subset of these key points which constitute the table structure. Finally, we decompose a table into a 2-D arrangement of cells using the key points. Experimental evaluation involving 61 handwritten pages from 17 table classes show a table cell precision of 89% and a recall of 88%.

表格分析在文档图像分析中是一个很有价值的步骤。在嘈杂的手写文档的情况下，各种各样的工件使在页面上定位表并将它们分割成单元格的任务复杂化。我们的基于规则的方法首先检测线段，以确保表规则的高召回率，然后计算水平和垂直规则的交叉点作为关键点。然后，我们使用一个优化过程来选择构成表结构的这些关键点的最可能子集。最后，我们使用关键点将表格分解为二维单元格排列。涉及17个表格类别的61个手写页面的实验评估显示，表格单元精度为89%，召回率为88%。

引用次数: 6

Multi-script robust reading competition in ICDAR 2013 ICDAR 2013中的多文字稳健阅读竞赛

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505390

D. Kumar, M. Prasad, A. Ramakrishnan

A competition was organized by the authors to detect text from scene images. The motivation was to look for script-independent algorithms that detect the text and extract it from the scene images, which may be applied directly to an unknown script. The competition had four distinct tasks: (i) text localization and (ii) segmentation from scene images containing one or more of Kannada, Tamil, Hindi, Chinese and English words. (iii) English and (iv) Kannada word recognition task from scene word images. There were totally four submissions for the text localization and segmentation tasks. For the other two tasks, we have evaluated two algorithms, namely nonlinear enhancement and selection of plane and midline analysis and propagation of segmentation, already published by us. A complete picture on the position of an algorithm is discussed and suggestions are provided to improve the quality of the algorithms. Graphical depiction of f-score of individual images in the form of benchmark values is proposed to show the strength of an algorithm.

作者组织了一场从场景图像中检测文本的比赛。动机是寻找独立于脚本的算法来检测文本并从场景图像中提取文本，这可能直接应用于未知的脚本。比赛有四个不同的任务:(i)文本定位和(ii)从包含一个或多个卡纳达语、泰米尔语、印地语、中文和英语单词的场景图像中分割。(iii)英语和(iv)从场景单词图像中识别卡纳达语单词任务。文本定位和分割任务共有四份提交。对于另外两个任务，我们评估了我们已经发表的两种算法，即非线性增强和选择平面和中线分析和传播分割。对算法的位置进行了全面的讨论，并提出了提高算法质量的建议。提出了以基准值的形式对单个图像的f分数进行图形化描述，以显示算法的强度。

引用次数: 32

An approach for Bangla and Devanagari video text recognition 一种孟加拉语和德文语视频文本识别方法

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505389

P. Banerjee, B. Chaudhuri

Extraction and recognition of Bangla text from video frame images is challenging due to fonts type and style variation, complex color background, low-resolution, low contrast etc. In this paper, we propose an algorithm for extraction and recognition of Bangla and Devanagari text form video frames with complex background. Here, a two-step approach has been proposed. After text localization, the text line is segmented into words using information based on line contours. First order gradient values of the text blocks are used to find the word gap. Next, an Adaptive SIS binarization technique is applied on each word. Next this binarized text block is sent to a state of the art OCR for recognition.

由于字体类型和样式变化、背景颜色复杂、分辨率低、对比度低等原因，从视频帧图像中提取和识别孟加拉语文本具有挑战性。本文提出了一种基于复杂背景的孟加拉语和印度语视频帧文本提取与识别算法。在这里，提出了一个两步走的方法。文本定位后，使用基于线轮廓的信息将文本线分割成单词。使用文本块的一阶梯度值来查找单词间隙。其次，对每个单词应用自适应SIS二值化技术。接下来，这个二值化的文本块被发送到最先进的OCR进行识别。

引用次数: 11

Unconstrained handwritten Devanagari character recognition using convolutional neural networks 使用卷积神经网络的无约束手写Devanagari字符识别

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505386

Kapil Mehrotra, Saumya Jetley, Akash Deshmukh, S. Belhe

In this paper, we introduce a novel offline strategy for recognition of online handwritten Devanagari characters entered in an unconstrained manner. Unlike the previous approaches based on standard classifiers - SVM, HMM, ANN and trained on statistical, structural or spectral features, our method, based on CNN, allows writers to enter characters in any number or order of strokes and is also robust to certain amount of overwriting. The CNN architecture supports an increased set of 42 Devanagari character classes. Experiments with 10 different configurations of CNN and for both Exponential Decay and Inverse Scale Annealing approaches to convergence, show highly promising results. In a further improvement, the final layer neuron outputs of top 3 configurations are averaged and used to make the classification decision, achieving an accuracy of 99.82% on the train data and 98.19% on the test data. This marks an improvement of 0.2% and 5.81%, for the train and test set respectively, over the existing state-of-the-art in unconstrained input. The data used for building the system is obtained from different parts of Devanagari writing states in India, in the form of isolated words. Character level data is extracted from the collected words using a hybrid approach and covers all possible variations owing to the different writing styles and varied parent word structures.

在本文中，我们介绍了一种新的离线策略，用于识别以不受约束的方式输入的在线手写Devanagari字符。与之前基于标准分类器的方法(SVM、HMM、ANN)以及基于统计、结构或谱特征训练的方法不同，我们基于CNN的方法允许编写者以任意数量或顺序的笔画输入字符，并且对一定数量的覆盖也具有鲁棒性。CNN架构支持增加的42个Devanagari字符类。用10种不同的CNN结构以及指数衰减和逆尺度退火方法进行的实验显示出非常有希望的结果。在进一步改进中，对前3个配置的最终层神经元输出进行平均并用于分类决策，对训练数据和测试数据的准确率分别达到99.82%和98.19%。这标志着训练和测试集在无约束输入方面分别比现有的最先进技术提高了0.2%和5.81%。用于构建系统的数据是从印度Devanagari书写邦的不同部分以孤立单词的形式获得的。字符级数据使用混合方法从收集的单词中提取，并涵盖了由于不同的写作风格和不同的父词结构而产生的所有可能的变化。

{"title":"Unconstrained handwritten Devanagari character recognition using convolutional neural networks","authors":"Kapil Mehrotra, Saumya Jetley, Akash Deshmukh, S. Belhe","doi":"10.1145/2505377.2505386","DOIUrl":"https://doi.org/10.1145/2505377.2505386","url":null,"abstract":"In this paper, we introduce a novel offline strategy for recognition of online handwritten Devanagari characters entered in an unconstrained manner. Unlike the previous approaches based on standard classifiers - SVM, HMM, ANN and trained on statistical, structural or spectral features, our method, based on CNN, allows writers to enter characters in any number or order of strokes and is also robust to certain amount of overwriting. The CNN architecture supports an increased set of 42 Devanagari character classes. Experiments with 10 different configurations of CNN and for both Exponential Decay and Inverse Scale Annealing approaches to convergence, show highly promising results. In a further improvement, the final layer neuron outputs of top 3 configurations are averaged and used to make the classification decision, achieving an accuracy of 99.82% on the train data and 98.19% on the test data. This marks an improvement of 0.2% and 5.81%, for the train and test set respectively, over the existing state-of-the-art in unconstrained input. The data used for building the system is obtained from different parts of Devanagari writing states in India, in the form of isolated words. Character level data is extracted from the collected words using a hybrid approach and covers all possible variations owing to the different writing styles and varied parent word structures.","PeriodicalId":288465,"journal":{"name":"MOCR '13","volume":"423 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131778800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 31

Global and local features for recognition of online handwritten numerals and Tamil characters 在线手写数字和泰米尔字符识别的全局和局部特征

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505391

A. Ramakrishnan, K. Urala

Feature extraction is a key step in the recognition of online handwritten data and is well investigated in literature. In the case of Tamil online handwritten characters, global features such as those derived from discrete Fourier transform (DFT), discrete cosine transform (DCT), wavelet transform have been used to capture overall information about the data. On the hand, local features such as (x, y) coordinates, nth derivative, curvature and angular features have also been used. In this paper, we investigate the efficacy of using global features alone (DFT, DCT), local features alone (preprocessed (x, y) coordinates) and a combination of both global and local features. Our classifier, a support vector machine (SVM) with radial basis function (RBF) kernel, is trained and tested on the IWFHR 2006 Tamil handwritten character recognition competition dataset. We have obtained more than 95% accuracy on the test dataset which is greater than the best score reported in the literature. Further, we have used a combination of global and local features on a publicly available database of Indo-Arabic numerals and obtained an accuracy of more than 98%.

特征提取是识别在线手写数据的关键步骤，在文献中得到了很好的研究。在泰米尔在线手写字符的情况下，使用离散傅立叶变换(DFT)、离散余弦变换(DCT)、小波变换等全局特征来捕获数据的总体信息。另一方面，局部特征如(x, y)坐标、n阶导数、曲率和角特征也被使用。在本文中，我们研究了单独使用全局特征(DFT, DCT)，单独使用局部特征(预处理(x, y)坐标)以及结合使用全局和局部特征的有效性。我们的分类器是一个具有径向基函数(RBF)核的支持向量机(SVM)，在IWFHR 2006泰米尔手写字符识别比赛数据集上进行了训练和测试。我们在测试数据集上获得了95%以上的准确率，这比文献中报道的最佳分数要高。此外，我们在一个公开可用的印度-阿拉伯数字数据库中使用了全球和本地特征的组合，并获得了98%以上的准确性。

引用次数: 28

Re-targeting of multi-script document images for handheld devices 为手持设备重新定位多脚本文档图像

MOCR '13

Pub Date : 2013-08-24 DOI: 10.1145/2505377.2505388

Soumyadeep Dey, J. Mukhopadhyay, S. Sural, Partha Bhowmick

We propose here a technique for transforming the layout of a printed document image to a new user-conducive layout. Its objective is to effectuate better display in a low-resolution screen for providing comfort and convenience to a viewer while reading. The task of re-targeting starts with analyzing the document image in the spatial domain for identifying its paragraphs. Text lines, words, characters, and hyphenations are then recognized from each paragraph, and necessary word stitching is performed to reproduce the paragraph, as appropriate to the resolution of the display device. Test results and related subjective evaluation for different datasets, especially the pages scanned from some Bengali and English magazines, demonstrate the strength and effectiveness of the proposed technique.

我们在这里提出了一种将打印文档图像的布局转换为新的用户有利布局的技术。它的目标是在低分辨率屏幕上实现更好的显示，为观看者在阅读时提供舒适和方便。重新定位的任务是从在空间域中分析文档图像以识别其段落开始的。然后从每个段落中识别文本行、单词、字符和连字符，并根据显示设备的分辨率执行必要的单词拼接以再现段落。不同数据集的测试结果和相关的主观评价，特别是从一些孟加拉语和英语杂志扫描的页面，证明了所提出的技术的强度和有效性。

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

MOCR '13

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀