OCR from Video Stream of Book Flipping

2013 2nd IAPR Asian Conference on Pattern Recognition Pub Date : 2013-11-05 DOI:10.1109/ACPR.2013.24

Dibyayan Chakraborty, P. Roy, J. Álvarez, U. Pal

{"title":"OCR from Video Stream of Book Flipping","authors":"Dibyayan Chakraborty, P. Roy, J. Álvarez, U. Pal","doi":"10.1109/ACPR.2013.24","DOIUrl":null,"url":null,"abstract":"Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed cause difficulties to identify frames that contain the open page image (OPI) for better readability. Also, low resolution, blurring effect shadows add significant noise in selection of proper frames for OCR. In this work, we focus on the problem of identifying the set of optimal representative frames for the OPI from a video stream of flipping pages and then perform OCR without using any explicit hardware. To the best of our knowledge this is the first work in this area. We present an algorithm that exploits cues from edge information of flipping pages. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. Then a SVM classifier is trained with the edge cue information for this determination. For each OPI we obtain a set of frames. Next we choose the central frame from that set of frames as the representative frame of the corresponding OPI and perform OCR. Experiments are performed on video documents recorded using a standard resolution camera to validate the frame selection algorithm and we have obtained 88% accuracy. Also, we have obtained character recognition accuracy of 82% and word recognition accuracy of 77% from such book flipping OCR.","PeriodicalId":365633,"journal":{"name":"2013 2nd IAPR Asian Conference on Pattern Recognition","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 2nd IAPR Asian Conference on Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ACPR.2013.24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

Optical Character Recognition (OCR) in video stream of flipping pages is a challenging task because flipping at random speed cause difficulties to identify frames that contain the open page image (OPI) for better readability. Also, low resolution, blurring effect shadows add significant noise in selection of proper frames for OCR. In this work, we focus on the problem of identifying the set of optimal representative frames for the OPI from a video stream of flipping pages and then perform OCR without using any explicit hardware. To the best of our knowledge this is the first work in this area. We present an algorithm that exploits cues from edge information of flipping pages. These cues, extracted from the region of interest (ROI) of the frame, determine the flipping or open state of a page. Then a SVM classifier is trained with the edge cue information for this determination. For each OPI we obtain a set of frames. Next we choose the central frame from that set of frames as the representative frame of the corresponding OPI and perform OCR. Experiments are performed on video documents recorded using a standard resolution camera to validate the frame selection algorithm and we have obtained 88% accuracy. Also, we have obtained character recognition accuracy of 82% and word recognition accuracy of 77% from such book flipping OCR.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

从翻书视频流的OCR

光学字符识别(OCR)是一项具有挑战性的任务，因为以随机速度翻转导致难以识别包含打开页面图像(OPI)的帧以获得更好的可读性。此外，低分辨率，模糊效果阴影在选择合适的OCR帧时增加了显著的噪声。在这项工作中，我们专注于从翻页视频流中识别OPI的最佳代表帧集的问题，然后在不使用任何显式硬件的情况下执行OCR。据我们所知，这是这一领域的首次研究。我们提出了一种利用翻页边缘信息线索的算法。这些线索从帧的感兴趣区域(ROI)中提取，确定页面的翻转或打开状态。然后，使用边缘线索信息训练SVM分类器进行此确定。对于每个OPI，我们获得一组帧。接下来，我们从这组帧中选择中心帧作为相应OPI的代表帧，并执行OCR。在标准分辨率摄像机录制的视频文档上进行了实验，验证了该算法的帧选择精度，达到88%。此外，我们还获得了82%的字符识别准确率和77%的单词识别准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2013 2nd IAPR Asian Conference on Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Automatic Compensation of Radial Distortion by Minimizing Entropy of Histogram of Oriented Gradients A Robust and Efficient Minutia-Based Fingerprint Matching Algorithm Sclera Recognition - A Survey A Non-local Sparse Model for Intrinsic Images Classification Based on Boolean Algebra and Its Application to the Prediction of Recurrence of Liver Cancer