Document Image Dewarping using Deep Learning

Vijaya Kumar Bajjer Ramanna, S. S. Bukhari, A. Dengel
{"title":"Document Image Dewarping using Deep Learning","authors":"Vijaya Kumar Bajjer Ramanna, S. S. Bukhari, A. Dengel","doi":"10.5220/0007368405240531","DOIUrl":null,"url":null,"abstract":"The distorted images have been a major problem for Optical Character Recognition (OCR). In order to perform OCR on distorted images, dewarping has become a principal preprocessing step. This paper presents a new document dewarping method that removes curl and geometric distortion of modern and historical documents. Finally, the proposed method is evaluated and compared to the existing Computer Vision based method. Most of the traditional dewarping algorithms are created based on the text line feature extraction and segmentation. However, textual content extraction and segmentation can be sophisticated. Hence, the new technique is proposed, which doesn’t need any complicated methods to process the text lines. The proposed method is based on Deep Learning and it can be applied on all type of text documents and also documents with images and graphics. Moreover, there is no preprocessing required to apply this method on warped images. In the proposed system, the document distortion problem is treated as an image-to-image translation. The new method is implemented using a very powerful pix2pixhd network by utilizing Conditional Generative Adversarial Networks (CGAN). The network is trained on UW3 dataset by supplying distorted document as an input and cleaned image as the target. The generated images from the proposed method are cleanly dewarped and they are of high-resolution. Furthermore, these images can be used to perform OCR.","PeriodicalId":410036,"journal":{"name":"International Conference on Pattern Recognition Applications and Methods","volume":"211 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Pattern Recognition Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5220/0007368405240531","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

The distorted images have been a major problem for Optical Character Recognition (OCR). In order to perform OCR on distorted images, dewarping has become a principal preprocessing step. This paper presents a new document dewarping method that removes curl and geometric distortion of modern and historical documents. Finally, the proposed method is evaluated and compared to the existing Computer Vision based method. Most of the traditional dewarping algorithms are created based on the text line feature extraction and segmentation. However, textual content extraction and segmentation can be sophisticated. Hence, the new technique is proposed, which doesn’t need any complicated methods to process the text lines. The proposed method is based on Deep Learning and it can be applied on all type of text documents and also documents with images and graphics. Moreover, there is no preprocessing required to apply this method on warped images. In the proposed system, the document distortion problem is treated as an image-to-image translation. The new method is implemented using a very powerful pix2pixhd network by utilizing Conditional Generative Adversarial Networks (CGAN). The network is trained on UW3 dataset by supplying distorted document as an input and cleaned image as the target. The generated images from the proposed method are cleanly dewarped and they are of high-resolution. Furthermore, these images can be used to perform OCR.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用深度学习的文档图像去翘曲
图像畸变一直是光学字符识别(OCR)中的一个主要问题。为了对失真图像进行OCR处理,去翘曲已经成为一个重要的预处理步骤。提出了一种新的文献去翘曲方法,消除了现代文献和历史文献的卷曲和几何畸变。最后,对该方法进行了评价,并与现有的基于计算机视觉的方法进行了比较。传统的去翘曲算法大多是基于文本行特征的提取和分割。然而,文本内容的提取和分割可能是复杂的。因此,提出了一种不需要任何复杂方法来处理文本行的新技术。该方法基于深度学习,可以应用于所有类型的文本文档以及带有图像和图形的文档。此外,该方法不需要对扭曲图像进行预处理。在该系统中,文档失真问题被视为图像到图像的翻译。该方法利用条件生成对抗网络(CGAN)实现了一个非常强大的pix2pixhd网络。该网络在UW3数据集上进行训练,将扭曲的文档作为输入,将清洗后的图像作为目标。该方法生成的图像经过了清晰的形变处理,具有较高的分辨率。此外,这些图像可以用来执行OCR。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
PatchSVD: A Non-Uniform SVD-Based Image Compression Algorithm On Spectrogram Analysis in a Multiple Classifier Fusion Framework for Power Grid Classification Using Electric Network Frequency Semantic Properties of cosine based bias scores for word embeddings Double Trouble? Impact and Detection of Duplicates in Face Image Datasets Detecting Brain Tumors through Multimodal Neural Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1