泰米尔语打印文档的预处理方法:偏斜校正和文本分类

2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS) Pub Date : 2015-12-01 DOI:10.1109/INTELCIS.2015.7397266

M. Ramanan, A. Ramanan, E. Charles

{"title":"泰米尔语打印文档的预处理方法:偏斜校正和文本分类","authors":"M. Ramanan, A. Ramanan, E. Charles","doi":"10.1109/INTELCIS.2015.7397266","DOIUrl":null,"url":null,"abstract":"An Optical character recognition (OCR) consists of the phases: preprocessing and segmentation, feature extraction, classification and post-processing. This paper focuses on preprocessing and segmentation tasks which plays a major role in the subsequent processes of an OCR. The objective of preprocessing and segmentation is to improve the quality of the input image. In addition this phase removes unnecessary portions of the input image that would otherwise complicate the subsequent steps of OCR and reduce the overall recognition rate. Preprocessing and segmentation step consists many sub processes namely, image binarisation, noise removal, skew detection and correction, page segmentation, text or non-text classification, line segmentation, word segmentation and character segmentation. This paper proposes a new method to calculate the skew angle for skew correction. In addition this paper proposes a more accurate method to segment the input image as blocks and classify the blocks as text or non-text. The skew angle is calculated on the scanned document using Wiener filter, smearing technique and Radon transform. Document image is segmented into blocks using run length smearing algorithm and connected component analysis. Features such as basic, density and HOG are extracted from each block for text and non-text classification. The proposed methods are tested on 54 documents. The testing results show a recognition rate of 96.30% for skew detection and correction whereas the recognition rate is 99.18% for text or non-text classification with binary SVMs using RBF kernel.","PeriodicalId":6478,"journal":{"name":"2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)","volume":"57 1","pages":"495-500"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"A preprocessing method for printed Tamil documents: Skew correction and textual classification\",\"authors\":\"M. Ramanan, A. Ramanan, E. Charles\",\"doi\":\"10.1109/INTELCIS.2015.7397266\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An Optical character recognition (OCR) consists of the phases: preprocessing and segmentation, feature extraction, classification and post-processing. This paper focuses on preprocessing and segmentation tasks which plays a major role in the subsequent processes of an OCR. The objective of preprocessing and segmentation is to improve the quality of the input image. In addition this phase removes unnecessary portions of the input image that would otherwise complicate the subsequent steps of OCR and reduce the overall recognition rate. Preprocessing and segmentation step consists many sub processes namely, image binarisation, noise removal, skew detection and correction, page segmentation, text or non-text classification, line segmentation, word segmentation and character segmentation. This paper proposes a new method to calculate the skew angle for skew correction. In addition this paper proposes a more accurate method to segment the input image as blocks and classify the blocks as text or non-text. The skew angle is calculated on the scanned document using Wiener filter, smearing technique and Radon transform. Document image is segmented into blocks using run length smearing algorithm and connected component analysis. Features such as basic, density and HOG are extracted from each block for text and non-text classification. The proposed methods are tested on 54 documents. The testing results show a recognition rate of 96.30% for skew detection and correction whereas the recognition rate is 99.18% for text or non-text classification with binary SVMs using RBF kernel.\",\"PeriodicalId\":6478,\"journal\":{\"name\":\"2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)\",\"volume\":\"57 1\",\"pages\":\"495-500\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INTELCIS.2015.7397266\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INTELCIS.2015.7397266","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

光学字符识别(OCR)包括预处理和分割、特征提取、分类和后处理四个阶段。本文重点研究了在OCR后续处理中起主要作用的预处理和分割任务。预处理和分割的目的是为了提高输入图像的质量。此外，这一阶段删除了输入图像中不必要的部分，否则这些部分会使OCR的后续步骤复杂化并降低整体识别率。预处理和分割步骤包括许多子过程，即图像二值化，去噪，倾斜检测和校正，页面分割，文本或非文本分类，线分割，词分割和字符分割。本文提出了一种新的倾斜角度计算方法，用于倾斜校正。此外，本文还提出了一种更准确的方法，将输入图像分割为块，并将块分类为文本或非文本。利用维纳滤波、涂抹技术和Radon变换对扫描文档进行倾斜角度计算。采用行程涂抹算法和连通分量分析对文档图像进行分割。从每个块中提取基本特征、密度特征和HOG特征，用于文本和非文本分类。在54个文档上对所提出的方法进行了测试。测试结果表明，基于RBF核的二值支持向量机对文本和非文本分类的识别率为99.18%，而对歪斜检测和校正的识别率为96.30%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A preprocessing method for printed Tamil documents: Skew correction and textual classification

An Optical character recognition (OCR) consists of the phases: preprocessing and segmentation, feature extraction, classification and post-processing. This paper focuses on preprocessing and segmentation tasks which plays a major role in the subsequent processes of an OCR. The objective of preprocessing and segmentation is to improve the quality of the input image. In addition this phase removes unnecessary portions of the input image that would otherwise complicate the subsequent steps of OCR and reduce the overall recognition rate. Preprocessing and segmentation step consists many sub processes namely, image binarisation, noise removal, skew detection and correction, page segmentation, text or non-text classification, line segmentation, word segmentation and character segmentation. This paper proposes a new method to calculate the skew angle for skew correction. In addition this paper proposes a more accurate method to segment the input image as blocks and classify the blocks as text or non-text. The skew angle is calculated on the scanned document using Wiener filter, smearing technique and Radon transform. Document image is segmented into blocks using run length smearing algorithm and connected component analysis. Features such as basic, density and HOG are extracted from each block for text and non-text classification. The proposed methods are tested on 54 documents. The testing results show a recognition rate of 96.30% for skew detection and correction whereas the recognition rate is 99.18% for text or non-text classification with binary SVMs using RBF kernel.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE Seventh International Conference on Intelligent Computing and Information Systems (ICICIS)

自引率

0.00%

发文量

期刊最新文献

On the use of probabilistic model-checking for the verification of prognostics applications Prospective, knowledge based clinical risk analysis: The OPT-model Partial deduction in predicate calculus as a tool for artificial intelligence problem complexity decreasing XML summarization: A survey Finding the pin in the haystack: A Bot Traceback service for public clouds