利用不规则金字塔对灰度图像进行文本分割和二值化

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings. Pub Date : 2003-08-03 DOI:10.1109/ICDAR.2003.1227733

Poh Kok Loo, C. Tan

{"title":"利用不规则金字塔对灰度图像进行文本分割和二值化","authors":"Poh Kok Loo, C. Tan","doi":"10.1109/ICDAR.2003.1227733","DOIUrl":null,"url":null,"abstract":"Compared to binary images that most text extraction methods work on, gray scale images provide much more information for the extraction task. On the other hand complication also arises in determining the subject textual content from its background region (i.e. thresholding) before the actual text extraction process can begin. Differing from the usual sequence of processes where document images are binarized before the actual text extraction, this paper proposes a new method by first segmenting individual subject areas with the help of an irregular pyramid to be followed by the binarization process. This permits the focus of attention only on the appropriate subject areas for the binarization process before text recognition. Our method overcomes the difficulty in global binarization to find a single value to fit all. It also avoids the common problem in most local thresholding technique of finding a suitable window size. As shown in our experimented result, our method performed well in both text segmentation and binarization by varying the sequence of processing.","PeriodicalId":249193,"journal":{"name":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","volume":"67 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Using irregular pyramid for text segmentation and binarization of gray scale images\",\"authors\":\"Poh Kok Loo, C. Tan\",\"doi\":\"10.1109/ICDAR.2003.1227733\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Compared to binary images that most text extraction methods work on, gray scale images provide much more information for the extraction task. On the other hand complication also arises in determining the subject textual content from its background region (i.e. thresholding) before the actual text extraction process can begin. Differing from the usual sequence of processes where document images are binarized before the actual text extraction, this paper proposes a new method by first segmenting individual subject areas with the help of an irregular pyramid to be followed by the binarization process. This permits the focus of attention only on the appropriate subject areas for the binarization process before text recognition. Our method overcomes the difficulty in global binarization to find a single value to fit all. It also avoids the common problem in most local thresholding technique of finding a suitable window size. As shown in our experimented result, our method performed well in both text segmentation and binarization by varying the sequence of processing.\",\"PeriodicalId\":249193,\"journal\":{\"name\":\"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.\",\"volume\":\"67 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-08-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2003.1227733\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2003.1227733","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

与大多数文本提取方法所处理的二值图像相比，灰度图像为提取任务提供了更多的信息。另一方面，在实际文本提取过程开始之前，从其背景区域确定主题文本内容(即阈值化)也会产生复杂性。与通常在实际文本提取之前对文档图像进行二值化的处理顺序不同，本文提出了一种新的方法，首先利用不规则金字塔对单个主题区域进行分割，然后进行二值化处理。这允许将注意力集中在文本识别之前的二值化过程的适当主题领域上。我们的方法克服了全局二值化中难以找到一个值来拟合所有值的困难。它还避免了大多数局部阈值技术中常见的寻找合适窗口大小的问题。实验结果表明，通过改变处理顺序，我们的方法在文本分割和二值化方面都表现良好。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Using irregular pyramid for text segmentation and binarization of gray scale images

Compared to binary images that most text extraction methods work on, gray scale images provide much more information for the extraction task. On the other hand complication also arises in determining the subject textual content from its background region (i.e. thresholding) before the actual text extraction process can begin. Differing from the usual sequence of processes where document images are binarized before the actual text extraction, this paper proposes a new method by first segmenting individual subject areas with the help of an irregular pyramid to be followed by the binarization process. This permits the focus of attention only on the appropriate subject areas for the binarization process before text recognition. Our method overcomes the difficulty in global binarization to find a single value to fit all. It also avoids the common problem in most local thresholding technique of finding a suitable window size. As shown in our experimented result, our method performed well in both text segmentation and binarization by varying the sequence of processing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings.

自引率

0.00%

发文量

期刊最新文献

Impact of imperfect OCR on part-of-speech tagging Writer identification using innovative binarised features of handwritten numerals Word searching in CCITT group 4 compressed document images Exploiting reliability for dynamic selection of classi .ers by means of genetic algorithms Investigation of off-line Japanese signature verification using a pattern matching