{"title":"卡纳达语手写文档图像的二值化与分割","authors":"Vinod H.C, S. Niranjan","doi":"10.1109/ICGCIOT.2018.8753039","DOIUrl":null,"url":null,"abstract":"Binarization of document images is a major phase in the handwritten text recognition process. Text recognition process gives best result and easy to archive recognition for printed documents, but more accurate and fast Binarization & segmentation methods are required to achieve high accuracy in handwritten character recognition. In this paper we presenting two modules, they are Document Binarization & Segmentation. In Document Binarization carried out using Haar wavelet decomposition, laplacian mask, maximum gradient difference, median filter and morphological operators. Segmentation is done by the projection profile method and paragraph skew correction recursively until height of the segmented line image is less than 7% of the input image, Connected Component Analysis is used to segment words. These segmented words can be feed to OCR for recognition; the proposed experimental results are encouraging.","PeriodicalId":269682,"journal":{"name":"2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Binarization and Segmentation of Kannada Handwritten Document Images\",\"authors\":\"Vinod H.C, S. Niranjan\",\"doi\":\"10.1109/ICGCIOT.2018.8753039\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Binarization of document images is a major phase in the handwritten text recognition process. Text recognition process gives best result and easy to archive recognition for printed documents, but more accurate and fast Binarization & segmentation methods are required to achieve high accuracy in handwritten character recognition. In this paper we presenting two modules, they are Document Binarization & Segmentation. In Document Binarization carried out using Haar wavelet decomposition, laplacian mask, maximum gradient difference, median filter and morphological operators. Segmentation is done by the projection profile method and paragraph skew correction recursively until height of the segmented line image is less than 7% of the input image, Connected Component Analysis is used to segment words. These segmented words can be feed to OCR for recognition; the proposed experimental results are encouraging.\",\"PeriodicalId\":269682,\"journal\":{\"name\":\"2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICGCIOT.2018.8753039\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Second International Conference on Green Computing and Internet of Things (ICGCIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGCIOT.2018.8753039","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Binarization and Segmentation of Kannada Handwritten Document Images
Binarization of document images is a major phase in the handwritten text recognition process. Text recognition process gives best result and easy to archive recognition for printed documents, but more accurate and fast Binarization & segmentation methods are required to achieve high accuracy in handwritten character recognition. In this paper we presenting two modules, they are Document Binarization & Segmentation. In Document Binarization carried out using Haar wavelet decomposition, laplacian mask, maximum gradient difference, median filter and morphological operators. Segmentation is done by the projection profile method and paragraph skew correction recursively until height of the segmented line image is less than 7% of the input image, Connected Component Analysis is used to segment words. These segmented words can be feed to OCR for recognition; the proposed experimental results are encouraging.