{"title":"草书笔划排序手写文本文件识别","authors":"S. Panwar, N. Nain","doi":"10.1109/NCVPRIPG.2013.6776232","DOIUrl":null,"url":null,"abstract":"Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.","PeriodicalId":436402,"journal":{"name":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","volume":"700 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Cursive stroke sequencing for handwritten text documents recognition\",\"authors\":\"S. Panwar, N. Nain\",\"doi\":\"10.1109/NCVPRIPG.2013.6776232\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.\",\"PeriodicalId\":436402,\"journal\":{\"name\":\"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)\",\"volume\":\"700 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NCVPRIPG.2013.6776232\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Fourth National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NCVPRIPG.2013.6776232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cursive stroke sequencing for handwritten text documents recognition
Text segmentation can be defined as the process of splitting the images of handwritten text document into pieces corresponding to single lines, words and character. This is a very challenging task because in handwritten documents curved text lines appear frequently with different skew and slant angles. After segmentation of word or stroke, also defined as finding the connected components in handwritten text document, we have to sequence the strokes according to the document so that the meaning of the document is preserved. In this paper, We use bottom up grouping approach for segmentation. We have used a novel connectivity strength parameter with depth first search approach for extraction of connected components of the same line from complete connected components of the given document. The exact sequence of connected components is stored in the sequential vector which contains the label of the components. The proposed cursive stroke sequencing technique is implemented and tested on a benchmark IAM database providing encouraging results. Quantitative analysis also shows that this approach gives better results compared to existing segmentation techniques and overcomes the problems encountered in Hill-and-dale writing styles and overlapped and touched lines. The accuracy of the proposed sequencing technique is 98%.