{"title":"Text line identification in Tagore's manuscript","authors":"Chandranath Adak, B. Chaudhuri","doi":"10.1109/TechSym.2014.6808048","DOIUrl":null,"url":null,"abstract":"In this paper, a text line identification method is proposed. The text lines of printed document are easy to segment due to uniform straightness of the lines and sufficient gap between the lines. But in handwritten documents, the line is nonuniform and interline gaps are variable. We take Rabindranath Tagore's manuscript as it is one of the most difficult manuscripts that contain doodles. Our method consists of a preprocessing stage to clean the document image. Then we separate doodles from the manuscript to get the textual region. After that we identify the text lines on the manuscript. For text line identification, we use window examination, black run-length smearing, horizontal histogram and connected component analysis.","PeriodicalId":265072,"journal":{"name":"Proceedings of the 2014 IEEE Students' Technology Symposium","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2014 IEEE Students' Technology Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TechSym.2014.6808048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, a text line identification method is proposed. The text lines of printed document are easy to segment due to uniform straightness of the lines and sufficient gap between the lines. But in handwritten documents, the line is nonuniform and interline gaps are variable. We take Rabindranath Tagore's manuscript as it is one of the most difficult manuscripts that contain doodles. Our method consists of a preprocessing stage to clean the document image. Then we separate doodles from the manuscript to get the textual region. After that we identify the text lines on the manuscript. For text line identification, we use window examination, black run-length smearing, horizontal histogram and connected component analysis.