{"title":"Text Segmentation from Bangla Land Map Images","authors":"Samit Biswas, Amit Kumar Das, B. Chanda","doi":"10.1515/ipc-2015-0003","DOIUrl":null,"url":null,"abstract":"Abstract Text segmentation from land map images is a non-trivial task as map components are interleaved and overlapped in a complex spatial form. The characters in a word in most of the Indic languages, including Bangla (the 6th most spoken language in the world), are connected through a headline (”matra” or ”shirorekha”) which makes the corresponding word a single component. It has been observed that the Delaunay triangulation (DT) forms a number of small triangles on the text regions compared to other regions of the map - a property very much discernible for Bangla (and some other Indic scripts) texts. This property is primarily exploited here to segment text from the complex background of the land map images. The proposed text segmentation approach is tested and compared with an existing method on a collected dataset of paper map images( containing Bangla, an Indian regional language texts) and the results are encouraging.","PeriodicalId":271906,"journal":{"name":"Image Processing & Communications","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image Processing & Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/ipc-2015-0003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Abstract Text segmentation from land map images is a non-trivial task as map components are interleaved and overlapped in a complex spatial form. The characters in a word in most of the Indic languages, including Bangla (the 6th most spoken language in the world), are connected through a headline (”matra” or ”shirorekha”) which makes the corresponding word a single component. It has been observed that the Delaunay triangulation (DT) forms a number of small triangles on the text regions compared to other regions of the map - a property very much discernible for Bangla (and some other Indic scripts) texts. This property is primarily exploited here to segment text from the complex background of the land map images. The proposed text segmentation approach is tested and compared with an existing method on a collected dataset of paper map images( containing Bangla, an Indian regional language texts) and the results are encouraging.