Chen Du, Chunheng Wang, Yanna Wang, Zipeng Feng, Jiyuan Zhang
{"title":"TextEdge: Multi-oriented Scene Text Detection via Region Segmentation and Edge Classification","authors":"Chen Du, Chunheng Wang, Yanna Wang, Zipeng Feng, Jiyuan Zhang","doi":"10.1109/ICDAR.2019.00067","DOIUrl":null,"url":null,"abstract":"The semantic-segmentation-based scene text detection algorithms always use the bounding-box regions or their shrinks to represent the text pixels. However, the non-text pixel information in these regions easily results in the poor performance of text detection, because these semantic segmentation methods need accurate pixel-level annotated training data to achieve approving performance and they are sensitive to noise and interference. In this work, we propose a fully convolutional network (FCN) based method termed TextEdge for multi-oriented scene text detection. Compared with previous methods simply using bounding-box regions as a segmentation mask, TextEdge introduces the text-region edge map as a new segmentation mask. Edge information is more representative for text areas and is proved to be effective in improving detection performance. TextEdge is optimized in an end-to-end way with multi-task outputs: text and non-text classification, text-edge prediction and the text boundaries regression. Experiments on standard datasets demonstrate that the proposed method achieves state-of-the-art performance in both accuracy and efficiency. Specifically, it achieves an F-score of 0.88 on ICDAR 2013 dataset and 0.86 on ICDAR 2015 dataset.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
The semantic-segmentation-based scene text detection algorithms always use the bounding-box regions or their shrinks to represent the text pixels. However, the non-text pixel information in these regions easily results in the poor performance of text detection, because these semantic segmentation methods need accurate pixel-level annotated training data to achieve approving performance and they are sensitive to noise and interference. In this work, we propose a fully convolutional network (FCN) based method termed TextEdge for multi-oriented scene text detection. Compared with previous methods simply using bounding-box regions as a segmentation mask, TextEdge introduces the text-region edge map as a new segmentation mask. Edge information is more representative for text areas and is proved to be effective in improving detection performance. TextEdge is optimized in an end-to-end way with multi-task outputs: text and non-text classification, text-edge prediction and the text boundaries regression. Experiments on standard datasets demonstrate that the proposed method achieves state-of-the-art performance in both accuracy and efficiency. Specifically, it achieves an F-score of 0.88 on ICDAR 2013 dataset and 0.86 on ICDAR 2015 dataset.