Chen Du, Chunheng Wang, Yanna Wang, Zipeng Feng, Jiyuan Zhang
{"title":"TextEdge:基于区域分割和边缘分类的多方向场景文本检测","authors":"Chen Du, Chunheng Wang, Yanna Wang, Zipeng Feng, Jiyuan Zhang","doi":"10.1109/ICDAR.2019.00067","DOIUrl":null,"url":null,"abstract":"The semantic-segmentation-based scene text detection algorithms always use the bounding-box regions or their shrinks to represent the text pixels. However, the non-text pixel information in these regions easily results in the poor performance of text detection, because these semantic segmentation methods need accurate pixel-level annotated training data to achieve approving performance and they are sensitive to noise and interference. In this work, we propose a fully convolutional network (FCN) based method termed TextEdge for multi-oriented scene text detection. Compared with previous methods simply using bounding-box regions as a segmentation mask, TextEdge introduces the text-region edge map as a new segmentation mask. Edge information is more representative for text areas and is proved to be effective in improving detection performance. TextEdge is optimized in an end-to-end way with multi-task outputs: text and non-text classification, text-edge prediction and the text boundaries regression. Experiments on standard datasets demonstrate that the proposed method achieves state-of-the-art performance in both accuracy and efficiency. Specifically, it achieves an F-score of 0.88 on ICDAR 2013 dataset and 0.86 on ICDAR 2015 dataset.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"TextEdge: Multi-oriented Scene Text Detection via Region Segmentation and Edge Classification\",\"authors\":\"Chen Du, Chunheng Wang, Yanna Wang, Zipeng Feng, Jiyuan Zhang\",\"doi\":\"10.1109/ICDAR.2019.00067\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The semantic-segmentation-based scene text detection algorithms always use the bounding-box regions or their shrinks to represent the text pixels. However, the non-text pixel information in these regions easily results in the poor performance of text detection, because these semantic segmentation methods need accurate pixel-level annotated training data to achieve approving performance and they are sensitive to noise and interference. In this work, we propose a fully convolutional network (FCN) based method termed TextEdge for multi-oriented scene text detection. Compared with previous methods simply using bounding-box regions as a segmentation mask, TextEdge introduces the text-region edge map as a new segmentation mask. Edge information is more representative for text areas and is proved to be effective in improving detection performance. TextEdge is optimized in an end-to-end way with multi-task outputs: text and non-text classification, text-edge prediction and the text boundaries regression. Experiments on standard datasets demonstrate that the proposed method achieves state-of-the-art performance in both accuracy and efficiency. Specifically, it achieves an F-score of 0.88 on ICDAR 2013 dataset and 0.86 on ICDAR 2015 dataset.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00067\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
TextEdge: Multi-oriented Scene Text Detection via Region Segmentation and Edge Classification
The semantic-segmentation-based scene text detection algorithms always use the bounding-box regions or their shrinks to represent the text pixels. However, the non-text pixel information in these regions easily results in the poor performance of text detection, because these semantic segmentation methods need accurate pixel-level annotated training data to achieve approving performance and they are sensitive to noise and interference. In this work, we propose a fully convolutional network (FCN) based method termed TextEdge for multi-oriented scene text detection. Compared with previous methods simply using bounding-box regions as a segmentation mask, TextEdge introduces the text-region edge map as a new segmentation mask. Edge information is more representative for text areas and is proved to be effective in improving detection performance. TextEdge is optimized in an end-to-end way with multi-task outputs: text and non-text classification, text-edge prediction and the text boundaries regression. Experiments on standard datasets demonstrate that the proposed method achieves state-of-the-art performance in both accuracy and efficiency. Specifically, it achieves an F-score of 0.88 on ICDAR 2013 dataset and 0.86 on ICDAR 2015 dataset.