Guo-lin Zhang, L. Ge, Yu-nuo Yang, Yu-qi Liu, Kexue Sun
{"title":"Fused Confidence for Scene Text Detection via Intersection-over-Union","authors":"Guo-lin Zhang, L. Ge, Yu-nuo Yang, Yu-qi Liu, Kexue Sun","doi":"10.1109/ICCT46805.2019.8947307","DOIUrl":null,"url":null,"abstract":"CNN-based scene text detection methods have achieved superior results. They are mostly implemented on the architecture of full convolution networks and non-maximum suppression (NMS) which combines two tasks of text classification and localization. However, in the NMS procedure, most filter the bounding boxes according to the classification confidence. This makes appropriately those well-located text boxes suppressed during NMS. In this paper, we propose an intersection-over-union (IOU) network to predict the IOU between the bounding box and the matched ground-truth. Then, the predicted IOU as localization confidence will be fused with the classification confidence. Furthermore, in the NMS, the classification confidence is replaced by the fused confidence as the ranking standard to preserve the accurately located text boxes. We experimented on the ICDAR2011 and ICDAR2013 datasets, the results show that the method proposed in this paper can effectively improve the accuracy of text detection.","PeriodicalId":306112,"journal":{"name":"2019 IEEE 19th International Conference on Communication Technology (ICCT)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 19th International Conference on Communication Technology (ICCT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCT46805.2019.8947307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
CNN-based scene text detection methods have achieved superior results. They are mostly implemented on the architecture of full convolution networks and non-maximum suppression (NMS) which combines two tasks of text classification and localization. However, in the NMS procedure, most filter the bounding boxes according to the classification confidence. This makes appropriately those well-located text boxes suppressed during NMS. In this paper, we propose an intersection-over-union (IOU) network to predict the IOU between the bounding box and the matched ground-truth. Then, the predicted IOU as localization confidence will be fused with the classification confidence. Furthermore, in the NMS, the classification confidence is replaced by the fused confidence as the ranking standard to preserve the accurately located text boxes. We experimented on the ICDAR2011 and ICDAR2013 datasets, the results show that the method proposed in this paper can effectively improve the accuracy of text detection.