{"title":"Scene Text Detection with Cascaded Multidimensional Attention","authors":"Shan Dai","doi":"10.1109/ICCECE58074.2023.10135187","DOIUrl":null,"url":null,"abstract":"Over the past years, scene text detection based on a segmentation network has progressed substantially due to its pixel-level description, which is more suitable for detecting long text and curved text. However, limited by the scale robustness and feature representation ability, most existing segmentation-based scene text detectors may need help to handle more complex forms of text, which is common in the real world. In this paper, to tackle this problem, we propose a cascaded module, termed CMAModule, based on the attention mechanism to improve the feature representation capability of the model, which integrates a series of the basic module to augment the feature map. Our proposed CMANet, obtained higher recall and precision on two benchmarks.","PeriodicalId":120030,"journal":{"name":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE58074.2023.10135187","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Over the past years, scene text detection based on a segmentation network has progressed substantially due to its pixel-level description, which is more suitable for detecting long text and curved text. However, limited by the scale robustness and feature representation ability, most existing segmentation-based scene text detectors may need help to handle more complex forms of text, which is common in the real world. In this paper, to tackle this problem, we propose a cascaded module, termed CMAModule, based on the attention mechanism to improve the feature representation capability of the model, which integrates a series of the basic module to augment the feature map. Our proposed CMANet, obtained higher recall and precision on two benchmarks.