Guangcun Wei, Wansheng Rong, Yongquan Liang, Xinguang Xiao, Xiang Liu
{"title":"Scene text spotting based on end-to-end","authors":"Guangcun Wei, Wansheng Rong, Yongquan Liang, Xinguang Xiao, Xiang Liu","doi":"10.3233/JIFS-200903","DOIUrl":null,"url":null,"abstract":"Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text.","PeriodicalId":44705,"journal":{"name":"International Journal of Fuzzy Logic and Intelligent Systems","volume":"99 1","pages":"1-11"},"PeriodicalIF":1.5000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Fuzzy Logic and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/JIFS-200903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Aiming at the problem that the traditional OCR processing method ignores the inherent connection between the text detection task and the text recognition task, This paper propose a novel end-to-end text spotting framework. The framework includes three parts: shared convolutional feature network, text detector and text recognizer. By sharing convolutional feature network, the text detection network and the text recognition network can be jointly optimized at the same time. On the one hand, it can reduce the computational burden; on the other hand, it can effectively use the inherent connection between text detection and text recognition. This model add the TCM (Text Context Module) on the basis of Mask RCNN, which can effectively solve the negative sample problem in text detection tasks. This paper propose a text recognition model based on the SAM-BiLSTM (spatial attention mechanism with BiLSTM), which can more effectively extract the semantic information between characters. This model significantly surpasses state-of-the-art methods on a number of text detection and text spotting benchmarks, including ICDAR 2015, Total-Text.
期刊介绍:
The International Journal of Fuzzy Logic and Intelligent Systems (pISSN 1598-2645, eISSN 2093-744X) is published quarterly by the Korean Institute of Intelligent Systems. The official title of the journal is International Journal of Fuzzy Logic and Intelligent Systems and the abbreviated title is Int. J. Fuzzy Log. Intell. Syst. Some, or all, of the articles in the journal are indexed in SCOPUS, Korea Citation Index (KCI), DOI/CrossrRef, DBLP, and Google Scholar. The journal was launched in 2001 and dedicated to the dissemination of well-defined theoretical and empirical studies results that have a potential impact on the realization of intelligent systems based on fuzzy logic and intelligent systems theory. Specific topics include, but are not limited to: a) computational intelligence techniques including fuzzy logic systems, neural networks and evolutionary computation; b) intelligent control, instrumentation and robotics; c) adaptive signal and multimedia processing; d) intelligent information processing including pattern recognition and information processing; e) machine learning and smart systems including data mining and intelligent service practices; f) fuzzy theory and its applications.