场景文本检测与识别算法

Vision Pub Date : 2022-05-20 DOI:10.1109/cvidliccea56201.2022.9824815

Guanjing Li

{"title":"场景文本检测与识别算法","authors":"Guanjing Li","doi":"10.1109/cvidliccea56201.2022.9824815","DOIUrl":null,"url":null,"abstract":"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.","PeriodicalId":23649,"journal":{"name":"Vision","volume":"3 1","pages":"1217-1224"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CSNet-PGNet: Algorithm for Scene Text Detection and Recognition\",\"authors\":\"Guanjing Li\",\"doi\":\"10.1109/cvidliccea56201.2022.9824815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.\",\"PeriodicalId\":23649,\"journal\":{\"name\":\"Vision\",\"volume\":\"3 1\",\"pages\":\"1217-1224\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cvidliccea56201.2022.9824815\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvidliccea56201.2022.9824815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

近年来，场景文本的检测与识别发展迅速，但两个难题还没有得到很好的解决。首先，基于卷积神经网络的语义分析和强大的ImageNet预训练会带来很高的计算成本。其次，不规则形状和不规则词序的场景文本检测不准确。针对上述问题，本文提出了一种新颖的轻量级网络模块(CSNet-PGNet)，用于实时读取任意形状和方向的文本。CSNet (Cross-Stage Cross-Scale network)是一个非常轻量级的整体跨阶段跨尺度网络，抛弃了繁琐的CNN骨架网络(语义分类)，可以从头开始训练。PGNet (Point Gathering Network)是一种文本检测识别器，可以检测和识别任何形状的文本，不需要非最大抑制(NMS)和感兴趣区域(RoI)的操作，具有端到端简单和高效的优点。表演本文提出了CSNet-PGNet场景曲线文本检测与识别方法，是对任意形状的场景文本进行更高效、更精确检测的一种发展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

CSNet-PGNet: Algorithm for Scene Text Detection and Recognition

In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Vision

自引率

0.00%

发文量