场景文本检测与识别算法

Guanjing Li
{"title":"场景文本检测与识别算法","authors":"Guanjing Li","doi":"10.1109/cvidliccea56201.2022.9824815","DOIUrl":null,"url":null,"abstract":"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.","PeriodicalId":23649,"journal":{"name":"Vision","volume":"3 1","pages":"1217-1224"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"CSNet-PGNet: Algorithm for Scene Text Detection and Recognition\",\"authors\":\"Guanjing Li\",\"doi\":\"10.1109/cvidliccea56201.2022.9824815\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.\",\"PeriodicalId\":23649,\"journal\":{\"name\":\"Vision\",\"volume\":\"3 1\",\"pages\":\"1217-1224\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Vision\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/cvidliccea56201.2022.9824815\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/cvidliccea56201.2022.9824815","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

近年来,场景文本的检测与识别发展迅速,但两个难题还没有得到很好的解决。首先,基于卷积神经网络的语义分析和强大的ImageNet预训练会带来很高的计算成本。其次,不规则形状和不规则词序的场景文本检测不准确。针对上述问题,本文提出了一种新颖的轻量级网络模块(CSNet-PGNet),用于实时读取任意形状和方向的文本。CSNet (Cross-Stage Cross-Scale network)是一个非常轻量级的整体跨阶段跨尺度网络,抛弃了繁琐的CNN骨架网络(语义分类),可以从头开始训练。PGNet (Point Gathering Network)是一种文本检测识别器,可以检测和识别任何形状的文本,不需要非最大抑制(NMS)和感兴趣区域(RoI)的操作,具有端到端简单和高效的优点。表演本文提出了CSNet-PGNet场景曲线文本检测与识别方法,是对任意形状的场景文本进行更高效、更精确检测的一种发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CSNet-PGNet: Algorithm for Scene Text Detection and Recognition
In recent years, the detection and recognition of scene text have developed rapidly, but two difficult challenges have not been well solved. First, semantic analysis based on convolutional neural networks and powerful ImageNet pre-training incur high computational costs. Second, scene text detection with irregular shapes and irregular word order is inaccurate. Aiming at the above problems, this paper proposes a novel and lightweight network module (CSNet-PGNet) for real-time reading of a text of arbitrary shape and orientation. CSNet (Cross-Stage Cross-Scale network) is an extremely lightweight overall cross-stage and cross-scale network, which abandons the cumbersome CNN skeleton network (semantic classification) and can be trained from scratch. PGNet (Point Gathering Network) is a text detection recognizer that can detect and recognize the text of any shape, without the operation of Non-maximum Suppression (NMS) and Region of Interest (RoI), and has the advantages of end-to-end simplicity and efficiency. performance. This paper proposes the CSNet-PGNet scene curve text detection and recognition method, which is a development to more efficient and precise scene text detection of any shapes.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Comparison of Eye Axial Length Measurements Taken Using Partial Coherence Interferometry and OCT Biometry The Effect of the Zonular Fiber Angle of Insertion on Accommodation Perceptual Biases in the Interpretation of Non-Rigid Shape Transformations from Motion A New Model of a Macular Buckle and a Refined Surgical Technique for the Treatment of Myopic Traction Maculopathy Eyes on Memory: Pupillometry in Encoding and Retrieval
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1