在自然场景中通过扩张重组和高效重组进行不规则文本检测

IF 3.5 3区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS Multimedia Systems Pub Date : 2024-06-02 DOI:10.1007/s00530-024-01360-6
Liwen Huang, Wenyuan Yang
{"title":"在自然场景中通过扩张重组和高效重组进行不规则文本检测","authors":"Liwen Huang, Wenyuan Yang","doi":"10.1007/s00530-024-01360-6","DOIUrl":null,"url":null,"abstract":"<p>In recent years, scene text detection has brought out broader prospects via growing applied opportunities. Nevertheless, pointing out which detected capability and suitable instantaneity in equilibrium is an essential consideration of irregular text detection. Out of consideration for the trouble, we propose an efficient scene text detector that unites a Dilated Recombined Unit (DRU) and a Efficient Reorganized Unit (ERU), named DENet. In the beginning, input feature information is received into a DR-VanillaNet backbone. Dilated recombined unit is devised to insert into every block of DR-VanillaNet to heighten the connection about distant pixel points. Next, an FPN with efficient reorganized unit tends to exploit feature redundancy and permutate channels partially. Correspondingly, DRU and ERU work on constructive effect for precision with a limited descent of speed. Moreover, a progressive scale expansion is carried forward which maintains the ability to generate the adjacent instances successfully. Multiple experiments on CTW1500, Total-Text benchmark datasets prove that designed model intends to improve precision accompanied by a limited drop of speed. It is specifically indicated that the value of precision on these two datasets reaches 84.29% and 85.30%. And FPS are achieved by 8.6 and 10.9, respectively.</p>","PeriodicalId":51138,"journal":{"name":"Multimedia Systems","volume":"29 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A irregular text detection via dilated recombination and efficient reorganization on natural scene\",\"authors\":\"Liwen Huang, Wenyuan Yang\",\"doi\":\"10.1007/s00530-024-01360-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In recent years, scene text detection has brought out broader prospects via growing applied opportunities. Nevertheless, pointing out which detected capability and suitable instantaneity in equilibrium is an essential consideration of irregular text detection. Out of consideration for the trouble, we propose an efficient scene text detector that unites a Dilated Recombined Unit (DRU) and a Efficient Reorganized Unit (ERU), named DENet. In the beginning, input feature information is received into a DR-VanillaNet backbone. Dilated recombined unit is devised to insert into every block of DR-VanillaNet to heighten the connection about distant pixel points. Next, an FPN with efficient reorganized unit tends to exploit feature redundancy and permutate channels partially. Correspondingly, DRU and ERU work on constructive effect for precision with a limited descent of speed. Moreover, a progressive scale expansion is carried forward which maintains the ability to generate the adjacent instances successfully. Multiple experiments on CTW1500, Total-Text benchmark datasets prove that designed model intends to improve precision accompanied by a limited drop of speed. It is specifically indicated that the value of precision on these two datasets reaches 84.29% and 85.30%. And FPS are achieved by 8.6 and 10.9, respectively.</p>\",\"PeriodicalId\":51138,\"journal\":{\"name\":\"Multimedia Systems\",\"volume\":\"29 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-06-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Multimedia Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00530-024-01360-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00530-024-01360-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

近年来,场景文本检测的应用机会越来越多,带来了更广阔的前景。然而,在不规则文本检测中,如何确定检测能力和合适的瞬时平衡是一个重要的考虑因素。出于对这一问题的考虑,我们提出了一种结合了稀释重组单元(DRU)和高效重组单元(ERU)的高效场景文本检测器,命名为 DENet。首先,输入的特征信息被接收到 DR-VanillaNet 骨干网。在 DR-VanillaNet 的每个区块中插入扩张重组单元,以加强与远处像素点的连接。接下来,带有高效重组单元的 FPN 往往会利用特征冗余和部分排列通道。相应地,DRU 和 ERU 利用构造效应来提高精度,但速度下降有限。此外,DRU 和 ERU 还进行了渐进式规模扩展,从而保持了成功生成相邻实例的能力。在 CTW1500、Total-Text 基准数据集上进行的多次实验证明,所设计的模型旨在提高精确度,同时限制速度的下降。具体表现在,这两个数据集的精确度值分别达到了 84.29% 和 85.30%。而 FPS 分别达到了 8.6 和 10.9。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
A irregular text detection via dilated recombination and efficient reorganization on natural scene

In recent years, scene text detection has brought out broader prospects via growing applied opportunities. Nevertheless, pointing out which detected capability and suitable instantaneity in equilibrium is an essential consideration of irregular text detection. Out of consideration for the trouble, we propose an efficient scene text detector that unites a Dilated Recombined Unit (DRU) and a Efficient Reorganized Unit (ERU), named DENet. In the beginning, input feature information is received into a DR-VanillaNet backbone. Dilated recombined unit is devised to insert into every block of DR-VanillaNet to heighten the connection about distant pixel points. Next, an FPN with efficient reorganized unit tends to exploit feature redundancy and permutate channels partially. Correspondingly, DRU and ERU work on constructive effect for precision with a limited descent of speed. Moreover, a progressive scale expansion is carried forward which maintains the ability to generate the adjacent instances successfully. Multiple experiments on CTW1500, Total-Text benchmark datasets prove that designed model intends to improve precision accompanied by a limited drop of speed. It is specifically indicated that the value of precision on these two datasets reaches 84.29% and 85.30%. And FPS are achieved by 8.6 and 10.9, respectively.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Multimedia Systems
Multimedia Systems 工程技术-计算机:理论方法
CiteScore
5.40
自引率
7.70%
发文量
148
审稿时长
4.5 months
期刊介绍: This journal details innovative research ideas, emerging technologies, state-of-the-art methods and tools in all aspects of multimedia computing, communication, storage, and applications. It features theoretical, experimental, and survey articles.
期刊最新文献
Generating generalized zero-shot learning based on dual-path feature enhancement Triple fusion and feature pyramid decoder for RGB-D semantic segmentation Automatic lymph node segmentation using deep parallel squeeze & excitation and attention Unet CAFIN: cross-attention based face image repair network Instance segmentation of faces and mouth-opening degrees based on improved YOLOv8 method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1