A Deep Learning model capable of producing heatmap probabilities for Characters in Natural Scenes.

Allen Joshey, Ashish Tiwari, Rakesh Sankar, Sahil Salim Makandar
{"title":"A Deep Learning model capable of producing heatmap probabilities for Characters in Natural Scenes.","authors":"Allen Joshey, Ashish Tiwari, Rakesh Sankar, Sahil Salim Makandar","doi":"10.1145/3480651.3480662","DOIUrl":null,"url":null,"abstract":"Text appearing in Natural settings come in all shapes, sizes and textures. Classical methods have often failed at extracting accurately the text present in naturally occurring scenes. Text appearing in the wild presents itself in forms of hierarchy organized as sentences, words and characters. Methods for detecting Text from everyday scenes of the real world have found success. Most real world datasets available are annotated on a word level or line level thereby limiting detection to words and not characters. Inspired by the works of Naver Labs on CRAFT [2] and Microsoft Research and Baidu Research's work on WordSup [5] by training models in a weakly supervised manner to gain character level predictions. We propose a computationally efficient architecture capable of providing similar results. Thus our model, once capable of producing character level annotation trained on Synthetic text can be used to fine tune for text appearing in natural settings. The methods discussed prove to be robust enough to identify text that could be curved or somewhat deformed appearing in natural settings. Our approach includes the generation of probabilities of the location of characters and the gaps between characters of which constitute a word, such that it becomes easier to localize characters and words. Our method goes to show comparable results as to CRAFT [2] with only 30% of the number of learnable parameters required.","PeriodicalId":305943,"journal":{"name":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 International Conference on Pattern Recognition and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3480651.3480662","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Text appearing in Natural settings come in all shapes, sizes and textures. Classical methods have often failed at extracting accurately the text present in naturally occurring scenes. Text appearing in the wild presents itself in forms of hierarchy organized as sentences, words and characters. Methods for detecting Text from everyday scenes of the real world have found success. Most real world datasets available are annotated on a word level or line level thereby limiting detection to words and not characters. Inspired by the works of Naver Labs on CRAFT [2] and Microsoft Research and Baidu Research's work on WordSup [5] by training models in a weakly supervised manner to gain character level predictions. We propose a computationally efficient architecture capable of providing similar results. Thus our model, once capable of producing character level annotation trained on Synthetic text can be used to fine tune for text appearing in natural settings. The methods discussed prove to be robust enough to identify text that could be curved or somewhat deformed appearing in natural settings. Our approach includes the generation of probabilities of the location of characters and the gaps between characters of which constitute a word, such that it becomes easier to localize characters and words. Our method goes to show comparable results as to CRAFT [2] with only 30% of the number of learnable parameters required.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一个深度学习模型,能够为自然场景中的角色生成热图概率。
在自然环境中出现的文本有各种形状、大小和纹理。经典的方法往往不能准确地提取自然发生场景中的文本。在野外出现的文本以句子、单词和字符的层次结构形式呈现出来。从现实世界的日常场景中检测文本的方法已经取得了成功。大多数可用的真实世界数据集都是在单词级别或行级别上进行注释的,因此限制了对单词而不是字符的检测。受到Naver实验室在CRAFT[2]上的工作以及微软研究院和百度研究院在WordSup[5]上的工作的启发,以弱监督的方式训练模型以获得字符水平预测。我们提出了一种计算效率高的架构,能够提供类似的结果。因此,我们的模型一旦能够生成在合成文本上训练的字符级注释,就可以用于微调自然环境中出现的文本。所讨论的方法被证明是足够健壮的,可以识别在自然环境中出现的弯曲或有些变形的文本。我们的方法包括生成字符位置的概率和组成单词的字符之间的间隔,这样就更容易定位字符和单词。我们的方法只需要30%的可学习参数,就能显示出与CRAFT[2]相当的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Food safety pre-warning system based on Robust Principal Component Analysis and Improved Apriori Algorithm Synthetic Aperture Radar image target recognition based on hybrid attention mechanism Cell Detection by Robust Self-Trained Networks Research of DBN PLSR algorithm Based on Sparse Constraint Age Estimation from Facial Images using Transfer Learning and K-fold Cross-Validation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1