A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images

B. Pandey, D. Pandey, Subodh Wariya, Gaurav Agarwal
{"title":"A Deep Neural Network-Based Approach for Extracting Textual Images from Deteriorate Images","authors":"B. Pandey, D. Pandey, Subodh Wariya, Gaurav Agarwal","doi":"10.4108/eai.17-9-2021.170961","DOIUrl":null,"url":null,"abstract":"INTRODUCTION: The quantity of audio and visual data is increasing exponentially due to the internet's rapid growth. The digital information in images and videos could be used for fully automated captions, indexing, and image structuring. The online image and video data system has seen a significant increase. In such a dataset, images and videos must be retrieved, explored, as well as inspected. OBJECTIVES: Text extraction is crucial for locating critical as well as important data. Disturbance is indeed a critical factor that affects image quality, and this is primarily generated during image acquisition and communication operations. An image can be contaminated by a variety of noise-type disturbances. A text in the complex image includes a variety of information which is used to recognise textual as well as non-textual particulars. The particulars in the complicated corrupted images have been considered important for individuals seeing the entire issue. However, text in complicated degraded images exhibits a rapidly changing form in an unconstrained circumstance, making textual data recognition complicated METHODS: The naïve bayes algorithm is a weighted reading technique is used to generate the correct text data from the complicated image regions. Usually, images hold some disturbance as a result of the fact that filtration is proposed during the early pre-processing step. To restore the image's quality, the input image is processed employing gradient and contrast image methods. Following that, the contrast of the source images would be enhanced using an adaptive image map. Stroke width transform, Gabor transform, and weighted naïve bayes classifier methodologies have been used in complicated degraded images to segment, features extraction, and detect textual and non-textual elements. RESULTS: Finally, to identify categorised textual data, the confluence of deep neural networks and particle swarm optimization is being used. The dataset IIIT5K is used for the development portion, and also the performance of the suggested methodology is assessed by utilizing parameters like as accuracy, recall, precision, and F1 score. It performs well enough for record collections such as articles, even when significantly distorted, and is thus suitable for creating library information system databases CONCLUSION: A combination of deep neural network and particle swarm optimization is being used to recognise classified text. The dataset IIIT5K is used for the development portion, and while high performance is achieved with parameters such as accuracy, recall, precision, and F1 score, characters may occasionally deviate. Alternatively, the same character is frequently extracted [3] multiple times, which may result in incorrect textual data being extracted from natural images. As a result, an efficient technique for avoiding such flaws in the text retrieval process must be implemented in the near future.","PeriodicalId":33474,"journal":{"name":"EAI Endorsed Transactions on Industrial Networks and Intelligent Systems","volume":"25 2 1","pages":"e3"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"EAI Endorsed Transactions on Industrial Networks and Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/eai.17-9-2021.170961","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 6

Abstract

INTRODUCTION: The quantity of audio and visual data is increasing exponentially due to the internet's rapid growth. The digital information in images and videos could be used for fully automated captions, indexing, and image structuring. The online image and video data system has seen a significant increase. In such a dataset, images and videos must be retrieved, explored, as well as inspected. OBJECTIVES: Text extraction is crucial for locating critical as well as important data. Disturbance is indeed a critical factor that affects image quality, and this is primarily generated during image acquisition and communication operations. An image can be contaminated by a variety of noise-type disturbances. A text in the complex image includes a variety of information which is used to recognise textual as well as non-textual particulars. The particulars in the complicated corrupted images have been considered important for individuals seeing the entire issue. However, text in complicated degraded images exhibits a rapidly changing form in an unconstrained circumstance, making textual data recognition complicated METHODS: The naïve bayes algorithm is a weighted reading technique is used to generate the correct text data from the complicated image regions. Usually, images hold some disturbance as a result of the fact that filtration is proposed during the early pre-processing step. To restore the image's quality, the input image is processed employing gradient and contrast image methods. Following that, the contrast of the source images would be enhanced using an adaptive image map. Stroke width transform, Gabor transform, and weighted naïve bayes classifier methodologies have been used in complicated degraded images to segment, features extraction, and detect textual and non-textual elements. RESULTS: Finally, to identify categorised textual data, the confluence of deep neural networks and particle swarm optimization is being used. The dataset IIIT5K is used for the development portion, and also the performance of the suggested methodology is assessed by utilizing parameters like as accuracy, recall, precision, and F1 score. It performs well enough for record collections such as articles, even when significantly distorted, and is thus suitable for creating library information system databases CONCLUSION: A combination of deep neural network and particle swarm optimization is being used to recognise classified text. The dataset IIIT5K is used for the development portion, and while high performance is achieved with parameters such as accuracy, recall, precision, and F1 score, characters may occasionally deviate. Alternatively, the same character is frequently extracted [3] multiple times, which may result in incorrect textual data being extracted from natural images. As a result, an efficient technique for avoiding such flaws in the text retrieval process must be implemented in the near future.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于深度神经网络的劣化图像文本提取方法
导读:由于互联网的快速发展,视听数据的数量呈指数级增长。图像和视频中的数字信息可用于全自动字幕、索引和图像结构。网络图像视频数据系统显著发展。在这样的数据集中,必须检索、探索和检查图像和视频。目的:文本提取对于定位关键数据和重要数据至关重要。干扰确实是影响图像质量的关键因素,这主要是在图像采集和通信操作过程中产生的。图像可能受到各种噪声类型干扰的污染。复杂图像中的文本包括用于识别文本和非文本细节的各种信息。在复杂的损坏图像中的细节被认为是重要的个人看到整个问题。然而,复杂退化图像中的文本在不受约束的情况下呈现出快速变化的形式,使得文本数据识别变得复杂。方法:利用naïve贝叶斯算法作为一种加权阅读技术,从复杂的图像区域中生成正确的文本数据。通常,由于在预处理的早期阶段进行了滤波,图像会受到一定的干扰。为了恢复图像的质量,对输入图像进行了梯度和对比度处理。然后,使用自适应图像映射增强源图像的对比度。笔画宽度变换、Gabor变换和加权naïve贝叶斯分类器方法被用于复杂退化图像的分割、特征提取以及文本和非文本元素的检测。结果:最后,为了识别分类文本数据,使用了深度神经网络和粒子群优化的融合。数据集IIIT5K用于开发部分,并且通过使用诸如准确性、召回率、精度和F1分数等参数来评估所建议方法的性能。即使在严重扭曲的情况下,它也能很好地处理文章等记录集合,因此适合创建图书馆信息系统数据库。结论:深度神经网络和粒子群优化的结合被用于识别分类文本。数据集IIIT5K用于开发部分,虽然通过准确性、召回率、精度和F1分数等参数实现了高性能,但字符可能偶尔会偏离。或者,同一字符经常被多次提取[3],这可能会导致从自然图像中提取错误的文本数据。因此,在不久的将来,必须实现一种有效的技术来避免文本检索过程中的这些缺陷。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
4.00
自引率
0.00%
发文量
15
审稿时长
10 weeks
期刊最新文献
ViMedNER: A Medical Named Entity Recognition Dataset for Vietnamese Distributed Spatially Non-Stationary Channel Estimation for Extremely-Large Antenna Systems On the Performance of the Relay Selection in Multi-hop Cluster-based Wireless Networks with Multiple Eavesdroppers Under Equally Correlated Rayleigh Fading Improving Performance of the Typical User in the Indoor Cooperative NOMA Millimeter Wave Networks with Presence of Walls Real-time Single-Channel EOG removal based on Empirical Mode Decomposition
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1