使用CRNN的OCR:一种文本识别的深度学习方法

2023 4th International Conference for Emerging Technology (INCET) Pub Date : 2023-05-26 DOI:10.1109/INCET57972.2023.10170436

Aditya Yadav, Shauryan Singh, Muzzamil Siddique, Nileshkumar Mehta, Archana Kotangale

{"title":"使用CRNN的OCR:一种文本识别的深度学习方法","authors":"Aditya Yadav, Shauryan Singh, Muzzamil Siddique, Nileshkumar Mehta, Archana Kotangale","doi":"10.1109/INCET57972.2023.10170436","DOIUrl":null,"url":null,"abstract":"Optical Character Recognition (OCR) is a widely used technology that converts image text or handwritten text into digital form. However, recognizing handwritten text, printed text, and image text poses a significant challenge due to variations in writing styles and the complexity of characters. This paper proposes a novel approach for OCR using Convolutional Recurrent Neural Network (CRNN) that combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The proposed CRNN architecture can automatically learn and extract features from raw image pixels and recognize sequential patterns of characters. This research paper presents a robust OCR system using CRNN architecture with 7 convolutional layers and 2 LSTM layers for recognizing text in images with complex backgrounds and varying fonts. The proposed system achieved state-of-the-art performance on several benchmark datasets, demonstrating the effectiveness of the proposed approach. Our experimental results demonstrate that the proposed CRNN approach is better than other methods and achieves higher accuracy with less latency in recognizing text from an image. We also analyze the impact of different parameters, such as the number of layers, filter sizes, and hidden units, on the performance of the CRNN model. This paper provides a comprehensive study on OCR using CRNN and its potential to improve the accuracy and efficiency of recognizing text.","PeriodicalId":403008,"journal":{"name":"2023 4th International Conference for Emerging Technology (INCET)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"OCR using CRNN: A Deep Learning Approach for Text Recognition\",\"authors\":\"Aditya Yadav, Shauryan Singh, Muzzamil Siddique, Nileshkumar Mehta, Archana Kotangale\",\"doi\":\"10.1109/INCET57972.2023.10170436\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Optical Character Recognition (OCR) is a widely used technology that converts image text or handwritten text into digital form. However, recognizing handwritten text, printed text, and image text poses a significant challenge due to variations in writing styles and the complexity of characters. This paper proposes a novel approach for OCR using Convolutional Recurrent Neural Network (CRNN) that combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The proposed CRNN architecture can automatically learn and extract features from raw image pixels and recognize sequential patterns of characters. This research paper presents a robust OCR system using CRNN architecture with 7 convolutional layers and 2 LSTM layers for recognizing text in images with complex backgrounds and varying fonts. The proposed system achieved state-of-the-art performance on several benchmark datasets, demonstrating the effectiveness of the proposed approach. Our experimental results demonstrate that the proposed CRNN approach is better than other methods and achieves higher accuracy with less latency in recognizing text from an image. We also analyze the impact of different parameters, such as the number of layers, filter sizes, and hidden units, on the performance of the CRNN model. This paper provides a comprehensive study on OCR using CRNN and its potential to improve the accuracy and efficiency of recognizing text.\",\"PeriodicalId\":403008,\"journal\":{\"name\":\"2023 4th International Conference for Emerging Technology (INCET)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 4th International Conference for Emerging Technology (INCET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INCET57972.2023.10170436\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 4th International Conference for Emerging Technology (INCET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INCET57972.2023.10170436","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

光学字符识别(OCR)是一种广泛应用的将图像文本或手写文本转换为数字形式的技术。然而，由于书写风格的变化和字符的复杂性，识别手写文本、印刷文本和图像文本提出了重大挑战。本文提出了一种结合卷积神经网络(cnn)和递归神经网络(rnn)的卷积递归神经网络(CRNN)的OCR新方法。所提出的CRNN架构可以自动从原始图像像素中学习和提取特征，并识别字符的顺序模式。本文提出了一种基于7个卷积层和2个LSTM层的CRNN结构的鲁棒OCR系统，用于识别复杂背景和不同字体图像中的文本。所提出的系统在几个基准数据集上取得了最先进的性能，证明了所提出方法的有效性。实验结果表明，本文提出的CRNN方法在识别图像文本方面优于其他方法，具有更高的准确率和更少的延迟。我们还分析了不同参数(如层数、滤波器大小和隐藏单元)对CRNN模型性能的影响。本文对使用CRNN的OCR及其提高文本识别精度和效率的潜力进行了全面的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

OCR using CRNN: A Deep Learning Approach for Text Recognition

Optical Character Recognition (OCR) is a widely used technology that converts image text or handwritten text into digital form. However, recognizing handwritten text, printed text, and image text poses a significant challenge due to variations in writing styles and the complexity of characters. This paper proposes a novel approach for OCR using Convolutional Recurrent Neural Network (CRNN) that combines convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The proposed CRNN architecture can automatically learn and extract features from raw image pixels and recognize sequential patterns of characters. This research paper presents a robust OCR system using CRNN architecture with 7 convolutional layers and 2 LSTM layers for recognizing text in images with complex backgrounds and varying fonts. The proposed system achieved state-of-the-art performance on several benchmark datasets, demonstrating the effectiveness of the proposed approach. Our experimental results demonstrate that the proposed CRNN approach is better than other methods and achieves higher accuracy with less latency in recognizing text from an image. We also analyze the impact of different parameters, such as the number of layers, filter sizes, and hidden units, on the performance of the CRNN model. This paper provides a comprehensive study on OCR using CRNN and its potential to improve the accuracy and efficiency of recognizing text.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 4th International Conference for Emerging Technology (INCET)

自引率

0.00%

发文量