使用图像处理和深度学习从申请表中提取手写文本

2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM) Pub Date : 2023-02-22 DOI:10.1109/ICIPTM57143.2023.10117678

Greeshmanth Penugonda, G. Anuradha, Jasti Lokesh Chowdary, Cheekurthi Abhinav, Kandimalla Naga Dinesh

{"title":"使用图像处理和深度学习从申请表中提取手写文本","authors":"Greeshmanth Penugonda, G. Anuradha, Jasti Lokesh Chowdary, Cheekurthi Abhinav, Kandimalla Naga Dinesh","doi":"10.1109/ICIPTM57143.2023.10117678","DOIUrl":null,"url":null,"abstract":"Around 400 million people worldwide use English as their first language, resulting it as the most extensively spoken language across the globe. Many government offices and other businesses use offline forms, the majority of which must be filled out in English. Manually digitalizing those forms is an impossible, time-consuming, and error-prone task, so extracting text from them may solve the problem. In many organizations, they have forms that have text boxes to fill out the information. So extracting text from these forms is a crucial solution. The proposed solution comprises image recognition, so the neural network is a far more convincing approach. A deep learning model, i.e., convolutional neural networks, is used to classify the characters. The handwritten alphabets and numbers are collected from Kaggle and Mnist datasets. Using those datasets, two CNN models were trained. Image processing techniques were used, which helped in the preprocessing of the image. Finding the image's coordinates and performing a perspective transform results in the removal of undesirable areas of the input image. Horizontal and vertical lines were detected, which resulted in the finding of the rectangular boxes in the form where the data is contained. According to the type of the detected box, each character in the box is sent to the respective model, which results in identifying the character and helping to find the content in that particular field. All detected content is automatically saved in an Excel sheet. The proposed system achieves an accuracy of 85%.","PeriodicalId":178817,"journal":{"name":"2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Handwritten Text Extraction from Application Forms using Image Processing and Deep Learning\",\"authors\":\"Greeshmanth Penugonda, G. Anuradha, Jasti Lokesh Chowdary, Cheekurthi Abhinav, Kandimalla Naga Dinesh\",\"doi\":\"10.1109/ICIPTM57143.2023.10117678\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Around 400 million people worldwide use English as their first language, resulting it as the most extensively spoken language across the globe. Many government offices and other businesses use offline forms, the majority of which must be filled out in English. Manually digitalizing those forms is an impossible, time-consuming, and error-prone task, so extracting text from them may solve the problem. In many organizations, they have forms that have text boxes to fill out the information. So extracting text from these forms is a crucial solution. The proposed solution comprises image recognition, so the neural network is a far more convincing approach. A deep learning model, i.e., convolutional neural networks, is used to classify the characters. The handwritten alphabets and numbers are collected from Kaggle and Mnist datasets. Using those datasets, two CNN models were trained. Image processing techniques were used, which helped in the preprocessing of the image. Finding the image's coordinates and performing a perspective transform results in the removal of undesirable areas of the input image. Horizontal and vertical lines were detected, which resulted in the finding of the rectangular boxes in the form where the data is contained. According to the type of the detected box, each character in the box is sent to the respective model, which results in identifying the character and helping to find the content in that particular field. All detected content is automatically saved in an Excel sheet. The proposed system achieves an accuracy of 85%.\",\"PeriodicalId\":178817,\"journal\":{\"name\":\"2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM)\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-02-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIPTM57143.2023.10117678\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIPTM57143.2023.10117678","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

全球约有4亿人将英语作为第一语言，使其成为全球使用最广泛的语言。许多政府机构和其他企业使用离线表格，其中大多数必须用英语填写。手动将这些表单数字化是一项不可能完成的、耗时且容易出错的任务，因此从中提取文本可以解决这个问题。在许多组织中，他们有带有文本框的表单来填写信息。因此，从这些表单中提取文本是一个关键的解决方案。所提出的解决方案包括图像识别，因此神经网络是一种更有说服力的方法。使用深度学习模型，即卷积神经网络对字符进行分类。手写的字母和数字是从Kaggle和Mnist数据集中收集的。利用这些数据集，训练了两个CNN模型。采用图像处理技术，对图像进行预处理。找到图像的坐标并执行透视变换，可以去除输入图像中不需要的区域。检测到水平线和垂直线，从而在包含数据的表单中找到矩形框。根据检测到的框的类型，将框中的每个字符发送到各自的模型，从而识别字符并帮助查找特定字段中的内容。所有检测到的内容都自动保存在Excel工作表中。该系统达到了85%的准确率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Handwritten Text Extraction from Application Forms using Image Processing and Deep Learning

Around 400 million people worldwide use English as their first language, resulting it as the most extensively spoken language across the globe. Many government offices and other businesses use offline forms, the majority of which must be filled out in English. Manually digitalizing those forms is an impossible, time-consuming, and error-prone task, so extracting text from them may solve the problem. In many organizations, they have forms that have text boxes to fill out the information. So extracting text from these forms is a crucial solution. The proposed solution comprises image recognition, so the neural network is a far more convincing approach. A deep learning model, i.e., convolutional neural networks, is used to classify the characters. The handwritten alphabets and numbers are collected from Kaggle and Mnist datasets. Using those datasets, two CNN models were trained. Image processing techniques were used, which helped in the preprocessing of the image. Finding the image's coordinates and performing a perspective transform results in the removal of undesirable areas of the input image. Horizontal and vertical lines were detected, which resulted in the finding of the rectangular boxes in the form where the data is contained. According to the type of the detected box, each character in the box is sent to the respective model, which results in identifying the character and helping to find the content in that particular field. All detected content is automatically saved in an Excel sheet. The proposed system achieves an accuracy of 85%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 3rd International Conference on Innovative Practices in Technology and Management (ICIPTM)

自引率

0.00%

发文量