Document Classification Using Lightweight Neural Network

Chung-Hsing Chen Chung-Hsing Chen, Ko-Wei Huang Chung-Hsing Chen
{"title":"Document Classification Using Lightweight Neural Network","authors":"Chung-Hsing Chen Chung-Hsing Chen, Ko-Wei Huang Chung-Hsing Chen","doi":"10.53106/160792642023122407012","DOIUrl":null,"url":null,"abstract":"In recent years, OCR data has been used for learning and analyzing document classification. In addition, some neural networks have used image recognition for training, such as the network published by the ImageNet Large Scale Visual Recognition Challenge for document image training, AlexNet, GoogleNet, and MobileNet. Document image classification is important in data extraction processes and often requires significant computing power. Furthermore, it is difficult to implement image classification using general computers without a graphics processing unit (GPU). Therefore, this study proposes a lightweight neural network application that can perform document image classification on general computers or the Internet of Things (IoT) without a GPU. Plustek Inc. provided 3065 receipts belonging to 58 categories. Three datasets were considered as test samples while the remaining were considered as training samples to train the network to obtain a classifier. After the experiments, the classifier achieved 98.26% accuracy, and only 3 out of 174 samples showed errors.","PeriodicalId":442331,"journal":{"name":"網際網路技術學刊","volume":"17 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"網際網路技術學刊","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.53106/160792642023122407012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In recent years, OCR data has been used for learning and analyzing document classification. In addition, some neural networks have used image recognition for training, such as the network published by the ImageNet Large Scale Visual Recognition Challenge for document image training, AlexNet, GoogleNet, and MobileNet. Document image classification is important in data extraction processes and often requires significant computing power. Furthermore, it is difficult to implement image classification using general computers without a graphics processing unit (GPU). Therefore, this study proposes a lightweight neural network application that can perform document image classification on general computers or the Internet of Things (IoT) without a GPU. Plustek Inc. provided 3065 receipts belonging to 58 categories. Three datasets were considered as test samples while the remaining were considered as training samples to train the network to obtain a classifier. After the experiments, the classifier achieved 98.26% accuracy, and only 3 out of 174 samples showed errors.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用轻量级神经网络进行文档分类
近年来,OCR 数据已被用于学习和分析文档分类。此外,一些神经网络也利用图像识别进行训练,例如 ImageNet 大规模视觉识别挑战赛发布的用于文档图像训练的网络、AlexNet、GoogleNet 和 MobileNet。文档图像分类在数据提取过程中非常重要,通常需要强大的计算能力。此外,没有图形处理器(GPU)的普通计算机很难实现图像分类。因此,本研究提出了一种轻量级神经网络应用,无需 GPU 即可在普通计算机或物联网(IoT)上执行文档图像分类。Plustek 公司提供了属于 58 个类别的 3065 份收据。其中三个数据集被视为测试样本,其余数据集被视为训练样本,用于训练网络以获得分类器。实验结束后,分类器的准确率达到 98.26%,174 个样本中只有 3 个出现错误。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
A Compact Depth Separable Convolutional Image Filter for Clinical Color Perception Test Hybrid Dynamic Analysis for Android Malware Protected by Anti-Analysis Techniques with DOOLDA An Improved SSD Model for Small Size Work-pieces Recognition in Automatic Production Line A Construction of Knowledge Graph for Semiconductor Industry Chain Based on Lattice-LSTM and PCNN Models Designing a Multi-Criteria Decision-Making Framework to Establish a Value Ranking System for the Quality Evaluation of Long-Term Care Services
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1