Document Classification Using Lightweight Neural Network

網際網路技術學刊 Pub Date : 2023-12-01 DOI:10.53106/160792642023122407012

Chung-Hsing Chen Chung-Hsing Chen, Ko-Wei Huang Chung-Hsing Chen

引用次数: 0

Abstract

In recent years, OCR data has been used for learning and analyzing document classification. In addition, some neural networks have used image recognition for training, such as the network published by the ImageNet Large Scale Visual Recognition Challenge for document image training, AlexNet, GoogleNet, and MobileNet. Document image classification is important in data extraction processes and often requires significant computing power. Furthermore, it is difficult to implement image classification using general computers without a graphics processing unit (GPU). Therefore, this study proposes a lightweight neural network application that can perform document image classification on general computers or the Internet of Things (IoT) without a GPU. Plustek Inc. provided 3065 receipts belonging to 58 categories. Three datasets were considered as test samples while the remaining were considered as training samples to train the network to obtain a classifier. After the experiments, the classifier achieved 98.26% accuracy, and only 3 out of 174 samples showed errors.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用轻量级神经网络进行文档分类

近年来，OCR 数据已被用于学习和分析文档分类。此外，一些神经网络也利用图像识别进行训练，例如 ImageNet 大规模视觉识别挑战赛发布的用于文档图像训练的网络、AlexNet、GoogleNet 和 MobileNet。文档图像分类在数据提取过程中非常重要，通常需要强大的计算能力。此外，没有图形处理器（GPU）的普通计算机很难实现图像分类。因此，本研究提出了一种轻量级神经网络应用，无需 GPU 即可在普通计算机或物联网（IoT）上执行文档图像分类。Plustek 公司提供了属于 58 个类别的 3065 份收据。其中三个数据集被视为测试样本，其余数据集被视为训练样本，用于训练网络以获得分类器。实验结束后，分类器的准确率达到 98.26%，174 个样本中只有 3 个出现错误。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

網際網路技術學刊

自引率

0.00%

发文量