Exploring recursive neural networks for compact handwritten text recognition models

IF 2.5 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal on Document Analysis and Recognition Pub Date : 2024-06-27 DOI:10.1007/s10032-024-00481-y

Enrique Mas-Candela, Jorge Calvo-Zaragoza

{"title":"Exploring recursive neural networks for compact handwritten text recognition models","authors":"Enrique Mas-Candela, Jorge Calvo-Zaragoza","doi":"10.1007/s10032-024-00481-y","DOIUrl":null,"url":null,"abstract":"<p>This paper addresses the challenge of deploying recognition models in specific scenarios in which memory size is relevant, such as in low-cost devices or browser-based applications. We specifically focus on developing memory-efficient approaches for Handwritten Text Recognition (HTR) by leveraging recursive networks. These networks reuse learned weights across successive layers, thus enabling the maintenance of depth, a critical factor associated with model accuracy, without an increase in memory footprint. We apply neural recursion techniques to models typically used in HTR that contain convolutional and recurrent layers. We additionally study the impact of kernel scaling, which allows the activations of these recursive layers to be modified for greater expressiveness with little cost to memory. Our experiments on various HTR benchmarks demonstrate that recursive networks are, indeed, a good alternative. It is noteworthy that these recursive networks not only preserve but in some instances also enhance accuracy, making them a promising solution for memory-efficient HTR applications. This research establishes the utility of recursive networks in addressing memory constraints in HTR models. Their ability to sustain or improve accuracy while being memory-efficient positions them as a promising solution for practical deployment, especially in contexts where memory size is a critical consideration, such as low-cost devices and browser-based applications.\n</p>","PeriodicalId":50277,"journal":{"name":"International Journal on Document Analysis and Recognition","volume":"48 14 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Document Analysis and Recognition","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10032-024-00481-y","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

This paper addresses the challenge of deploying recognition models in specific scenarios in which memory size is relevant, such as in low-cost devices or browser-based applications. We specifically focus on developing memory-efficient approaches for Handwritten Text Recognition (HTR) by leveraging recursive networks. These networks reuse learned weights across successive layers, thus enabling the maintenance of depth, a critical factor associated with model accuracy, without an increase in memory footprint. We apply neural recursion techniques to models typically used in HTR that contain convolutional and recurrent layers. We additionally study the impact of kernel scaling, which allows the activations of these recursive layers to be modified for greater expressiveness with little cost to memory. Our experiments on various HTR benchmarks demonstrate that recursive networks are, indeed, a good alternative. It is noteworthy that these recursive networks not only preserve but in some instances also enhance accuracy, making them a promising solution for memory-efficient HTR applications. This research establishes the utility of recursive networks in addressing memory constraints in HTR models. Their ability to sustain or improve accuracy while being memory-efficient positions them as a promising solution for practical deployment, especially in contexts where memory size is a critical consideration, such as low-cost devices and browser-based applications.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

探索用于紧凑型手写文本识别模型的递归神经网络

本文探讨了在内存大小相关的特定场景（如低成本设备或基于浏览器的应用）中部署识别模型所面临的挑战。我们特别关注通过利用递归网络为手写文字识别（HTR）开发内存效率高的方法。这些网络可以在连续的层中重复使用学习到的权重，从而在不增加内存占用的情况下保持深度，而深度是与模型准确性相关的关键因素。我们将神经递归技术应用于 HTR 中通常使用的包含卷积层和递归层的模型。此外，我们还研究了内核缩放的影响，它允许修改这些递归层的激活，以提高表达能力，而对内存的影响很小。我们在各种 HTR 基准上进行的实验表明，递归网络确实是一种不错的选择。值得注意的是，这些递归网络不仅保持了准确性，而且在某些情况下还提高了准确性，这使它们成为高效内存 HTR 应用的理想解决方案。这项研究证实了递归网络在解决 HTR 模型内存限制方面的实用性。递归网络既能保持或提高准确性，又能节省内存，因此在实际应用中是一种很有前途的解决方案，尤其是在对内存大小有严格要求的情况下，如低成本设备和基于浏览器的应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal on Document Analysis and Recognition 工程技术-计算机：人工智能

CiteScore

6.20

自引率

4.30%

发文量

审稿时长

7.5 months

期刊介绍： The large number of existing documents and the production of a multitude of new ones every year raise important issues in efficient handling, retrieval and storage of these documents and the information which they contain. This has led to the emergence of new research domains dealing with the recognition by computers of the constituent elements of documents - including characters, symbols, text, lines, graphics, images, handwriting, signatures, etc. In addition, these new domains deal with automatic analyses of the overall physical and logical structures of documents, with the ultimate objective of a high-level understanding of their semantic content. We have also seen renewed interest in optical character recognition (OCR) and handwriting recognition during the last decade. Document analysis and recognition are obviously the next stage. Automatic, intelligent processing of documents is at the intersections of many fields of research, especially of computer vision, image analysis, pattern recognition and artificial intelligence, as well as studies on reading, handwriting and linguistics. Although quality document related publications continue to appear in journals dedicated to these domains, the community will benefit from having this journal as a focal point for archival literature dedicated to document analysis and recognition.