利用深度卷积和递归神经网络识别古代手稿中的手写文本并提取信息

IF 3.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Soft Computing Pub Date : 2024-09-13 DOI:10.1007/s00500-024-09930-6
Hassan El Bahi
{"title":"利用深度卷积和递归神经网络识别古代手稿中的手写文本并提取信息","authors":"Hassan El Bahi","doi":"10.1007/s00500-024-09930-6","DOIUrl":null,"url":null,"abstract":"<p>Digitizing ancient manuscripts and making them accessible to a broader audience is a crucial step in unlocking the wealth of information they hold. However, automatic recognition of handwritten text and the extraction of relevant information such as named entities from these manuscripts are among the most difficult research topics, due to several factors such as poor quality of manuscripts, complex background, presence of ink stains, cursive handwriting, etc. To meet these challenges, we propose two systems, the first system performs the task of handwritten text recognition (HTR) in ancient manuscripts; it starts with a preprocessing operation. Then, a convolutional neural network (CNN) is used to extract the features of each input image. Finally, a recurrent neural network (RNN) which has Long Short-Term Memory (LSTM) blocks with the Connectionist Temporal Classification (CTC) layer will predict the text contained in the image. The second system focuses on recognizing named entities and deciphering the relationships among words directly from images of old manuscripts, bypassing the need for an intermediate text transcription step. Like the previous system, this second system starts with a preprocessing step. Then the data augmentation technique is used to increase the training dataset. After that, the extraction of the most relevant features is done automatically using a CNN model. Finally, the recognition of names entities and the relationship between word images is performed using a bidirectional LSTM. Extensive experiments on the ESPOSALLES dataset demonstrate that the proposed systems achieve the state-of-the-art performance exceeding existing systems.</p>","PeriodicalId":22039,"journal":{"name":"Soft Computing","volume":null,"pages":null},"PeriodicalIF":3.1000,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network\",\"authors\":\"Hassan El Bahi\",\"doi\":\"10.1007/s00500-024-09930-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Digitizing ancient manuscripts and making them accessible to a broader audience is a crucial step in unlocking the wealth of information they hold. However, automatic recognition of handwritten text and the extraction of relevant information such as named entities from these manuscripts are among the most difficult research topics, due to several factors such as poor quality of manuscripts, complex background, presence of ink stains, cursive handwriting, etc. To meet these challenges, we propose two systems, the first system performs the task of handwritten text recognition (HTR) in ancient manuscripts; it starts with a preprocessing operation. Then, a convolutional neural network (CNN) is used to extract the features of each input image. Finally, a recurrent neural network (RNN) which has Long Short-Term Memory (LSTM) blocks with the Connectionist Temporal Classification (CTC) layer will predict the text contained in the image. The second system focuses on recognizing named entities and deciphering the relationships among words directly from images of old manuscripts, bypassing the need for an intermediate text transcription step. Like the previous system, this second system starts with a preprocessing step. Then the data augmentation technique is used to increase the training dataset. After that, the extraction of the most relevant features is done automatically using a CNN model. Finally, the recognition of names entities and the relationship between word images is performed using a bidirectional LSTM. Extensive experiments on the ESPOSALLES dataset demonstrate that the proposed systems achieve the state-of-the-art performance exceeding existing systems.</p>\",\"PeriodicalId\":22039,\"journal\":{\"name\":\"Soft Computing\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2024-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Soft Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s00500-024-09930-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s00500-024-09930-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

将古代手稿数字化并使更多人能够获取,是发掘其所蕴含的丰富信息的关键一步。然而,由于手稿质量差、背景复杂、存在墨迹、草书笔迹等多种因素,手写文本的自动识别以及从这些手稿中提取命名实体等相关信息是最困难的研究课题之一。为了应对这些挑战,我们提出了两个系统,第一个系统执行古代手稿中的手写文本识别(HTR)任务;它首先进行预处理操作。然后,使用卷积神经网络(CNN)提取每张输入图像的特征。最后,具有长短期记忆(LSTM)块和联结时态分类(CTC)层的递归神经网络(RNN)将预测图像中包含的文本。第二个系统的重点是直接从旧手稿图像中识别命名实体并破译单词之间的关系,而无需中间的文本转录步骤。与前一个系统一样,第二个系统首先进行预处理。然后使用数据增强技术来增加训练数据集。之后,使用 CNN 模型自动提取最相关的特征。最后,使用双向 LSTM 对名称实体和词图像之间的关系进行识别。在 ESPOSALLES 数据集上进行的大量实验表明,所提出的系统达到了最先进的性能,超过了现有系统。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network

Digitizing ancient manuscripts and making them accessible to a broader audience is a crucial step in unlocking the wealth of information they hold. However, automatic recognition of handwritten text and the extraction of relevant information such as named entities from these manuscripts are among the most difficult research topics, due to several factors such as poor quality of manuscripts, complex background, presence of ink stains, cursive handwriting, etc. To meet these challenges, we propose two systems, the first system performs the task of handwritten text recognition (HTR) in ancient manuscripts; it starts with a preprocessing operation. Then, a convolutional neural network (CNN) is used to extract the features of each input image. Finally, a recurrent neural network (RNN) which has Long Short-Term Memory (LSTM) blocks with the Connectionist Temporal Classification (CTC) layer will predict the text contained in the image. The second system focuses on recognizing named entities and deciphering the relationships among words directly from images of old manuscripts, bypassing the need for an intermediate text transcription step. Like the previous system, this second system starts with a preprocessing step. Then the data augmentation technique is used to increase the training dataset. After that, the extraction of the most relevant features is done automatically using a CNN model. Finally, the recognition of names entities and the relationship between word images is performed using a bidirectional LSTM. Extensive experiments on the ESPOSALLES dataset demonstrate that the proposed systems achieve the state-of-the-art performance exceeding existing systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Soft Computing
Soft Computing 工程技术-计算机:跨学科应用
CiteScore
8.10
自引率
9.80%
发文量
927
审稿时长
7.3 months
期刊介绍: Soft Computing is dedicated to system solutions based on soft computing techniques. It provides rapid dissemination of important results in soft computing technologies, a fusion of research in evolutionary algorithms and genetic programming, neural science and neural net systems, fuzzy set theory and fuzzy systems, and chaos theory and chaotic systems. Soft Computing encourages the integration of soft computing techniques and tools into both everyday and advanced applications. By linking the ideas and techniques of soft computing with other disciplines, the journal serves as a unifying platform that fosters comparisons, extensions, and new applications. As a result, the journal is an international forum for all scientists and engineers engaged in research and development in this fast growing field.
期刊最新文献
Handwritten text recognition and information extraction from ancient manuscripts using deep convolutional and recurrent neural network Optimizing green solid transportation with carbon cap and trade: a multi-objective two-stage approach in a type-2 Pythagorean fuzzy context Production chain modeling based on learning flow stochastic petri nets Multi-population multi-strategy differential evolution algorithm with dynamic population size adjustment Dynamic parameter identification of modular robot manipulators based on hybrid optimization strategy: genetic algorithm and least squares method
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1