International Journal on Document Analysis and Recognition最新文献

英文中文

Training transformer architectures on few annotated data: an application to historical handwritten text recognition 在少量注释数据上训练转换器架构：应用于历史手写文本识别

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2024-01-25 DOI: 10.1007/s10032-023-00459-2

Killian Barrere, Yann Soullard, Aurélie Lemaitre, Bertrand Coüasnon

Transformer-based architectures show excellent results on the task of handwritten text recognition, becoming the standard architecture for modern datasets. However, they require a significant amount of annotated data to achieve competitive results. They typically rely on synthetic data to solve this problem. Historical handwritten text recognition represents a challenging task due to degradations, specific handwritings for which few examples are available and ancient languages that vary over time. These limitations also make it difficult to generate realistic synthetic data. Given sufficient and appropriate data, Transformer-based architectures could alleviate these concerns, thanks to their ability to have a global view of textual images and their language modeling capabilities. In this paper, we propose the use of a lightweight Transformer model to tackle the task of historical handwritten text recognition. To train the architecture, we introduce realistic looking synthetic data reproducing the style of historical handwritings. We present a specific strategy, both for training and prediction, to deal with historical documents, where only a limited amount of training data are available. We evaluate our approach on the ICFHR 2018 READ dataset which is dedicated to handwriting recognition in specific historical documents. The results show that our Transformer-based approach is able to outperform existing methods.

基于变换器的架构在手写文本识别任务中显示出卓越的效果，已成为现代数据集的标准架构。然而，它们需要大量的注释数据才能获得有竞争力的结果。它们通常依靠合成数据来解决这个问题。历史手写文本识别是一项具有挑战性的任务，原因包括退化、可用示例很少的特定手写体以及随时间变化的古代语言。这些局限性也使得生成真实的合成数据变得困难。如果有足够和适当的数据，基于变换器的架构可以缓解这些问题，这要归功于它们对文本图像的全局视图能力和语言建模能力。在本文中，我们建议使用轻量级 Transformer 模型来处理历史手写文本识别任务。为了训练该架构，我们引入了逼真的合成数据，再现了历史手写体的风格。我们提出了一种用于训练和预测的特定策略，以处理训练数据量有限的历史文件。我们在 ICFHR 2018 READ 数据集上评估了我们的方法，该数据集专门用于特定历史文件中的手写识别。结果表明，我们基于变换器的方法能够超越现有方法。

{"title":"Training transformer architectures on few annotated data: an application to historical handwritten text recognition","authors":"Killian Barrere, Yann Soullard, Aurélie Lemaitre, Bertrand Coüasnon","doi":"10.1007/s10032-023-00459-2","DOIUrl":"https://doi.org/10.1007/s10032-023-00459-2","url":null,"abstract":"Transformer-based architectures show excellent results on the task of handwritten text recognition, becoming the standard architecture for modern datasets. However, they require a significant amount of annotated data to achieve competitive results. They typically rely on synthetic data to solve this problem. Historical handwritten text recognition represents a challenging task due to degradations, specific handwritings for which few examples are available and ancient languages that vary over time. These limitations also make it difficult to generate realistic synthetic data. Given sufficient and appropriate data, Transformer-based architectures could alleviate these concerns, thanks to their ability to have a global view of textual images and their language modeling capabilities. In this paper, we propose the use of a lightweight Transformer model to tackle the task of historical handwritten text recognition. To train the architecture, we introduce realistic looking synthetic data reproducing the style of historical handwritings. We present a specific strategy, both for training and prediction, to deal with historical documents, where only a limited amount of training data are available. We evaluate our approach on the ICFHR 2018 READ dataset which is dedicated to handwriting recognition in specific historical documents. The results show that our Transformer-based approach is able to outperform existing methods.","PeriodicalId":50277,"journal":{"name":"International Journal on Document Analysis and Recognition","volume":"26 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139580505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Background grid extraction from historical hand-drawn cadastral maps 从历史手绘地籍图中提取背景网格

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2023-12-08 DOI: 10.1007/s10032-023-00457-4

Tauseef Iftikhar, Nazar Khan

We tackle a novel problem of detecting background grids in hand-drawn cadastral maps. Grid extraction is necessary for accessing and contextualizing the actual map content. The problem is challenging since the background grid is the bottommost map layer that is severely occluded by subsequent map layers. We present a novel automatic method for robust, bottom-up extraction of background grid structures in historical cadastral maps. The proposed algorithm extracts grid structures under significant occlusion, missing information, and noise by iteratively providing an increasingly refined estimate of the grid structure. The key idea is to exploit periodicity of background grid lines to corroborate the existence of each other. We also present an automatic scheme for determining the ‘gridness’ of any detected grid so that the proposed method self-evaluates its result as being good or poor without using ground truth. We present empirical evidence to show that the proposed gridness measure is a good indicator of quality. On a dataset of 268 historical cadastral maps with resolution (1424times 2136) pixels, the proposed method detects grids in 247 images yielding an average root-mean-square error (RMSE) of 5.0 pixels and average intersection over union (IoU) of 0.990. On grids self-evaluated as being good, we report average RMSE of 4.39 pixels and average IoU of 0.991. To compare with the proposed bottom-up approach, we also develop three increasingly sophisticated top-down algorithms based on RANSAC-based model fitting. Experimental results show that our bottom-up algorithm yields better results than the top-down algorithms. We also demonstrate that using detected background grids for stitching different maps is visually better than both manual and SURF-based stitching.

我们要解决的新问题是检测手绘地籍图中的背景网格。网格提取是访问实际地图内容并将其上下文化的必要条件。由于背景网格是最底层的地图图层，被后续地图图层严重遮挡，因此该问题具有挑战性。我们提出了一种新颖的自动方法，用于自下而上提取历史地籍图中的背景网格结构。所提出的算法通过迭代提供越来越精细的网格结构估计值，在严重遮挡、信息缺失和噪声的情况下提取网格结构。其关键思路是利用背景网格线的周期性来证实彼此的存在。我们还提出了一种自动方案，用于确定任何检测到的网格的 "网格度"，这样所提出的方法就能在不使用地面实况的情况下，自我评估其结果的好坏。我们提出的经验证据表明，所提出的网格度量是一个很好的质量指标。在一个包含 268 幅历史地籍图（分辨率为 1424×2136 像素）的数据集上，所提出的方法在 247 幅图像中检测到了网格，平均均方根误差（RMSE）为 5.0 像素，平均交集大于联合（IoU）为 0.990。对于自我评估为良好的网格，我们报告的平均 RMSE 为 4.39 像素，平均 IoU 为 0.991。为了与自下而上的方法进行比较，我们还基于基于 RANSAC 的模型拟合开发了三种日益复杂的自上而下算法。实验结果表明，我们的自下而上算法比自上而下算法产生了更好的结果。我们还证明，使用检测到的背景网格来拼接不同的地图在视觉效果上要好于手动拼接和基于 SURF 的拼接。

{"title":"Background grid extraction from historical hand-drawn cadastral maps","authors":"Tauseef Iftikhar, Nazar Khan","doi":"10.1007/s10032-023-00457-4","DOIUrl":"https://doi.org/10.1007/s10032-023-00457-4","url":null,"abstract":"We tackle a novel problem of detecting background grids in hand-drawn cadastral maps. Grid extraction is necessary for accessing and contextualizing the actual map content. The problem is challenging since the background grid is the bottommost map layer that is severely occluded by subsequent map layers. We present a novel automatic method for robust, bottom-up extraction of background grid structures in historical cadastral maps. The proposed algorithm extracts grid structures under significant occlusion, missing information, and noise by iteratively providing an increasingly refined estimate of the grid structure. The key idea is to exploit periodicity of background grid lines to corroborate the existence of each other. We also present an automatic scheme for determining the ‘gridness’ of any detected grid so that the proposed method self-evaluates its result as being good or poor without using ground truth. We present empirical evidence to show that the proposed gridness measure is a good indicator of quality. On a dataset of 268 historical cadastral maps with resolution (1424times 2136) pixels, the proposed method detects grids in 247 images yielding an average root-mean-square error (RMSE) of 5.0 pixels and average intersection over union (IoU) of 0.990. On grids self-evaluated as being good, we report average RMSE of 4.39 pixels and average IoU of 0.991. To compare with the proposed bottom-up approach, we also develop three increasingly sophisticated top-down algorithms based on RANSAC-based model fitting. Experimental results show that our bottom-up algorithm yields better results than the top-down algorithms. We also demonstrate that using detected background grids for stitching different maps is visually better than both manual and SURF-based stitching.","PeriodicalId":50277,"journal":{"name":"International Journal on Document Analysis and Recognition","volume":"21 1","pages":""},"PeriodicalIF":2.3,"publicationDate":"2023-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138556858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Genre as noise 流派是噪音

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2007-12-01 DOI: 10.2307/j.ctv125jncf.8

StubbeAndrea, RinglstetterChristoph, U. SchulzKlaus

Given a specific information need, documents of the wrong genre can be considered as noise. From this perspective, genre classification helps to separate relevant documents from noise. Orthographic...

考虑到特定的信息需求，错误类型的文件可能会被视为噪音。从这个角度来看，体裁分类有助于将相关文档从噪声中分离出来。拼写……

引用次数: 0

Text line segmentation of historical documents: a survey 历史文献的文本行分割:综述

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2007-04-04 DOI: 10.5555/1237480.1237483

Likforman-SulemLaurence, ZahourAbderrazak, TaconetBruno

There is a huge amount of historical documents in libraries and in various National Archives that have not been exploited electronically. Although automatic reading of complete pages remains, in mo...

在图书馆和各个国家档案馆中有大量的历史文献没有被电子利用。虽然自动读取完整的页面仍然存在，在…

引用次数: 8

Text line segmentation of historical documents 历史文献的文本行分割

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2007-04-01 DOI: 10.5555/2722890.2723025

Likforman-SulemLaurence, ZahourAbderrazak, TaconetBruno

在图书馆和各个国家档案馆中有大量的历史文献没有被电子利用。虽然自动读取完整的页面仍然存在，在…

引用次数: 7

The recognition of handwritten numeral strings using a two-stage HMM-based method 使用基于hmm的两阶段方法识别手写数字字符串

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2003-04-01 DOI: 10.1007/s10032-002-0085-5

A. Britto, R. Sabourin, Flávio Bortolozzi

引用次数: 50

Adaptive image-smoothing using a coplanar matrix and its application to document image binarization 共面矩阵自适应图像平滑及其在文档图像二值化中的应用

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2003-04-01 DOI: 10.1007/s10032-002-0098-0

Lixin Fan, Liying Fan, C. Tan

引用次数: 5

Special issue – selected papers from the ICDAR'01 conference 特刊- ICDAR'01会议论文选集

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2003-04-01 DOI: 10.1007/s10032-002-0093-5

A. Spitz, K. Tombre

引用次数: 1

Extraction of special effects caption text events from digital video 从数字视频中提取特效字幕文本事件

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2003-04-01 DOI: 10.1007/s10032-002-0091-7

David J. Crandall, Sameer Kiran Antani, R. Kasturi

引用次数: 94

Online image classification using IHDR 基于IHDR的在线图像分类

IF 2.3 4区计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal on Document Analysis and Recognition

Pub Date : 2003-04-01 DOI: 10.1007/s10032-002-0086-4

J. Weng, Wey-Shiuan Hwang

引用次数: 24

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

International Journal on Document Analysis and Recognition

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀