改进异构手写表单识别的字段输入

2019 International Conference on Document Analysis and Recognition (ICDAR) Pub Date : 2019-09-01 DOI:10.1109/ICDAR.2019.00084

Ciprian Tomoiaga, Paul Feng, M. Salzmann, PA Jayet

{"title":"改进异构手写表单识别的字段输入","authors":"Ciprian Tomoiaga, Paul Feng, M. Salzmann, PA Jayet","doi":"10.1109/ICDAR.2019.00084","DOIUrl":null,"url":null,"abstract":"Offline handwriting recognition has undergone continuous progress over the past decades. However, existing methods are typically benchmarked on free-form text datasets that are biased towards good-quality images and handwriting styles, and homogeneous content. In this paper, we show that state-of-the-art algorithms, employing long short-term memory (LSTM) layers, do not readily generalize to real-world structured documents, such as forms, due to their highly heterogeneous and out-of-vocabulary content, and to the inherent ambiguities of this content. To address this, we propose to leverage the content type within an LSTM-based architecture. Furthermore, we introduce a procedure to generate synthetic data to train this architecture without requiring expensive manual annotations. We demonstrate the effectiveness of our approach at transcribing text on a challenging, real-world dataset of European Accident Statements.","PeriodicalId":325437,"journal":{"name":"2019 International Conference on Document Analysis and Recognition (ICDAR)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Field Typing for Improved Recognition on Heterogeneous Handwritten Forms\",\"authors\":\"Ciprian Tomoiaga, Paul Feng, M. Salzmann, PA Jayet\",\"doi\":\"10.1109/ICDAR.2019.00084\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Offline handwriting recognition has undergone continuous progress over the past decades. However, existing methods are typically benchmarked on free-form text datasets that are biased towards good-quality images and handwriting styles, and homogeneous content. In this paper, we show that state-of-the-art algorithms, employing long short-term memory (LSTM) layers, do not readily generalize to real-world structured documents, such as forms, due to their highly heterogeneous and out-of-vocabulary content, and to the inherent ambiguities of this content. To address this, we propose to leverage the content type within an LSTM-based architecture. Furthermore, we introduce a procedure to generate synthetic data to train this architecture without requiring expensive manual annotations. We demonstrate the effectiveness of our approach at transcribing text on a challenging, real-world dataset of European Accident Statements.\",\"PeriodicalId\":325437,\"journal\":{\"name\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Document Analysis and Recognition (ICDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDAR.2019.00084\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Document Analysis and Recognition (ICDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDAR.2019.00084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

离线手写识别在过去的几十年里经历了不断的进步。然而，现有的方法通常是在自由格式的文本数据集上进行基准测试的，这些数据集倾向于高质量的图像和手写样式，以及同质的内容。在本文中，我们表明，采用长短期记忆(LSTM)层的最先进算法，由于其高度异构和词汇外的内容，以及这些内容固有的模糊性，不容易推广到现实世界的结构化文档，如表单。为了解决这个问题，我们建议在基于lstm的体系结构中利用内容类型。此外，我们引入了一个过程来生成合成数据来训练这个体系结构，而不需要昂贵的手动注释。我们展示了我们的方法在一个具有挑战性的、真实世界的欧洲事故声明数据集上转录文本的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Field Typing for Improved Recognition on Heterogeneous Handwritten Forms

Offline handwriting recognition has undergone continuous progress over the past decades. However, existing methods are typically benchmarked on free-form text datasets that are biased towards good-quality images and handwriting styles, and homogeneous content. In this paper, we show that state-of-the-art algorithms, employing long short-term memory (LSTM) layers, do not readily generalize to real-world structured documents, such as forms, due to their highly heterogeneous and out-of-vocabulary content, and to the inherent ambiguities of this content. To address this, we propose to leverage the content type within an LSTM-based architecture. Furthermore, we introduce a procedure to generate synthetic data to train this architecture without requiring expensive manual annotations. We demonstrate the effectiveness of our approach at transcribing text on a challenging, real-world dataset of European Accident Statements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Document Analysis and Recognition (ICDAR)

自引率

0.00%

发文量