Reading handwritten US census forms

Proceedings of 3rd International Conference on Document Analysis and Recognition Pub Date : 1995-08-14 DOI:10.1109/ICDAR.1995.598949

S. Madhvanath, V. Govindaraju, V. Ramanaprasad, Dar-Shyang Lee, S. Srihari

引用次数: 23

Abstract

Commercial forms-reading systems for extraction of data from forms do not meet acceptable accuracy requirements on forms filled out by hand. In December 1993, NIST called industry and research organizations working in the area of handwriting recognition to participate in a test to determine the state of the art in the area. A database of form images containing actual responses received by the US Census Bureau was provided. The handwritten responses are very loosely constrained in terms of writing style, format of response and choice of text. The sizes of the lexicons provided are very large (about 50000 entries) and yet the coverage is incomplete (about 70%). In this paper we discuss the approach taken by CEDAR to automate the task of reading the census forms. The subtasks of field extraction and phrase recognition are described.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

阅读手写的美国人口普查表格

用于从表格中提取数据的商业表格读取系统不符合手工填写表格的可接受的准确性要求。1993年12月，NIST召集手写识别领域的工业和研究组织参加一个测试，以确定该领域的技术水平。提供了包含美国人口普查局收到的实际答复的表格图像数据库。手写回复在写作风格、回复格式和文本选择方面的限制非常宽松。所提供的词典的大小非常大(大约50000个条目)，但覆盖率不完整(大约70%)。在本文中，我们讨论了雪松采用的方法，使阅读人口普查表格的任务自动化。描述了字段提取和短语识别的子任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of 3rd International Conference on Document Analysis and Recognition

自引率

0.00%

发文量