Extracting Handwritten Annotations from Printed Documents Via Infrared Scanning

Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer
{"title":"Extracting Handwritten Annotations from Printed Documents Via Infrared Scanning","authors":"Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer","doi":"10.1145/3491101.3519872","DOIUrl":null,"url":null,"abstract":"Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.","PeriodicalId":123301,"journal":{"name":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","volume":"475 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3491101.3519872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
通过红外扫描从打印文档中提取手写注释
尽管数字墨水和纸张解决方案不断改进,但许多人仍然喜欢打印文件,以便仔细阅读、校对或填写表格。然而,为了将基于纸张的注释合并到数字工作流程中,需要提取手写文本和标记。常见的计算机视觉和机器学习方法需要大量的训练数据集或文档的清晰数字版本。提出了一种利用多光谱成像从激光打印文档中提取手写注释的简单方法。虽然黑色墨粉吸收红外光,但大多数油墨在红外光谱中是不可见的。我们改进了一个现成的平板扫描仪,在它的光导上增加了一个可切换的红外LED。通过从彩色扫描中减去红外扫描,可以提取手写文本和高亮显示并添加到PDF版本中。在93页的测试数据集上进行了初步实验,结果准确,质量高。因此,红外扫描似乎是整合纸质和数字注释实践的一个很有前途的构建块。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhancing Students’ Social and Emotional Learning in Educational Virtual Heritage through Projective Augmented Reality Designing Non-Verbal Humorous Gestures for a Non-Humanoid Robot Inter-brain Synchrony and Eye Gaze Direction During Collaboration in VR Automated Vehicles as a Space for Work & Wellbeing Art is Not Research. Research is not Art.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1