Text recognition for information retrieval in images of printed circuit boards

Wei Li, Stefan Neullens, Matthias Breier, Marcel Bosling, T. Pretz, D. Merhof
{"title":"Text recognition for information retrieval in images of printed circuit boards","authors":"Wei Li, Stefan Neullens, Matthias Breier, Marcel Bosling, T. Pretz, D. Merhof","doi":"10.1109/IECON.2014.7049016","DOIUrl":null,"url":null,"abstract":"In order to achieve an efficient and environment-friendly recycling of printed circuit boards (PCBs), a comprehensive analysis of their material composition is essential. Besides sophisticated chemical and physical methods for a direct material analysis, an indirect method based on information retrieval provides a less costly and more efficient alternative. During the process of information retrieval, PCBs and their components need to be recognized based on their appearance and the corresponding text information. Their material composition is then available through a pre-established database. Therefore, a practical text recognition is necessary for a successful data analysis prior to PCB recycling. Our paper is focusing on two key aspects of text recognition: binarization and final recognition of text objects using optical character recognition (OCR) engines. For binarization of text contents, a novel local thresholding method using an adaptive window size along with background estimation is presented. Several state-of-the-art algorithms and the proposed method were evaluated for comparing their binarization performance on text objects in PCB images. With respect to a data set containing manually created references, our novel method provides superior results. Furthermore, in contrast to previous work on text recognition, an additional evaluation of available open source OCR engines was conducted to asses technical limitations of OCR applications. We show that the quality of text recognition can be significantly improved if the binarization approach accounts for these technical limitations of OCR software. The presented method and results are expected to provide improved OCR performance also in other applications.","PeriodicalId":228897,"journal":{"name":"IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IECON.2014.7049016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

Abstract

In order to achieve an efficient and environment-friendly recycling of printed circuit boards (PCBs), a comprehensive analysis of their material composition is essential. Besides sophisticated chemical and physical methods for a direct material analysis, an indirect method based on information retrieval provides a less costly and more efficient alternative. During the process of information retrieval, PCBs and their components need to be recognized based on their appearance and the corresponding text information. Their material composition is then available through a pre-established database. Therefore, a practical text recognition is necessary for a successful data analysis prior to PCB recycling. Our paper is focusing on two key aspects of text recognition: binarization and final recognition of text objects using optical character recognition (OCR) engines. For binarization of text contents, a novel local thresholding method using an adaptive window size along with background estimation is presented. Several state-of-the-art algorithms and the proposed method were evaluated for comparing their binarization performance on text objects in PCB images. With respect to a data set containing manually created references, our novel method provides superior results. Furthermore, in contrast to previous work on text recognition, an additional evaluation of available open source OCR engines was conducted to asses technical limitations of OCR applications. We show that the quality of text recognition can be significantly improved if the binarization approach accounts for these technical limitations of OCR software. The presented method and results are expected to provide improved OCR performance also in other applications.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
印刷电路板图像信息检索的文本识别
为了实现印刷电路板(pcb)的高效和环保回收,对其材料成分进行全面分析是必不可少的。除了复杂的化学和物理方法用于直接材料分析之外,基于信息检索的间接方法提供了一种成本更低、效率更高的替代方法。在信息检索过程中,pcb及其组件需要根据其外观和相应的文本信息进行识别。然后通过预先建立的数据库可以获得它们的材料组成。因此,在PCB回收之前,一个实用的文本识别对于成功的数据分析是必要的。本文主要关注文本识别的两个关键方面:二值化和使用光学字符识别(OCR)引擎对文本对象进行最终识别。针对文本内容的二值化问题,提出了一种基于自适应窗口大小和背景估计的局部阈值化方法。对几种最先进的算法和提出的方法进行了评估,比较了它们对PCB图像中文本对象的二值化性能。对于包含手动创建引用的数据集,我们的新方法提供了更好的结果。此外,与之前在文本识别方面的工作相比,对可用的开源OCR引擎进行了额外的评估,以评估OCR应用程序的技术局限性。我们表明,如果二值化方法考虑到OCR软件的这些技术限制,则文本识别的质量可以显着提高。所提出的方法和结果有望在其他应用中提供改进的OCR性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
On influence of various modulation schemes on a PMSM within an electric Vehicle Design of an intra-module DC-DC converter for PV application: Design considerations and prototype DC microgrid dynamic performance assessment and enhancement based on virtual impedance method Modified half-bridge modular multilevel converter for HVDC systems with DC fault ride-through capability Circular beam scanning power system for isotope production upgrade
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1