R. Sánchez-Rivero, P.V. Bezmaternykh, A.V. Gayer, A. Morales-González, F. José Silva-Mata, K.B. Bulatov
{"title":"A joint study of deep learning-based methods for identity document image binarization and its influence on attribute recognition","authors":"R. Sánchez-Rivero, P.V. Bezmaternykh, A.V. Gayer, A. Morales-González, F. José Silva-Mata, K.B. Bulatov","doi":"10.18287/2412-6179-co-1207","DOIUrl":null,"url":null,"abstract":"Text recognition has benefited considerably from deep learning research, as well as the preprocessing methods included in its workflow. Identity documents are critical in the field of document analysis and should be thoroughly researched in relation to this workflow. We propose to examine the link between deep learning-based binarization and recognition algorithms for this sort of documents on the MIDV-500 and MIDV-2020 datasets. We provide a series of experiments to illustrate the relation between the quality of the collected images with respect to the binarization results, as well as the influence of its output on final recognition performance. We show that deep learning-based binarization solutions are affected by the capture quality, which implies that they still need significant improvements. We also show that proper binarization results can improve the performance for many recognition methods. Our retrained U-Net-bin outperformed all other binarization methods, and the best result in recognition was obtained by Paddle Paddle OCR v2.","PeriodicalId":46692,"journal":{"name":"Computer Optics","volume":"40 1","pages":"0"},"PeriodicalIF":1.1000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Optics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18287/2412-6179-co-1207","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Text recognition has benefited considerably from deep learning research, as well as the preprocessing methods included in its workflow. Identity documents are critical in the field of document analysis and should be thoroughly researched in relation to this workflow. We propose to examine the link between deep learning-based binarization and recognition algorithms for this sort of documents on the MIDV-500 and MIDV-2020 datasets. We provide a series of experiments to illustrate the relation between the quality of the collected images with respect to the binarization results, as well as the influence of its output on final recognition performance. We show that deep learning-based binarization solutions are affected by the capture quality, which implies that they still need significant improvements. We also show that proper binarization results can improve the performance for many recognition methods. Our retrained U-Net-bin outperformed all other binarization methods, and the best result in recognition was obtained by Paddle Paddle OCR v2.
期刊介绍:
The journal is intended for researchers and specialists active in the following research areas: Diffractive Optics; Information Optical Technology; Nanophotonics and Optics of Nanostructures; Image Analysis & Understanding; Information Coding & Security; Earth Remote Sensing Technologies; Hyperspectral Data Analysis; Numerical Methods for Optics and Image Processing; Intelligent Video Analysis. The journal "Computer Optics" has been published since 1987. Published 6 issues per year.