C H Nachappa, N Shobha Rani, Peeta Basa Pati, M Gokulnath
{"title":"基于文档地图生成的严重扭曲相机捕获文档图像的自适应去翘曲。","authors":"C H Nachappa, N Shobha Rani, Peeta Basa Pati, M Gokulnath","doi":"10.1007/s10032-022-00425-4","DOIUrl":null,"url":null,"abstract":"<p><p>Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user. The proposed work implements a fully automated technique for control point detection from simple-to-complex geometrical distortions in camera-captured document images. The input image is subject to preprocessing, corner point detection, document map generation, and rendering of the de-warped document image. The proposed algorithm has been tested on five different camera-captured document datasets (one internal and four external publicly available) consisting of 958 images. Both quantitative and qualitative evaluations have been performed to test the efficacy of the proposed system. On the quantitative front, an Intersection Over Union (IoU) score of 0.92, 0.88, and 0.80 for document map generation for low-, medium-, and high-complexity datasets, respectively. Additionally, accuracies of the recognized texts, obtained from a market leading OCR engine, are utilized for quantitative comparative analysis on document images before and after the proposed enhancement. Finally, the qualitative analysis visually establishes the system's reliability by demonstrating improved readability even for severely distorted image samples.</p>","PeriodicalId":50277,"journal":{"name":"International Journal on Document Analysis and Recognition","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838515/pdf/","citationCount":"2","resultStr":"{\"title\":\"Adaptive dewarping of severely warped camera-captured document images based on document map generation.\",\"authors\":\"C H Nachappa, N Shobha Rani, Peeta Basa Pati, M Gokulnath\",\"doi\":\"10.1007/s10032-022-00425-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user. The proposed work implements a fully automated technique for control point detection from simple-to-complex geometrical distortions in camera-captured document images. The input image is subject to preprocessing, corner point detection, document map generation, and rendering of the de-warped document image. The proposed algorithm has been tested on five different camera-captured document datasets (one internal and four external publicly available) consisting of 958 images. Both quantitative and qualitative evaluations have been performed to test the efficacy of the proposed system. On the quantitative front, an Intersection Over Union (IoU) score of 0.92, 0.88, and 0.80 for document map generation for low-, medium-, and high-complexity datasets, respectively. Additionally, accuracies of the recognized texts, obtained from a market leading OCR engine, are utilized for quantitative comparative analysis on document images before and after the proposed enhancement. Finally, the qualitative analysis visually establishes the system's reliability by demonstrating improved readability even for severely distorted image samples.</p>\",\"PeriodicalId\":50277,\"journal\":{\"name\":\"International Journal on Document Analysis and Recognition\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9838515/pdf/\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Document Analysis and Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s10032-022-00425-4\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Document Analysis and Recognition","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10032-022-00425-4","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Adaptive dewarping of severely warped camera-captured document images based on document map generation.
Automated dewarping of camera-captured handwritten documents is a challenging research problem in Computer Vision and Pattern Recognition. Most available systems assume the shape of the camera-captured image boundaries to be anywhere between trapezoidal and octahedral, with linear distortion in areas between the boundaries for dewarping. The majority of the state-of-the-art applications successfully dewarp the simple-to-medium range geometrical distortions with partial selection of control points by a user. The proposed work implements a fully automated technique for control point detection from simple-to-complex geometrical distortions in camera-captured document images. The input image is subject to preprocessing, corner point detection, document map generation, and rendering of the de-warped document image. The proposed algorithm has been tested on five different camera-captured document datasets (one internal and four external publicly available) consisting of 958 images. Both quantitative and qualitative evaluations have been performed to test the efficacy of the proposed system. On the quantitative front, an Intersection Over Union (IoU) score of 0.92, 0.88, and 0.80 for document map generation for low-, medium-, and high-complexity datasets, respectively. Additionally, accuracies of the recognized texts, obtained from a market leading OCR engine, are utilized for quantitative comparative analysis on document images before and after the proposed enhancement. Finally, the qualitative analysis visually establishes the system's reliability by demonstrating improved readability even for severely distorted image samples.
期刊介绍:
The large number of existing documents and the production of a multitude of new ones every year raise important issues in efficient handling, retrieval and storage of these documents and the information which they contain. This has led to the emergence of new research domains dealing with the recognition by computers of the constituent elements of documents - including characters, symbols, text, lines, graphics, images, handwriting, signatures, etc. In addition, these new domains deal with automatic analyses of the overall physical and logical structures of documents, with the ultimate objective of a high-level understanding of their semantic content. We have also seen renewed interest in optical character recognition (OCR) and handwriting recognition during the last decade. Document analysis and recognition are obviously the next stage.
Automatic, intelligent processing of documents is at the intersections of many fields of research, especially of computer vision, image analysis, pattern recognition and artificial intelligence, as well as studies on reading, handwriting and linguistics. Although quality document related publications continue to appear in journals dedicated to these domains, the community will benefit from having this journal as a focal point for archival literature dedicated to document analysis and recognition.