{"title":"Data Preprocessing for Learning, Analyzing and Detecting Scene Text Video based on Rotational Gradient","authors":"Manasa Devi Mortha, S. Maddala, V. Raju","doi":"10.1145/3460620.3460621","DOIUrl":null,"url":null,"abstract":"Challenging annotated video datasets are in huge demand for the researchers and embedded industrials to learn and build an artificial intelligence for detecting, localizing and classifying the objects of interest aimed at various applications under pattern recognition and computer vision domain. It is very significant to produce those annotated sets to the respective communal. This paper focuses on text as annotated data in video for detection, localization, tracking and classification to solve several optical character recognition (OCR) based problems. Text is very essential in understanding the nature of the video because of diverse applications which are in renowned today like video retrieval and searching, driverless cars, industrial goods automation, geocoding and many more. Hence, it is important to understand how to create, prepare and load datasets to make ready for the machine to learn and understand. First, we have applied bilateral filter to preserve the edge information. Then, rotational gradient approach is proposed to detect the text in variable viewpoints. Later, the combination of morphology and contours has applied to generate blobs with bounding box around the detected regions by eradicating quasi text areas. The simulation results have shown better performance than traditional techniques with better detection rate on ICDAR Robust Reading Competition on Text in Video 2013-15 datasets.","PeriodicalId":36824,"journal":{"name":"Data","volume":"89 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2021-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data","FirstCategoryId":"90","ListUrlMain":"https://doi.org/10.1145/3460620.3460621","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Challenging annotated video datasets are in huge demand for the researchers and embedded industrials to learn and build an artificial intelligence for detecting, localizing and classifying the objects of interest aimed at various applications under pattern recognition and computer vision domain. It is very significant to produce those annotated sets to the respective communal. This paper focuses on text as annotated data in video for detection, localization, tracking and classification to solve several optical character recognition (OCR) based problems. Text is very essential in understanding the nature of the video because of diverse applications which are in renowned today like video retrieval and searching, driverless cars, industrial goods automation, geocoding and many more. Hence, it is important to understand how to create, prepare and load datasets to make ready for the machine to learn and understand. First, we have applied bilateral filter to preserve the edge information. Then, rotational gradient approach is proposed to detect the text in variable viewpoints. Later, the combination of morphology and contours has applied to generate blobs with bounding box around the detected regions by eradicating quasi text areas. The simulation results have shown better performance than traditional techniques with better detection rate on ICDAR Robust Reading Competition on Text in Video 2013-15 datasets.