Rodolfo Valiente, José C. Gutiérrez, M. T. Sadaike, G. Bressan
{"title":"Automatic Text Recognition in Web Images","authors":"Rodolfo Valiente, José C. Gutiérrez, M. T. Sadaike, G. Bressan","doi":"10.1145/3126858.3131570","DOIUrl":null,"url":null,"abstract":"Web images play an important role in delivering multimedia content on the Web. The text embedded in web images carry semantic information related to layout and content of the pages. Statistics show that there is a significant need to detect and recognize text from web images. This paper presents an architecture that efficiently integrates localization, extraction and recognition algorithms applied to text recognition in web images. In the recognition step is proposed a procedure based on super-resolution and an iterative method for improving the performance. The approach is implemented and evaluated using Matlab and cloud computing, making the system flexible, scalable and robust in detecting texts from complex web images with different orientations, dimensions and colors. Competitive results are presented, both in precision and recognition rate, when compared with other systems in the existing literature.","PeriodicalId":338362,"journal":{"name":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","volume":"124 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd Brazillian Symposium on Multimedia and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3126858.3131570","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Web images play an important role in delivering multimedia content on the Web. The text embedded in web images carry semantic information related to layout and content of the pages. Statistics show that there is a significant need to detect and recognize text from web images. This paper presents an architecture that efficiently integrates localization, extraction and recognition algorithms applied to text recognition in web images. In the recognition step is proposed a procedure based on super-resolution and an iterative method for improving the performance. The approach is implemented and evaluated using Matlab and cloud computing, making the system flexible, scalable and robust in detecting texts from complex web images with different orientations, dimensions and colors. Competitive results are presented, both in precision and recognition rate, when compared with other systems in the existing literature.