Subhankar Ghosh, P. Bora, Sanjib Das, B. Chaudhuri
{"title":"Development of an Assamese OCR using Bangla OCR","authors":"Subhankar Ghosh, P. Bora, Sanjib Das, B. Chaudhuri","doi":"10.1145/2432553.2432566","DOIUrl":null,"url":null,"abstract":"This paper refers to the development of an OCR for the Assamese language by modifying an existing OCR for the Bangla language. This modification is feasible because the Assamese script is similar, except for a few characters, to the Bangla script. The OCR incorporates a two stage recognizer using SVM classifier with no post-processing. A spell-checker capable of detecting most errors and interactively recommending some corrections is implemented. The OCR is tested with about 1800 pages of good quality printed documents. The accuracy achieved is about 97%.","PeriodicalId":410986,"journal":{"name":"DAR '12","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DAR '12","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2432553.2432566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10
Abstract
This paper refers to the development of an OCR for the Assamese language by modifying an existing OCR for the Bangla language. This modification is feasible because the Assamese script is similar, except for a few characters, to the Bangla script. The OCR incorporates a two stage recognizer using SVM classifier with no post-processing. A spell-checker capable of detecting most errors and interactively recommending some corrections is implemented. The OCR is tested with about 1800 pages of good quality printed documents. The accuracy achieved is about 97%.