{"title":"A systematic feature selection process for a Sinhala character recognition system","authors":"T. Kumara, R. Ragel","doi":"10.1109/ICIAFS.2016.7946523","DOIUrl":null,"url":null,"abstract":"Optical Character Recognition (OCR) is a well-researched topic. Feature selection plays a vital role in a functional OCR system. The right feature selection process would make an OCR system faster, accurate and complete. The Sinhala language suffers from complete OCR systems. In this paper, we introduce a quantifiable, and systematic feature selection process for OCR systems. Using which, we show that the feature set that usually works well with English characters will not work for Sinhala letters. Further, we examine and compare some existing features in the literature and also introduce new features that would work well for Sinhala letter. We argue that the features we have identified and introduced would help researchers to make the best and complete OCR system for Sinhala.","PeriodicalId":237290,"journal":{"name":"2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Information and Automation for Sustainability (ICIAfS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIAFS.2016.7946523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Optical Character Recognition (OCR) is a well-researched topic. Feature selection plays a vital role in a functional OCR system. The right feature selection process would make an OCR system faster, accurate and complete. The Sinhala language suffers from complete OCR systems. In this paper, we introduce a quantifiable, and systematic feature selection process for OCR systems. Using which, we show that the feature set that usually works well with English characters will not work for Sinhala letters. Further, we examine and compare some existing features in the literature and also introduce new features that would work well for Sinhala letter. We argue that the features we have identified and introduced would help researchers to make the best and complete OCR system for Sinhala.