{"title":"一种新颖的基于sift的编码本生成方法,用于手写泰米尔字符识别","authors":"A. Subashini, N. Kodikara","doi":"10.1109/ICIINFS.2011.6038077","DOIUrl":null,"url":null,"abstract":"A method for the off-line recognition of Tamil handwriting characters based on local feature extraction is investigated. In the proposed method each pre-processed character is represented by a set of local SIFT feature vectors. From a large set of SIFT descriptors, the key idea is to create a codebook for each character using K-means clustering algorithm. K-means is an optimisation algorithm but this algorithm takes very long time to converge. We construct an initial codebook by using the Linde Buzo and Gray (LBG) algorithm so that the convergence time for K-means is reduced considerably. Target character is recognised into one of twenty categories by k-nearest neighbour classification. An average recognition rate of 87% on the character level has been achieved in experiments using six thousand training and two thousand testing images of twenty selected characters. Further study may include more characters and more samples being recognised with better classifier.","PeriodicalId":353966,"journal":{"name":"2011 6th International Conference on Industrial and Information Systems","volume":"285 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"A novel SIFT-based codebook generation for handwritten Tamil character recognition\",\"authors\":\"A. Subashini, N. Kodikara\",\"doi\":\"10.1109/ICIINFS.2011.6038077\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A method for the off-line recognition of Tamil handwriting characters based on local feature extraction is investigated. In the proposed method each pre-processed character is represented by a set of local SIFT feature vectors. From a large set of SIFT descriptors, the key idea is to create a codebook for each character using K-means clustering algorithm. K-means is an optimisation algorithm but this algorithm takes very long time to converge. We construct an initial codebook by using the Linde Buzo and Gray (LBG) algorithm so that the convergence time for K-means is reduced considerably. Target character is recognised into one of twenty categories by k-nearest neighbour classification. An average recognition rate of 87% on the character level has been achieved in experiments using six thousand training and two thousand testing images of twenty selected characters. Further study may include more characters and more samples being recognised with better classifier.\",\"PeriodicalId\":353966,\"journal\":{\"name\":\"2011 6th International Conference on Industrial and Information Systems\",\"volume\":\"285 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 6th International Conference on Industrial and Information Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIINFS.2011.6038077\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 6th International Conference on Industrial and Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIINFS.2011.6038077","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel SIFT-based codebook generation for handwritten Tamil character recognition
A method for the off-line recognition of Tamil handwriting characters based on local feature extraction is investigated. In the proposed method each pre-processed character is represented by a set of local SIFT feature vectors. From a large set of SIFT descriptors, the key idea is to create a codebook for each character using K-means clustering algorithm. K-means is an optimisation algorithm but this algorithm takes very long time to converge. We construct an initial codebook by using the Linde Buzo and Gray (LBG) algorithm so that the convergence time for K-means is reduced considerably. Target character is recognised into one of twenty categories by k-nearest neighbour classification. An average recognition rate of 87% on the character level has been achieved in experiments using six thousand training and two thousand testing images of twenty selected characters. Further study may include more characters and more samples being recognised with better classifier.