{"title":"基于乌尔都纳萨克语字符识别的自组织地图","authors":"S. A. Hussain, S. Zaman, M. Ayub","doi":"10.1109/ICET.2009.5353161","DOIUrl":null,"url":null,"abstract":"Research in the field of character recognition for Urdu script faces challenges mainly due to its characteristics, like cursive nature, multiple fonts and context dependent shapes of characters and their position with respect to the base line. This paper addresses problems recognizing Nasakh script of Urdu Language. The proposed system takes segmented character as input and recognizes them in two steps. In the first step the different shapes of each character are classifies into 33 categories using Kohonen Self-organizing Map (SOM) by auto clustering similar ligatures for initial classification. During the Feature Extraction phase more than twenty five different features are extracted from each character which are further processed for final character recognition.","PeriodicalId":307661,"journal":{"name":"2009 International Conference on Emerging Technologies","volume":"79 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"A Self Organizing Map based Urdu Nasakh character recognition\",\"authors\":\"S. A. Hussain, S. Zaman, M. Ayub\",\"doi\":\"10.1109/ICET.2009.5353161\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Research in the field of character recognition for Urdu script faces challenges mainly due to its characteristics, like cursive nature, multiple fonts and context dependent shapes of characters and their position with respect to the base line. This paper addresses problems recognizing Nasakh script of Urdu Language. The proposed system takes segmented character as input and recognizes them in two steps. In the first step the different shapes of each character are classifies into 33 categories using Kohonen Self-organizing Map (SOM) by auto clustering similar ligatures for initial classification. During the Feature Extraction phase more than twenty five different features are extracted from each character which are further processed for final character recognition.\",\"PeriodicalId\":307661,\"journal\":{\"name\":\"2009 International Conference on Emerging Technologies\",\"volume\":\"79 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 International Conference on Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICET.2009.5353161\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Conference on Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICET.2009.5353161","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Self Organizing Map based Urdu Nasakh character recognition
Research in the field of character recognition for Urdu script faces challenges mainly due to its characteristics, like cursive nature, multiple fonts and context dependent shapes of characters and their position with respect to the base line. This paper addresses problems recognizing Nasakh script of Urdu Language. The proposed system takes segmented character as input and recognizes them in two steps. In the first step the different shapes of each character are classifies into 33 categories using Kohonen Self-organizing Map (SOM) by auto clustering similar ligatures for initial classification. During the Feature Extraction phase more than twenty five different features are extracted from each character which are further processed for final character recognition.