{"title":"A computational framework for real-time detection and recognition of large number of classes","authors":"Li Tao, V. Asari","doi":"10.1109/AIPR.2004.1","DOIUrl":null,"url":null,"abstract":"Inspired by recent advances in real-time vision for certain applications, we propose a framework for developing and implementing systems that are capable of detecting and recognizing a large number of objects in real time on a top desktop workstation with field programmable gate array (FPGA) devices. To avoid explicit segmentation, detection and recognition is performed by scanning through local windows of input scenes at multiple scales. This is achieved by using a new feature family (named as topological local spectral histogram (ToLoSH) features, consisting of histograms of local regions of filtered images) and a lookup table decision tree (i.e. a decision tree where each node is implemented as lookup tables) as the classifier to reduce the average time per local window while achieving high accuracy. We show through analysis and empirical studies that ToLoSH features are effective to discriminate a large number of object classes and can be computed using only three instructions. Given the choice of the ToLoSH feature family and lookup table decision tree classifiers, the problem of real-time scene interpretation becomes a joint optimization problem of learning an optimal classifier and associated optimal ToLoSH features. To show the feasibility of the proposed framework, we have constructed a decision lookup table tree for a dataset consisting of textures, faces, and objects. We argue that the proposed framework may reconcile some of the fundamental issues in visual recognition modeling.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR.2004.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Inspired by recent advances in real-time vision for certain applications, we propose a framework for developing and implementing systems that are capable of detecting and recognizing a large number of objects in real time on a top desktop workstation with field programmable gate array (FPGA) devices. To avoid explicit segmentation, detection and recognition is performed by scanning through local windows of input scenes at multiple scales. This is achieved by using a new feature family (named as topological local spectral histogram (ToLoSH) features, consisting of histograms of local regions of filtered images) and a lookup table decision tree (i.e. a decision tree where each node is implemented as lookup tables) as the classifier to reduce the average time per local window while achieving high accuracy. We show through analysis and empirical studies that ToLoSH features are effective to discriminate a large number of object classes and can be computed using only three instructions. Given the choice of the ToLoSH feature family and lookup table decision tree classifiers, the problem of real-time scene interpretation becomes a joint optimization problem of learning an optimal classifier and associated optimal ToLoSH features. To show the feasibility of the proposed framework, we have constructed a decision lookup table tree for a dataset consisting of textures, faces, and objects. We argue that the proposed framework may reconcile some of the fundamental issues in visual recognition modeling.