A computational framework for real-time detection and recognition of large number of classes

33rd Applied Imagery Pattern Recognition Workshop (AIPR'04) Pub Date : 2004-10-13 DOI:10.1109/AIPR.2004.1

Li Tao, V. Asari

{"title":"A computational framework for real-time detection and recognition of large number of classes","authors":"Li Tao, V. Asari","doi":"10.1109/AIPR.2004.1","DOIUrl":null,"url":null,"abstract":"Inspired by recent advances in real-time vision for certain applications, we propose a framework for developing and implementing systems that are capable of detecting and recognizing a large number of objects in real time on a top desktop workstation with field programmable gate array (FPGA) devices. To avoid explicit segmentation, detection and recognition is performed by scanning through local windows of input scenes at multiple scales. This is achieved by using a new feature family (named as topological local spectral histogram (ToLoSH) features, consisting of histograms of local regions of filtered images) and a lookup table decision tree (i.e. a decision tree where each node is implemented as lookup tables) as the classifier to reduce the average time per local window while achieving high accuracy. We show through analysis and empirical studies that ToLoSH features are effective to discriminate a large number of object classes and can be computed using only three instructions. Given the choice of the ToLoSH feature family and lookup table decision tree classifiers, the problem of real-time scene interpretation becomes a joint optimization problem of learning an optimal classifier and associated optimal ToLoSH features. To show the feasibility of the proposed framework, we have constructed a decision lookup table tree for a dataset consisting of textures, faces, and objects. We argue that the proposed framework may reconcile some of the fundamental issues in visual recognition modeling.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR.2004.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Inspired by recent advances in real-time vision for certain applications, we propose a framework for developing and implementing systems that are capable of detecting and recognizing a large number of objects in real time on a top desktop workstation with field programmable gate array (FPGA) devices. To avoid explicit segmentation, detection and recognition is performed by scanning through local windows of input scenes at multiple scales. This is achieved by using a new feature family (named as topological local spectral histogram (ToLoSH) features, consisting of histograms of local regions of filtered images) and a lookup table decision tree (i.e. a decision tree where each node is implemented as lookup tables) as the classifier to reduce the average time per local window while achieving high accuracy. We show through analysis and empirical studies that ToLoSH features are effective to discriminate a large number of object classes and can be computed using only three instructions. Given the choice of the ToLoSH feature family and lookup table decision tree classifiers, the problem of real-time scene interpretation becomes a joint optimization problem of learning an optimal classifier and associated optimal ToLoSH features. To show the feasibility of the proposed framework, we have constructed a decision lookup table tree for a dataset consisting of textures, faces, and objects. We argue that the proposed framework may reconcile some of the fundamental issues in visual recognition modeling.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种用于实时检测和识别大量类别的计算框架

受某些应用的实时视觉最新进展的启发，我们提出了一个框架，用于开发和实现能够在具有现场可编程门阵列(FPGA)设备的顶级桌面工作站上实时检测和识别大量对象的系统。为了避免显式分割，检测和识别是通过在多个尺度上扫描输入场景的局部窗口来完成的。这是通过使用新的特征族(称为拓扑局部光谱直方图(ToLoSH)特征，由过滤图像的局部区域直方图组成)和查找表决策树(即每个节点实现为查找表的决策树)作为分类器来减少每个局部窗口的平均时间，同时实现高精度来实现的。我们通过分析和实证研究表明，ToLoSH特征可以有效地区分大量的对象类别，并且只需三条指令就可以计算出来。给定ToLoSH特征族和查找表决策树分类器的选择，实时场景解释问题成为学习最优分类器和关联最优ToLoSH特征的联合优化问题。为了展示所提出框架的可行性，我们为包含纹理、人脸和对象的数据集构建了决策查找表树。我们认为所提出的框架可以调和视觉识别建模中的一些基本问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)

自引率

0.00%

发文量

期刊最新文献

Top-down approach to segmentation of prostate boundaries in ultrasound images Computation in the higher visual cortices: map-seeking circuit theory and application to machine vision Neurally-based algorithms for image processing Image primitive signatures A multiresolution time domain approach to RF image formation