A computational framework for real-time detection and recognition of large number of classes

Li Tao, V. Asari
{"title":"A computational framework for real-time detection and recognition of large number of classes","authors":"Li Tao, V. Asari","doi":"10.1109/AIPR.2004.1","DOIUrl":null,"url":null,"abstract":"Inspired by recent advances in real-time vision for certain applications, we propose a framework for developing and implementing systems that are capable of detecting and recognizing a large number of objects in real time on a top desktop workstation with field programmable gate array (FPGA) devices. To avoid explicit segmentation, detection and recognition is performed by scanning through local windows of input scenes at multiple scales. This is achieved by using a new feature family (named as topological local spectral histogram (ToLoSH) features, consisting of histograms of local regions of filtered images) and a lookup table decision tree (i.e. a decision tree where each node is implemented as lookup tables) as the classifier to reduce the average time per local window while achieving high accuracy. We show through analysis and empirical studies that ToLoSH features are effective to discriminate a large number of object classes and can be computed using only three instructions. Given the choice of the ToLoSH feature family and lookup table decision tree classifiers, the problem of real-time scene interpretation becomes a joint optimization problem of learning an optimal classifier and associated optimal ToLoSH features. To show the feasibility of the proposed framework, we have constructed a decision lookup table tree for a dataset consisting of textures, faces, and objects. We argue that the proposed framework may reconcile some of the fundamental issues in visual recognition modeling.","PeriodicalId":120814,"journal":{"name":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2004-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"33rd Applied Imagery Pattern Recognition Workshop (AIPR'04)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIPR.2004.1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Inspired by recent advances in real-time vision for certain applications, we propose a framework for developing and implementing systems that are capable of detecting and recognizing a large number of objects in real time on a top desktop workstation with field programmable gate array (FPGA) devices. To avoid explicit segmentation, detection and recognition is performed by scanning through local windows of input scenes at multiple scales. This is achieved by using a new feature family (named as topological local spectral histogram (ToLoSH) features, consisting of histograms of local regions of filtered images) and a lookup table decision tree (i.e. a decision tree where each node is implemented as lookup tables) as the classifier to reduce the average time per local window while achieving high accuracy. We show through analysis and empirical studies that ToLoSH features are effective to discriminate a large number of object classes and can be computed using only three instructions. Given the choice of the ToLoSH feature family and lookup table decision tree classifiers, the problem of real-time scene interpretation becomes a joint optimization problem of learning an optimal classifier and associated optimal ToLoSH features. To show the feasibility of the proposed framework, we have constructed a decision lookup table tree for a dataset consisting of textures, faces, and objects. We argue that the proposed framework may reconcile some of the fundamental issues in visual recognition modeling.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种用于实时检测和识别大量类别的计算框架
受某些应用的实时视觉最新进展的启发,我们提出了一个框架,用于开发和实现能够在具有现场可编程门阵列(FPGA)设备的顶级桌面工作站上实时检测和识别大量对象的系统。为了避免显式分割,检测和识别是通过在多个尺度上扫描输入场景的局部窗口来完成的。这是通过使用新的特征族(称为拓扑局部光谱直方图(ToLoSH)特征,由过滤图像的局部区域直方图组成)和查找表决策树(即每个节点实现为查找表的决策树)作为分类器来减少每个局部窗口的平均时间,同时实现高精度来实现的。我们通过分析和实证研究表明,ToLoSH特征可以有效地区分大量的对象类别,并且只需三条指令就可以计算出来。给定ToLoSH特征族和查找表决策树分类器的选择,实时场景解释问题成为学习最优分类器和关联最优ToLoSH特征的联合优化问题。为了展示所提出框架的可行性,我们为包含纹理、人脸和对象的数据集构建了决策查找表树。我们认为所提出的框架可以调和视觉识别建模中的一些基本问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Top-down approach to segmentation of prostate boundaries in ultrasound images Computation in the higher visual cortices: map-seeking circuit theory and application to machine vision Neurally-based algorithms for image processing Image primitive signatures A multiresolution time domain approach to RF image formation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1