A Scalable 3D HOG Model for Fast Object Detection and Viewpoint Estimation

M. Pedersoli, T. Tuytelaars
{"title":"A Scalable 3D HOG Model for Fast Object Detection and Viewpoint Estimation","authors":"M. Pedersoli, T. Tuytelaars","doi":"10.1109/3DV.2014.82","DOIUrl":null,"url":null,"abstract":"In this paper we present a scalable way to learn and detect objects using a 3D representation based on HOG patches placed on a 3D cuboid. The model consists of a single 3D representation that is shared among views. Similarly to the work of Fidler et al. [5], at detection time this representation is projected on the image plane over the desired viewpoints. However, whereas in [5] the projection is done at image-level and therefore the computational cost is linear in the number of views, in our model every view is approximated at feature level as a linear combination of the pre-computed fron to-parallel views. As a result, once the fron to-parallel views have been computed, the cost of computing new views is almost negligible. This allows the model to be evaluated on many more viewpoints. In the experimental results we show that the proposed model has a comparable detection and pose estimation performance to standard multiview HOG detectors, but it is faster, it scales very well with the number of views and can better generalize to unseen views. Finally, we also show that with a procedure similar to label propagation it is possible to train the model even without using pose annotations at training time.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 2nd International Conference on 3D Vision","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV.2014.82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

In this paper we present a scalable way to learn and detect objects using a 3D representation based on HOG patches placed on a 3D cuboid. The model consists of a single 3D representation that is shared among views. Similarly to the work of Fidler et al. [5], at detection time this representation is projected on the image plane over the desired viewpoints. However, whereas in [5] the projection is done at image-level and therefore the computational cost is linear in the number of views, in our model every view is approximated at feature level as a linear combination of the pre-computed fron to-parallel views. As a result, once the fron to-parallel views have been computed, the cost of computing new views is almost negligible. This allows the model to be evaluated on many more viewpoints. In the experimental results we show that the proposed model has a comparable detection and pose estimation performance to standard multiview HOG detectors, but it is faster, it scales very well with the number of views and can better generalize to unseen views. Finally, we also show that with a procedure similar to label propagation it is possible to train the model even without using pose annotations at training time.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
一种用于快速目标检测和视点估计的可扩展3D HOG模型
在本文中,我们提出了一种可扩展的方法,使用基于放置在三维长方体上的HOG补丁的3D表示来学习和检测物体。该模型由视图之间共享的单一3D表示组成。与Fidler等人的工作类似[5],在检测时,该表示在所需视点上投影到图像平面上。然而,在[5]中,投影是在图像级别上完成的,因此计算成本在视图数量上是线性的,而在我们的模型中,每个视图在特征级别上近似为预先计算的平行视图的线性组合。因此,一旦计算了从并行视图,计算新视图的成本几乎可以忽略不计。这允许在更多的视点上对模型进行评估。实验结果表明,该模型具有与标准多视图HOG检测器相当的检测和姿态估计性能,但速度更快,随着视图数量的增加,它可以很好地扩展,并且可以更好地推广到未见视图。最后,我们还表明,使用类似于标签传播的过程,即使在训练时不使用姿态注释,也可以训练模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Querying 3D Mesh Sequences for Human Action Retrieval Temporal Octrees for Compressing Dynamic Point Cloud Streams High-Quality Depth Recovery via Interactive Multi-view Stereo Iterative Closest Spectral Kernel Maps Close-Range Photometric Stereo with Point Light Sources
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1