首页 > 最新文献

2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献

英文 中文
Abnormal crowd behavior detection using social force model 基于社会力模型的人群异常行为检测
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206641
Ramin Mehran, Alexis Oyama, M. Shah
In this paper we introduce a novel method to detect and localize abnormal behaviors in crowd videos using Social Force model. For this purpose, a grid of particles is placed over the image and it is advected with the space-time average of optical flow. By treating the moving particles as individuals, their interaction forces are estimated using social force model. The interaction force is then mapped into the image plane to obtain Force Flow for every pixel in every frame. Randomly selected spatio-temporal volumes of Force Flow are used to model the normal behavior of the crowd. We classify frames as normal and abnormal by using a bag of words approach. The regions of anomalies in the abnormal frames are localized using interaction forces. The experiments are conducted on a publicly available dataset from University of Minnesota for escape panic scenarios and a challenging dataset of crowd videos taken from the web. The experiments show that the proposed method captures the dynamics of the crowd behavior successfully. In addition, we have shown that the social force approach outperforms similar approaches based on pure optical flow.
本文介绍了一种利用社会力模型对人群视频中的异常行为进行检测和定位的方法。为此,在图像上放置一个粒子网格,并将其与光流的时空平均值平流。将运动粒子视为个体,利用社会力模型估计其相互作用力。然后将相互作用力映射到图像平面上,得到每一帧中每个像素的force Flow。随机选择的力流时空体积用于模拟人群的正常行为。我们使用词包的方法将框架划分为正常和异常。利用相互作用力对异常框架中的异常区域进行定位。实验是在明尼苏达大学的一个公开可用的数据集上进行的,该数据集用于逃离恐慌场景,另一个具有挑战性的数据集来自网络上的人群视频。实验表明,该方法成功地捕捉到了人群行为的动态特征。此外,我们已经证明,社会力方法优于基于纯光流的类似方法。
{"title":"Abnormal crowd behavior detection using social force model","authors":"Ramin Mehran, Alexis Oyama, M. Shah","doi":"10.1109/CVPR.2009.5206641","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206641","url":null,"abstract":"In this paper we introduce a novel method to detect and localize abnormal behaviors in crowd videos using Social Force model. For this purpose, a grid of particles is placed over the image and it is advected with the space-time average of optical flow. By treating the moving particles as individuals, their interaction forces are estimated using social force model. The interaction force is then mapped into the image plane to obtain Force Flow for every pixel in every frame. Randomly selected spatio-temporal volumes of Force Flow are used to model the normal behavior of the crowd. We classify frames as normal and abnormal by using a bag of words approach. The regions of anomalies in the abnormal frames are localized using interaction forces. The experiments are conducted on a publicly available dataset from University of Minnesota for escape panic scenarios and a challenging dataset of crowd videos taken from the web. The experiments show that the proposed method captures the dynamics of the crowd behavior successfully. In addition, we have shown that the social force approach outperforms similar approaches based on pure optical flow.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115598452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1650
Extraction of tubular structures over an orientation domain 取向域上管状结构的提取
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206782
M. Pechaud, R. Keriven, G. Peyré
This paper presents a new method to extract tubular structures from bi-dimensional images. The core of the proposed algorithm is the computation of geodesic curves over a four-dimensional space that includes local orientation and scale. These shortest paths follow closely the centerline of tubular structures, provide an estimation of the radius and can deal robustly with crossings over the image plane. Numerical experiments on a database of synthetic and natural images show the superiority of the proposed approach with respect to several method based on shortest paths extractions.
提出了一种从二维图像中提取管状结构的新方法。该算法的核心是在包含局部方向和尺度的四维空间上计算测地线曲线。这些最短路径与管状结构的中心线密切相关,提供了半径估计,并且可以鲁棒地处理图像平面上的交叉。在合成图像和自然图像数据库上进行的数值实验表明,该方法相对于几种基于最短路径提取的方法具有优越性。
{"title":"Extraction of tubular structures over an orientation domain","authors":"M. Pechaud, R. Keriven, G. Peyré","doi":"10.1109/CVPR.2009.5206782","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206782","url":null,"abstract":"This paper presents a new method to extract tubular structures from bi-dimensional images. The core of the proposed algorithm is the computation of geodesic curves over a four-dimensional space that includes local orientation and scale. These shortest paths follow closely the centerline of tubular structures, provide an estimation of the radius and can deal robustly with crossings over the image plane. Numerical experiments on a database of synthetic and natural images show the superiority of the proposed approach with respect to several method based on shortest paths extractions.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124372380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Learning color and locality cues for moving object detection and segmentation 学习运动物体检测和分割的颜色和位置线索
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206857
Feng Liu, Michael Gleicher
This paper presents an algorithm for automatically detecting and segmenting a moving object from a monocular video. Detecting and segmenting a moving object from a video with limited object motion is challenging. Since existing automatic algorithms rely on motion to detect the moving object, they cannot work well when the object motion is sparse and insufficient. In this paper, we present an unsupervised algorithm to learn object color and locality cues from the sparse motion information. We first detect key frames with reliable motion cues and then estimate moving sub-objects based on these motion cues using a Markov Random Field (MRF) framework. From these sub-objects, we learn an appearance model as a color Gaussian Mixture Model. To avoid the false classification of background pixels with similar color to the moving objects, the locations of these sub-objects are propagated to neighboring frames as locality cues. Finally, robust moving object segmentation is achieved by combining these learned color and locality cues with motion cues in a MRF framework. Experiments on videos with a variety of object and camera motion demonstrate the effectiveness of this algorithm.
本文提出了一种从单目视频中自动检测和分割运动目标的算法。检测和分割移动的物体从有限的物体运动的视频是具有挑战性的。由于现有的自动算法依赖于运动来检测运动物体,在物体运动稀疏且运动不足的情况下无法很好地工作。本文提出了一种从稀疏运动信息中学习物体颜色和位置线索的无监督算法。我们首先使用可靠的运动线索检测关键帧,然后使用马尔可夫随机场(MRF)框架基于这些运动线索估计运动子目标。从这些子对象中,我们学习了一个外观模型作为颜色高斯混合模型。为了避免与运动物体颜色相似的背景像素的错误分类,这些子物体的位置作为局部线索传播到相邻帧。最后,将这些学习到的颜色和位置线索与运动线索结合在MRF框架中,实现了鲁棒的运动目标分割。对各种物体和摄像机运动的视频进行实验,证明了该算法的有效性。
{"title":"Learning color and locality cues for moving object detection and segmentation","authors":"Feng Liu, Michael Gleicher","doi":"10.1109/CVPR.2009.5206857","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206857","url":null,"abstract":"This paper presents an algorithm for automatically detecting and segmenting a moving object from a monocular video. Detecting and segmenting a moving object from a video with limited object motion is challenging. Since existing automatic algorithms rely on motion to detect the moving object, they cannot work well when the object motion is sparse and insufficient. In this paper, we present an unsupervised algorithm to learn object color and locality cues from the sparse motion information. We first detect key frames with reliable motion cues and then estimate moving sub-objects based on these motion cues using a Markov Random Field (MRF) framework. From these sub-objects, we learn an appearance model as a color Gaussian Mixture Model. To avoid the false classification of background pixels with similar color to the moving objects, the locations of these sub-objects are propagated to neighboring frames as locality cues. Finally, robust moving object segmentation is achieved by combining these learned color and locality cues with motion cues in a MRF framework. Experiments on videos with a variety of object and camera motion demonstrate the effectiveness of this algorithm.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117234308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Contextual classification with functional Max-Margin Markov Networks 基于功能最大边际马尔可夫网络的上下文分类
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206590
Daniel Munoz, Andrew Bagnell
We address the problem of label assignment in computer vision: given a novel 3D or 2D scene, we wish to assign a unique label to every site (voxel, pixel, superpixel, etc.). To this end, the Markov Random Field framework has proven to be a model of choice as it uses contextual information to yield improved classification results over locally independent classifiers. In this work we adapt a functional gradient approach for learning high-dimensional parameters of random fields in order to perform discrete, multi-label classification. With this approach we can learn robust models involving high-order interactions better than the previously used learning method. We validate the approach in the context of point cloud classification and improve the state of the art. In addition, we successfully demonstrate the generality of the approach on the challenging vision problem of recovering 3-D geometric surfaces from images.
我们解决了计算机视觉中的标签分配问题:给定一个新的3D或2D场景,我们希望为每个站点(体素,像素,超像素等)分配一个唯一的标签。为此,马尔科夫随机场框架已被证明是一种选择模型,因为它使用上下文信息来产生优于局部独立分类器的分类结果。在这项工作中,我们采用了一种函数梯度方法来学习随机场的高维参数,以执行离散的多标签分类。通过这种方法,我们可以比以前使用的学习方法更好地学习涉及高阶交互的鲁棒模型。我们在点云分类的背景下验证了该方法,并提高了技术水平。此外,我们成功地证明了该方法在从图像中恢复三维几何表面的挑战性视觉问题上的通用性。
{"title":"Contextual classification with functional Max-Margin Markov Networks","authors":"Daniel Munoz, Andrew Bagnell","doi":"10.1109/CVPR.2009.5206590","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206590","url":null,"abstract":"We address the problem of label assignment in computer vision: given a novel 3D or 2D scene, we wish to assign a unique label to every site (voxel, pixel, superpixel, etc.). To this end, the Markov Random Field framework has proven to be a model of choice as it uses contextual information to yield improved classification results over locally independent classifiers. In this work we adapt a functional gradient approach for learning high-dimensional parameters of random fields in order to perform discrete, multi-label classification. With this approach we can learn robust models involving high-order interactions better than the previously used learning method. We validate the approach in the context of point cloud classification and improve the state of the art. In addition, we successfully demonstrate the generality of the approach on the challenging vision problem of recovering 3-D geometric surfaces from images.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121058716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 336
Classifier grids for robust adaptive object detection 鲁棒自适应目标检测的分类网格
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206616
P. Roth, Sabine Sternig, H. Grabner, H. Bischof
In this paper we present an adaptive but robust object detector for static cameras by introducing classifier grids. Instead of using a sliding window for object detection we propose to train a separate classifier for each image location, obtaining a very specific object detector with a low false alarm rate. For each classifier corresponding to a grid element we estimate two generative representations in parallel, one describing the object's class and one describing the background. These are combined in order to obtain a discriminative model. To enable to adapt to changing environments these classifiers are learned on-line (i.e., boosting). Continuously learning (24 hours a day, 7 days a week) requires a stable system. In our method this is ensured by a fixed object representation while updating only the representation of the background. We demonstrate the stability in a long-term experiment by running the system for a whole week, which shows a stable performance over time. In addition, we compare the proposed approach to state-of-the-art methods in the field of person and car detection. In both cases we obtain competitive results.
本文通过引入分类器网格,提出了一种自适应且鲁棒的静态相机目标检测器。我们建议为每个图像位置训练一个单独的分类器,而不是使用滑动窗口进行目标检测,从而获得具有低虚警率的非常特定的目标检测器。对于对应于网格元素的每个分类器,我们并行估计两个生成表示,一个描述对象的类,一个描述背景。将这些组合在一起以获得判别模型。为了适应不断变化的环境,这些分类器是在线学习的(即增强)。持续的学习(每天24小时,每周7天)需要一个稳定的系统。在我们的方法中,这是通过一个固定的对象表示来保证的,同时只更新背景的表示。我们在一个长期的实验中通过运行系统一整个星期来证明系统的稳定性,随着时间的推移,系统表现出稳定的性能。此外,我们将提出的方法与人和车检测领域的最新方法进行了比较。在这两种情况下,我们都获得了具有竞争力的结果。
{"title":"Classifier grids for robust adaptive object detection","authors":"P. Roth, Sabine Sternig, H. Grabner, H. Bischof","doi":"10.1109/CVPR.2009.5206616","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206616","url":null,"abstract":"In this paper we present an adaptive but robust object detector for static cameras by introducing classifier grids. Instead of using a sliding window for object detection we propose to train a separate classifier for each image location, obtaining a very specific object detector with a low false alarm rate. For each classifier corresponding to a grid element we estimate two generative representations in parallel, one describing the object's class and one describing the background. These are combined in order to obtain a discriminative model. To enable to adapt to changing environments these classifiers are learned on-line (i.e., boosting). Continuously learning (24 hours a day, 7 days a week) requires a stable system. In our method this is ensured by a fixed object representation while updating only the representation of the background. We demonstrate the stability in a long-term experiment by running the system for a whole week, which shows a stable performance over time. In addition, we compare the proposed approach to state-of-the-art methods in the field of person and car detection. In both cases we obtain competitive results.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124904814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 82
Robust guidewire tracking in fluoroscopy 透视中稳健的导丝跟踪
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206692
Peng Wang, Terrence Chen, Ying Zhu, Wei Zhang, S. Zhou, D. Comaniciu
A guidewire is a medical device inserted into vessels during image guided interventions for balloon inflation. During interventions, the guidewire undergoes non-rigid deformation due to patients' breathing and cardiac motions, and such 3D motions are complicated when being projected onto the 2D fluoroscopy. Furthermore, in fluoroscopy there exist severe image artifacts and other wire-like structures. All these make robust guidewire tracking challenging. To address these challenges, this paper presents a probabilistic framework for robust guidewire tracking. We first introduce a semantic guidewire model that contains three parts, including a catheter tip, a guidewire tip and a guidewire body. Measurements of different parts are integrated into a Bayesian framework as measurements of a whole guidewire for robust guidewire tracking. Moreover, for each part, two types of measurements, one from learning-based detectors and the other from online appearance models, are applied and combined. A hierarchical and multi-resolution tracking scheme is then developed based on kernel-based measurement smoothing to track guidewires effectively and efficiently in a coarse-to-fine manner. The presented framework has been validated on a test set of 47 sequences, and achieves a mean tracking error of less than 2 pixels. This demonstrates the great potential of our method for clinical applications.
导丝是一种医疗设备插入血管在图像引导干预气球膨胀。在干预过程中,导丝由于患者的呼吸和心脏运动而发生非刚性变形,这种三维运动在投影到二维透视上时很复杂。此外,在透视中存在严重的图像伪影和其他钢丝状结构。所有这些都使得稳健的导丝跟踪具有挑战性。为了解决这些问题,本文提出了一种鲁棒导丝跟踪的概率框架。我们首先介绍了一个语义导丝模型,该模型包含导管尖端、导丝尖端和导丝体三部分。不同部分的测量被整合到一个贝叶斯框架中作为整个导丝的测量,以实现导丝的鲁棒跟踪。此外,对于每个部分,应用并结合了两种类型的测量,一种来自基于学习的检测器,另一种来自在线外观模型。在此基础上,提出了一种基于核测量平滑的分层多分辨率跟踪方案,实现了导丝从粗到精的高效跟踪。该框架在47个序列的测试集上进行了验证,平均跟踪误差小于2个像素。这证明了我们的方法在临床应用方面的巨大潜力。
{"title":"Robust guidewire tracking in fluoroscopy","authors":"Peng Wang, Terrence Chen, Ying Zhu, Wei Zhang, S. Zhou, D. Comaniciu","doi":"10.1109/CVPR.2009.5206692","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206692","url":null,"abstract":"A guidewire is a medical device inserted into vessels during image guided interventions for balloon inflation. During interventions, the guidewire undergoes non-rigid deformation due to patients' breathing and cardiac motions, and such 3D motions are complicated when being projected onto the 2D fluoroscopy. Furthermore, in fluoroscopy there exist severe image artifacts and other wire-like structures. All these make robust guidewire tracking challenging. To address these challenges, this paper presents a probabilistic framework for robust guidewire tracking. We first introduce a semantic guidewire model that contains three parts, including a catheter tip, a guidewire tip and a guidewire body. Measurements of different parts are integrated into a Bayesian framework as measurements of a whole guidewire for robust guidewire tracking. Moreover, for each part, two types of measurements, one from learning-based detectors and the other from online appearance models, are applied and combined. A hierarchical and multi-resolution tracking scheme is then developed based on kernel-based measurement smoothing to track guidewires effectively and efficiently in a coarse-to-fine manner. The presented framework has been validated on a test set of 47 sequences, and achieves a mean tracking error of less than 2 pixels. This demonstrates the great potential of our method for clinical applications.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125082165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Learning shape prior models for object matching 学习物体匹配的形状先验模型
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206568
Tingting Jiang, F. Jurie, C. Schmid
The aim of this work is to learn a shape prior model for an object class and to improve shape matching with the learned shape prior. Given images of example instances, we can learn a mean shape of the object class as well as the variations of non-affine and affine transformations separately based on the thin plate spline (TPS) parameterization. Unlike previous methods, for learning, we represent shapes by vector fields instead of features which makes our learning approach general. During shape matching, we inject the shape prior knowledge and make the matching result consistent with the training examples. This is achieved by an extension of the TPS-RPM algorithm which finds a closed form solution for the TPS transformation coherent with the learned transformations. We test our approach by using it to learn shape prior models for all the five object classes in the ETHZ Shape Classes. The results show that the learning accuracy is better than previous work and the learned shape prior models are helpful for object matching in real applications such as object classification.
这项工作的目的是学习一个物体类的形状先验模型,并改进与学习到的形状先验的形状匹配。基于薄板样条(TPS)参数化,在给定样例图像的情况下,我们可以分别学习到目标类的平均形状以及非仿射变换和仿射变换的变化。与以前的学习方法不同,对于学习,我们用向量场而不是特征来表示形状,这使得我们的学习方法具有通用性。在形状匹配过程中,注入形状先验知识,使匹配结果与训练样例一致。这是通过扩展TPS- rpm算法来实现的,该算法为与学习到的变换一致的TPS变换找到封闭形式的解。我们通过使用它来学习ETHZ形状类中所有五个对象类的形状先验模型来测试我们的方法。结果表明,该方法的学习精度优于以往的方法,并且学习到的形状先验模型有助于物体分类等实际应用中的物体匹配。
{"title":"Learning shape prior models for object matching","authors":"Tingting Jiang, F. Jurie, C. Schmid","doi":"10.1109/CVPR.2009.5206568","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206568","url":null,"abstract":"The aim of this work is to learn a shape prior model for an object class and to improve shape matching with the learned shape prior. Given images of example instances, we can learn a mean shape of the object class as well as the variations of non-affine and affine transformations separately based on the thin plate spline (TPS) parameterization. Unlike previous methods, for learning, we represent shapes by vector fields instead of features which makes our learning approach general. During shape matching, we inject the shape prior knowledge and make the matching result consistent with the training examples. This is achieved by an extension of the TPS-RPM algorithm which finds a closed form solution for the TPS transformation coherent with the learned transformations. We test our approach by using it to learn shape prior models for all the five object classes in the ETHZ Shape Classes. The results show that the learning accuracy is better than previous work and the learned shape prior models are helpful for object matching in real applications such as object classification.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123521409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
Fast concurrent object localization and recognition 快速并发目标定位和识别
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206805
Tom Yeh, John J. Lee, Trevor Darrell
Object localization and recognition are important problems in computer vision. However, in many applications, exhaustive search over all object models and image locations is computationally prohibitive. While several methods have been proposed to make either recognition or localization more efficient, few have dealt with both tasks simultaneously. This paper proposes an efficient method for concurrent object localization and recognition based on a data-dependent multi-class branch-and-bound formalism. Existing bag-of-features recognition techniques which can be expressed as weighted combinations of feature counts can be readily adapted to our method. We present experimental results that demonstrate the merit of our algorithm in terms of recognition accuracy, localization accuracy, and speed, compared to baseline approaches including exhaustive search, implicit-shape model (ISM), and efficient sub-window search (ESS). Moreover, we develop two extensions to consider non-rectangular bounding regions-composite boxes and polygons-and demonstrate their ability to achieve higher recognition scores compared to traditional rectangular bounding boxes.
目标定位与识别是计算机视觉中的一个重要问题。然而,在许多应用程序中,对所有对象模型和图像位置进行穷举搜索在计算上是令人望而却步的。虽然已经提出了几种方法来提高识别或定位的效率,但很少有方法同时处理这两项任务。本文提出了一种基于数据相关的多类分支定界形式的并行目标定位与识别方法。现有的特征袋识别技术可以表示为特征计数的加权组合,可以很容易地适应我们的方法。实验结果表明,与包括穷极搜索、隐式形状模型(ISM)和高效子窗口搜索(ESS)在内的基线方法相比,我们的算法在识别精度、定位精度和速度方面具有优势。此外,我们开发了两个扩展来考虑非矩形边界区域——复合框和多边形——并证明了它们与传统矩形边界框相比能够获得更高的识别分数。
{"title":"Fast concurrent object localization and recognition","authors":"Tom Yeh, John J. Lee, Trevor Darrell","doi":"10.1109/CVPR.2009.5206805","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206805","url":null,"abstract":"Object localization and recognition are important problems in computer vision. However, in many applications, exhaustive search over all object models and image locations is computationally prohibitive. While several methods have been proposed to make either recognition or localization more efficient, few have dealt with both tasks simultaneously. This paper proposes an efficient method for concurrent object localization and recognition based on a data-dependent multi-class branch-and-bound formalism. Existing bag-of-features recognition techniques which can be expressed as weighted combinations of feature counts can be readily adapted to our method. We present experimental results that demonstrate the merit of our algorithm in terms of recognition accuracy, localization accuracy, and speed, compared to baseline approaches including exhaustive search, implicit-shape model (ISM), and efficient sub-window search (ESS). Moreover, we develop two extensions to consider non-rectangular bounding regions-composite boxes and polygons-and demonstrate their ability to achieve higher recognition scores compared to traditional rectangular bounding boxes.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125493676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Flow mosaicking: Real-time pedestrian counting without scene-specific learning 流动马赛克:实时行人计数,不需要特定场景的学习
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206648
Yang Cong, Haifeng Gong, Song-Chun Zhu, Yandong Tang
In this paper, we present a novel algorithm based on flow velocity field estimation to count the number of pedestrians across a detection line or inside a specified region. We regard pedestrians across the line as fluid flow, and design a novel model to estimate the flow velocity field. By integrating over time, the dynamic mosaics are constructed to count the number of pixels and edges passed through the line. Consequentially, the number of pedestrians can be estimated by quadratic regression, with the number of weighted pixels and edges as input. The regressors are learned off line from several camera tilt angles, and have taken the calibration information into account. We use tilt-angle-specific learning to ensure direct deployment and avoid overfitting while the commonly used scene-specific learning scheme needs on-site annotation and always trends to overfitting. Experiments on a variety of videos verified that the proposed method can give accurate estimation under different camera setup in real-time.
在本文中,我们提出了一种基于流速场估计的新算法来计算穿过检测线或在指定区域内的行人数量。我们将过街行人视为流体流动,设计了一种新的流速场估计模型。通过随时间的积分,动态马赛克被构造来计算通过该线的像素和边缘的数量。因此,行人的数量可以通过二次回归估计,加权像素和边缘的数量作为输入。该回归量从多个摄像机倾斜角度离线学习,并考虑了标定信息。我们使用倾斜特定学习来保证直接部署,避免过拟合,而常用的场景特定学习方案需要现场标注,容易出现过拟合的趋势。在各种视频上的实验验证了该方法可以在不同摄像机设置下实时给出准确的估计。
{"title":"Flow mosaicking: Real-time pedestrian counting without scene-specific learning","authors":"Yang Cong, Haifeng Gong, Song-Chun Zhu, Yandong Tang","doi":"10.1109/CVPR.2009.5206648","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206648","url":null,"abstract":"In this paper, we present a novel algorithm based on flow velocity field estimation to count the number of pedestrians across a detection line or inside a specified region. We regard pedestrians across the line as fluid flow, and design a novel model to estimate the flow velocity field. By integrating over time, the dynamic mosaics are constructed to count the number of pixels and edges passed through the line. Consequentially, the number of pedestrians can be estimated by quadratic regression, with the number of weighted pixels and edges as input. The regressors are learned off line from several camera tilt angles, and have taken the calibration information into account. We use tilt-angle-specific learning to ensure direct deployment and avoid overfitting while the commonly used scene-specific learning scheme needs on-site annotation and always trends to overfitting. Experiments on a variety of videos verified that the proposed method can give accurate estimation under different camera setup in real-time.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127030831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 90
Learning general optical flow subspaces for egomotion estimation and detection of motion anomalies 学习用于自运动估计和运动异常检测的一般光流子空间
Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206538
Richard Roberts, C. Potthast, F. Dellaert
This paper deals with estimation of dense optical flow and ego-motion in a generalized imaging system by exploiting probabilistic linear subspace constraints on the flow. We deal with the extended motion of the imaging system through an environment that we assume to have some degree of statistical regularity. For example, in autonomous ground vehicles the structure of the environment around the vehicle is far from arbitrary, and the depth at each pixel is often approximately constant. The subspace constraints hold not only for perspective cameras, but in fact for a very general class of imaging systems, including catadioptric and multiple-view systems. Using minimal assumptions about the imaging system, we learn a probabilistic subspace constraint that captures the statistical regularity of the scene geometry relative to an imaging system. We propose an extension to probabilistic PCA (Tipping and Bishop, 1999) as a way to robustly learn this subspace from recorded imagery, and demonstrate its use in conjunction with a sparse optical flow algorithm. To deal with the sparseness of the input flow, we use a generative model to estimate the subspace using only the observed flow measurements. Additionally, to identify and cope with image regions that violate subspace constraints, such as moving objects, objects that violate the depth regularity, or gross flow estimation errors, we employ a per-pixel Gaussian mixture outlier process. We demonstrate results of finding the optical flow subspaces and employing them to estimate dense flow and to recover camera motion for a variety of imaging systems in several different environments.
本文利用广义成像系统中密集光流的概率线性子空间约束,研究了密集光流和自运动的估计。我们通过假设具有一定程度的统计规律性的环境来处理成像系统的扩展运动。例如,在自主地面车辆中,车辆周围环境的结构远非任意的,并且每个像素处的深度通常近似恒定。子空间的限制不仅适用于透视相机,实际上也适用于非常一般的成像系统,包括反射镜和多视角系统。使用关于成像系统的最小假设,我们学习了一个概率子空间约束,该约束捕获了相对于成像系统的场景几何的统计规律性。我们提出了对概率PCA的扩展(Tipping and Bishop, 1999),作为一种从记录图像中鲁棒学习该子空间的方法,并演示了它与稀疏光流算法的结合使用。为了处理输入流的稀疏性,我们使用生成模型仅使用观察到的流量测量来估计子空间。此外,为了识别和处理违反子空间约束的图像区域,如移动物体、违反深度规则的物体或总流量估计误差,我们采用了每像素高斯混合离群值过程。我们展示了寻找光流子空间的结果,并利用它们来估计密集流,并在几个不同的环境中恢复各种成像系统的相机运动。
{"title":"Learning general optical flow subspaces for egomotion estimation and detection of motion anomalies","authors":"Richard Roberts, C. Potthast, F. Dellaert","doi":"10.1109/CVPR.2009.5206538","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206538","url":null,"abstract":"This paper deals with estimation of dense optical flow and ego-motion in a generalized imaging system by exploiting probabilistic linear subspace constraints on the flow. We deal with the extended motion of the imaging system through an environment that we assume to have some degree of statistical regularity. For example, in autonomous ground vehicles the structure of the environment around the vehicle is far from arbitrary, and the depth at each pixel is often approximately constant. The subspace constraints hold not only for perspective cameras, but in fact for a very general class of imaging systems, including catadioptric and multiple-view systems. Using minimal assumptions about the imaging system, we learn a probabilistic subspace constraint that captures the statistical regularity of the scene geometry relative to an imaging system. We propose an extension to probabilistic PCA (Tipping and Bishop, 1999) as a way to robustly learn this subspace from recorded imagery, and demonstrate its use in conjunction with a sparse optical flow algorithm. To deal with the sparseness of the input flow, we use a generative model to estimate the subspace using only the observed flow measurements. Additionally, to identify and cope with image regions that violate subspace constraints, such as moving objects, objects that violate the depth regularity, or gross flow estimation errors, we employ a per-pixel Gaussian mixture outlier process. We demonstrate results of finding the optical flow subspaces and employing them to estimate dense flow and to recover camera motion for a variety of imaging systems in several different environments.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115199983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 63
期刊
2009 IEEE Conference on Computer Vision and Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1