首页 > 最新文献

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)最新文献

英文 中文
A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image 单幅室内图像自主三维重建的动态贝叶斯网络模型
E. Delage, Honglak Lee, A. Ng
When we look at a picture, our prior knowledge about the world allows us to resolve some of the ambiguities that are inherent to monocular vision, and thereby infer 3d information about the scene. We also recognize different objects, decide on their orientations, and identify how they are connected to their environment. Focusing on the problem of autonomous 3d reconstruction of indoor scenes, in this paper we present a dynamic Bayesian network model capable of resolving some of these ambiguities and recovering 3d information for many images. Our model assumes a "floorwall" geometry on the scene and is trained to recognize the floor-wall boundary in each column of the image. When the image is produced under perspective geometry, we show that this model can be used for 3d reconstruction from a single image. To our knowledge, this was the first monocular approach to automatically recover 3d reconstructions from single indoor images.
当我们看一张图片时,我们对世界的先验知识使我们能够解决单目视觉固有的一些模糊性,从而推断出场景的3d信息。我们还可以识别不同的物体,确定它们的方向,并确定它们与环境的联系。针对室内场景的自主三维重建问题,本文提出了一种动态贝叶斯网络模型,该模型能够解决其中的一些模糊性并恢复许多图像的三维信息。我们的模型假设场景中的“地板墙”几何形状,并经过训练以识别图像中每列的地板墙边界。当图像在透视几何下生成时,我们表明该模型可以用于从单个图像进行三维重建。据我们所知,这是第一个从单个室内图像中自动恢复3d重建的单眼方法。
{"title":"A Dynamic Bayesian Network Model for Autonomous 3D Reconstruction from a Single Indoor Image","authors":"E. Delage, Honglak Lee, A. Ng","doi":"10.1109/CVPR.2006.23","DOIUrl":"https://doi.org/10.1109/CVPR.2006.23","url":null,"abstract":"When we look at a picture, our prior knowledge about the world allows us to resolve some of the ambiguities that are inherent to monocular vision, and thereby infer 3d information about the scene. We also recognize different objects, decide on their orientations, and identify how they are connected to their environment. Focusing on the problem of autonomous 3d reconstruction of indoor scenes, in this paper we present a dynamic Bayesian network model capable of resolving some of these ambiguities and recovering 3d information for many images. Our model assumes a \"floorwall\" geometry on the scene and is trained to recognize the floor-wall boundary in each column of the image. When the image is produced under perspective geometry, we show that this model can be used for 3d reconstruction from a single image. To our knowledge, this was the first monocular approach to automatically recover 3d reconstructions from single indoor images.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126810094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 244
Polarization-based Surface Reconstruction via Patch Matching 基于Patch匹配的偏振曲面重建
G. Atkinson, E. Hancock
A new method for multiple viewpoint 3D shape reconstruction is presented that relies on the polarization properties of surface reflection. The method is intended to complement existing stereo techniques by establishing correspondence for surfaces without salient features. The phase and degree of polarization from two views of an object are used to reconstruct surface patches. Local surface properties are then used to both align these patches and to compute a cost for that alignment. This cost is used as a basis to establish correspondence between the two views. The method is tested on an object library comprising shapes of varying complexity and material. An accuracy assessment is also presented where real world data are compared to ground truth.
提出了一种基于表面反射偏振特性的多视点三维形状重建方法。该方法旨在通过建立无显著特征表面的对应关系来补充现有的立体技术。从一个物体的两个视图的相位和偏振度被用来重建表面斑块。然后使用局部表面属性来对齐这些补丁并计算对齐的成本。这个成本被用作在两个视图之间建立对应关系的基础。该方法在包含不同复杂性和材料的形状的对象库上进行了测试。在将真实世界的数据与真实情况进行比较时,还提出了准确性评估。
{"title":"Polarization-based Surface Reconstruction via Patch Matching","authors":"G. Atkinson, E. Hancock","doi":"10.1109/CVPR.2006.226","DOIUrl":"https://doi.org/10.1109/CVPR.2006.226","url":null,"abstract":"A new method for multiple viewpoint 3D shape reconstruction is presented that relies on the polarization properties of surface reflection. The method is intended to complement existing stereo techniques by establishing correspondence for surfaces without salient features. The phase and degree of polarization from two views of an object are used to reconstruct surface patches. Local surface properties are then used to both align these patches and to compute a cost for that alignment. This cost is used as a basis to establish correspondence between the two views. The method is tested on an object library comprising shapes of varying complexity and material. An accuracy assessment is also presented where real world data are compared to ground truth.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115628931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Real Time Localization and 3D Reconstruction 实时定位和三维重建
E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, P. Sayd
In this paper we describe a method that estimates the motion of a calibrated camera (settled on an experimental vehicle) and the tridimensional geometry of the environment. The only data used is a video input. In fact, interest points are tracked and matched between frames at video rate. Robust estimates of the camera motion are computed in real-time, key-frames are selected and permit the features 3D reconstruction. The algorithm is particularly appropriate to the reconstruction of long images sequences thanks to the introduction of a fast and local bundle adjustment method that ensures both good accuracy and consistency of the estimated camera poses along the sequence. It also largely reduces computational complexity compared to a global bundle adjustment. Experiments on real data were carried out to evaluate speed and robustness of the method for a sequence of about one kilometer long. Results are also compared to the ground truth measured with a differential GPS.
在本文中,我们描述了一种估计校准相机(安置在实验车辆上)的运动和环境的三维几何形状的方法。唯一使用的数据是视频输入。实际上,兴趣点是在帧之间以视频速率跟踪和匹配的。实时计算摄像机运动的鲁棒估计,选择关键帧并允许特征3D重建。该算法特别适用于长图像序列的重建,因为它引入了一种快速的局部束调整方法,确保了沿序列估计的相机姿态的良好准确性和一致性。与全局束调整相比,它还大大降低了计算复杂度。在实际数据上进行了实验,以评估该方法在约1公里长的序列上的速度和鲁棒性。结果还与差分GPS测量的地面真值进行了比较。
{"title":"Real Time Localization and 3D Reconstruction","authors":"E. Mouragnon, M. Lhuillier, M. Dhome, F. Dekeyser, P. Sayd","doi":"10.1109/CVPR.2006.236","DOIUrl":"https://doi.org/10.1109/CVPR.2006.236","url":null,"abstract":"In this paper we describe a method that estimates the motion of a calibrated camera (settled on an experimental vehicle) and the tridimensional geometry of the environment. The only data used is a video input. In fact, interest points are tracked and matched between frames at video rate. Robust estimates of the camera motion are computed in real-time, key-frames are selected and permit the features 3D reconstruction. The algorithm is particularly appropriate to the reconstruction of long images sequences thanks to the introduction of a fast and local bundle adjustment method that ensures both good accuracy and consistency of the estimated camera poses along the sequence. It also largely reduces computational complexity compared to a global bundle adjustment. Experiments on real data were carried out to evaluate speed and robustness of the method for a sequence of about one kilometer long. Results are also compared to the ground truth measured with a differential GPS.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"262 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124276654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 443
Fusion of Summation Invariants in 3D Human Face Recognition 三维人脸识别中的求和不变量融合
Wei-Yang Lin, Kin-Chung Wong, N. Boston, Y. Hu
A novel family of 2D and 3D geometrically invariant features, called summation invariants is proposed for the recognition of the 3D surface of human faces. Focusing on a rectangular region surrounding the nose of a 3D facial depth map, a subset of the so called semi-local summation invariant features is extracted. Then the similarity between a pair of 3D facial depth maps is computed to determine whether they belong to the same person. Out of many possible combinations of these set of features, we select, through careful experimentation, a subset of features that yields best combined performance. Tested with the 3D facial data from the on-going Face Recognition Grand Challenge v1.0 dataset, the proposed new features exhibit significant performance improvement over the baseline algorithm distributed with the datase
提出了一种新的二维和三维几何不变量,称为和不变量,用于人脸三维表面的识别。针对三维人脸深度图鼻子周围的矩形区域,提取了所谓的半局部求和不变特征子集。然后计算一对三维面部深度图之间的相似度,以确定它们是否属于同一个人。从这些特征集的许多可能组合中,我们通过仔细的实验,选择一个子集,以产生最佳的组合性能。使用正在进行的人脸识别大挑战v1.0数据集的3D面部数据进行测试,所提出的新特征比数据库分发的基线算法表现出显着的性能改进
{"title":"Fusion of Summation Invariants in 3D Human Face Recognition","authors":"Wei-Yang Lin, Kin-Chung Wong, N. Boston, Y. Hu","doi":"10.1109/CVPR.2006.124","DOIUrl":"https://doi.org/10.1109/CVPR.2006.124","url":null,"abstract":"A novel family of 2D and 3D geometrically invariant features, called summation invariants is proposed for the recognition of the 3D surface of human faces. Focusing on a rectangular region surrounding the nose of a 3D facial depth map, a subset of the so called semi-local summation invariant features is extracted. Then the similarity between a pair of 3D facial depth maps is computed to determine whether they belong to the same person. Out of many possible combinations of these set of features, we select, through careful experimentation, a subset of features that yields best combined performance. Tested with the 3D facial data from the on-going Face Recognition Grand Challenge v1.0 dataset, the proposed new features exhibit significant performance improvement over the baseline algorithm distributed with the datase","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116708174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures 使用合成孔径重建遮挡表面:立体、聚焦和鲁棒测量
V. Vaish, M. Levoy, R. Szeliski, C. L. Zitnick, S. B. Kang
Most algorithms for 3D reconstruction from images use cost functions based on SSD, which assume that the surfaces being reconstructed are visible to all cameras. This makes it difficult to reconstruct objects which are partially occluded. Recently, researchers working with large camera arrays have shown it is possible to "see through" occlusions using a technique called synthetic aperture focusing. This suggests that we can design alternative cost functions that are robust to occlusions using synthetic apertures. Our paper explores this design space. We compare classical shape from stereo with shape from synthetic aperture focus. We also describe two variants of multi-view stereo based on color medians and entropy that increase robustness to occlusions. We present an experimental comparison of these cost functions on complex light fields, measuring their accuracy against the amount of occlusion.
大多数从图像进行3D重建的算法都使用基于SSD的成本函数,该算法假设被重建的表面对所有相机都是可见的。这使得重建部分遮挡的物体变得困难。最近,研究人员利用大型相机阵列表明,使用一种称为合成孔径聚焦的技术,可以“透视”遮挡物。这表明我们可以使用合成孔径设计对闭塞具有鲁棒性的替代成本函数。我们的论文探讨了这个设计空间。我们比较了立体的经典形状和合成光圈聚焦的形状。我们还描述了两种基于颜色中值和熵的多视图立体图像变体,以增加对遮挡的鲁棒性。我们在复杂光场上对这些代价函数进行了实验比较,测量了它们对遮挡量的准确性。
{"title":"Reconstructing Occluded Surfaces Using Synthetic Apertures: Stereo, Focus and Robust Measures","authors":"V. Vaish, M. Levoy, R. Szeliski, C. L. Zitnick, S. B. Kang","doi":"10.1109/CVPR.2006.244","DOIUrl":"https://doi.org/10.1109/CVPR.2006.244","url":null,"abstract":"Most algorithms for 3D reconstruction from images use cost functions based on SSD, which assume that the surfaces being reconstructed are visible to all cameras. This makes it difficult to reconstruct objects which are partially occluded. Recently, researchers working with large camera arrays have shown it is possible to \"see through\" occlusions using a technique called synthetic aperture focusing. This suggests that we can design alternative cost functions that are robust to occlusions using synthetic apertures. Our paper explores this design space. We compare classical shape from stereo with shape from synthetic aperture focus. We also describe two variants of multi-view stereo based on color medians and entropy that increase robustness to occlusions. We present an experimental comparison of these cost functions on complex light fields, measuring their accuracy against the amount of occlusion.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117134659","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 226
A Mean Field EM-algorithm for Coherent Occlusion Handling in MAP-Estimation Prob 地图估计问题中相干遮挡处理的平均场em算法
R. Fransens, C. Strecha, L. Gool
This paper presents a generative model based approach to deal with occlusions in vision problems which can be formulated as MAP-estimation problems. The approach is generic and targets applications in diverse domains like model-based object recognition, depth-from-stereo and image registration. It relies on a probabilistic imaging model, in which visible regions and occlusions are generated by two separate processes. The partitioning into visible and occluded regions is made explicit by the introduction of an hidden binary visibility map, which, to account for the coherent nature of occlusions, is modelled as a Markov Random Field. Inference is made tractable by a mean field EMalgorithm, which alternates between estimation of visibility and optimisation of model parameters. We demonstrate the effectiveness of the approach with two examples. First, in a N-view stereo experiment, we compute a dense depth map of a scene which is contaminated by multiple occluding objects. Finally, in a 2D-face recognition experiment, we try to identify people from partially occluded facial images.
本文提出了一种基于生成模型的方法来处理视觉问题中的遮挡问题,该问题可以表述为map估计问题。该方法具有通用性,适用于基于模型的物体识别、立体深度和图像配准等领域。它依赖于概率成像模型,其中可见区域和遮挡是由两个独立的过程产生的。通过引入一个隐藏的二进制可见性映射来明确划分为可见和遮挡区域,为了考虑遮挡的相干性,该映射被建模为马尔可夫随机场。该算法在模型参数的优化和可见性的估计之间交替进行,使推理变得易于处理。我们用两个例子证明了该方法的有效性。首先,在n视图立体实验中,我们计算了一个被多个遮挡物体污染的场景的密集深度图。最后,在一个二维人脸识别实验中,我们尝试从部分遮挡的人脸图像中识别人。
{"title":"A Mean Field EM-algorithm for Coherent Occlusion Handling in MAP-Estimation Prob","authors":"R. Fransens, C. Strecha, L. Gool","doi":"10.1109/CVPR.2006.31","DOIUrl":"https://doi.org/10.1109/CVPR.2006.31","url":null,"abstract":"This paper presents a generative model based approach to deal with occlusions in vision problems which can be formulated as MAP-estimation problems. The approach is generic and targets applications in diverse domains like model-based object recognition, depth-from-stereo and image registration. It relies on a probabilistic imaging model, in which visible regions and occlusions are generated by two separate processes. The partitioning into visible and occluded regions is made explicit by the introduction of an hidden binary visibility map, which, to account for the coherent nature of occlusions, is modelled as a Markov Random Field. Inference is made tractable by a mean field EMalgorithm, which alternates between estimation of visibility and optimisation of model parameters. We demonstrate the effectiveness of the approach with two examples. First, in a N-view stereo experiment, we compute a dense depth map of a scene which is contaminated by multiple occluding objects. Finally, in a 2D-face recognition experiment, we try to identify people from partially occluded facial images.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121055807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Learning Semantic Patterns with Discriminant Localized Binary Projections 用判别局部二值投影学习语义模式
Shuicheng Yan, Tianqiang Yuan, Xiaoou Tang
In this paper, we present a novel approach to learning semantic localized patterns with binary projections in a supervised manner. The pursuit of these binary projections is reformulated into a problem of feature clustering, which optimizes the separability of different classes by taking the members within each cluster as the nonzero entries of a projection vector. An efficient greedy procedure is proposed to incrementally combine the sub-clusters by ensuring the cardinality constraints of the projections and the increase of the objective function. Compared with other algorithms for sparse representations, our proposed algorithm, referred to as Discriminant Localized Binary Projections (dlb), has the following characteristics: 1) dlb is supervised, hence is much more effective than other unsupervised sparse algorithms like Non-negative Matrix Factorization (NMF) in terms of classification power; 2) similar to NMF, dlb can derive spatially localized sparse bases; furthermore, the sparsity of dlb is controllable, and an interesting result is that the bases have explicit semantics in human perception, like eyes and mouth; and 3) classification with dlb is extremely efficient, and only addition operations are required for dimensionality reduction. Extensive experimental results show significant improvements of dlb in sparsity and face recognition accuracy in comparison to the state-of-the-art algorithms for dimensionality reduction and sparse representations.
在本文中,我们提出了一种新的方法,以监督的方式学习具有二元投影的语义局部模式。对这些二元投影的追求被重新表述为特征聚类问题,该问题通过将每个聚类中的成员作为投影向量的非零项来优化不同类的可分性。通过保证投影的基数约束和目标函数的增加,提出了一种有效的贪心算法来实现子聚类的增量组合。与其他稀疏表示算法相比,我们提出的判别局部二值投影(Discriminant localization Binary projection, dlb)算法具有以下特点:1)dlb是有监督的,因此在分类能力上比非负矩阵分解(Non-negative Matrix Factorization, NMF)等其他无监督稀疏算法要有效得多;2)与NMF类似,dlb可以导出空间局部化的稀疏基;此外,DLB的稀疏性是可控的,一个有趣的结果是,这些基础在人类感知中具有明确的语义,如眼睛和嘴巴;3) DLB分类效率极高,降维只需要加法运算。大量的实验结果表明,与最先进的降维和稀疏表示算法相比,dlb在稀疏性和人脸识别精度方面有显著改善。
{"title":"Learning Semantic Patterns with Discriminant Localized Binary Projections","authors":"Shuicheng Yan, Tianqiang Yuan, Xiaoou Tang","doi":"10.1109/CVPR.2006.173","DOIUrl":"https://doi.org/10.1109/CVPR.2006.173","url":null,"abstract":"In this paper, we present a novel approach to learning semantic localized patterns with binary projections in a supervised manner. The pursuit of these binary projections is reformulated into a problem of feature clustering, which optimizes the separability of different classes by taking the members within each cluster as the nonzero entries of a projection vector. An efficient greedy procedure is proposed to incrementally combine the sub-clusters by ensuring the cardinality constraints of the projections and the increase of the objective function. Compared with other algorithms for sparse representations, our proposed algorithm, referred to as Discriminant Localized Binary Projections (dlb), has the following characteristics: 1) dlb is supervised, hence is much more effective than other unsupervised sparse algorithms like Non-negative Matrix Factorization (NMF) in terms of classification power; 2) similar to NMF, dlb can derive spatially localized sparse bases; furthermore, the sparsity of dlb is controllable, and an interesting result is that the bases have explicit semantics in human perception, like eyes and mouth; and 3) classification with dlb is extremely efficient, and only addition operations are required for dimensionality reduction. Extensive experimental results show significant improvements of dlb in sparsity and face recognition accuracy in comparison to the state-of-the-art algorithms for dimensionality reduction and sparse representations.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127324405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Shape Representation based on Integral Kernels: Application to Image Matching and Segmentation 基于积分核的形状表示:在图像匹配和分割中的应用
Byung-Woo Hong, E. Prados, Stefano Soatto, L. Vese
This paper presents a shape representation and a variational framework for the construction of diffeomorphisms that establish "meaningful"correspondences between images, in that they preserve the local geometry of singularities such as region boundaries. At the same time, the shape representation allows enforcing shape information locally in determining such region boundaries. Our representation is based on a kernel descriptor that characterizes local shape. This shape descriptor is robust to noise and forms a scale-space in which an appropriate scale can be chosen depending on the size of features of interest in the scene. In order to preserve local shape during the matching procedure, we introduce a novel constraint to traditional energybased approaches to estimate diffeomorphic deformations, and enforce it in a variational framework.
本文提出了一种形状表示和变分框架,用于构建在图像之间建立“有意义的”对应关系的微分同态,因为它们保留了奇点(如区域边界)的局部几何。同时,形状表示允许在确定此类区域边界时局部强制执行形状信息。我们的表示基于描述局部形状的内核描述符。这种形状描述符对噪声具有鲁棒性,并形成一个尺度空间,在这个尺度空间中,可以根据场景中感兴趣的特征的大小选择适当的尺度。为了在匹配过程中保持局部形状,我们在传统的基于能量的差分变形估计方法中引入了一种新的约束,并在变分框架中强制执行。
{"title":"Shape Representation based on Integral Kernels: Application to Image Matching and Segmentation","authors":"Byung-Woo Hong, E. Prados, Stefano Soatto, L. Vese","doi":"10.1109/CVPR.2006.277","DOIUrl":"https://doi.org/10.1109/CVPR.2006.277","url":null,"abstract":"This paper presents a shape representation and a variational framework for the construction of diffeomorphisms that establish \"meaningful\"correspondences between images, in that they preserve the local geometry of singularities such as region boundaries. At the same time, the shape representation allows enforcing shape information locally in determining such region boundaries. Our representation is based on a kernel descriptor that characterizes local shape. This shape descriptor is robust to noise and forms a scale-space in which an appropriate scale can be chosen depending on the size of features of interest in the scene. In order to preserve local shape during the matching procedure, we introduce a novel constraint to traditional energybased approaches to estimate diffeomorphic deformations, and enforce it in a variational framework.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124799511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 39
Covariance Tracking using Model Update Based on Lie Algebra 基于李代数的模型更新协方差跟踪
F. Porikli, Oncel Tuzel, P. Meer
We propose a simple and elegant algorithm to track nonrigid objects using a covariance based object description and a Lie algebra based update mechanism. We represent an object window as the covariance matrix of features, therefore we manage to capture the spatial and statistical properties as well as their correlation within the same representation. The covariance matrix enables efficient fusion of different types of features and modalities, and its dimensionality is small. We incorporated a model update algorithm using the Lie group structure of the positive definite matrices. The update mechanism effectively adapts to the undergoing object deformations and appearance changes. The covariance tracking method does not make any assumption on the measurement noise and the motion of the tracked objects, and provides the global optimal solution. We show that it is capable of accurately detecting the nonrigid, moving objects in non-stationary camera sequences while achieving a promising detection rate of 97.4 percent.
我们提出了一种简单而优雅的算法,利用基于协方差的对象描述和基于李代数的更新机制来跟踪非刚性对象。我们将对象窗口表示为特征的协方差矩阵,因此我们设法捕获空间和统计属性以及它们在同一表示中的相关性。协方差矩阵能有效地融合不同类型的特征和模态,且协方差矩阵维数小。我们引入了一种利用正定矩阵的李群结构的模型更新算法。该更新机制能有效地适应正在发生的物体变形和外观变化。协方差跟踪方法不考虑测量噪声和被跟踪对象的运动,提供全局最优解。我们表明,它能够准确地检测非静止相机序列中的非刚性运动物体,同时实现了97.4%的有希望的检测率。
{"title":"Covariance Tracking using Model Update Based on Lie Algebra","authors":"F. Porikli, Oncel Tuzel, P. Meer","doi":"10.1109/CVPR.2006.94","DOIUrl":"https://doi.org/10.1109/CVPR.2006.94","url":null,"abstract":"We propose a simple and elegant algorithm to track nonrigid objects using a covariance based object description and a Lie algebra based update mechanism. We represent an object window as the covariance matrix of features, therefore we manage to capture the spatial and statistical properties as well as their correlation within the same representation. The covariance matrix enables efficient fusion of different types of features and modalities, and its dimensionality is small. We incorporated a model update algorithm using the Lie group structure of the positive definite matrices. The update mechanism effectively adapts to the undergoing object deformations and appearance changes. The covariance tracking method does not make any assumption on the measurement noise and the motion of the tracked objects, and provides the global optimal solution. We show that it is capable of accurately detecting the nonrigid, moving objects in non-stationary camera sequences while achieving a promising detection rate of 97.4 percent.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125295127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 630
Tracking of Multiple, Partially Occluded Humans based on Static Body Part Detection 基于静态身体部位检测的多人局部遮挡跟踪
Bo Wu, R. Nevatia
Tracking of humans in videos is important for many applications. A major source of difficulty in performing this task is due to inter-human or scene occlusion. We present an approach based on representing humans as an assembly of four body parts and detection of the body parts in single frames which makes the method insensitive to camera motions. The responses of the body part detectors and a combined human detector provide the "observations" used for tracking. Trajectory initialization and termination are both fully automatic and rely on the confidences computed from the detection responses. An object is tracked by data association if its corresponding detection response can be found; otherwise it is tracked by a meanshift style tracker. Our method can track humans with both inter-object and scene occlusions. The system is evaluated on three sets of videos and compared with previous method.
在视频中跟踪人对许多应用来说都很重要。执行这项任务的一个主要困难来源是由于人与人之间或场景遮挡。我们提出了一种基于将人体表示为四个身体部位的集合并在单帧中检测身体部位的方法,使该方法对摄像机运动不敏感。身体部位探测器和组合人体探测器的响应提供了用于跟踪的“观察”。轨迹初始化和终止都是完全自动的,并且依赖于从检测响应计算的置信度。如果能找到相应的检测响应,则采用数据关联跟踪;否则,它将被meanshift样式跟踪器跟踪。我们的方法可以在物体间和场景遮挡下跟踪人类。该系统在三组视频上进行了评估,并与之前的方法进行了比较。
{"title":"Tracking of Multiple, Partially Occluded Humans based on Static Body Part Detection","authors":"Bo Wu, R. Nevatia","doi":"10.1109/CVPR.2006.312","DOIUrl":"https://doi.org/10.1109/CVPR.2006.312","url":null,"abstract":"Tracking of humans in videos is important for many applications. A major source of difficulty in performing this task is due to inter-human or scene occlusion. We present an approach based on representing humans as an assembly of four body parts and detection of the body parts in single frames which makes the method insensitive to camera motions. The responses of the body part detectors and a combined human detector provide the \"observations\" used for tracking. Trajectory initialization and termination are both fully automatic and rely on the confidences computed from the detection responses. An object is tracked by data association if its corresponding detection response can be found; otherwise it is tracked by a meanshift style tracker. Our method can track humans with both inter-object and scene occlusions. The system is evaluated on three sets of videos and compared with previous method.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125682240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 275
期刊
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1