首页 > 最新文献

2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)最新文献

英文 中文
Intelligent Collaborative Tracking by Mining Auxiliary Objects 基于辅助对象挖掘的智能协同跟踪
Ming Yang, Ying Wu, S. Lao
Many tracking methods face a fundamental dilemma in practice: tracking has to be computationally efficient but verifying if or not the tracker is following the true target tends to be demanding, especially when the background is cluttered and/or when occlusion occurs. Due to the lack of a good solution to this problem, many existing methods tend to be either computationally intensive with the use of sophisticated image observation models, or vulnerable to the false alarms. This greatly threatens long-duration robust tracking. This paper presents a novel solution to this dilemma by integrating into the tracking process a set of auxiliary objects that are automatically discovered in the video on the fly by data mining. Auxiliary objects have three properties at least in a short time interval: (1) persistent co-occurrence with the target; (2) consistent motion correlation with the target; and (3) easy to track. The collaborative tracking of these auxiliary objects leads to an efficient computation as well as a strong verification. Our extensive experiments have exhibited exciting performance in very challenging real-world testing cases.
许多跟踪方法在实践中面临着一个基本的困境:跟踪必须具有计算效率,但验证跟踪器是否跟踪真实目标往往是苛刻的,特别是当背景混乱和/或遮挡发生时。由于缺乏很好的解决方案,现有的许多方法要么使用复杂的图像观测模型计算量大,要么容易产生误报。这极大地威胁到长时间的稳健跟踪。本文提出了一种新颖的解决方案,即在跟踪过程中集成一组通过数据挖掘在视频中动态自动发现的辅助对象。辅助对象至少在短时间间隔内具有三个属性:(1)与目标持续共现;(2)与目标运动相关性一致;(3)易于跟踪。这些辅助目标的协同跟踪使得计算效率高,验证能力强。我们广泛的实验在非常具有挑战性的现实世界测试案例中展示了令人兴奋的性能。
{"title":"Intelligent Collaborative Tracking by Mining Auxiliary Objects","authors":"Ming Yang, Ying Wu, S. Lao","doi":"10.1109/CVPR.2006.157","DOIUrl":"https://doi.org/10.1109/CVPR.2006.157","url":null,"abstract":"Many tracking methods face a fundamental dilemma in practice: tracking has to be computationally efficient but verifying if or not the tracker is following the true target tends to be demanding, especially when the background is cluttered and/or when occlusion occurs. Due to the lack of a good solution to this problem, many existing methods tend to be either computationally intensive with the use of sophisticated image observation models, or vulnerable to the false alarms. This greatly threatens long-duration robust tracking. This paper presents a novel solution to this dilemma by integrating into the tracking process a set of auxiliary objects that are automatically discovered in the video on the fly by data mining. Auxiliary objects have three properties at least in a short time interval: (1) persistent co-occurrence with the target; (2) consistent motion correlation with the target; and (3) easy to track. The collaborative tracking of these auxiliary objects leads to an efficient computation as well as a strong verification. Our extensive experiments have exhibited exciting performance in very challenging real-world testing cases.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"251 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114246516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Structure and View Estimation for Tomographic Reconstruction: A Bayesian Approach 层析成像重建的结构和视图估计:贝叶斯方法
S. P. Mallick, Sameer Agarwal, D. Kriegman, Serge J. Belongie, B. Carragher, C. Potter
This paper addresses the problem of reconstructing the density of a scene from multiple projection images produced by modalities such as x-ray, electron microscopy, etc. where an image value is related to the integral of the scene density along a 3D line segment between a radiation source and a point on the image plane. While computed tomography (CT) addresses this problem when the absolute orientation of the image plane and radiation source directions are known, this paper addresses the problem when the orientations are unknown - it is akin to the structure-from-motion (SFM) problem when the extrinsic camera parameters are unknown. We study the problem within the context of reconstructing the density of protein macro-molecules in Cryogenic Electron Microscopy (cryo-EM), where images are very noisy and existing techniques use several thousands of images. In a non-degenerate configuration, the viewing planes corresponding to two projections, intersect in a line in 3D. Using the geometry of the imaging setup, it is possible to determine the projections of this 3D line on the two image planes. In turn, the problem can be formulated as a type of orthographic structure from motion from line correspondences where the line correspondences between two views are unreliable due to image noise. We formulate the task as the problem of denoising a correspondence matrix and present a Bayesian solution to it. Subsequently, the absolute orientation of each projection is determined followed by density reconstruction. We show results on cryo-EM images of proteins and compare our results to that of Electron Micrograph Analysis (EMAN) - a widely used reconstruction tool in cryo-EM.
本文解决了从x射线,电子显微镜等方式产生的多个投影图像中重建场景密度的问题,其中图像值与场景密度沿辐射源和图像平面上的点之间的三维线段的积分有关。虽然计算机断层扫描(CT)解决了当图像平面的绝对方向和辐射源方向已知时的问题,但本文解决了当方向未知时的问题-它类似于外部相机参数未知时的运动结构(SFM)问题。我们在低温电子显微镜(cryo-EM)中重建蛋白质大分子密度的背景下研究了这个问题,其中图像非常嘈杂,现有技术使用数千张图像。在非简并构型中,对应于两个投影的观察平面在三维中相交于一条直线。利用成像设置的几何形状,可以确定这条3D线在两个图像平面上的投影。反过来,这个问题可以被表述为一种来自直线对应的运动的正射影结构,其中两个视图之间的直线对应由于图像噪声而不可靠。我们将该任务表述为对对应矩阵去噪的问题,并给出了一个贝叶斯解。然后,确定每个投影的绝对方向,然后进行密度重建。我们展示了蛋白质的冷冻电镜图像的结果,并将我们的结果与电子显微图分析(EMAN)的结果进行了比较-电子显微图分析是冷冻电镜中广泛使用的重建工具。
{"title":"Structure and View Estimation for Tomographic Reconstruction: A Bayesian Approach","authors":"S. P. Mallick, Sameer Agarwal, D. Kriegman, Serge J. Belongie, B. Carragher, C. Potter","doi":"10.1109/CVPR.2006.295","DOIUrl":"https://doi.org/10.1109/CVPR.2006.295","url":null,"abstract":"This paper addresses the problem of reconstructing the density of a scene from multiple projection images produced by modalities such as x-ray, electron microscopy, etc. where an image value is related to the integral of the scene density along a 3D line segment between a radiation source and a point on the image plane. While computed tomography (CT) addresses this problem when the absolute orientation of the image plane and radiation source directions are known, this paper addresses the problem when the orientations are unknown - it is akin to the structure-from-motion (SFM) problem when the extrinsic camera parameters are unknown. We study the problem within the context of reconstructing the density of protein macro-molecules in Cryogenic Electron Microscopy (cryo-EM), where images are very noisy and existing techniques use several thousands of images. In a non-degenerate configuration, the viewing planes corresponding to two projections, intersect in a line in 3D. Using the geometry of the imaging setup, it is possible to determine the projections of this 3D line on the two image planes. In turn, the problem can be formulated as a type of orthographic structure from motion from line correspondences where the line correspondences between two views are unreliable due to image noise. We formulate the task as the problem of denoising a correspondence matrix and present a Bayesian solution to it. Subsequently, the absolute orientation of each projection is determined followed by density reconstruction. We show results on cryo-EM images of proteins and compare our results to that of Electron Micrograph Analysis (EMAN) - a widely used reconstruction tool in cryo-EM.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114601255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Interactive Feature Tracking using K-D Trees and Dynamic Programming 基于K-D树和动态规划的交互式特征跟踪
Aeron Buchanan, A. Fitzgibbon
A new approach to template tracking is presented, incorporating three distinct contributions. Firstly, an explicit definition for a feature track is given. Secondly, the advantages of an image preprocessing stage are demonstrated and, in particular, the effectiveness of highly compressed image patch data stored in k-d trees for fast and discriminatory image patch searches. Thirdly, the k-d trees are used to generate multiple track hypotheses which are efficiently merged to give the optimal solution using dynamic programming. The explicit separation of feature detection and trajectory determination creates the basis for the novel use of k-d trees and dynamic programming. Multiple appearances and occlusion handling are seamlessly integrated into this framework. Appearance variation through the sequence is robustly handled in an iterative process. The work presented is a significant foundation for a powerful off-line feature tracking system, particularly in the context of interactive applications.
提出了一种新的模板跟踪方法,它结合了三个不同的贡献。首先,给出了特征轨迹的明确定义。其次,展示了图像预处理阶段的优势,特别是高度压缩的图像补丁数据存储在k-d树中用于快速和歧视性图像补丁搜索的有效性。第三,利用k-d树生成多个轨道假设,并利用动态规划方法进行有效合并,给出最优解。特征检测和轨迹确定的明确分离为k-d树和动态规划的新使用奠定了基础。多个外观和遮挡处理被无缝集成到这个框架中。在迭代过程中稳健地处理序列中的外观变化。所提出的工作是一个强大的离线特征跟踪系统的重要基础,特别是在交互式应用程序的背景下。
{"title":"Interactive Feature Tracking using K-D Trees and Dynamic Programming","authors":"Aeron Buchanan, A. Fitzgibbon","doi":"10.1109/CVPR.2006.158","DOIUrl":"https://doi.org/10.1109/CVPR.2006.158","url":null,"abstract":"A new approach to template tracking is presented, incorporating three distinct contributions. Firstly, an explicit definition for a feature track is given. Secondly, the advantages of an image preprocessing stage are demonstrated and, in particular, the effectiveness of highly compressed image patch data stored in k-d trees for fast and discriminatory image patch searches. Thirdly, the k-d trees are used to generate multiple track hypotheses which are efficiently merged to give the optimal solution using dynamic programming. The explicit separation of feature detection and trajectory determination creates the basis for the novel use of k-d trees and dynamic programming. Multiple appearances and occlusion handling are seamlessly integrated into this framework. Appearance variation through the sequence is robustly handled in an iterative process. The work presented is a significant foundation for a powerful off-line feature tracking system, particularly in the context of interactive applications.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115748773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering 基于谱舍入的图分割:在图像分割和聚类中的应用
David Tolliver, G. Miller
We introduce a family of spectral partitioning methods. Edge separators of a graph are produced by iteratively reweighting the edges until the graph disconnects into the prescribed number of components. At each iteration a small number of eigenvectors with small eigenvalue are computed and used to determine the reweighting. In this way spectral rounding directly produces discrete solutions where as current spectral algorithms must map the continuous eigenvectors to discrete solutions by employing a heuristic geometric separator (e.g. k-means). We show that spectral rounding compares favorably to current spectral approximations on the Normalized Cut criterion (NCut). Results are given for natural image segmentation, medical image segmentation, and clustering. A practical version is shown to converge.
我们介绍了一系列谱划分方法。图的边缘分隔符是通过迭代地重新加权边缘来产生的,直到图分离成规定数量的组件。在每次迭代中,计算具有较小特征值的少量特征向量并用于确定重加权。以这种方式,频谱舍入直接产生离散解,而当前的频谱算法必须通过采用启发式几何分隔符(例如k-means)将连续特征向量映射到离散解。我们表明,光谱四舍五入优于当前的光谱近似的归一化切割准则(NCut)。给出了自然图像分割、医学图像分割和聚类的结果。一个实用的版本是收敛的。
{"title":"Graph Partitioning by Spectral Rounding: Applications in Image Segmentation and Clustering","authors":"David Tolliver, G. Miller","doi":"10.1109/CVPR.2006.129","DOIUrl":"https://doi.org/10.1109/CVPR.2006.129","url":null,"abstract":"We introduce a family of spectral partitioning methods. Edge separators of a graph are produced by iteratively reweighting the edges until the graph disconnects into the prescribed number of components. At each iteration a small number of eigenvectors with small eigenvalue are computed and used to determine the reweighting. In this way spectral rounding directly produces discrete solutions where as current spectral algorithms must map the continuous eigenvectors to discrete solutions by employing a heuristic geometric separator (e.g. k-means). We show that spectral rounding compares favorably to current spectral approximations on the Normalized Cut criterion (NCut). Results are given for natural image segmentation, medical image segmentation, and clustering. A practical version is shown to converge.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116277051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 109
Bottom-Up & Top-down Object Detection using Primal Sketch Features and Graphical Models 自底向上的,使用原始草图特征和图形模型的自上而下的对象检测
Iasonas Kokkinos, P. Maragos, A. Yuille
A combination of techniques that is becoming increasingly popular is the construction of part-based object representations using the outputs of interest-point detectors. Our contributions in this paper are twofold: first, we propose a primal-sketch-based set of image tokens that are used for object representation and detection. Second, top-down information is introduced based on an efficient method for the evaluation of the likelihood of hypothesized part locations. This allows us to use graphical model techniques to complement bottom-up detection, by proposing and finding the parts of the object that were missed by the front-end feature detection stage. Detection results for four object categories validate the merits of this joint top-down and bottom-up approach.
越来越流行的技术组合是使用兴趣点检测器的输出构建基于部件的对象表示。我们在本文中的贡献是双重的:首先,我们提出了一组基于原始草图的图像令牌,用于对象表示和检测。其次,引入了基于有效方法的自顶向下信息,用于评估假设零件位置的可能性。这允许我们使用图形模型技术来补充自下而上的检测,通过提出和找到被前端特征检测阶段遗漏的对象部分。四类目标的检测结果验证了自顶向下和自底向上联合方法的优点。
{"title":"Bottom-Up & Top-down Object Detection using Primal Sketch Features and Graphical Models","authors":"Iasonas Kokkinos, P. Maragos, A. Yuille","doi":"10.1109/CVPR.2006.74","DOIUrl":"https://doi.org/10.1109/CVPR.2006.74","url":null,"abstract":"A combination of techniques that is becoming increasingly popular is the construction of part-based object representations using the outputs of interest-point detectors. Our contributions in this paper are twofold: first, we propose a primal-sketch-based set of image tokens that are used for object representation and detection. Second, top-down information is introduced based on an efficient method for the evaluation of the likelihood of hypothesized part locations. This allows us to use graphical model techniques to complement bottom-up detection, by proposing and finding the parts of the object that were missed by the front-end feature detection stage. Detection results for four object categories validate the merits of this joint top-down and bottom-up approach.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123283831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Discriminative Object Class Models of Appearance and Shape by Correlatons 基于相关性的外观和形状的判别对象类模型
S. Savarese, J. Winn, A. Criminisi
This paper presents a new model of object classes which incorporates appearance and shape information jointly. Modeling objects appearance by distributions of visual words has recently proven successful. Here appearancebased models are augmented by capturing the spatial arrangement of visual words. Compact spatial modeling without loss of discrimination is achieved through the introduction of adaptive vector quantized correlograms, which we call correlatons. Efficiency is further improved by means of integral images. The robustness of our new models to geometric transformations, severe occlusions and missing information is also demonstrated. The accuracy of discrimination of the proposed models is assessed with respect to existing databases with large numbers of object classes viewed under general conditions, and shown to outperform appearance-only models.
提出了一种结合外观和形状信息的对象类模型。通过视觉词的分布来建模对象的外观最近被证明是成功的。在这里,基于外观的模型通过捕捉视觉单词的空间排列而得到增强。通过引入自适应矢量量化相关图(我们称之为相关性),实现了不损失识别的紧凑空间建模。利用积分图像进一步提高了效率。我们的新模型对几何变换、严重遮挡和信息缺失的鲁棒性也得到了证明。针对在一般条件下查看的大量对象类别的现有数据库,评估了所提出模型的识别准确性,并显示优于仅外观模型。
{"title":"Discriminative Object Class Models of Appearance and Shape by Correlatons","authors":"S. Savarese, J. Winn, A. Criminisi","doi":"10.1109/CVPR.2006.102","DOIUrl":"https://doi.org/10.1109/CVPR.2006.102","url":null,"abstract":"This paper presents a new model of object classes which incorporates appearance and shape information jointly. Modeling objects appearance by distributions of visual words has recently proven successful. Here appearancebased models are augmented by capturing the spatial arrangement of visual words. Compact spatial modeling without loss of discrimination is achieved through the introduction of adaptive vector quantized correlograms, which we call correlatons. Efficiency is further improved by means of integral images. The robustness of our new models to geometric transformations, severe occlusions and missing information is also demonstrated. The accuracy of discrimination of the proposed models is assessed with respect to existing databases with large numbers of object classes viewed under general conditions, and shown to outperform appearance-only models.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124813956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 241
Acceleration Strategies for Gaussian Mean-Shift Image Segmentation 高斯均值移位图像分割的加速策略
M. A. Carreira-Perpiñán
Gaussian mean-shift (GMS) is a clustering algorithm that has been shown to produce good image segmentations (where each pixel is represented as a feature vector with spatial and range components). GMS operates by defining a Gaussian kernel density estimate for the data and clustering together points that converge to the same mode under a fixed-point iterative scheme. However, the algorithm is slow, since its complexity is O(kN2), where N is the number of pixels and k the average number of iterations per pixel. We study four acceleration strategies for GMS based on the spatial structure of images and on the fact that GMS is an expectation-maximisation (EM) algorithm: spatial discretisation, spatial neighbourhood, sparse EM and EM-Newton algorithm. We show that the spatial discretisation strategy can accelerate GMS by one to two orders of magnitude while achieving essentially the same segmentation; and that the other strategies attain speedups of less than an order of magnitude.
高斯均值移位(GMS)是一种聚类算法,已被证明可以产生良好的图像分割(其中每个像素被表示为具有空间和距离分量的特征向量)。GMS的工作原理是为数据定义高斯核密度估计,并在定点迭代方案下将收敛到同一模式的点聚在一起。然而,该算法速度较慢,因为其复杂度为O(kN2),其中N为像素数,k为每个像素的平均迭代次数。基于图像的空间结构和期望最大化(EM)算法的特点,研究了四种GMS加速策略:空间离散化、空间邻域、稀疏EM和EM- newton算法。我们表明,空间离散化策略可以在实现基本相同的分割的同时,将GMS的速度提高一到两个数量级;而其他策略获得的加速不到一个数量级。
{"title":"Acceleration Strategies for Gaussian Mean-Shift Image Segmentation","authors":"M. A. Carreira-Perpiñán","doi":"10.1109/CVPR.2006.44","DOIUrl":"https://doi.org/10.1109/CVPR.2006.44","url":null,"abstract":"Gaussian mean-shift (GMS) is a clustering algorithm that has been shown to produce good image segmentations (where each pixel is represented as a feature vector with spatial and range components). GMS operates by defining a Gaussian kernel density estimate for the data and clustering together points that converge to the same mode under a fixed-point iterative scheme. However, the algorithm is slow, since its complexity is O(kN2), where N is the number of pixels and k the average number of iterations per pixel. We study four acceleration strategies for GMS based on the spatial structure of images and on the fact that GMS is an expectation-maximisation (EM) algorithm: spatial discretisation, spatial neighbourhood, sparse EM and EM-Newton algorithm. We show that the spatial discretisation strategy can accelerate GMS by one to two orders of magnitude while achieving essentially the same segmentation; and that the other strategies attain speedups of less than an order of magnitude.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123623649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 88
Classifying Human Dynamics Without Contact Forces 没有接触力的人体动力学分类
A. Bissacco, Stefano Soatto
We develop a classification algorithm for hybrid autoregressive models of human motion for the purpose of videobased analysis and recognition. We assume that some temporal statistics are extracted from the images, and we use them to infer a dynamical system that explicitly models contact forces. We then develop a distance between such models that explicitly factors out exogenous inputs that are not unique to an individual or her gait. We show that such a distance is more discriminative than the distance between simple linear systems, where most of the energy is devoted to modeling the dynamics of spurious nuisances such as contact forces.
我们开发了一种用于基于视频的分析和识别的混合自回归人体运动模型的分类算法。我们假设从图像中提取了一些时间统计数据,并使用它们来推断一个明确建模接触力的动力系统。然后,我们在这些模型之间建立了一个距离,这些模型明确地排除了外生输入,这些输入不是个体或其步态所独有的。我们表明,这样的距离比简单线性系统之间的距离更具判别性,在简单线性系统中,大部分能量用于模拟虚假干扰(如接触力)的动力学。
{"title":"Classifying Human Dynamics Without Contact Forces","authors":"A. Bissacco, Stefano Soatto","doi":"10.1109/CVPR.2006.75","DOIUrl":"https://doi.org/10.1109/CVPR.2006.75","url":null,"abstract":"We develop a classification algorithm for hybrid autoregressive models of human motion for the purpose of videobased analysis and recognition. We assume that some temporal statistics are extracted from the images, and we use them to infer a dynamical system that explicitly models contact forces. We then develop a distance between such models that explicitly factors out exogenous inputs that are not unique to an individual or her gait. We show that such a distance is more discriminative than the distance between simple linear systems, where most of the energy is devoted to modeling the dynamics of spurious nuisances such as contact forces.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122716220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Improving Recognition of Novel Input with Similarity 利用相似度提高新输入的识别
Jerod J. Weinman, E. Learned-Miller
Many sources of information relevant to computer vision and machine learning tasks are often underused. One example is the similarity between the elements from a novel source, such as a speaker, writer, or printed font. By comparing instances emitted by a source, we help ensure that similar instances are given the same label. Previous approaches have clustered instances prior to recognition. We propose a probabilistic framework that unifies similarity with prior identity and contextual information. By fusing information sources in a single model, we eliminate unrecoverable errors that result from processing the information in separate stages and improve overall accuracy. The framework also naturally integrates dissimilarity information, which has previously been ignored. We demonstrate with an application in printed character recognition from images of signs in natural scenes.
许多与计算机视觉和机器学习任务相关的信息来源往往没有得到充分利用。一个例子是来自一个新来源的元素之间的相似性,例如演讲者、作家或印刷字体。通过比较源发出的实例,我们可以帮助确保为类似的实例提供相同的标签。以前的方法在识别之前对实例进行聚类。我们提出了一个概率框架,统一相似性与先前的身份和上下文信息。通过在单个模型中融合信息源,我们消除了由于在不同阶段处理信息而导致的不可恢复的错误,并提高了整体准确性。该框架还自然地集成了以前被忽略的差异性信息。我们用一个应用程序来演示从自然场景中的标志图像中识别印刷字符。
{"title":"Improving Recognition of Novel Input with Similarity","authors":"Jerod J. Weinman, E. Learned-Miller","doi":"10.1109/CVPR.2006.151","DOIUrl":"https://doi.org/10.1109/CVPR.2006.151","url":null,"abstract":"Many sources of information relevant to computer vision and machine learning tasks are often underused. One example is the similarity between the elements from a novel source, such as a speaker, writer, or printed font. By comparing instances emitted by a source, we help ensure that similar instances are given the same label. Previous approaches have clustered instances prior to recognition. We propose a probabilistic framework that unifies similarity with prior identity and contextual information. By fusing information sources in a single model, we eliminate unrecoverable errors that result from processing the information in separate stages and improve overall accuracy. The framework also naturally integrates dissimilarity information, which has previously been ignored. We demonstrate with an application in printed character recognition from images of signs in natural scenes.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"197 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125298014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Diffusion Distance for Histogram Comparison 直方图比较的扩散距离
Haibin Ling, K. Okada
In this paper we propose diffusion distance, a new dissimilarity measure between histogram-based descriptors. We define the difference between two histograms to be a temperature field. We then study the relationship between histogram similarity and a diffusion process, showing how diffusion handles deformation as well as quantization effects. As a result, the diffusion distance is derived as the sum of dissimilarities over scales. Being a cross-bin histogram distance, the diffusion distance is robust to deformation, lighting change and noise in histogram-based local descriptors. In addition, it enjoys linear computational complexity which significantly improves previously proposed cross-bin distances with quadratic complexity or higher. We tested the proposed approach on both shape recognition and interest point matching tasks using several multi-dimensional histogram-based descriptors including shape context, SIFT, and spin images. In all experiments, the diffusion distance performs excellently in both accuracy and efficiency in comparison with other state-of-the-art distance measures. In particular, it performs as accurately as the Earth Mover’s Distance with much greater efficiency.
本文提出了一种新的基于直方图的描述符之间不相似度度量方法——扩散距离。我们将两个直方图之间的差定义为温度场。然后,我们研究了直方图相似性和扩散过程之间的关系,展示了扩散如何处理变形以及量化效果。因此,扩散距离推导为不同尺度的不相似度之和。在基于直方图的局部描述符中,扩散距离作为一个跨bin直方图距离,对变形、光照变化和噪声具有鲁棒性。此外,它具有线性计算复杂度,这大大提高了先前提出的二次复杂度或更高的跨库距离。我们使用几种基于多维直方图的描述符(包括形状上下文、SIFT和旋转图像)在形状识别和兴趣点匹配任务上测试了所提出的方法。在所有实验中,与其他最先进的距离测量方法相比,扩散距离在准确性和效率方面都表现出色。特别是,它的计算精度与地球移动距离一样,而且效率要高得多。
{"title":"Diffusion Distance for Histogram Comparison","authors":"Haibin Ling, K. Okada","doi":"10.1109/CVPR.2006.99","DOIUrl":"https://doi.org/10.1109/CVPR.2006.99","url":null,"abstract":"In this paper we propose diffusion distance, a new dissimilarity measure between histogram-based descriptors. We define the difference between two histograms to be a temperature field. We then study the relationship between histogram similarity and a diffusion process, showing how diffusion handles deformation as well as quantization effects. As a result, the diffusion distance is derived as the sum of dissimilarities over scales. Being a cross-bin histogram distance, the diffusion distance is robust to deformation, lighting change and noise in histogram-based local descriptors. In addition, it enjoys linear computational complexity which significantly improves previously proposed cross-bin distances with quadratic complexity or higher. We tested the proposed approach on both shape recognition and interest point matching tasks using several multi-dimensional histogram-based descriptors including shape context, SIFT, and spin images. In all experiments, the diffusion distance performs excellently in both accuracy and efficiency in comparison with other state-of-the-art distance measures. In particular, it performs as accurately as the Earth Mover’s Distance with much greater efficiency.","PeriodicalId":421737,"journal":{"name":"2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129278327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 270
期刊
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1