首页 > 最新文献

Proceedings Ninth IEEE International Conference on Computer Vision最新文献

英文 中文
Computing geodesics and minimal surfaces via graph cuts 通过图切割计算测地线和最小曲面
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238310
Yuri Boykov, V. Kolmogorov
Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry.
测地线活动轮廓和图形切割是两种标准的图像分割技术。我们介绍了一种新的分割方法,结合了它们的一些优点。我们的主要直觉是,嵌入在某些连续空间中的图形上的任何切口都可以被解释为轮廓(2D)或表面(3D)。我们展示了如何构建网格图并设置其边缘权重,以便切割的代价任意接近任何各向异性黎曼度量的相应轮廓(表面)的长度(面积)。这个技术结果有两个有趣的结果。首先,图割算法可用于在给定的一组边界条件下,在任意黎曼度量下找到全局最小测地线轮廓(3D最小曲面)。其次,我们展示了如何在现有的基于视觉的图切割方法中最小化度量工件。从理论上讲,我们的工作为数学的几个分支——微分几何、积分几何和组合优化——提供了一个有趣的联系。利用积分几何中的柯西-克罗夫顿公式解决了主要的技术问题。
{"title":"Computing geodesics and minimal surfaces via graph cuts","authors":"Yuri Boykov, V. Kolmogorov","doi":"10.1109/ICCV.2003.1238310","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238310","url":null,"abstract":"Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128016664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 681
Learning how to inpaint from global image statistics 学习如何从全局图像统计绘制
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238360
Anat Levin, A. Zomet, Yair Weiss
Inpainting is the problem of filling-in holes in images. Considerable progress has been made by techniques that use the immediate boundary of the hole and some prior information on images to solve this problem. These algorithms successfully solve the local inpainting problem but they must, by definition, give the same completion to any two holes that have the same boundary, even when the rest of the image is vastly different. We address a different, more global inpainting problem. How can we use the rest of the image in order to learn how to inpaint? We approach this problem from the context of statistical learning. Given a training image we build an exponential family distribution over images that is based on the histograms of local features. We then use this image specific distribution to inpaint the hole by finding the most probable image given the boundary and the distribution. The optimization is done using loopy belief propagation. We show that our method can successfully complete holes while taking into account the specific image statistics. In particular it can give vastly different completions even when the local neighborhoods are identical.
补图是在图像上补洞的问题。利用孔的直接边界和图像上的一些先验信息来解决这一问题的技术已经取得了相当大的进展。这些算法成功地解决了局部补全问题,但根据定义,它们必须对具有相同边界的任意两个孔给予相同的补全,即使图像的其余部分差异很大。我们解决了一个不同的,更全球性的油漆问题。我们如何使用图像的其余部分来学习如何上色?我们从统计学习的角度来解决这个问题。给定一个训练图像,我们在基于局部特征直方图的图像上建立一个指数族分布。然后,我们使用这个图像特定的分布,通过找到给定边界和分布的最可能的图像来绘制洞。采用循环信念传播方法进行优化。结果表明,该方法可以在考虑特定图像统计量的情况下成功地补全孔洞。特别是,即使当地社区是相同的,它也可以给出截然不同的完成度。
{"title":"Learning how to inpaint from global image statistics","authors":"Anat Levin, A. Zomet, Yair Weiss","doi":"10.1109/ICCV.2003.1238360","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238360","url":null,"abstract":"Inpainting is the problem of filling-in holes in images. Considerable progress has been made by techniques that use the immediate boundary of the hole and some prior information on images to solve this problem. These algorithms successfully solve the local inpainting problem but they must, by definition, give the same completion to any two holes that have the same boundary, even when the rest of the image is vastly different. We address a different, more global inpainting problem. How can we use the rest of the image in order to learn how to inpaint? We approach this problem from the context of statistical learning. Given a training image we build an exponential family distribution over images that is based on the histograms of local features. We then use this image specific distribution to inpaint the hole by finding the most probable image given the boundary and the distribution. The optimization is done using loopy belief propagation. We show that our method can successfully complete holes while taking into account the specific image statistics. In particular it can give vastly different completions even when the local neighborhoods are identical.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128051742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 354
Towards a mathematical theory of primal sketch and sketchability 关于原始素描和可素描性的数学理论
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238631
Cheng-en Guo, Song-Chun Zhu, Y. Wu
In this paper, we present a mathematical theory for Marr's primal sketch. We first conduct a theoretical study of the descriptive Markov random field model and the generative wavelet/sparse coding model from the perspective of entropy and complexity. The competition between the two types of models defines the concept of "sketchability", which divides image into texture and geometry. We then propose a primal sketch model that integrates the two models and, in addition, a Gestalt field model for spatial organization. We also propose a sketching pursuit process that coordinates the competition between two pursuit algorithms: the matching pursuit (Mallat and Zhang, 1993) and the filter pursuit (Zhu, et al., 1997), that seek to explain the image by bases and filters respectively. The model can be used to learn a dictionary of image primitives, or textons in Julesz's language, for natural images. The primal sketch model is not only parsimonious for image representation, but produces meaningful sketches over a large number of generic images.
在本文中,我们提出了马尔原始草图的数学理论。首先从熵和复杂度的角度对描述性马尔可夫随机场模型和生成小波/稀疏编码模型进行了理论研究。两种模型之间的竞争定义了“可素描性”的概念,将图像分为纹理和几何。然后,我们提出了一个整合这两个模型的原始草图模型,以及一个空间组织的格式塔场模型。我们还提出了一种素描追踪过程,它协调了两种追踪算法之间的竞争:匹配追踪(Mallat and Zhang, 1993)和滤波追踪(Zhu, et al., 1997),这两种算法分别试图通过基和滤波器来解释图像。这个模型可以用来学习自然图像的图像原语字典,或者用Julesz的语言来说就是文本。原始草图模型不仅简化了图像表示,而且在大量通用图像上生成有意义的草图。
{"title":"Towards a mathematical theory of primal sketch and sketchability","authors":"Cheng-en Guo, Song-Chun Zhu, Y. Wu","doi":"10.1109/ICCV.2003.1238631","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238631","url":null,"abstract":"In this paper, we present a mathematical theory for Marr's primal sketch. We first conduct a theoretical study of the descriptive Markov random field model and the generative wavelet/sparse coding model from the perspective of entropy and complexity. The competition between the two types of models defines the concept of \"sketchability\", which divides image into texture and geometry. We then propose a primal sketch model that integrates the two models and, in addition, a Gestalt field model for spatial organization. We also propose a sketching pursuit process that coordinates the competition between two pursuit algorithms: the matching pursuit (Mallat and Zhang, 1993) and the filter pursuit (Zhu, et al., 1997), that seek to explain the image by bases and filters respectively. The model can be used to learn a dictionary of image primitives, or textons in Julesz's language, for natural images. The primal sketch model is not only parsimonious for image representation, but produces meaningful sketches over a large number of generic images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128123054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
Object recognition with informative features and linear classification 基于信息特征和线性分类的目标识别
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238356
Michel Vidal-Naquet, S. Ullman
We show that efficient object recognition can be obtained by combining informative features with linear classification. The results demonstrate the superiority of informative class-specific features, as compared with generic type features such as wavelets, for the task of object recognition. We show that information rich features can reach optimal performance with simple linear separation rules, while generic feature based classifiers require more complex classification schemes. This is significant because efficient and optimal methods have been developed for spaces that allow linear separation. To compare different strategies for feature extraction, we trained and compared classifiers working in feature spaces of the same low dimensionality, using two feature types (image fragments vs. wavelets) and two classification rules (linear hyperplane and a Bayesian network). The results show that by maximizing the individual information of the features, it is possible to obtain efficient classification by a simple linear separating rule, as well as more efficient learning.
我们证明了将信息特征与线性分类相结合可以获得有效的目标识别。结果表明,与一般类型特征(如小波)相比,信息类特定特征在目标识别任务中具有优势。研究表明,信息丰富的特征可以通过简单的线性分离规则达到最优性能,而基于一般特征的分类器需要更复杂的分类方案。这一点很重要,因为对于允许线性分离的空间,已经开发出了高效和最佳的方法。为了比较不同的特征提取策略,我们使用两种特征类型(图像片段与小波)和两种分类规则(线性超平面和贝叶斯网络)训练并比较了在相同低维特征空间中工作的分类器。结果表明,通过最大化特征的个体信息,可以通过简单的线性分离规则获得有效的分类,并且可以提高学习效率。
{"title":"Object recognition with informative features and linear classification","authors":"Michel Vidal-Naquet, S. Ullman","doi":"10.1109/ICCV.2003.1238356","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238356","url":null,"abstract":"We show that efficient object recognition can be obtained by combining informative features with linear classification. The results demonstrate the superiority of informative class-specific features, as compared with generic type features such as wavelets, for the task of object recognition. We show that information rich features can reach optimal performance with simple linear separation rules, while generic feature based classifiers require more complex classification schemes. This is significant because efficient and optimal methods have been developed for spaces that allow linear separation. To compare different strategies for feature extraction, we trained and compared classifiers working in feature spaces of the same low dimensionality, using two feature types (image fragments vs. wavelets) and two classification rules (linear hyperplane and a Bayesian network). The results show that by maximizing the individual information of the features, it is possible to obtain efficient classification by a simple linear separating rule, as well as more efficient learning.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122270947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 268
Landmark-based shape deformation with topology-preserving constraints 具有拓扑保持约束的基于地标的形状变形
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238447
Song Wang, J. Ji, Zhi-Pei Liang
This paper presents a novel approach for landmark-based shape deformation, in which fitting error and shape difference are formulated into a support vector machine (SVM) regression problem. To well describe nonrigid shape deformation, this paper measures the shape difference using a thin-plate spline model. The proposed approach is capable of preserving the topology of the template shape in the deformation. This property is achieved by inserting a set of additional points and imposing a set of linear equality and/or inequality constraints. The underlying optimization problem is solved using a quadratic programming algorithm. The proposed method has been tested using practical data in the context of shape-based image segmentation. Some relevant practical issues, such as missing detected landmarks and selection of the regularization parameter are also briefly discussed.
本文提出了一种新的基于地标的形状变形方法,将拟合误差和形状差异转化为支持向量机(SVM)回归问题。为了更好地描述非刚性形状变形,本文采用薄板样条模型测量形状差异。该方法能够在变形过程中保持模板形状的拓扑结构。这个性质是通过插入一组附加点和施加一组线性等式和/或不等式约束来实现的。底层优化问题采用二次规划算法求解。该方法已在基于形状的图像分割中使用实际数据进行了测试。本文还简要讨论了一些相关的实际问题,如缺少检测到的标志和正则化参数的选择。
{"title":"Landmark-based shape deformation with topology-preserving constraints","authors":"Song Wang, J. Ji, Zhi-Pei Liang","doi":"10.1109/ICCV.2003.1238447","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238447","url":null,"abstract":"This paper presents a novel approach for landmark-based shape deformation, in which fitting error and shape difference are formulated into a support vector machine (SVM) regression problem. To well describe nonrigid shape deformation, this paper measures the shape difference using a thin-plate spline model. The proposed approach is capable of preserving the topology of the template shape in the deformation. This property is achieved by inserting a set of additional points and imposing a set of linear equality and/or inequality constraints. The underlying optimization problem is solved using a quadratic programming algorithm. The proposed method has been tested using practical data in the context of shape-based image segmentation. Some relevant practical issues, such as missing detected landmarks and selection of the regularization parameter are also briefly discussed.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121444023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Dense shape reconstruction of a moving object under arbitrary, unknown lighting 在任意未知光照下对移动物体进行密集形状重建
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238628
D. Simakov, D. Frolova, R. Basri
We present a method for shape reconstruction from several images of a moving object. The reconstruction is dense (up to image resolution). The method assumes that the motion is known, e.g., by tracking a small number of feature points on the object. The object is assumed Lambertian (completely matte), light sources should not be very close to the object but otherwise arbitrary, and no knowledge of lighting conditions is required. An object changes its appearance significantly when it changes its orientation relative to light sources, causing violation of the common brightness constancy assumption. While a lot of effort is devoted to deal with this violation, we demonstrate how to exploit it to recover 3D structure from 2D images. We propose a new correspondence measure that enables point matching across views of a moving object. The method has been tested both on computer simulated examples and on a real object.
提出了一种基于多幅运动物体图像的形状重建方法。重建是密集的(达到图像分辨率)。该方法假设运动是已知的,例如,通过跟踪对象上的少量特征点。物体被假设为朗伯(完全哑光),光源不应该非常接近物体,否则是任意的,并且不需要照明条件的知识。当一个物体相对于光源的方向发生改变时,它的外观就会发生很大的变化,从而违反了通常的亮度恒定假设。而大量的努力致力于处理这种违规,我们演示如何利用它来恢复3D结构从2D图像。我们提出了一个新的对应措施,使点匹配跨视图的移动对象。该方法已在计算机模拟实例和实物上进行了验证。
{"title":"Dense shape reconstruction of a moving object under arbitrary, unknown lighting","authors":"D. Simakov, D. Frolova, R. Basri","doi":"10.1109/ICCV.2003.1238628","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238628","url":null,"abstract":"We present a method for shape reconstruction from several images of a moving object. The reconstruction is dense (up to image resolution). The method assumes that the motion is known, e.g., by tracking a small number of feature points on the object. The object is assumed Lambertian (completely matte), light sources should not be very close to the object but otherwise arbitrary, and no knowledge of lighting conditions is required. An object changes its appearance significantly when it changes its orientation relative to light sources, causing violation of the common brightness constancy assumption. While a lot of effort is devoted to deal with this violation, we demonstrate how to exploit it to recover 3D structure from 2D images. We propose a new correspondence measure that enables point matching across views of a moving object. The method has been tested both on computer simulated examples and on a real object.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121625026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
Image registration with global and local luminance alignment 图像配准与全局和局部亮度对齐
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238331
Jiaya Jia, Chi-Keung Tang
Inspired by tensor voting, we present luminance voting, a novel approach for image registration with global and local luminance alignment. The key to our modeless approach is the direct estimation of replacement function, by reducing the complex estimation problem to the robust 2D tensor voting in the corresponding voting spaces. No model for replacement function is assumed. Luminance data are first encoded into 2D ball tensors. Subject to the monotonic constraint only, we vote for an optimal replacement function by propagating the smoothness constraint using a dense tensor field. Our method effectively infers missing curve segments and rejects image outliers without assuming any simplifying or complex curve model. The voted replacement functions are used in our iterative registration algorithm for computing the best warping matrix. Unlike previous approaches, our robust method corrects exposure disparity even if the two overlapping images are initially misaligned. Luminance voting is effective in correcting exposure difference, eliminating vignettes, and thus improving image registration. We present results on a variety of images.
受张量投票的启发,我们提出了亮度投票,一种具有全局和局部亮度对齐的图像配准新方法。我们的非模态方法的关键是替换函数的直接估计,通过将复杂的估计问题简化为相应投票空间中的鲁棒二维张量投票。没有假设替换函数的模型。亮度数据首先被编码成二维球张量。仅在单调约束下,我们通过使用密集张量场传播平滑约束来投票选出最优替换函数。我们的方法在不假设任何简化或复杂曲线模型的情况下有效地推断出缺失的曲线段并拒绝图像异常值。在我们的迭代配准算法中使用投票的替换函数来计算最佳的翘曲矩阵。与以前的方法不同,我们的鲁棒方法可以校正曝光差,即使两个重叠图像最初是不对齐的。亮度投票可以有效地校正曝光差,消除小晕,从而改善图像配准。我们展示了各种图像的结果。
{"title":"Image registration with global and local luminance alignment","authors":"Jiaya Jia, Chi-Keung Tang","doi":"10.1109/ICCV.2003.1238331","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238331","url":null,"abstract":"Inspired by tensor voting, we present luminance voting, a novel approach for image registration with global and local luminance alignment. The key to our modeless approach is the direct estimation of replacement function, by reducing the complex estimation problem to the robust 2D tensor voting in the corresponding voting spaces. No model for replacement function is assumed. Luminance data are first encoded into 2D ball tensors. Subject to the monotonic constraint only, we vote for an optimal replacement function by propagating the smoothness constraint using a dense tensor field. Our method effectively infers missing curve segments and rejects image outliers without assuming any simplifying or complex curve model. The voted replacement functions are used in our iterative registration algorithm for computing the best warping matrix. Unlike previous approaches, our robust method corrects exposure disparity even if the two overlapping images are initially misaligned. Luminance voting is effective in correcting exposure difference, eliminating vignettes, and thus improving image registration. We present results on a variety of images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115894759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Regression based bandwidth selection for segmentation using Parzen windows 基于回归的带宽选择分割使用Parzen窗口
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238307
Maneesh Kumar Singh, N. Ahuja
We consider the problem of segmentation of images that can be modelled as piecewise continuous signals having unknown, nonstationary statistics. We propose a solution to this problem which first uses a regression framework to estimate the image PDF, and then mean-shift to find the modes of this PDF. The segmentation follows from mode identification wherein pixel clusters or image segments are identified with unique modes of the multimodal PDF. Each pixel is mapped to a mode using a convergent, iterative process. The effectiveness of the approach depends upon the accuracy of the (implicit) estimate of the underlying multimodal density function and thus on the bandwidth parameters used for its estimate using Parzen windows. Automatic selection of bandwidth parameters is a desired feature of the algorithm. We show that the proposed regression-based model admits a realistic framework to automatically choose bandwidth parameters which minimizes a global error criterion. We validate the theory presented with results on real images.
我们考虑图像的分割问题,这些图像可以建模为具有未知,非平稳统计量的分段连续信号。我们提出了一种解决方案,首先使用回归框架估计图像的PDF,然后mean-shift找到该PDF的模式。分割遵循模式识别,其中像素簇或图像段用多模态PDF的唯一模式识别。每个像素被映射到一个模式使用收敛,迭代过程。该方法的有效性取决于底层多模态密度函数(隐式)估计的准确性,因此取决于使用Parzen窗口进行估计的带宽参数。带宽参数的自动选择是该算法的一个理想特性。我们的研究表明,基于回归的模型提供了一个现实的框架来自动选择带宽参数,使全局误差准则最小化。我们用实际图像验证了所提出的理论。
{"title":"Regression based bandwidth selection for segmentation using Parzen windows","authors":"Maneesh Kumar Singh, N. Ahuja","doi":"10.1109/ICCV.2003.1238307","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238307","url":null,"abstract":"We consider the problem of segmentation of images that can be modelled as piecewise continuous signals having unknown, nonstationary statistics. We propose a solution to this problem which first uses a regression framework to estimate the image PDF, and then mean-shift to find the modes of this PDF. The segmentation follows from mode identification wherein pixel clusters or image segments are identified with unique modes of the multimodal PDF. Each pixel is mapped to a mode using a convergent, iterative process. The effectiveness of the approach depends upon the accuracy of the (implicit) estimate of the underlying multimodal density function and thus on the bandwidth parameters used for its estimate using Parzen windows. Automatic selection of bandwidth parameters is a desired feature of the algorithm. We show that the proposed regression-based model admits a realistic framework to automatically choose bandwidth parameters which minimizes a global error criterion. We validate the theory presented with results on real images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131771472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Reinforcement learning for combining relevance feedback techniques 结合相关反馈技术的强化学习
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238390
Peng-Yeng Yin, B. Bhanu, Kuang-Cheng Chang, Anlei Dong
Relevance feedback (RF) is an interactive process which refines the retrievals by utilizing user's feedback history. Most researchers strive to develop new RF techniques and ignore the advantages of existing ones. We propose an image relevance reinforcement learning (IRRL) model for integrating existing RF techniques. Various integration schemes are presented and a long-term shared memory is used to exploit the retrieval experience from multiple users. Also, a concept digesting method is proposed to reduce the complexity of storage demand. The experimental results manifest that the integration of multiple RF approaches gives better retrieval performance than using one RF technique alone, and that the sharing of relevance knowledge between multiple query sessions also provides significant contributions for improvement. Further, the storage demand is significantly reduced by the concept digesting technique. This shows the scalability of the proposed model against a growing-size database.
关联反馈是一种利用用户反馈历史对检索结果进行优化的交互过程。大多数研究人员都在努力开发新的射频技术,而忽略了现有技术的优点。我们提出了一个图像相关强化学习(IRRL)模型来整合现有的射频技术。提出了多种集成方案,并利用长期共享记忆来利用多用户的检索体验。同时,提出了一种概念消化方法来降低存储需求的复杂性。实验结果表明,多种射频方法的集成比单独使用一种射频技术具有更好的检索性能,并且多个查询会话之间的相关知识共享也为改进提供了重要贡献。此外,概念消化技术显著降低了存储需求。这显示了所建议的模型对不断增长的数据库的可伸缩性。
{"title":"Reinforcement learning for combining relevance feedback techniques","authors":"Peng-Yeng Yin, B. Bhanu, Kuang-Cheng Chang, Anlei Dong","doi":"10.1109/ICCV.2003.1238390","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238390","url":null,"abstract":"Relevance feedback (RF) is an interactive process which refines the retrievals by utilizing user's feedback history. Most researchers strive to develop new RF techniques and ignore the advantages of existing ones. We propose an image relevance reinforcement learning (IRRL) model for integrating existing RF techniques. Various integration schemes are presented and a long-term shared memory is used to exploit the retrieval experience from multiple users. Also, a concept digesting method is proposed to reduce the complexity of storage demand. The experimental results manifest that the integration of multiple RF approaches gives better retrieval performance than using one RF technique alone, and that the sharing of relevance knowledge between multiple query sessions also provides significant contributions for improvement. Further, the storage demand is significantly reduced by the concept digesting technique. This shows the scalability of the proposed model against a growing-size database.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133485472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Tracking articulated hand motion with eigen dynamics analysis 基于特征动力学分析的关节手运动跟踪
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238472
Hanning Zhou, Thomas S. Huang
This paper introduces the concept of eigen-dynamics and proposes an eigen dynamics analysis (EDA) method to learn the dynamics of natural hand motion from labelled sets of motion captured with a data glove. The result is parameterized with a high-order stochastic linear dynamic system (LDS) consisting of five lower-order LDS. Each corresponding to one eigen-dynamics. Based on the EDA model, we construct a dynamic Bayesian network (DBN) to analyze the generative process of a image sequence of natural hand motion. Using the DBN, a hand tracking system is implemented. Experiments on both synthesized and real-world data demonstrate the robustness and effectiveness of these techniques.
介绍了特征动力学的概念,提出了一种特征动力学分析(EDA)方法,从数据手套捕获的标记运动集中学习手部自然运动的动力学。结果用一个由5个低阶随机线性动力系统组成的高阶随机线性动力系统(LDS)参数化。每个对应一个本征动力学。在EDA模型的基础上,构建了一个动态贝叶斯网络(DBN)来分析手部自然运动图像序列的生成过程。利用DBN,实现了一个手部跟踪系统。在合成数据和实际数据上的实验证明了这些技术的鲁棒性和有效性。
{"title":"Tracking articulated hand motion with eigen dynamics analysis","authors":"Hanning Zhou, Thomas S. Huang","doi":"10.1109/ICCV.2003.1238472","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238472","url":null,"abstract":"This paper introduces the concept of eigen-dynamics and proposes an eigen dynamics analysis (EDA) method to learn the dynamics of natural hand motion from labelled sets of motion captured with a data glove. The result is parameterized with a high-order stochastic linear dynamic system (LDS) consisting of five lower-order LDS. Each corresponding to one eigen-dynamics. Based on the EDA model, we construct a dynamic Bayesian network (DBN) to analyze the generative process of a image sequence of natural hand motion. Using the DBN, a hand tracking system is implemented. Experiments on both synthesized and real-world data demonstrate the robustness and effectiveness of these techniques.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133280086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 123
期刊
Proceedings Ninth IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1