首页 > 最新文献

Proceedings Ninth IEEE International Conference on Computer Vision最新文献

英文 中文
Surface reflectance modeling of real objects with interreflections 具有互反射的真实物体表面反射率建模
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238335
Takashi Machida, N. Yokoya, H. Takemura
In mixed reality, especially in augmented virtuality which virtualizes real objects, it is important to estimate object surface reflectance properties to render the objects under arbitrary illumination conditions. Though several methods have been explored to estimate the surface reflectance properties, it is still difficult to estimate surface reflectance parameters faithfully for complex objects which have nonuniform surface reflectance properties and exhibit interreflections. We describe a new method for densely estimating nonuniform surface reflectance properties of real objects constructed of convex and concave surfaces with interreflections. We use registered range and surface color texture images obtained by a laser rangefinder. Experiments show the usefulness of the proposed method.
在混合现实中,特别是在虚拟真实物体的增强虚拟中,估算物体表面反射率特性对于在任意光照条件下渲染物体非常重要。虽然已经探索了几种估算表面反射率的方法,但对于表面反射率不均匀且存在互反射的复杂物体,仍然难以准确地估计表面反射率参数。提出了一种密集估计具有互反射的凸、凹曲面构成的真实物体非均匀表面反射率的新方法。我们使用激光测距仪获得的配准距离和表面彩色纹理图像。实验证明了该方法的有效性。
{"title":"Surface reflectance modeling of real objects with interreflections","authors":"Takashi Machida, N. Yokoya, H. Takemura","doi":"10.1109/ICCV.2003.1238335","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238335","url":null,"abstract":"In mixed reality, especially in augmented virtuality which virtualizes real objects, it is important to estimate object surface reflectance properties to render the objects under arbitrary illumination conditions. Though several methods have been explored to estimate the surface reflectance properties, it is still difficult to estimate surface reflectance parameters faithfully for complex objects which have nonuniform surface reflectance properties and exhibit interreflections. We describe a new method for densely estimating nonuniform surface reflectance properties of real objects constructed of convex and concave surfaces with interreflections. We use registered range and surface color texture images obtained by a laser rangefinder. Experiments show the usefulness of the proposed method.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116000581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Modeling textured motion : particle, wave and sketch 建模纹理运动:粒子,波浪和草图
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238343
Yizhou Wang, Song-Chun Zhu
We present a generative model for textured motion phenomena, such as falling snow, wavy river and dancing grass, etc. Firstly, we represent an image as a linear superposition of image bases selected from a generic and over-complete dictionary. The dictionary contains Gabor bases for point/particle elements and Fourier bases for wave-elements. These bases compete to explain the input images. The transform from a raw image to a base or a token representation leads to large dimension reduction. Secondly, we introduce a unified motion equation to characterize the motion of these bases and the interactions between waves and particles, e.g. a ball floating on water. We use statistical learning algorithm to identify the structure of moving objects and their trajectories automatically. Then novel sequences can be synthesized easily from the motion and image models. Thirdly, we replace the dictionary of Gabor and Fourier bases with symbolic sketches (also bases). With the same image and motion model, we can render realistic and stylish cartoon animation. In our view, cartoon and sketch are symbolic visualization of the inner representation for visual perception. The success of the cartoon animation, in turn, suggests that our image and motion models capture the essence of visual perception of textured motion.
我们提出了一个纹理运动现象的生成模型,如飘落的雪,波浪的河流和跳舞的草等。首先,我们将图像表示为从通用和过完备字典中选择的图像基的线性叠加。该字典包含点/粒子单元的Gabor基和波单元的傅里叶基。这些碱基相互竞争来解释输入的图像。从原始图像到基图像或标记表示的转换会导致大幅度的降维。其次,我们引入了一个统一的运动方程来描述这些基的运动以及波与粒子之间的相互作用,例如一个漂浮在水面上的球。我们使用统计学习算法来自动识别运动物体的结构和轨迹。这样就可以很容易地从运动和图像模型中合成新的序列。第三,我们用符号草图(也是碱基)替换Gabor和傅里叶碱基字典。在相同的图像和运动模型下,我们可以呈现出真实而时尚的卡通动画。在我们看来,漫画和素描是视觉感知的内在表征的象征性可视化。反过来,卡通动画的成功表明,我们的图像和运动模型捕捉到了纹理运动视觉感知的本质。
{"title":"Modeling textured motion : particle, wave and sketch","authors":"Yizhou Wang, Song-Chun Zhu","doi":"10.1109/ICCV.2003.1238343","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238343","url":null,"abstract":"We present a generative model for textured motion phenomena, such as falling snow, wavy river and dancing grass, etc. Firstly, we represent an image as a linear superposition of image bases selected from a generic and over-complete dictionary. The dictionary contains Gabor bases for point/particle elements and Fourier bases for wave-elements. These bases compete to explain the input images. The transform from a raw image to a base or a token representation leads to large dimension reduction. Secondly, we introduce a unified motion equation to characterize the motion of these bases and the interactions between waves and particles, e.g. a ball floating on water. We use statistical learning algorithm to identify the structure of moving objects and their trajectories automatically. Then novel sequences can be synthesized easily from the motion and image models. Thirdly, we replace the dictionary of Gabor and Fourier bases with symbolic sketches (also bases). With the same image and motion model, we can render realistic and stylish cartoon animation. In our view, cartoon and sketch are symbolic visualization of the inner representation for visual perception. The success of the cartoon animation, in turn, suggests that our image and motion models capture the essence of visual perception of textured motion.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116042288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 53
Calibrating pan-tilt cameras in wide-area surveillance networks 广域监控网络中平移摄像机的标定
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238329
James Davis, Xing Chen
Pan-tilt cameras are often used as components of wide-area surveillance systems. It is necessary to calibrate these cameras in relation to one another in order to obtain a consistent representation of the entire space. Existing methods for calibrating pan-tilt cameras have assumed an idealized model of camera mechanics. In addition, most methods have been calibrated using only a small range of camera motion. We present a method for calibrating pan-tilt cameras that introduces a more complete model of camera motion. Pan and tilt rotations are modeled as occurring around arbitrary axes in space. In addition, the wide area surveillance system itself is used to build a large virtual calibration object, resulting in better calibration than would be possible with a single small calibration target. Finally, the proposed enhancements are validated experimentally, with comparisons showing the improvement provided over more traditional methods.
平移摄像机常被用作广域监控系统的组成部分。为了获得整个空间的一致表现,有必要校准这些相机之间的关系。现有的校准平移照相机的方法都假定了一个理想的照相机力学模型。此外,大多数方法只使用很小的相机运动范围进行校准。我们提出了一种校准平移摄像机的方法,该方法引入了一个更完整的摄像机运动模型。平移和倾斜旋转被建模为围绕空间中的任意轴发生。此外,广域监控系统本身用于构建一个大型虚拟校准对象,其校准效果优于单个小型校准目标。最后,通过实验验证了所提出的增强方法,并与传统方法进行了比较。
{"title":"Calibrating pan-tilt cameras in wide-area surveillance networks","authors":"James Davis, Xing Chen","doi":"10.1109/ICCV.2003.1238329","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238329","url":null,"abstract":"Pan-tilt cameras are often used as components of wide-area surveillance systems. It is necessary to calibrate these cameras in relation to one another in order to obtain a consistent representation of the entire space. Existing methods for calibrating pan-tilt cameras have assumed an idealized model of camera mechanics. In addition, most methods have been calibrated using only a small range of camera motion. We present a method for calibrating pan-tilt cameras that introduces a more complete model of camera motion. Pan and tilt rotations are modeled as occurring around arbitrary axes in space. In addition, the wide area surveillance system itself is used to build a large virtual calibration object, resulting in better calibration than would be possible with a single small calibration target. Finally, the proposed enhancements are validated experimentally, with comparisons showing the improvement provided over more traditional methods.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125935247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 106
Eye gaze estimation from a single image of one eye 基于单眼图像的眼睛注视估计
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238328
Jian-Gang Wang, E. Sung, R. Venkateswarlu
We present a novel approach, called the "one-circle " algorithm, for measuring the eye gaze using a monocular image that zooms in on only one eye of a person. Observing that the iris contour is a circle, we estimate the normal direction of this iris circle, considered as the eye gaze, from its elliptical image. From basic projective geometry, an ellipse can be back-projected into space onto two circles of different orientations. However, by using an anthropometric property of the eyeball, the correct solution can be disambiguated. This allows us to obtain a higher resolution image of the iris with a zoom-in camera and thereby achieving higher accuracies in the estimation. The robustness of our gaze determination approach was verified statistically by the extensive experiments on synthetic and real image data. The two key contributions are that we show the possibility of finding the unique eye gaze direction from a single image of one eye and that one can obtain better accuracy as a consequence of this.
我们提出了一种新颖的方法,称为“一圈”算法,用于使用单眼图像来测量眼睛的凝视,该图像只放大一个人的一只眼睛。观察到虹膜轮廓是一个圆,我们从虹膜的椭圆图像中估计出这个虹膜圆作为眼睛注视的法线方向。从基本的射影几何,一个椭圆可以反投影到空间的两个圆的不同方向。然而,通过使用眼球的人体测量特性,正确的解决方案可以消除歧义。这使我们能够使用变焦相机获得更高分辨率的虹膜图像,从而实现更高的估计精度。通过对合成图像和真实图像数据的大量实验,统计上验证了我们的凝视确定方法的鲁棒性。两个关键的贡献是,我们展示了从一只眼睛的单个图像中找到唯一的眼睛注视方向的可能性,并且由此可以获得更好的准确性。
{"title":"Eye gaze estimation from a single image of one eye","authors":"Jian-Gang Wang, E. Sung, R. Venkateswarlu","doi":"10.1109/ICCV.2003.1238328","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238328","url":null,"abstract":"We present a novel approach, called the \"one-circle \" algorithm, for measuring the eye gaze using a monocular image that zooms in on only one eye of a person. Observing that the iris contour is a circle, we estimate the normal direction of this iris circle, considered as the eye gaze, from its elliptical image. From basic projective geometry, an ellipse can be back-projected into space onto two circles of different orientations. However, by using an anthropometric property of the eyeball, the correct solution can be disambiguated. This allows us to obtain a higher resolution image of the iris with a zoom-in camera and thereby achieving higher accuracies in the estimation. The robustness of our gaze determination approach was verified statistically by the extensive experiments on synthetic and real image data. The two key contributions are that we show the possibility of finding the unique eye gaze direction from a single image of one eye and that one can obtain better accuracy as a consequence of this.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129898673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 195
Minimally-supervised classification using multiple observation sets 使用多个观测集的最小监督分类
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238358
C. Stauffer
We discuss building complex classifiers from a single labeled example and vast number of unlabeled observation sets, each derived from observation of a single process or object. When data can be measured by observation, it is often plentiful and it is often possible to make more than one observation of the state of a process or object. We discuss how to exploit the variability across such sets of observations of the same object to estimate class labels for unlabeled examples given a minimal number of labeled examples. In contrast to similar semisupervised classification procedures that define the likelihood that two observations share a label as a function of the embedded distance between the two observations, this method uses the Naive Bayes estimate of how often the two observations did result from the same observed process. Exploiting this additional source of information in an iterative estimation procedure can generalize complex classification models from single labeled observations. Some examples involving classification of tracked objects in a low-dimensional feature space given thousands of unlabeled observation sets are used to illustrate the effectiveness of this method.
我们讨论从单个标记的示例和大量未标记的观察集构建复杂分类器,每个观察集都来自单个过程或对象的观察。当数据可以通过观察来测量时,它通常是丰富的,并且通常可以对一个过程或对象的状态进行多次观察。我们讨论了如何利用相同对象的这些观察集之间的可变性来估计给定最小数量的标记示例的未标记示例的类标记。类似的半监督分类程序将两个观测值共享标签的可能性定义为两个观测值之间嵌入距离的函数,与此相反,该方法使用朴素贝叶斯估计两个观测值来自相同观察过程的频率。在迭代估计过程中利用这种额外的信息源可以从单个标记的观测中推广复杂的分类模型。在给定数千个未标记的观测集的低维特征空间中对跟踪目标进行分类的实例说明了该方法的有效性。
{"title":"Minimally-supervised classification using multiple observation sets","authors":"C. Stauffer","doi":"10.1109/ICCV.2003.1238358","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238358","url":null,"abstract":"We discuss building complex classifiers from a single labeled example and vast number of unlabeled observation sets, each derived from observation of a single process or object. When data can be measured by observation, it is often plentiful and it is often possible to make more than one observation of the state of a process or object. We discuss how to exploit the variability across such sets of observations of the same object to estimate class labels for unlabeled examples given a minimal number of labeled examples. In contrast to similar semisupervised classification procedures that define the likelihood that two observations share a label as a function of the embedded distance between the two observations, this method uses the Naive Bayes estimate of how often the two observations did result from the same observed process. Exploiting this additional source of information in an iterative estimation procedure can generalize complex classification models from single labeled observations. Some examples involving classification of tracked objects in a low-dimensional feature space given thousands of unlabeled observation sets are used to illustrate the effectiveness of this method.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127244228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Using specularities for recognition 利用镜面进行识别
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238669
Margarita Osadchy, D. Jacobs, R. Ramamoorthi
Recognition systems have generally treated specular highlights as noise. We show how to use these highlights as a positive source of information that improves recognition of shiny objects. This also enables us to recognize very challenging shiny transparent objects, such as wine glasses. Specifically, we show how to find highlights that are consistent with a hypothesized pose of an object of known 3D shape. We do this using only a qualitative description of highlight formation that is consistent with most models of specular reflection, so no specific knowledge of an object's reflectance properties is needed. We first present a method that finds highlights produced by a dominant compact light source, whose position is roughly known. We then show how to estimate the lighting automatically for objects whose reflection is part specular and part Lambertian. We demonstrate this method for two classes of objects. First, we show that specular information alone can suffice to identify objects with no Lambertian reflectance, such as transparent wine glasses. Second, we use our complete system to recognize shiny objects, such as pottery.
识别系统通常将镜面高光视为噪声。我们展示了如何使用这些亮点作为积极的信息来源,以提高对闪亮物体的识别。这也使我们能够识别非常具有挑战性的闪亮透明物体,比如酒杯。具体来说,我们展示了如何找到与已知3D形状的对象的假设姿势一致的亮点。我们只使用与大多数镜面反射模型一致的高光形成的定性描述来做到这一点,因此不需要对物体的反射特性有具体的了解。我们首先提出了一种方法,该方法发现由主要紧凑光源产生的亮点,其位置大致已知。然后,我们展示了如何自动估计反射部分是镜面反射部分是朗伯反射的物体的照明。我们为两类对象演示了这种方法。首先,我们展示了单独的镜面信息足以识别没有朗伯反射率的物体,比如透明的酒杯。其次,我们用我们完整的系统来识别闪亮的物体,比如陶器。
{"title":"Using specularities for recognition","authors":"Margarita Osadchy, D. Jacobs, R. Ramamoorthi","doi":"10.1109/ICCV.2003.1238669","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238669","url":null,"abstract":"Recognition systems have generally treated specular highlights as noise. We show how to use these highlights as a positive source of information that improves recognition of shiny objects. This also enables us to recognize very challenging shiny transparent objects, such as wine glasses. Specifically, we show how to find highlights that are consistent with a hypothesized pose of an object of known 3D shape. We do this using only a qualitative description of highlight formation that is consistent with most models of specular reflection, so no specific knowledge of an object's reflectance properties is needed. We first present a method that finds highlights produced by a dominant compact light source, whose position is roughly known. We then show how to estimate the lighting automatically for objects whose reflection is part specular and part Lambertian. We demonstrate this method for two classes of objects. First, we show that specular information alone can suffice to identify objects with no Lambertian reflectance, such as transparent wine glasses. Second, we use our complete system to recognize shiny objects, such as pottery.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130058059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
3D tracking = classification + interpolation 3D跟踪=分类+插值
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238659
Carlo Tomasi, Slav Petrov, A. Sastry
Hand gestures are examples of fast and complex motions. Computers fail to track these in fast video, but sleight of hand fools humans as well: what happens too quickly we just cannot see. We show a 3D tracker for these types of motions that relies on the recognition of familiar configurations in 2D images (classification), and fills the gaps in-between (interpolation). We illustrate this idea with experiments on hand motions similar to finger spelling. The penalty for a recognition failure is often small: if two configurations are confused, they are often similar to each other, and the illusion works well enough, for instance, to drive a graphics animation of the moving hand. We contribute advances in both feature design and classifier training: our image features are invariant to image scale, translation, and rotation, and we propose a classification method that combines VQPCA with discrimination trees.
手势是快速和复杂动作的例子。在快速视频中,电脑无法追踪到这些变化,但手法也能骗过人类:发生得太快的情况我们根本看不见。我们为这些类型的运动展示了一个3D跟踪器,它依赖于对2D图像中熟悉配置的识别(分类),并填补了两者之间的空白(插值)。我们用类似于手指拼写的手部动作实验来说明这个观点。对识别失败的惩罚通常很小:如果混淆了两种配置,它们通常是相似的,而且这种错觉效果很好,例如,可以驱动移动的手的图形动画。我们在特征设计和分类器训练方面都取得了进展:我们的图像特征对图像尺度、平移和旋转不变化,我们提出了一种将VQPCA与识别树相结合的分类方法。
{"title":"3D tracking = classification + interpolation","authors":"Carlo Tomasi, Slav Petrov, A. Sastry","doi":"10.1109/ICCV.2003.1238659","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238659","url":null,"abstract":"Hand gestures are examples of fast and complex motions. Computers fail to track these in fast video, but sleight of hand fools humans as well: what happens too quickly we just cannot see. We show a 3D tracker for these types of motions that relies on the recognition of familiar configurations in 2D images (classification), and fills the gaps in-between (interpolation). We illustrate this idea with experiments on hand motions similar to finger spelling. The penalty for a recognition failure is often small: if two configurations are confused, they are often similar to each other, and the illusion works well enough, for instance, to drive a graphics animation of the moving hand. We contribute advances in both feature design and classifier training: our image features are invariant to image scale, translation, and rotation, and we propose a classification method that combines VQPCA with discrimination trees.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132979841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
Maintaining multimodality through mixture tracking 通过混合跟踪维持多模态
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238473
J. Vermaak, A. Doucet, P. Pérez
In recent years particle filters have become a tremendously popular tool to perform tracking for nonlinear and/or nonGaussian models. This is due to their simplicity, generality and success over a wide range of challenging applications. Particle filters, and Monte Carlo methods in general, are however poor at consistently maintaining the multimodality of the target distributions that may arise due to ambiguity or the presence of multiple objects. To address this shortcoming this paper proposes to model the target distribution as a nonparametric mixture model, and presents the general tracking recursion in this case. It is shown how a Monte Carlo implementation of the general recursion leads to a mixture of particle filters that interact only in the computation of the mixture weights, thus leading to an efficient numerical algorithm, where all the results pertaining to standard particle filters apply. The ability of the new method to maintain posterior multimodality is illustrated on a synthetic example and a real world tracking problem involving the tracking of football players in a video sequence.
近年来,粒子滤波已经成为一种非常流行的工具,用于执行非线性和/或非高斯模型的跟踪。这是由于它们的简单性、通用性和在各种具有挑战性的应用中取得的成功。然而,粒子滤波和一般的蒙特卡罗方法在一致地维持目标分布的多模态方面很差,这可能是由于模糊性或多个对象的存在而引起的。针对这一缺点,本文提出将目标分布建模为非参数混合模型,并给出了这种情况下的一般跟踪递归。它显示了一般递归的蒙特卡罗实现如何导致仅在混合权重计算中相互作用的混合粒子滤波器,从而导致有效的数值算法,其中所有有关标准粒子滤波器的结果都适用。通过一个综合实例和一个涉及视频序列中足球运动员跟踪的现实世界跟踪问题,说明了新方法保持后验多模态的能力。
{"title":"Maintaining multimodality through mixture tracking","authors":"J. Vermaak, A. Doucet, P. Pérez","doi":"10.1109/ICCV.2003.1238473","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238473","url":null,"abstract":"In recent years particle filters have become a tremendously popular tool to perform tracking for nonlinear and/or nonGaussian models. This is due to their simplicity, generality and success over a wide range of challenging applications. Particle filters, and Monte Carlo methods in general, are however poor at consistently maintaining the multimodality of the target distributions that may arise due to ambiguity or the presence of multiple objects. To address this shortcoming this paper proposes to model the target distribution as a nonparametric mixture model, and presents the general tracking recursion in this case. It is shown how a Monte Carlo implementation of the general recursion leads to a mixture of particle filters that interact only in the computation of the mixture weights, thus leading to an efficient numerical algorithm, where all the results pertaining to standard particle filters apply. The ability of the new method to maintain posterior multimodality is illustrated on a synthetic example and a real world tracking problem involving the tracking of football players in a video sequence.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"436 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130507287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 463
Bayesian clustering of optical flow fields 光流场的贝叶斯聚类
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238470
J. Hoey, J. Little
We present a method for unsupervised learning of classes of motions in video. We project optical flow fields to a complete, orthogonal, a-priori set of basis functions in a probabilistic fashion, which improves the estimation of the projections by incorporating uncertainties in the flows. We then cluster the projections using a mixture of feature-weighted Gaussians over optical flow fields. The resulting model extracts a concise probabilistic description of the major classes of optical flow present. The method is demonstrated on a video of a person's facial expressions.
提出了一种视频中运动类的无监督学习方法。我们以一种概率方式将光流场投影到一个完整的、正交的、先验的基函数集上,通过将流中的不确定性纳入其中,改进了投影的估计。然后,我们在光流场上使用特征加权高斯的混合聚类投影。所得模型对现有的主要类型的光流进行了简明的概率描述。该方法在一个人的面部表情视频中得到了演示。
{"title":"Bayesian clustering of optical flow fields","authors":"J. Hoey, J. Little","doi":"10.1109/ICCV.2003.1238470","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238470","url":null,"abstract":"We present a method for unsupervised learning of classes of motions in video. We project optical flow fields to a complete, orthogonal, a-priori set of basis functions in a probabilistic fashion, which improves the estimation of the projections by incorporating uncertainties in the flows. We then cluster the projections using a mixture of feature-weighted Gaussians over optical flow fields. The resulting model extracts a concise probabilistic description of the major classes of optical flow present. The method is demonstrated on a video of a person's facial expressions.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131703087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Computing MAP trajectories by representing, propagating and combining PDFs over groups 通过在组上表示、传播和组合pdf来计算MAP轨迹
Pub Date : 2003-10-13 DOI: 10.1109/ICCV.2003.1238637
Paul Smith, T. Drummond, K. Roussopoulos
This paper addresses the problem of computing the trajectory of a camera from sparse positional measurements that have been obtained from visual localisation, and dense differential measurements from odometry or inertial sensors. A fast method is presented for fusing these two sources of information to obtain the maximum a posteriori estimate of the trajectory. A formalism is introduced for representing probability density functions over Euclidean transformations, and it is shown how these density functions can be propagated along the data sequence and how multiple estimates of a transformation can be combined. A three-pass algorithm is described which makes use of these results to yield the trajectory of the camera. Simulation results are presented which are validated against a physical analogue of the vision problem, and results are then shown from sequences of approximately 1,800 frames captured from a video camera mounted on a go-kart. Several of these frames are processed using computer vision to obtain estimates of the position of the go-kart. The algorithm fuses these estimates with odometry from the entire sequence in 150 ms to obtain the trajectory of the kart.
本文解决了从视觉定位获得的稀疏位置测量和从里程计或惯性传感器获得的密集差分测量中计算相机轨迹的问题。提出了一种快速融合两种信息源的方法,以获得弹道的最大后验估计。介绍了在欧几里得变换上表示概率密度函数的一种形式,并展示了这些密度函数如何沿着数据序列传播,以及如何组合变换的多个估计。描述了一种利用这些结果产生相机轨迹的三步算法。通过视觉问题的物理模拟验证了仿真结果,然后显示了从安装在卡丁车上的摄像机捕获的大约1800帧序列的结果。其中一些帧使用计算机视觉处理,以获得卡丁车位置的估计。该算法在150毫秒内将这些估计与整个序列的里程计融合,以获得卡丁车的轨迹。
{"title":"Computing MAP trajectories by representing, propagating and combining PDFs over groups","authors":"Paul Smith, T. Drummond, K. Roussopoulos","doi":"10.1109/ICCV.2003.1238637","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238637","url":null,"abstract":"This paper addresses the problem of computing the trajectory of a camera from sparse positional measurements that have been obtained from visual localisation, and dense differential measurements from odometry or inertial sensors. A fast method is presented for fusing these two sources of information to obtain the maximum a posteriori estimate of the trajectory. A formalism is introduced for representing probability density functions over Euclidean transformations, and it is shown how these density functions can be propagated along the data sequence and how multiple estimates of a transformation can be combined. A three-pass algorithm is described which makes use of these results to yield the trajectory of the camera. Simulation results are presented which are validated against a physical analogue of the vision problem, and results are then shown from sequences of approximately 1,800 frames captured from a video camera mounted on a go-kart. Several of these frames are processed using computer vision to obtain estimates of the position of the go-kart. The algorithm fuses these estimates with odometry from the entire sequence in 150 ms to obtain the trajectory of the kart.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129176104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
期刊
Proceedings Ninth IEEE International Conference on Computer Vision
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1