Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238310
Yuri Boykov, V. Kolmogorov
Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry.
{"title":"Computing geodesics and minimal surfaces via graph cuts","authors":"Yuri Boykov, V. Kolmogorov","doi":"10.1109/ICCV.2003.1238310","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238310","url":null,"abstract":"Geodesic active contours and graph cuts are two standard image segmentation techniques. We introduce a new segmentation method combining some of their benefits. Our main intuition is that any cut on a graph embedded in some continuous space can be interpreted as a contour (in 2D) or a surface (in 3D). We show how to build a grid graph and set its edge weights so that the cost of cuts is arbitrarily close to the length (area) of the corresponding contours (surfaces) for any anisotropic Riemannian metric. There are two interesting consequences of this technical result. First, graph cut algorithms can be used to find globally minimum geodesic contours (minimal surfaces in 3D) under arbitrary Riemannian metric for a given set of boundary conditions. Second, we show how to minimize metrication artifacts in existing graph-cut based methods in vision. Theoretically speaking, our work provides an interesting link between several branches of mathematics -differential geometry, integral geometry, and combinatorial optimization. The main technical problem is solved using Cauchy-Crofton formula from integral geometry.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128016664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238360
Anat Levin, A. Zomet, Yair Weiss
Inpainting is the problem of filling-in holes in images. Considerable progress has been made by techniques that use the immediate boundary of the hole and some prior information on images to solve this problem. These algorithms successfully solve the local inpainting problem but they must, by definition, give the same completion to any two holes that have the same boundary, even when the rest of the image is vastly different. We address a different, more global inpainting problem. How can we use the rest of the image in order to learn how to inpaint? We approach this problem from the context of statistical learning. Given a training image we build an exponential family distribution over images that is based on the histograms of local features. We then use this image specific distribution to inpaint the hole by finding the most probable image given the boundary and the distribution. The optimization is done using loopy belief propagation. We show that our method can successfully complete holes while taking into account the specific image statistics. In particular it can give vastly different completions even when the local neighborhoods are identical.
{"title":"Learning how to inpaint from global image statistics","authors":"Anat Levin, A. Zomet, Yair Weiss","doi":"10.1109/ICCV.2003.1238360","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238360","url":null,"abstract":"Inpainting is the problem of filling-in holes in images. Considerable progress has been made by techniques that use the immediate boundary of the hole and some prior information on images to solve this problem. These algorithms successfully solve the local inpainting problem but they must, by definition, give the same completion to any two holes that have the same boundary, even when the rest of the image is vastly different. We address a different, more global inpainting problem. How can we use the rest of the image in order to learn how to inpaint? We approach this problem from the context of statistical learning. Given a training image we build an exponential family distribution over images that is based on the histograms of local features. We then use this image specific distribution to inpaint the hole by finding the most probable image given the boundary and the distribution. The optimization is done using loopy belief propagation. We show that our method can successfully complete holes while taking into account the specific image statistics. In particular it can give vastly different completions even when the local neighborhoods are identical.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"116 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128051742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238631
Cheng-en Guo, Song-Chun Zhu, Y. Wu
In this paper, we present a mathematical theory for Marr's primal sketch. We first conduct a theoretical study of the descriptive Markov random field model and the generative wavelet/sparse coding model from the perspective of entropy and complexity. The competition between the two types of models defines the concept of "sketchability", which divides image into texture and geometry. We then propose a primal sketch model that integrates the two models and, in addition, a Gestalt field model for spatial organization. We also propose a sketching pursuit process that coordinates the competition between two pursuit algorithms: the matching pursuit (Mallat and Zhang, 1993) and the filter pursuit (Zhu, et al., 1997), that seek to explain the image by bases and filters respectively. The model can be used to learn a dictionary of image primitives, or textons in Julesz's language, for natural images. The primal sketch model is not only parsimonious for image representation, but produces meaningful sketches over a large number of generic images.
在本文中,我们提出了马尔原始草图的数学理论。首先从熵和复杂度的角度对描述性马尔可夫随机场模型和生成小波/稀疏编码模型进行了理论研究。两种模型之间的竞争定义了“可素描性”的概念,将图像分为纹理和几何。然后,我们提出了一个整合这两个模型的原始草图模型,以及一个空间组织的格式塔场模型。我们还提出了一种素描追踪过程,它协调了两种追踪算法之间的竞争:匹配追踪(Mallat and Zhang, 1993)和滤波追踪(Zhu, et al., 1997),这两种算法分别试图通过基和滤波器来解释图像。这个模型可以用来学习自然图像的图像原语字典,或者用Julesz的语言来说就是文本。原始草图模型不仅简化了图像表示,而且在大量通用图像上生成有意义的草图。
{"title":"Towards a mathematical theory of primal sketch and sketchability","authors":"Cheng-en Guo, Song-Chun Zhu, Y. Wu","doi":"10.1109/ICCV.2003.1238631","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238631","url":null,"abstract":"In this paper, we present a mathematical theory for Marr's primal sketch. We first conduct a theoretical study of the descriptive Markov random field model and the generative wavelet/sparse coding model from the perspective of entropy and complexity. The competition between the two types of models defines the concept of \"sketchability\", which divides image into texture and geometry. We then propose a primal sketch model that integrates the two models and, in addition, a Gestalt field model for spatial organization. We also propose a sketching pursuit process that coordinates the competition between two pursuit algorithms: the matching pursuit (Mallat and Zhang, 1993) and the filter pursuit (Zhu, et al., 1997), that seek to explain the image by bases and filters respectively. The model can be used to learn a dictionary of image primitives, or textons in Julesz's language, for natural images. The primal sketch model is not only parsimonious for image representation, but produces meaningful sketches over a large number of generic images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128123054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238356
Michel Vidal-Naquet, S. Ullman
We show that efficient object recognition can be obtained by combining informative features with linear classification. The results demonstrate the superiority of informative class-specific features, as compared with generic type features such as wavelets, for the task of object recognition. We show that information rich features can reach optimal performance with simple linear separation rules, while generic feature based classifiers require more complex classification schemes. This is significant because efficient and optimal methods have been developed for spaces that allow linear separation. To compare different strategies for feature extraction, we trained and compared classifiers working in feature spaces of the same low dimensionality, using two feature types (image fragments vs. wavelets) and two classification rules (linear hyperplane and a Bayesian network). The results show that by maximizing the individual information of the features, it is possible to obtain efficient classification by a simple linear separating rule, as well as more efficient learning.
{"title":"Object recognition with informative features and linear classification","authors":"Michel Vidal-Naquet, S. Ullman","doi":"10.1109/ICCV.2003.1238356","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238356","url":null,"abstract":"We show that efficient object recognition can be obtained by combining informative features with linear classification. The results demonstrate the superiority of informative class-specific features, as compared with generic type features such as wavelets, for the task of object recognition. We show that information rich features can reach optimal performance with simple linear separation rules, while generic feature based classifiers require more complex classification schemes. This is significant because efficient and optimal methods have been developed for spaces that allow linear separation. To compare different strategies for feature extraction, we trained and compared classifiers working in feature spaces of the same low dimensionality, using two feature types (image fragments vs. wavelets) and two classification rules (linear hyperplane and a Bayesian network). The results show that by maximizing the individual information of the features, it is possible to obtain efficient classification by a simple linear separating rule, as well as more efficient learning.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122270947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238447
Song Wang, J. Ji, Zhi-Pei Liang
This paper presents a novel approach for landmark-based shape deformation, in which fitting error and shape difference are formulated into a support vector machine (SVM) regression problem. To well describe nonrigid shape deformation, this paper measures the shape difference using a thin-plate spline model. The proposed approach is capable of preserving the topology of the template shape in the deformation. This property is achieved by inserting a set of additional points and imposing a set of linear equality and/or inequality constraints. The underlying optimization problem is solved using a quadratic programming algorithm. The proposed method has been tested using practical data in the context of shape-based image segmentation. Some relevant practical issues, such as missing detected landmarks and selection of the regularization parameter are also briefly discussed.
{"title":"Landmark-based shape deformation with topology-preserving constraints","authors":"Song Wang, J. Ji, Zhi-Pei Liang","doi":"10.1109/ICCV.2003.1238447","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238447","url":null,"abstract":"This paper presents a novel approach for landmark-based shape deformation, in which fitting error and shape difference are formulated into a support vector machine (SVM) regression problem. To well describe nonrigid shape deformation, this paper measures the shape difference using a thin-plate spline model. The proposed approach is capable of preserving the topology of the template shape in the deformation. This property is achieved by inserting a set of additional points and imposing a set of linear equality and/or inequality constraints. The underlying optimization problem is solved using a quadratic programming algorithm. The proposed method has been tested using practical data in the context of shape-based image segmentation. Some relevant practical issues, such as missing detected landmarks and selection of the regularization parameter are also briefly discussed.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121444023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238628
D. Simakov, D. Frolova, R. Basri
We present a method for shape reconstruction from several images of a moving object. The reconstruction is dense (up to image resolution). The method assumes that the motion is known, e.g., by tracking a small number of feature points on the object. The object is assumed Lambertian (completely matte), light sources should not be very close to the object but otherwise arbitrary, and no knowledge of lighting conditions is required. An object changes its appearance significantly when it changes its orientation relative to light sources, causing violation of the common brightness constancy assumption. While a lot of effort is devoted to deal with this violation, we demonstrate how to exploit it to recover 3D structure from 2D images. We propose a new correspondence measure that enables point matching across views of a moving object. The method has been tested both on computer simulated examples and on a real object.
{"title":"Dense shape reconstruction of a moving object under arbitrary, unknown lighting","authors":"D. Simakov, D. Frolova, R. Basri","doi":"10.1109/ICCV.2003.1238628","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238628","url":null,"abstract":"We present a method for shape reconstruction from several images of a moving object. The reconstruction is dense (up to image resolution). The method assumes that the motion is known, e.g., by tracking a small number of feature points on the object. The object is assumed Lambertian (completely matte), light sources should not be very close to the object but otherwise arbitrary, and no knowledge of lighting conditions is required. An object changes its appearance significantly when it changes its orientation relative to light sources, causing violation of the common brightness constancy assumption. While a lot of effort is devoted to deal with this violation, we demonstrate how to exploit it to recover 3D structure from 2D images. We propose a new correspondence measure that enables point matching across views of a moving object. The method has been tested both on computer simulated examples and on a real object.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121625026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238331
Jiaya Jia, Chi-Keung Tang
Inspired by tensor voting, we present luminance voting, a novel approach for image registration with global and local luminance alignment. The key to our modeless approach is the direct estimation of replacement function, by reducing the complex estimation problem to the robust 2D tensor voting in the corresponding voting spaces. No model for replacement function is assumed. Luminance data are first encoded into 2D ball tensors. Subject to the monotonic constraint only, we vote for an optimal replacement function by propagating the smoothness constraint using a dense tensor field. Our method effectively infers missing curve segments and rejects image outliers without assuming any simplifying or complex curve model. The voted replacement functions are used in our iterative registration algorithm for computing the best warping matrix. Unlike previous approaches, our robust method corrects exposure disparity even if the two overlapping images are initially misaligned. Luminance voting is effective in correcting exposure difference, eliminating vignettes, and thus improving image registration. We present results on a variety of images.
{"title":"Image registration with global and local luminance alignment","authors":"Jiaya Jia, Chi-Keung Tang","doi":"10.1109/ICCV.2003.1238331","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238331","url":null,"abstract":"Inspired by tensor voting, we present luminance voting, a novel approach for image registration with global and local luminance alignment. The key to our modeless approach is the direct estimation of replacement function, by reducing the complex estimation problem to the robust 2D tensor voting in the corresponding voting spaces. No model for replacement function is assumed. Luminance data are first encoded into 2D ball tensors. Subject to the monotonic constraint only, we vote for an optimal replacement function by propagating the smoothness constraint using a dense tensor field. Our method effectively infers missing curve segments and rejects image outliers without assuming any simplifying or complex curve model. The voted replacement functions are used in our iterative registration algorithm for computing the best warping matrix. Unlike previous approaches, our robust method corrects exposure disparity even if the two overlapping images are initially misaligned. Luminance voting is effective in correcting exposure difference, eliminating vignettes, and thus improving image registration. We present results on a variety of images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115894759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238307
Maneesh Kumar Singh, N. Ahuja
We consider the problem of segmentation of images that can be modelled as piecewise continuous signals having unknown, nonstationary statistics. We propose a solution to this problem which first uses a regression framework to estimate the image PDF, and then mean-shift to find the modes of this PDF. The segmentation follows from mode identification wherein pixel clusters or image segments are identified with unique modes of the multimodal PDF. Each pixel is mapped to a mode using a convergent, iterative process. The effectiveness of the approach depends upon the accuracy of the (implicit) estimate of the underlying multimodal density function and thus on the bandwidth parameters used for its estimate using Parzen windows. Automatic selection of bandwidth parameters is a desired feature of the algorithm. We show that the proposed regression-based model admits a realistic framework to automatically choose bandwidth parameters which minimizes a global error criterion. We validate the theory presented with results on real images.
{"title":"Regression based bandwidth selection for segmentation using Parzen windows","authors":"Maneesh Kumar Singh, N. Ahuja","doi":"10.1109/ICCV.2003.1238307","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238307","url":null,"abstract":"We consider the problem of segmentation of images that can be modelled as piecewise continuous signals having unknown, nonstationary statistics. We propose a solution to this problem which first uses a regression framework to estimate the image PDF, and then mean-shift to find the modes of this PDF. The segmentation follows from mode identification wherein pixel clusters or image segments are identified with unique modes of the multimodal PDF. Each pixel is mapped to a mode using a convergent, iterative process. The effectiveness of the approach depends upon the accuracy of the (implicit) estimate of the underlying multimodal density function and thus on the bandwidth parameters used for its estimate using Parzen windows. Automatic selection of bandwidth parameters is a desired feature of the algorithm. We show that the proposed regression-based model admits a realistic framework to automatically choose bandwidth parameters which minimizes a global error criterion. We validate the theory presented with results on real images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131771472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238390
Peng-Yeng Yin, B. Bhanu, Kuang-Cheng Chang, Anlei Dong
Relevance feedback (RF) is an interactive process which refines the retrievals by utilizing user's feedback history. Most researchers strive to develop new RF techniques and ignore the advantages of existing ones. We propose an image relevance reinforcement learning (IRRL) model for integrating existing RF techniques. Various integration schemes are presented and a long-term shared memory is used to exploit the retrieval experience from multiple users. Also, a concept digesting method is proposed to reduce the complexity of storage demand. The experimental results manifest that the integration of multiple RF approaches gives better retrieval performance than using one RF technique alone, and that the sharing of relevance knowledge between multiple query sessions also provides significant contributions for improvement. Further, the storage demand is significantly reduced by the concept digesting technique. This shows the scalability of the proposed model against a growing-size database.
{"title":"Reinforcement learning for combining relevance feedback techniques","authors":"Peng-Yeng Yin, B. Bhanu, Kuang-Cheng Chang, Anlei Dong","doi":"10.1109/ICCV.2003.1238390","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238390","url":null,"abstract":"Relevance feedback (RF) is an interactive process which refines the retrievals by utilizing user's feedback history. Most researchers strive to develop new RF techniques and ignore the advantages of existing ones. We propose an image relevance reinforcement learning (IRRL) model for integrating existing RF techniques. Various integration schemes are presented and a long-term shared memory is used to exploit the retrieval experience from multiple users. Also, a concept digesting method is proposed to reduce the complexity of storage demand. The experimental results manifest that the integration of multiple RF approaches gives better retrieval performance than using one RF technique alone, and that the sharing of relevance knowledge between multiple query sessions also provides significant contributions for improvement. Further, the storage demand is significantly reduced by the concept digesting technique. This shows the scalability of the proposed model against a growing-size database.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133485472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238472
Hanning Zhou, Thomas S. Huang
This paper introduces the concept of eigen-dynamics and proposes an eigen dynamics analysis (EDA) method to learn the dynamics of natural hand motion from labelled sets of motion captured with a data glove. The result is parameterized with a high-order stochastic linear dynamic system (LDS) consisting of five lower-order LDS. Each corresponding to one eigen-dynamics. Based on the EDA model, we construct a dynamic Bayesian network (DBN) to analyze the generative process of a image sequence of natural hand motion. Using the DBN, a hand tracking system is implemented. Experiments on both synthesized and real-world data demonstrate the robustness and effectiveness of these techniques.
{"title":"Tracking articulated hand motion with eigen dynamics analysis","authors":"Hanning Zhou, Thomas S. Huang","doi":"10.1109/ICCV.2003.1238472","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238472","url":null,"abstract":"This paper introduces the concept of eigen-dynamics and proposes an eigen dynamics analysis (EDA) method to learn the dynamics of natural hand motion from labelled sets of motion captured with a data glove. The result is parameterized with a high-order stochastic linear dynamic system (LDS) consisting of five lower-order LDS. Each corresponding to one eigen-dynamics. Based on the EDA model, we construct a dynamic Bayesian network (DBN) to analyze the generative process of a image sequence of natural hand motion. Using the DBN, a hand tracking system is implemented. Experiments on both synthesized and real-world data demonstrate the robustness and effectiveness of these techniques.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"185 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133280086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}