Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238433
E. Prados, O. Faugeras
This article proposes a solution of the Lambertian shape from shading (SFS) problem in the case of a pinhole camera model (performing a perspective projection). Our approach is based upon the notion of viscosity solutions of Hamilton-Jacobi equations. This approach allows us to naturally deal with nonsmooth solutions and provides a mathematical framework for proving correctness of our algorithms. Our work extends previous work in the area in three aspects. First, it models the camera as a pinhole whereas most authors assume an orthographic projection, thereby extending the applicability of shape from shading methods to more realistic images. In particular it extends the work of E. Prados et al. (2002) and E. Rouy et al. (1992). Second, by adapting the brightness equation to the perspective problem, we obtain a new partial differential equation (PDE). Results about the existence and uniqueness of its solution are also obtained. Third, it allows us to come up with a new approximation scheme and a new algorithm for computing numerical approximations of the "continuous" solution as well as a proof of their convergence toward that solution.
{"title":"\"Perspective shape from shading\" and viscosity solutions","authors":"E. Prados, O. Faugeras","doi":"10.1109/ICCV.2003.1238433","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238433","url":null,"abstract":"This article proposes a solution of the Lambertian shape from shading (SFS) problem in the case of a pinhole camera model (performing a perspective projection). Our approach is based upon the notion of viscosity solutions of Hamilton-Jacobi equations. This approach allows us to naturally deal with nonsmooth solutions and provides a mathematical framework for proving correctness of our algorithms. Our work extends previous work in the area in three aspects. First, it models the camera as a pinhole whereas most authors assume an orthographic projection, thereby extending the applicability of shape from shading methods to more realistic images. In particular it extends the work of E. Prados et al. (2002) and E. Rouy et al. (1992). Second, by adapting the brightness equation to the perspective problem, we obtain a new partial differential equation (PDE). Results about the existence and uniqueness of its solution are also obtained. Third, it allows us to come up with a new approximation scheme and a new algorithm for computing numerical approximations of the \"continuous\" solution as well as a proof of their convergence toward that solution.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129928454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238629
C. Rother
This paper presents a new linear method for reconstructing simultaneously 3D features (points, lines and planes) and cameras from many perspective views by solving a single linear system. It assumes that a real or virtual reference plane is visible in all views. We call it the Direct Reference Plane (DRP) method. It is well known that the projection relationship between uncalibrated cameras and 3D features is nonlinear in the absence of a reference plane. With a known reference plane, points and cameras have a linear relationship, as shown by Rother and Carlsson (2001). The main contribution of this paper is that lines and cameras, as well as, planes and cameras also have a linear relationship. Consequently, all 3D features and all cameras can be reconstructed simultaneously from a single linear system, which handles missing image measurements naturally. A further contribution is an extensive experimental comparison, using real data, of different reference plane and nonreference plane reconstruction methods. For difficult reference plane scenarios, with point or line features, the DRP method is superior to all compared methods. Finally, an extensive list of reference plane scenarios is presented, which shows the wide applicability of the DRP method.
{"title":"Linear multiview reconstruction of points, lines, planes and cameras using a reference plane","authors":"C. Rother","doi":"10.1109/ICCV.2003.1238629","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238629","url":null,"abstract":"This paper presents a new linear method for reconstructing simultaneously 3D features (points, lines and planes) and cameras from many perspective views by solving a single linear system. It assumes that a real or virtual reference plane is visible in all views. We call it the Direct Reference Plane (DRP) method. It is well known that the projection relationship between uncalibrated cameras and 3D features is nonlinear in the absence of a reference plane. With a known reference plane, points and cameras have a linear relationship, as shown by Rother and Carlsson (2001). The main contribution of this paper is that lines and cameras, as well as, planes and cameras also have a linear relationship. Consequently, all 3D features and all cameras can be reconstructed simultaneously from a single linear system, which handles missing image measurements naturally. A further contribution is an extensive experimental comparison, using real data, of different reference plane and nonreference plane reconstruction methods. For difficult reference plane scenarios, with point or line features, the DRP method is superior to all compared methods. Finally, an extensive list of reference plane scenarios is presented, which shows the wide applicability of the DRP method.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127683535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238367
M. Pavan, M. Pelillo
Dominant sets are a new graph-theoretic concept that has proven to be relevant in partitional (flat) clustering as well as image segmentation problems. However, in many computer vision applications, such as the organization of an image database, it is important to provide the data to be clustered with a hierarchical organization, and it is not clear how to do this within the dominant set framework. We address precisely this problem, and present a simple and elegant solution to it. To this end, we consider a family of (continuous) quadratic programs, which contain a parameterized regularization term that controls the global shape of the energy landscape. When the regularization parameter is zero the local solutions are known to be in one-to-one correspondence with dominant sets, but when it is positive an interesting picture emerges. We determine bounds for the regularization parameter that allow us to exclude from the set of local solutions those inducing clusters of size smaller than a prescribed threshold. This suggests a new (divisive) hierarchical approach to clustering, which is based on the idea of properly varying the regularization parameter during the clustering process. Straightforward dynamics from evolutionary game theory are used to locate the solutions of the quadratic programs at each level of the hierarchy. We apply the proposed framework to the problem of organizing a shape database. Experiments with three different similarity matrices (and databases) reported in the literature have been conducted, and the results confirm the effectiveness of our approach.
{"title":"Dominant sets and hierarchical clustering","authors":"M. Pavan, M. Pelillo","doi":"10.1109/ICCV.2003.1238367","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238367","url":null,"abstract":"Dominant sets are a new graph-theoretic concept that has proven to be relevant in partitional (flat) clustering as well as image segmentation problems. However, in many computer vision applications, such as the organization of an image database, it is important to provide the data to be clustered with a hierarchical organization, and it is not clear how to do this within the dominant set framework. We address precisely this problem, and present a simple and elegant solution to it. To this end, we consider a family of (continuous) quadratic programs, which contain a parameterized regularization term that controls the global shape of the energy landscape. When the regularization parameter is zero the local solutions are known to be in one-to-one correspondence with dominant sets, but when it is positive an interesting picture emerges. We determine bounds for the regularization parameter that allow us to exclude from the set of local solutions those inducing clusters of size smaller than a prescribed threshold. This suggests a new (divisive) hierarchical approach to clustering, which is based on the idea of properly varying the regularization parameter during the clustering process. Straightforward dynamics from evolutionary game theory are used to locate the solutions of the quadratic programs at each level of the hierarchy. We apply the proposed framework to the problem of organizing a shape database. Experiments with three different similarity matrices (and databases) reported in the literature have been conducted, and the results confirm the effectiveness of our approach.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129656569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238437
Jiahua Wu, M. Chantler
We present a new texture classification scheme which is invariant to surface-rotation. Many texture classification approaches have been presented in the past that are image-rotation invariant. However, image rotation is not necessarily the same as surface rotation. We have therefore developed a classifier that uses invariants that are derived from surface properties rather than image properties. Previously we developed a scheme that used surface gradient (normal) fields estimated using photometric stereo. In this paper we augment these data with albedo information and also employ an additional feature set: the radial spectrum. We used 30 real textures to test the new classifier. A classification accuracy of 91% was achieved when albedo and gradient 1D polar and radial features were combined. The best performance was also achieved by using 2D albedo and gradient spectra. The classification accuracy is 99%.
{"title":"Combining gradient and albedo data for rotation invariant classification of 3D surface texture","authors":"Jiahua Wu, M. Chantler","doi":"10.1109/ICCV.2003.1238437","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238437","url":null,"abstract":"We present a new texture classification scheme which is invariant to surface-rotation. Many texture classification approaches have been presented in the past that are image-rotation invariant. However, image rotation is not necessarily the same as surface rotation. We have therefore developed a classifier that uses invariants that are derived from surface properties rather than image properties. Previously we developed a scheme that used surface gradient (normal) fields estimated using photometric stereo. In this paper we augment these data with albedo information and also employ an additional feature set: the radial spectrum. We used 30 real textures to test the new classifier. A classification accuracy of 91% was achieved when albedo and gradient 1D polar and radial features were combined. The best performance was also achieved by using 2D albedo and gradient spectra. The classification accuracy is 99%.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129695218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238351
C. Wallraven, B. Caputo, Arnulf B. A. Graf
Recent developments in computer vision have shown that local features can provide efficient representations suitable for robust object recognition. Support vector machines have been established as powerful learning algorithms with good generalization capabilities. We combine these two approaches and propose a general kernel method for recognition with local features. We show that the proposed kernel satisfies the Mercer condition and that it is, suitable for many established local feature frameworks. Large-scale recognition results are presented on three different databases, which demonstrate that SVMs with the proposed kernel perform better than standard matching techniques on local features. In addition, experiments on noisy and occluded images show that local feature representations significantly outperform global approaches.
{"title":"Recognition with local features: the kernel recipe","authors":"C. Wallraven, B. Caputo, Arnulf B. A. Graf","doi":"10.1109/ICCV.2003.1238351","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238351","url":null,"abstract":"Recent developments in computer vision have shown that local features can provide efficient representations suitable for robust object recognition. Support vector machines have been established as powerful learning algorithms with good generalization capabilities. We combine these two approaches and propose a general kernel method for recognition with local features. We show that the proposed kernel satisfies the Mercer condition and that it is, suitable for many established local feature frameworks. Large-scale recognition results are presented on three different databases, which demonstrate that SVMs with the proposed kernel perform better than standard matching techniques on local features. In addition, experiments on noisy and occluded images show that local feature representations significantly outperform global approaches.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124389466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238636
T. Kadir, M. Brady
We present a novel non-parametric unsupervised segmentation algorithm based on region competition (Zhu and Yuille, 1996); but implemented within a level sets framework (Osher and Sethian, 1988). The key novelty of the algorithm is that it can solve N /spl ges/ 2 class segmentation problems using just one embedded surface; this is achieved by controlling the merging and splitting behaviour of the level sets according to a minimum description length (MDL) (Leclerc (1989) and Rissanen (1985)) cost function. This is in contrast to N class region-based level set segmentation methods to date which operate by evolving multiple coupled embedded surfaces in parallel (Chan et al., 2002). Furthermore, it operates in an unsupervised manner; it is necessary neither to specify the value of N nor the class models a-priori. We argue that the level sets methodology provides a more convenient framework for the implementation of the region competition algorithm, which is conventionally implemented using region membership arrays due to the lack of a intrinsic curve representation. Finally, we generalise the Gaussian region model used in standard region competition to the non-parametric case. The region boundary motion and merge equations become simple expressions containing cross-entropy and entropy terms.
我们提出了一种新的基于区域竞争的非参数无监督分割算法(Zhu and Yuille, 1996);但要在关卡集框架内执行。该算法的新颖之处在于仅使用一个嵌入曲面就可以解决N /spl / 2类分割问题;这是通过根据最小描述长度(MDL) (Leclerc(1989)和Rissanen(1985))成本函数控制关卡集的合并和分裂行为来实现的。这与迄今为止N类基于区域的水平集分割方法形成对比,后者通过并行发展多个耦合嵌入表面来操作(Chan et al., 2002)。此外,它以一种无人监督的方式运作;既不需要指定N的值,也不需要先验地指定类模型。我们认为水平集方法为区域竞争算法的实现提供了一个更方便的框架,由于缺乏固有的曲线表示,区域竞争算法通常使用区域成员数组实现。最后,我们将标准区域竞争中的高斯区域模型推广到非参数情况。区域边界运动和合并方程变成包含交叉熵和熵项的简单表达式。
{"title":"Unsupervised non-parametric region segmentation using level sets","authors":"T. Kadir, M. Brady","doi":"10.1109/ICCV.2003.1238636","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238636","url":null,"abstract":"We present a novel non-parametric unsupervised segmentation algorithm based on region competition (Zhu and Yuille, 1996); but implemented within a level sets framework (Osher and Sethian, 1988). The key novelty of the algorithm is that it can solve N /spl ges/ 2 class segmentation problems using just one embedded surface; this is achieved by controlling the merging and splitting behaviour of the level sets according to a minimum description length (MDL) (Leclerc (1989) and Rissanen (1985)) cost function. This is in contrast to N class region-based level set segmentation methods to date which operate by evolving multiple coupled embedded surfaces in parallel (Chan et al., 2002). Furthermore, it operates in an unsupervised manner; it is necessary neither to specify the value of N nor the class models a-priori. We argue that the level sets methodology provides a more convenient framework for the implementation of the region competition algorithm, which is conventionally implemented using region membership arrays due to the lack of a intrinsic curve representation. Finally, we generalise the Gaussian region model used in standard region competition to the non-parametric case. The region boundary motion and merge equations become simple expressions containing cross-entropy and entropy terms.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124590306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238467
B. Stenger, A. Thayananthan, P. Torr, R. Cipolla
Within this paper a new framework for Bayesian tracking is presented, which approximates the posterior distribution at multiple resolutions. We propose a tree-based representation of the distribution, where the leaves define a partition of the state space with piecewise constant density. The advantage of this representation is that regions with low probability mass can be rapidly discarded in a hierarchical search, and the distribution can be approximated to arbitrary precision. We demonstrate the effectiveness of the technique by using it for tracking 3D articulated and nonrigid motion in front of cluttered background. More specifically, we are interested in estimating the joint angles, position and orientation of a 3D hand model in order to drive an avatar.
{"title":"Filtering using a tree-based estimator","authors":"B. Stenger, A. Thayananthan, P. Torr, R. Cipolla","doi":"10.1109/ICCV.2003.1238467","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238467","url":null,"abstract":"Within this paper a new framework for Bayesian tracking is presented, which approximates the posterior distribution at multiple resolutions. We propose a tree-based representation of the distribution, where the leaves define a partition of the state space with piecewise constant density. The advantage of this representation is that regions with low probability mass can be rapidly discarded in a hierarchical search, and the distribution can be approximated to arbitrary precision. We demonstrate the effectiveness of the technique by using it for tracking 3D articulated and nonrigid motion in front of cluttered background. More specifically, we are interested in estimating the joint angles, position and orientation of a 3D hand model in order to drive an avatar.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122820834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238412
Anand Rangarajan, J. Coughlan, A. Yuille
A Bayesian network formulation for relational shape matching is presented. The main advantage of the relational shape matching approach is the obviation of the nonrigid spatial mappings used by recent nonrigid matching approaches. The basic variables that need to be estimated in the relational shape matching objective function are the global rotation and scale and the local displacements and correspondences. The new Bethe free energy approach is used to estimate the pairwise correspondences between links of the template graphs and the data. The resulting framework is useful in both registration and recognition contexts. Results are shown on hand-drawn templates and on 2D transverse T1-weighted MR images.
{"title":"A Bayesian network framework for relational shape matching","authors":"Anand Rangarajan, J. Coughlan, A. Yuille","doi":"10.1109/ICCV.2003.1238412","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238412","url":null,"abstract":"A Bayesian network formulation for relational shape matching is presented. The main advantage of the relational shape matching approach is the obviation of the nonrigid spatial mappings used by recent nonrigid matching approaches. The basic variables that need to be estimated in the relational shape matching objective function are the global rotation and scale and the local displacements and correspondences. The new Bethe free energy approach is used to estimate the pairwise correspondences between links of the template graphs and the data. The resulting framework is useful in both registration and recognition contexts. Results are shown on hand-drawn templates and on 2D transverse T1-weighted MR images.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115517599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238383
Changjiang Yang, R. Duraiswami, N. Gumerov, L. Davis
Evaluating sums of multivariate Gaussians is a common computational task in computer vision and pattern recognition, including in the general and powerful kernel density estimation technique. The quadratic computational complexity of the summation is a significant barrier to the scalability of this algorithm to practical applications. The fast Gauss transform (FGT) has successfully accelerated the kernel density estimation to linear running time for low-dimensional problems. Unfortunately, the cost of a direct extension of the FGT to higher-dimensional problems grows exponentially with dimension, making it impractical for dimensions above 3. We develop an improved fast Gauss transform to efficiently estimate sums of Gaussians in higher dimensions, where a new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance. The improved FGT has been applied to the mean shift algorithm achieving linear computational complexity. Experimental results demonstrate the efficiency and effectiveness of our algorithm.
{"title":"Improved fast gauss transform and efficient kernel density estimation","authors":"Changjiang Yang, R. Duraiswami, N. Gumerov, L. Davis","doi":"10.1109/ICCV.2003.1238383","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238383","url":null,"abstract":"Evaluating sums of multivariate Gaussians is a common computational task in computer vision and pattern recognition, including in the general and powerful kernel density estimation technique. The quadratic computational complexity of the summation is a significant barrier to the scalability of this algorithm to practical applications. The fast Gauss transform (FGT) has successfully accelerated the kernel density estimation to linear running time for low-dimensional problems. Unfortunately, the cost of a direct extension of the FGT to higher-dimensional problems grows exponentially with dimension, making it impractical for dimensions above 3. We develop an improved fast Gauss transform to efficiently estimate sums of Gaussians in higher dimensions, where a new multivariate expansion scheme and an adaptive space subdivision technique dramatically improve the performance. The improved FGT has been applied to the mean shift algorithm achieving linear computational complexity. Experimental results demonstrate the efficiency and effectiveness of our algorithm.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132416359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2003-10-13DOI: 10.1109/ICCV.2003.1238469
Yue Zhou, Hai Tao
Motion layer estimation has recently emerged as a promising object tracking method. In this paper, we extend previous research on layer-based tracker by introducing the concept of background occluding layers and explicitly inferring depth ordering of foreground layers. The background occluding layers lie in front of, behind, and in between foreground layers. Each pixel in the background regions belongs to one of these layers and occludes all the foreground layers behind it. Together with the foreground ordering, the complete information necessary for reliably tracking objects through occlusion is included in our representation. An MAP estimation framework is developed to simultaneously update the motion layer parameters, the ordering parameters, and the background occluding layers. Experimental results show that under various conditions with occlusion, including situations with moving objects undergoing complex motions or having complex interactions, our tracking algorithm is able to handle many difficult tracking tasks reliably.
{"title":"A background layer model for object tracking through occlusion","authors":"Yue Zhou, Hai Tao","doi":"10.1109/ICCV.2003.1238469","DOIUrl":"https://doi.org/10.1109/ICCV.2003.1238469","url":null,"abstract":"Motion layer estimation has recently emerged as a promising object tracking method. In this paper, we extend previous research on layer-based tracker by introducing the concept of background occluding layers and explicitly inferring depth ordering of foreground layers. The background occluding layers lie in front of, behind, and in between foreground layers. Each pixel in the background regions belongs to one of these layers and occludes all the foreground layers behind it. Together with the foreground ordering, the complete information necessary for reliably tracking objects through occlusion is included in our representation. An MAP estimation framework is developed to simultaneously update the motion layer parameters, the ordering parameters, and the background occluding layers. Experimental results show that under various conditions with occlusion, including situations with moving objects undergoing complex motions or having complex interactions, our tracking algorithm is able to handle many difficult tracking tasks reliably.","PeriodicalId":131580,"journal":{"name":"Proceedings Ninth IEEE International Conference on Computer Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131888541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}