Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.786934
Tong Zhang, Carlo Tomasi
Previous algorithms that recover camera motion from image velocities suffer from both bias and excessive variance in the results. We propose a robust estimator of camera motion that is statistically consistent when image noise is isotropic. Consistency means that the estimated motion converges in probability, to the true value as the number of image points increases. An algorithm based on reweighted Gauss-Newton iterations handles 100 velocity measurements in about 50 milliseconds on a workstation.
{"title":"Fast, robust, and consistent camera motion estimation","authors":"Tong Zhang, Carlo Tomasi","doi":"10.1109/CVPR.1999.786934","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786934","url":null,"abstract":"Previous algorithms that recover camera motion from image velocities suffer from both bias and excessive variance in the results. We propose a robust estimator of camera motion that is statistically consistent when image noise is isotropic. Consistency means that the estimated motion converges in probability, to the true value as the number of image points increases. An algorithm based on reweighted Gauss-Newton iterations handles 100 velocity measurements in about 50 milliseconds on a workstation.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"16 1","pages":"164-170 Vol. 1"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87388283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784605
T. Tasdizen, Jean-Philippe Tarel, D. Cooper
An algebraic curve is defined as the zero set of a polynomial in two variables. Algebraic curves are practical for modeling shapes much more complicated than conics or superquadrics. The main drawback in representing shapes by algebraic curves has been the lack of repeatability in fitting algebraic curves to data. A regularized fast linear fitting method based on ridge regression and restricting the representation to well behaved subsets of polynomials is proposed, and its properties are investigated. The fitting algorithm is of sufficient stability for very fast position-invariant shape recognition, position estimation, and shape tracking, based on new invariants and representations, and is appropriate to open as well as closed curves of unorganized data. Among appropriate applications are shape-based indexing into image databases.
{"title":"Algebraic curves that work better","authors":"T. Tasdizen, Jean-Philippe Tarel, D. Cooper","doi":"10.1109/CVPR.1999.784605","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784605","url":null,"abstract":"An algebraic curve is defined as the zero set of a polynomial in two variables. Algebraic curves are practical for modeling shapes much more complicated than conics or superquadrics. The main drawback in representing shapes by algebraic curves has been the lack of repeatability in fitting algebraic curves to data. A regularized fast linear fitting method based on ridge regression and restricting the representation to well behaved subsets of polynomials is proposed, and its properties are investigated. The fitting algorithm is of sufficient stability for very fast position-invariant shape recognition, position estimation, and shape tracking, based on new invariants and representations, and is appropriate to open as well as closed curves of unorganized data. Among appropriate applications are shape-based indexing into image databases.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"1 1","pages":"35-41 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75012848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.786995
Bea Thai, G. Healey
We present an algorithm for subpixel material identification that is invariant to the illumination and atmospheric conditions. The target material spectral reflectance is the only prior information required by the algorithm. A target material subspace model is constructed from the reflectance using a physical model and a background subspace model is estimated directly from the image. These two subspace models are used to compute maximum likelihood estimates for the target material component and the background component at each image pixel. These estimates form the basis of a generalized likelihood ratio test for subpixel material identification. We present experimental results using HYDICE imagery that demonstrate the utility of the algorithm for subpixel material identification under varying illumination and atmospheric conditions.
{"title":"Using a linear subspace approach for invariant subpixel material identification in airborne hyperspectral imagery","authors":"Bea Thai, G. Healey","doi":"10.1109/CVPR.1999.786995","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786995","url":null,"abstract":"We present an algorithm for subpixel material identification that is invariant to the illumination and atmospheric conditions. The target material spectral reflectance is the only prior information required by the algorithm. A target material subspace model is constructed from the reflectance using a physical model and a background subspace model is estimated directly from the image. These two subspace models are used to compute maximum likelihood estimates for the target material component and the background component at each image pixel. These estimates form the basis of a generalized likelihood ratio test for subpixel material identification. We present experimental results using HYDICE imagery that demonstrate the utility of the algorithm for subpixel material identification under varying illumination and atmospheric conditions.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"96 1","pages":"567-572 Vol. 1"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75146055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784602
M. Mostafa, S. Yamany, A. Farag
This paper presents a framework for integrating multiple sensory data, sparse range data and dense depth maps from shape from shading in order to improve the 3D reconstruction of visible surfaces of 3D objects. The integration process is based on propagating the error difference between the two data sets by fitting a surface to that difference and using it to correct the visible surface obtained from shape from shading. A feedforward neural network is used to fit a surface to the sparse data. We also study the use of the extended Kalman filter for supervised learning and compare it with the backpropagation algorithm. A performance analysis is done to obtain the best neural network architecture and learning algorithm. It is found that the integration of sparse depth measurements has greatly enhanced the 3D visible surface obtained from shape from shading in terms of metric measurements.
{"title":"Integrating shape from shading and range data using neural networks","authors":"M. Mostafa, S. Yamany, A. Farag","doi":"10.1109/CVPR.1999.784602","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784602","url":null,"abstract":"This paper presents a framework for integrating multiple sensory data, sparse range data and dense depth maps from shape from shading in order to improve the 3D reconstruction of visible surfaces of 3D objects. The integration process is based on propagating the error difference between the two data sets by fitting a surface to that difference and using it to correct the visible surface obtained from shape from shading. A feedforward neural network is used to fit a surface to the sparse data. We also study the use of the extended Kalman filter for supervised learning and compare it with the backpropagation algorithm. A performance analysis is done to obtain the best neural network architecture and learning algorithm. It is found that the integration of sparse depth measurements has greatly enhanced the 3D visible surface obtained from shape from shading in terms of metric measurements.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"24 1","pages":"15-20 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74266117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784966
C. Baillard, Andrew Zisserman
A new method is described for automatically reconstructing 3D planar faces from multiple images of a scene. The novelty of the approach lies in the use of inter-image homographies to validate and best estimate the plane, and in the minimal initialization requirements-only a single 3D line with a textured neighbourhood is required to generate a plane hypothesis. The planar facets enable line grouping and also the construction of parts of the wireframe which were missed due to the inevitable shortcomings of feature detection and matching. The method allows a piecewise planar model of a scene to be built completely automatically, with no user intervention at any stage, given only the images and camera projection matrices as input. The robustness and reliability of the method are illustrated on several examples, from both aerial and interior views.
{"title":"Automatic reconstruction of piecewise planar models from multiple views","authors":"C. Baillard, Andrew Zisserman","doi":"10.1109/CVPR.1999.784966","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784966","url":null,"abstract":"A new method is described for automatically reconstructing 3D planar faces from multiple images of a scene. The novelty of the approach lies in the use of inter-image homographies to validate and best estimate the plane, and in the minimal initialization requirements-only a single 3D line with a textured neighbourhood is required to generate a plane hypothesis. The planar facets enable line grouping and also the construction of parts of the wireframe which were missed due to the inevitable shortcomings of feature detection and matching. The method allows a piecewise planar model of a scene to be built completely automatically, with no user intervention at any stage, given only the images and camera projection matrices as input. The robustness and reliability of the method are illustrated on several examples, from both aerial and interior views.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"17 1","pages":"559-565 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74556896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.786958
Diego A. Socolinsky, L. B. Wolff
We present a new formalism for the treatment and understanding of multispectral images and multisensor imagery based on first order contrast information. Although little attention has been paid to the utility of multispectral contrast, we develop a theory for multispectral contrast that enables us to produce an optimal grayscale visualization of the first order contrast of an image with an arbitrary number of bands. We demonstrate how our technique can reveal significantly more interpretive information to an image analyst, who can use it in a number of image understanding algorithms. Existing grayscale visualization strategies are reviewed and a discussion is given as to why our algorithm is optimal and outperforms them. A variety of experimental results are presented.
{"title":"A new visualization paradigm for multispectral imagery and data fusion","authors":"Diego A. Socolinsky, L. B. Wolff","doi":"10.1109/CVPR.1999.786958","DOIUrl":"https://doi.org/10.1109/CVPR.1999.786958","url":null,"abstract":"We present a new formalism for the treatment and understanding of multispectral images and multisensor imagery based on first order contrast information. Although little attention has been paid to the utility of multispectral contrast, we develop a theory for multispectral contrast that enables us to produce an optimal grayscale visualization of the first order contrast of an image with an arbitrary number of bands. We demonstrate how our technique can reveal significantly more interpretive information to an image analyst, who can use it in a number of image understanding algorithms. Existing grayscale visualization strategies are reviewed and a discussion is given as to why our algorithm is optimal and outperforms them. A variety of experimental results are presented.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"43 1","pages":"319-324 Vol. 1"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74881637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784981
J. Puzicha, J. Buhmann, Thomas Hofmann
This paper introduces a novel statistical mixture model for probabilistic grouping of distributional (histogram) data. Adopting the Bayesian framework, we propose to perform annealed maximum a posteriori estimation to compute optimal clustering solutions. In order to accelerate the optimization process, an efficient multiscale formulation is developed. We present a prototypical application of this method for the unsupervised segmentation of textured images based on local distributions of Gabor coefficients. Benchmark results indicate superior performance compared to K-means clustering and proximity-based algorithms.
{"title":"Histogram clustering for unsupervised image segmentation","authors":"J. Puzicha, J. Buhmann, Thomas Hofmann","doi":"10.1109/CVPR.1999.784981","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784981","url":null,"abstract":"This paper introduces a novel statistical mixture model for probabilistic grouping of distributional (histogram) data. Adopting the Bayesian framework, we propose to perform annealed maximum a posteriori estimation to compute optimal clustering solutions. In order to accelerate the optimization process, an efficient multiscale formulation is developed. We present a prototypical application of this method for the unsupervised segmentation of textured images based on local distributions of Gabor coefficients. Benchmark results indicate superior performance compared to K-means clustering and proximity-based algorithms.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"22 1","pages":"602-608 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85048144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784617
James M. Rehg, Kevin P. Murphy, P. Fieguth
The development of user interfaces based on vision and speech requires the solution of a challenging statistical inference problem: The intentions and actions of multiple individuals must be inferred from noisy and ambiguous data. We argue that Bayesian network models are an attractive statistical framework for cue fusion in these applications. Bayes nets combine a natural mechanism for expressing contextual information with efficient algorithms for learning and inference. We illustrate these points through the development of a Bayes net model for detecting when a user is speaking. The model combines four simple vision sensors: face detection, skin color, skin texture, and mouth motion. We present some promising experimental results.
{"title":"Vision-based speaker detection using Bayesian networks","authors":"James M. Rehg, Kevin P. Murphy, P. Fieguth","doi":"10.1109/CVPR.1999.784617","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784617","url":null,"abstract":"The development of user interfaces based on vision and speech requires the solution of a challenging statistical inference problem: The intentions and actions of multiple individuals must be inferred from noisy and ambiguous data. We argue that Bayesian network models are an attractive statistical framework for cue fusion in these applications. Bayes nets combine a natural mechanism for expressing contextual information with efficient algorithms for learning and inference. We illustrate these points through the development of a Bayes net model for detecting when a user is speaking. The model combines four simple vision sensors: face detection, skin color, skin texture, and mouth motion. We present some promising experimental results.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"54 1","pages":"110-116 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79449288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784629
S. Pae, J. Ponce
This paper addresses the problem of constructing the scale-space aspect graph of a solid of revolution whose surface is the zero set of a polynomial volumetric density undergoing a Gaussian diffusion process. Equations for the associated visual event surfaces are derived, and polynomial curve tracing techniques are used to delineate these surfaces. An implementation and examples are presented, and limitations as well as extensions of the proposed approach are discussed.
{"title":"Toward a scale-space aspect graph: solids of revolution","authors":"S. Pae, J. Ponce","doi":"10.1109/CVPR.1999.784629","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784629","url":null,"abstract":"This paper addresses the problem of constructing the scale-space aspect graph of a solid of revolution whose surface is the zero set of a polynomial volumetric density undergoing a Gaussian diffusion process. Equations for the associated visual event surfaces are derived, and polynomial curve tracing techniques are used to delineate these surfaces. An implementation and examples are presented, and limitations as well as extensions of the proposed approach are discussed.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"19 2 1","pages":"196-201 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82564166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1999-06-23DOI: 10.1109/CVPR.1999.784614
Robert Pless, T. Brodský, Y. Aloimonos
We consider a problem central in aerial visual surveillance applications-detection and tracking of small, independently moving objects in long and noisy video sequences. We directly use spatiotemporal image intensity gradient measurements to compute an exact model of background motion. This allows the creation of accurate mosaics over many frames and the definition of a constraint violation function which acts as an indication of independent motion. A novel temporal integration method maintains confidence measures over long subsequences without computing the optic flow, requiring object models, or using a Kalman filler. The mosaic acts as a stable feature frame, allowing precise localization of the independently moving objects. We present a statistical analysis of the effects of image noise on the constraint violation measure and find a good match between the predicted probability distribution function and the measured sample frequencies in a test sequence.
{"title":"Independent motion: the importance of history","authors":"Robert Pless, T. Brodský, Y. Aloimonos","doi":"10.1109/CVPR.1999.784614","DOIUrl":"https://doi.org/10.1109/CVPR.1999.784614","url":null,"abstract":"We consider a problem central in aerial visual surveillance applications-detection and tracking of small, independently moving objects in long and noisy video sequences. We directly use spatiotemporal image intensity gradient measurements to compute an exact model of background motion. This allows the creation of accurate mosaics over many frames and the definition of a constraint violation function which acts as an indication of independent motion. A novel temporal integration method maintains confidence measures over long subsequences without computing the optic flow, requiring object models, or using a Kalman filler. The mosaic acts as a stable feature frame, allowing precise localization of the independently moving objects. We present a statistical analysis of the effects of image noise on the constraint violation measure and find a good match between the predicted probability distribution function and the measured sample frequencies in a test sequence.","PeriodicalId":20644,"journal":{"name":"Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149)","volume":"07 1","pages":"92-97 Vol. 2"},"PeriodicalIF":0.0,"publicationDate":"1999-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85854501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}