Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710814
P. Giblin, G. Sapiro
Affine invariant medial axes and symmetry sets of planar shapes are introduced and studied in this paper. Two different approaches are presented. The first one is based on affine invariant distances, and defines the symmetry set, a set containing the medial axis; as the closure of the locus of points on (at least) two affine normals an affine-equidistant from the corresponding points on the curve. The second approach is based on affine bitangent conics. In this case the symmetry set is defined as the closure of the locus of centers of conics with (at least) three-point contact with two or more distinct points on the curve. This is equivalent to conic and curve having, at those points, the same affine tangent, or the same Euclidean tangent and curvature. Although the two analogous definitions for the classical Euclidean symmetry set (medial axis) are equivalent, this is not the case for the affine group. We then show how to use the symmetry set to detect affine skew symmetry, proving that the contact based symmetry set is a straight line if and only if the given shape is the affine transformation of a symmetric object.
{"title":"Affine invariant medial axis and skew symmetry","authors":"P. Giblin, G. Sapiro","doi":"10.1109/ICCV.1998.710814","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710814","url":null,"abstract":"Affine invariant medial axes and symmetry sets of planar shapes are introduced and studied in this paper. Two different approaches are presented. The first one is based on affine invariant distances, and defines the symmetry set, a set containing the medial axis; as the closure of the locus of points on (at least) two affine normals an affine-equidistant from the corresponding points on the curve. The second approach is based on affine bitangent conics. In this case the symmetry set is defined as the closure of the locus of centers of conics with (at least) three-point contact with two or more distinct points on the curve. This is equivalent to conic and curve having, at those points, the same affine tangent, or the same Euclidean tangent and curvature. Although the two analogous definitions for the classical Euclidean symmetry set (medial axis) are equivalent, this is not the case for the affine group. We then show how to use the symmetry set to detect affine skew symmetry, proving that the contact based symmetry set is a straight line if and only if the given shape is the affine transformation of a symmetric object.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134010672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710702
T. Tuytelaars, L. Gool, M. Proesmans, T. Moons
Cartography and other applications of remote sensing have led to an increased interest in the (semi-)automatic interpretation of structures in aerial images of urban and suburban areas. Although these areas are particularly challenging because of their complexity, the degree of regularity in such man-made structures also helps to tackle the problems. The paper presents the iterated application of the Hough transform as a means to exploit such regularities. It shows how such 'Cascaded Hough Transform' (or CHT for short) yields straight lines, vanishing points, and vanishing lines. It also illustrates how the latter assist in improving the precision of the former. The examples are based on real aerial photographs.
{"title":"The cascaded Hough transform as an aid in aerial image interpretation","authors":"T. Tuytelaars, L. Gool, M. Proesmans, T. Moons","doi":"10.1109/ICCV.1998.710702","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710702","url":null,"abstract":"Cartography and other applications of remote sensing have led to an increased interest in the (semi-)automatic interpretation of structures in aerial images of urban and suburban areas. Although these areas are particularly challenging because of their complexity, the degree of regularity in such man-made structures also helps to tackle the problems. The paper presents the iterated application of the Hough transform as a means to exploit such regularities. It shows how such 'Cascaded Hough Transform' (or CHT for short) yields straight lines, vanishing points, and vanishing lines. It also illustrates how the latter assist in improving the precision of the former. The examples are based on real aerial photographs.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134327763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710763
S. Roy, I. Cox
This paper describes a new algorithm for solving the N-camera stereo correspondence problem by transforming it into a maximum-flow problem. Once solved, the minimum-cut associated to the maximum-flow yields a disparity surface for the whole image at once. This global approach to stereo analysis provides a more accurate and coherent depth map than the traditional line-by-line stereo. Moreover, the optimality of the depth surface is guaranteed and can be shown to be a generalization of the dynamic programming approach that is widely used in standard stereo. Results show improved depth estimation as well as better handling of depth discontinuities. While the worst case running time is O(n/sup 2/d/sup 2/log(nd)), the observed average running time is O(n/sup 1.2/ d/sup 1.3/) for an image size of n pixels and depth resolution d.
{"title":"A maximum-flow formulation of the N-camera stereo correspondence problem","authors":"S. Roy, I. Cox","doi":"10.1109/ICCV.1998.710763","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710763","url":null,"abstract":"This paper describes a new algorithm for solving the N-camera stereo correspondence problem by transforming it into a maximum-flow problem. Once solved, the minimum-cut associated to the maximum-flow yields a disparity surface for the whole image at once. This global approach to stereo analysis provides a more accurate and coherent depth map than the traditional line-by-line stereo. Moreover, the optimality of the depth surface is guaranteed and can be shown to be a generalization of the dynamic programming approach that is widely used in standard stereo. Results show improved depth estimation as well as better handling of depth discontinuities. While the worst case running time is O(n/sup 2/d/sup 2/log(nd)), the observed average running time is O(n/sup 1.2/ d/sup 1.3/) for an image size of n pixels and depth resolution d.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132157880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710858
Tyng-Luh Liu, D. Geiger, R. Kohn
Representing shapes is a significant problem for vision systems that must recognize or classify objects. We derive a representation for a given shape by investigating its self-similarities, and constructing its shape axis (SA) and shape axis tree (SA-tree). We start with a shape, its boundary contour, and two different parameterizations for the contour. To measure its self-similarity we consider matching pairs of points (and their tangents) along the boundary contour, i.e., matching the two parameterizations. The matching, of self-similarity criteria may vary, e.g., co-circularity, parallelism, distance, region homogeneity. The loci of middle points of the pairing contour points are the shape axis and they can be grouped into a unique tree graph, the SA-tree. The shape axis for the co-circularity criteria is compared to the symmetry axis. An interpretation in terms of object parts is also presented.
{"title":"Representation and self-similarity of shapes","authors":"Tyng-Luh Liu, D. Geiger, R. Kohn","doi":"10.1109/ICCV.1998.710858","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710858","url":null,"abstract":"Representing shapes is a significant problem for vision systems that must recognize or classify objects. We derive a representation for a given shape by investigating its self-similarities, and constructing its shape axis (SA) and shape axis tree (SA-tree). We start with a shape, its boundary contour, and two different parameterizations for the contour. To measure its self-similarity we consider matching pairs of points (and their tangents) along the boundary contour, i.e., matching the two parameterizations. The matching, of self-similarity criteria may vary, e.g., co-circularity, parallelism, distance, region homogeneity. The loci of middle points of the pairing contour points are the shape axis and they can be grouped into a unique tree graph, the SA-tree. The shape axis for the co-circularity criteria is compared to the symmetry axis. An interpretation in terms of object parts is also presented.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132508492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710726
B. Schiele, J. Crowley
This article develops an analogy between object recognition and the transmission of information through a channel based on the statistical representation of the appearances of 3D objects. This analogy provides a means to quantitatively evaluate the contribution of individual receptive field vectors, and to predict the performance of the object recognition process. Transinformation also provides a quantitative measure of the discrimination provided by each viewpoint, thus permitting the determination of the most discriminant viewpoints. As an application, the article develops an active object recognition algorithm which is able to resolve ambiguities inherent in a single-view recognition algorithm.
{"title":"Transinformation for active object recognition","authors":"B. Schiele, J. Crowley","doi":"10.1109/ICCV.1998.710726","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710726","url":null,"abstract":"This article develops an analogy between object recognition and the transmission of information through a channel based on the statistical representation of the appearances of 3D objects. This analogy provides a means to quantitatively evaluate the contribution of individual receptive field vectors, and to predict the performance of the object recognition process. Transinformation also provides a quantitative measure of the discrimination provided by each viewpoint, thus permitting the determination of the most discriminant viewpoints. As an application, the article develops an active object recognition algorithm which is able to resolve ambiguities inherent in a single-view recognition algorithm.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133273223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710844
G. Wei, G. Hirzinger
In this paper, we propose a new solution to the stereo correspondence problem by including features an intensity based matching. The features we use are intensity gradients in both the x and y directions of the left and the deformed right images. Although a uniform smoothness constraint is still used, it is nevertheless applied only to non-feature regions. To avoid local minima in function minimization, we propose to parameterize the disparity function by hierarchical Gaussians. A simple stochastic gradient method is used to estimate the Gaussian weights. Experiments with various real stereo images show robust performances.
{"title":"Intensity and feature based stereo matching by disparity parameterization","authors":"G. Wei, G. Hirzinger","doi":"10.1109/ICCV.1998.710844","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710844","url":null,"abstract":"In this paper, we propose a new solution to the stereo correspondence problem by including features an intensity based matching. The features we use are intensity gradients in both the x and y directions of the left and the deformed right images. Although a uniform smoothness constraint is still used, it is nevertheless applied only to non-feature regions. To avoid local minima in function minimization, we propose to parameterize the disparity function by hierarchical Gaussians. A simple stochastic gradient method is used to estimate the Gaussian weights. Experiments with various real stereo images show robust performances.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114490295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710797
Taeone Kim, Y. Seo, K. Hong
In this paper, we propose a method for locating 3D position of a soccer ball from monocular image sequence of soccer games. Toward this goal, we adopted ground-model-to-image transformation together with physics-based approach, that a ball follows the parabolic trajectory in the air. By using the transformation the heights of a ball can be easily calculated using simple triangular geometric relations given the start and the end position of the ball on the ground. Here the heights of a ball are determined in terms of a player's height. Even if the end position of a ball is not given on the ground due to kicking or heading of a falling ball before it touches the ground, the most probable trajectory can be determined by searching based on the physical fact that the ball follows a parabolic trajectory in the air. We have tested and experimented with a real image sequence the results of which seem promising.
{"title":"Physics-based 3D position analysis of a soccer ball from monocular image sequences","authors":"Taeone Kim, Y. Seo, K. Hong","doi":"10.1109/ICCV.1998.710797","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710797","url":null,"abstract":"In this paper, we propose a method for locating 3D position of a soccer ball from monocular image sequence of soccer games. Toward this goal, we adopted ground-model-to-image transformation together with physics-based approach, that a ball follows the parabolic trajectory in the air. By using the transformation the heights of a ball can be easily calculated using simple triangular geometric relations given the start and the end position of the ball on the ground. Here the heights of a ball are determined in terms of a player's height. Even if the end position of a ball is not given on the ground due to kicking or heading of a falling ball before it touches the ground, the most probable trajectory can be determined by searching based on the physical fact that the ball follows a parabolic trajectory in the air. We have tested and experimented with a real image sequence the results of which seem promising.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114499960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710767
Björn Krebs, B. Korn, M. Burkhardt
In this paper we propose a general framework to build a task oriented 3D object recognition system for CAD based vision (CBV). Features from 3D space curves representing the object's rims provide sufficient information to allow identification and pose estimation of industrial CAD models. However, features relying on differential surface properties tend to be very vulnerable with respect to noise. To model the statistical behavior of the data we introduce Bayesian nets which model the relationship between objects and observable features. Furthermore, task oriented selection of the optimal action to reduce the uncertainty of recognition results is incorporated into the Bayesian nets. This enables the integration of intelligent recognition strategies depending on the already acquired evidence into a robust, and efficient, 3D CAD based recognition system.
{"title":"A task driven 3D object recognition system using Bayesian networks","authors":"Björn Krebs, B. Korn, M. Burkhardt","doi":"10.1109/ICCV.1998.710767","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710767","url":null,"abstract":"In this paper we propose a general framework to build a task oriented 3D object recognition system for CAD based vision (CBV). Features from 3D space curves representing the object's rims provide sufficient information to allow identification and pose estimation of industrial CAD models. However, features relying on differential surface properties tend to be very vulnerable with respect to noise. To model the statistical behavior of the data we introduce Bayesian nets which model the relationship between objects and observable features. Furthermore, task oriented selection of the optimal action to reduce the uncertainty of recognition results is incorporated into the Bayesian nets. This enables the integration of intelligent recognition strategies depending on the already acquired evidence into a robust, and efficient, 3D CAD based recognition system.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131857788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710777
A. Shokoufandeh, I. Marsic, Sven J. Dickinson
We introduce a novel view-based object representation, called the saliency map graph (SMG), which captures the salient regions of an object view at multiple scales using a wavelet transform. This compact representation is highly invariant to translation, rotation (image and depth), and scaling, and offers the locality of representation required for occluded object recognition. To compare two saliency map graphs, we introduce two graph similarity algorithms. The first computes the topological similarity between two SMG's, providing a coarse-level matching of two graphs. The second computes the geometrical similarity between two SMG's, providing a fine-level matching of two graphs. We test and compare these two algorithms on a large database of model object views.
{"title":"View-based object matching","authors":"A. Shokoufandeh, I. Marsic, Sven J. Dickinson","doi":"10.1109/ICCV.1998.710777","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710777","url":null,"abstract":"We introduce a novel view-based object representation, called the saliency map graph (SMG), which captures the salient regions of an object view at multiple scales using a wavelet transform. This compact representation is highly invariant to translation, rotation (image and depth), and scaling, and offers the locality of representation required for occluded object recognition. To compare two saliency map graphs, we introduce two graph similarity algorithms. The first computes the topological similarity between two SMG's, providing a coarse-level matching of two graphs. The second computes the geometrical similarity between two SMG's, providing a fine-level matching of two graphs. We test and compare these two algorithms on a large database of model object views.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128569046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1998-01-04DOI: 10.1109/ICCV.1998.710834
A. Yuille, Pierre-Yves Burgi, N. Grzywacz
We develop a theory for the temporal integration of visual motion motivated by psychophysical experiments. The theory proposes that input data are temporally grouped and used to predict and estimate motion flows in the image sequences. Our theory is expressed in terms of the Bayesian generalization of standard Kalman filtering which allows us to solve temporal grouping in conjunction with prediction and estimation. As demonstrated for tracking isolated contours the Bayesian formulation is superior to approaches which use data association as a first stage followed by conventional Kalman filtering. Our computer simulations demonstrate that our theory qualitatively accounts for several psychophysical experiments on motion occlusion and motion outliers.
{"title":"Visual motion estimation and prediction: a probabilistic network model for temporal coherence","authors":"A. Yuille, Pierre-Yves Burgi, N. Grzywacz","doi":"10.1109/ICCV.1998.710834","DOIUrl":"https://doi.org/10.1109/ICCV.1998.710834","url":null,"abstract":"We develop a theory for the temporal integration of visual motion motivated by psychophysical experiments. The theory proposes that input data are temporally grouped and used to predict and estimate motion flows in the image sequences. Our theory is expressed in terms of the Bayesian generalization of standard Kalman filtering which allows us to solve temporal grouping in conjunction with prediction and estimation. As demonstrated for tracking isolated contours the Bayesian formulation is superior to approaches which use data association as a first stage followed by conventional Kalman filtering. Our computer simulations demonstrate that our theory qualitatively accounts for several psychophysical experiments on motion occlusion and motion outliers.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134478639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}