Non rigid registration is an important task in computer vision with many applications in shape and motion modeling. A fundamental step of the registration is the data association between the source and the target sets. Such association proves difficult in practice, due to the discrete nature of the information and its corruption by various types of noise, e.g. Outliers and missing data. In this paper we investigate the benefit of the implicit representations for the non-rigid registration of 3D point clouds. First, the target points are described with small quadratic patches that are blended through partition of unity weighting. Then, the discrete association between the source and the target can be replaced by a continuous distance field induced by the interface. By combining this distance field with a proper deformation term, the registration energy can be expressed in a linear least square form that is easy and fast to solve. This significantly eases the registration by avoiding direct association between points. Moreover, a hierarchical approach can be easily implemented by employing coarse-to-fine representations. Experimental results are provided for point clouds from multi-view data sets. The qualitative and quantitative comparisons show the out performance and robustness of our framework.
{"title":"Non-rigid Registration Meets Surface Reconstruction","authors":"Mohammad Rouhani, Edmond Boyer, A. Sappa","doi":"10.1109/3DV.2014.80","DOIUrl":"https://doi.org/10.1109/3DV.2014.80","url":null,"abstract":"Non rigid registration is an important task in computer vision with many applications in shape and motion modeling. A fundamental step of the registration is the data association between the source and the target sets. Such association proves difficult in practice, due to the discrete nature of the information and its corruption by various types of noise, e.g. Outliers and missing data. In this paper we investigate the benefit of the implicit representations for the non-rigid registration of 3D point clouds. First, the target points are described with small quadratic patches that are blended through partition of unity weighting. Then, the discrete association between the source and the target can be replaced by a continuous distance field induced by the interface. By combining this distance field with a proper deformation term, the registration energy can be expressed in a linear least square form that is easy and fast to solve. This significantly eases the registration by avoiding direct association between points. Moreover, a hierarchical approach can be easily implemented by employing coarse-to-fine representations. Experimental results are provided for point clouds from multi-view data sets. The qualitative and quantitative comparisons show the out performance and robustness of our framework.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"917 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116179825","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we present a scalable way to learn and detect objects using a 3D representation based on HOG patches placed on a 3D cuboid. The model consists of a single 3D representation that is shared among views. Similarly to the work of Fidler et al. [5], at detection time this representation is projected on the image plane over the desired viewpoints. However, whereas in [5] the projection is done at image-level and therefore the computational cost is linear in the number of views, in our model every view is approximated at feature level as a linear combination of the pre-computed fron to-parallel views. As a result, once the fron to-parallel views have been computed, the cost of computing new views is almost negligible. This allows the model to be evaluated on many more viewpoints. In the experimental results we show that the proposed model has a comparable detection and pose estimation performance to standard multiview HOG detectors, but it is faster, it scales very well with the number of views and can better generalize to unseen views. Finally, we also show that with a procedure similar to label propagation it is possible to train the model even without using pose annotations at training time.
{"title":"A Scalable 3D HOG Model for Fast Object Detection and Viewpoint Estimation","authors":"M. Pedersoli, T. Tuytelaars","doi":"10.1109/3DV.2014.82","DOIUrl":"https://doi.org/10.1109/3DV.2014.82","url":null,"abstract":"In this paper we present a scalable way to learn and detect objects using a 3D representation based on HOG patches placed on a 3D cuboid. The model consists of a single 3D representation that is shared among views. Similarly to the work of Fidler et al. [5], at detection time this representation is projected on the image plane over the desired viewpoints. However, whereas in [5] the projection is done at image-level and therefore the computational cost is linear in the number of views, in our model every view is approximated at feature level as a linear combination of the pre-computed fron to-parallel views. As a result, once the fron to-parallel views have been computed, the cost of computing new views is almost negligible. This allows the model to be evaluated on many more viewpoints. In the experimental results we show that the proposed model has a comparable detection and pose estimation performance to standard multiview HOG detectors, but it is faster, it scales very well with the number of views and can better generalize to unseen views. Finally, we also show that with a procedure similar to label propagation it is possible to train the model even without using pose annotations at training time.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124819363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stereo reconstruction is challenging in scenes with many similar-looking objects, as matches between features are often ambiguous. Features matched incorrectly lead to an incorrect 3D reconstruction, whereas if correct matches are missed, the reconstruction will be incomplete. Previous systems for selecting a correspondence (set of matched features) select either a maximum likelihood correspondence, which may contain many incorrect matches, or use some heuristic for discarding ambiguous matches. In this paper we propose a new method for selecting a correspondence: we select the correspondence which minimises an expected loss function. Match probabilities are computed by Gibbs sampling, then the minimum expected loss correspondence is selected based on these probabilities. A parameter of the loss function controls the trade off between selecting incorrect matches versus missing correct matches. The proposed correspondence selection method is evaluated in a model-based framework for reconstructing branching plants, and on simulated data. In both cases it outperforms alternative approaches in terms of precision and recall, giving more complete and accurate 3D models.
{"title":"A Decision-Theoretic Formulation for Sparse Stereo Correspondence Problems","authors":"T. Botterill, R. Green, S. Mills","doi":"10.1109/3DV.2014.34","DOIUrl":"https://doi.org/10.1109/3DV.2014.34","url":null,"abstract":"Stereo reconstruction is challenging in scenes with many similar-looking objects, as matches between features are often ambiguous. Features matched incorrectly lead to an incorrect 3D reconstruction, whereas if correct matches are missed, the reconstruction will be incomplete. Previous systems for selecting a correspondence (set of matched features) select either a maximum likelihood correspondence, which may contain many incorrect matches, or use some heuristic for discarding ambiguous matches. In this paper we propose a new method for selecting a correspondence: we select the correspondence which minimises an expected loss function. Match probabilities are computed by Gibbs sampling, then the minimum expected loss correspondence is selected based on these probabilities. A parameter of the loss function controls the trade off between selecting incorrect matches versus missing correct matches. The proposed correspondence selection method is evaluated in a model-based framework for reconstructing branching plants, and on simulated data. In both cases it outperforms alternative approaches in terms of precision and recall, giving more complete and accurate 3D models.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"18 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126146353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper proposes a solution to the 2-D phase unwrapping problem, inherent to time-of-flight range sensing technology due to the cyclic nature of phase. Our method uses a single frequency capture period to improve frame rate and decrease the presence of motion artifacts encountered in multiple frequency solutions. We present a probabilistic framework that considers intensity image in addition to the phase image. The phase unwrapping problem is cast in terms of global optimization of a carefully chosen objective function. Comparative experimental results confirm the effectiveness of the proposed approach.
{"title":"Probabilistic Phase Unwrapping for Single-Frequency Time-of-Flight Range Cameras","authors":"Ryan Crabb, R. Manduchi","doi":"10.1109/3DV.2014.89","DOIUrl":"https://doi.org/10.1109/3DV.2014.89","url":null,"abstract":"This paper proposes a solution to the 2-D phase unwrapping problem, inherent to time-of-flight range sensing technology due to the cyclic nature of phase. Our method uses a single frequency capture period to improve frame rate and decrease the presence of motion artifacts encountered in multiple frequency solutions. We present a probabilistic framework that considers intensity image in addition to the phase image. The phase unwrapping problem is cast in terms of global optimization of a carefully chosen objective function. Comparative experimental results confirm the effectiveness of the proposed approach.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130281111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work, we investigate the use of simple flash-based photography to capture an object's 3D shape and reflectance characteristics at the same time. The presented method is based on the principles of Structure from Motion (SfM) and Photometric Stereo (PS), yet, we make sure not to use more than readily-available consumer equipment, like a camera with flash. Starting from a SfM-generated mesh, we apply PS to refine both geometry and reflectance, where the latter is expressed in terms of data-driven Bidirectional Reflectance Distribution Function (BRDF) representations. We also introduce a novel approach to infer complete BRDFs starting from the sparsely sampled data-driven reflectance information captured with this setup. Our approach is experimentally validated by modeling several challenging objects, both synthetic and real.
{"title":"Tackling Shapes and BRDFs Head-On","authors":"Stamatios Georgoulis, M. Proesmans, L. Gool","doi":"10.1109/3DV.2014.81","DOIUrl":"https://doi.org/10.1109/3DV.2014.81","url":null,"abstract":"In this work, we investigate the use of simple flash-based photography to capture an object's 3D shape and reflectance characteristics at the same time. The presented method is based on the principles of Structure from Motion (SfM) and Photometric Stereo (PS), yet, we make sure not to use more than readily-available consumer equipment, like a camera with flash. Starting from a SfM-generated mesh, we apply PS to refine both geometry and reflectance, where the latter is expressed in terms of data-driven Bidirectional Reflectance Distribution Function (BRDF) representations. We also introduce a novel approach to infer complete BRDFs starting from the sparsely sampled data-driven reflectance information captured with this setup. Our approach is experimentally validated by modeling several challenging objects, both synthetic and real.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128276932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the area of as-is BIM creation from point cloud, researchers start digging the potential of using point cloud as 3D scenes of the construction job site to monitor construction progresses and safety. However, limited contribution was made in the AEC/FM domain to assist decision making process on building retrofit and renovation. This paper presents a method of automatic gbXML-based building model generation from thermal point cloud. Through the proposed geometry extraction and thermal resistance value estimation techniques, the size and thermal information of the building envelope components are automatically obtained in order to quickly generate building models ready for energy performance simulation. The registered point cloud of a residential house was used as a case study to validate the proposed method.
{"title":"Automated gbXML-Based Building Model Creation for Thermal Building Simulation","authors":"Chao Wang, Y. Cho","doi":"10.1109/3DV.2014.109","DOIUrl":"https://doi.org/10.1109/3DV.2014.109","url":null,"abstract":"In the area of as-is BIM creation from point cloud, researchers start digging the potential of using point cloud as 3D scenes of the construction job site to monitor construction progresses and safety. However, limited contribution was made in the AEC/FM domain to assist decision making process on building retrofit and renovation. This paper presents a method of automatic gbXML-based building model generation from thermal point cloud. Through the proposed geometry extraction and thermal resistance value estimation techniques, the size and thermal information of the building envelope components are automatically obtained in order to quickly generate building models ready for energy performance simulation. The registered point cloud of a residential house was used as a case study to validate the proposed method.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115959465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Christian Häne, Lionel Heng, Gim Hee Lee, A. Sizov, M. Pollefeys
In this paper, we propose an adaptation of camera projection models for fisheye cameras into the plane-sweeping stereo matching algorithm. This adaptation allows us to do plane-sweeping stereo directly on fisheye images. Our approach also works for other non-pinhole cameras such as omni directional and catadioptric cameras when using the unified projection model. Despite the simplicity of our proposed approach, we are able to obtain full, good quality and high resolution depth maps from the fisheye images. To verify our approach, we show experimental results based on depth maps generated by our approach, and dense models produced from these depth maps.
{"title":"Real-Time Direct Dense Matching on Fisheye Images Using Plane-Sweeping Stereo","authors":"Christian Häne, Lionel Heng, Gim Hee Lee, A. Sizov, M. Pollefeys","doi":"10.1109/3DV.2014.77","DOIUrl":"https://doi.org/10.1109/3DV.2014.77","url":null,"abstract":"In this paper, we propose an adaptation of camera projection models for fisheye cameras into the plane-sweeping stereo matching algorithm. This adaptation allows us to do plane-sweeping stereo directly on fisheye images. Our approach also works for other non-pinhole cameras such as omni directional and catadioptric cameras when using the unified projection model. Despite the simplicity of our proposed approach, we are able to obtain full, good quality and high resolution depth maps from the fisheye images. To verify our approach, we show experimental results based on depth maps generated by our approach, and dense models produced from these depth maps.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134116426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Image-to-image feature matching is the single most restrictive time bottleneck in any matching pipeline. We propose two methods for improving the speed and quality by employing semantic scene segmentation. First, we introduce a way of capturing semantic scene context of a key point into a compact description. Second, we propose to learn correct match ability of descriptors from these semantic contexts. Finally, we further reduce the complexity of matching to only a pre-computed set of semantically close key points. All methods can be used independently and in the evaluation we show combinations for maximum speed benefits. Overall, our proposed methods outperform all baselines and provide significant improvements in accuracy and an order of magnitude faster key point matching.
{"title":"Matching Features Correctly through Semantic Understanding","authors":"Nikolay Kobyshev, Hayko Riemenschneider, L. Gool","doi":"10.1109/3DV.2014.15","DOIUrl":"https://doi.org/10.1109/3DV.2014.15","url":null,"abstract":"Image-to-image feature matching is the single most restrictive time bottleneck in any matching pipeline. We propose two methods for improving the speed and quality by employing semantic scene segmentation. First, we introduce a way of capturing semantic scene context of a key point into a compact description. Second, we propose to learn correct match ability of descriptors from these semantic contexts. Finally, we further reduce the complexity of matching to only a pre-computed set of semantically close key points. All methods can be used independently and in the evaluation we show combinations for maximum speed benefits. Overall, our proposed methods outperform all baselines and provide significant improvements in accuracy and an order of magnitude faster key point matching.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128824284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this work we propose an optimization scheme for variational, vectorial denoising and fusion of surface normal maps. These are common outputs of shape from shading, photometric stereo or single image reconstruction methods, but tend to be noisy and request post-processing for further usage. Processing of normals maps, which do not provide knowledge about the underlying scene depth, is complicated due to their unit length constraint which renders the optimization non-linear and non-convex. The presented approach builds upon a linearization of the constraint to obtain a convex relaxation, while guaranteeing convergence. Experimental results demonstrate that our algorithm generates more consistent representations from estimated and potentially complementary normal maps.
{"title":"Variational Regularization and Fusion of Surface Normal Maps","authors":"Bernhard Zeisl, C. Zach, M. Pollefeys","doi":"10.1109/3DV.2014.92","DOIUrl":"https://doi.org/10.1109/3DV.2014.92","url":null,"abstract":"In this work we propose an optimization scheme for variational, vectorial denoising and fusion of surface normal maps. These are common outputs of shape from shading, photometric stereo or single image reconstruction methods, but tend to be noisy and request post-processing for further usage. Processing of normals maps, which do not provide knowledge about the underlying scene depth, is complicated due to their unit length constraint which renders the optimization non-linear and non-convex. The presented approach builds upon a linearization of the constraint to obtain a convex relaxation, while guaranteeing convergence. Experimental results demonstrate that our algorithm generates more consistent representations from estimated and potentially complementary normal maps.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129108575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We propose an iterative method for detecting closed surfaces in a volumetric data, where an optimal search is performed in a graph build upon a triangular mesh. Our approach is based on previous techniques for detecting an optimal terrain-like or tubular surface employing a regular grid. Unlike similar adaptations for triangle meshes, our method is capable of capturing complex geometries by iteratively refining the surface, where we obtain a high level of robustness by applying explicit mesh processing to intermediate results. Our method uses on-surface data support, but it also exploits data information about the region inside and outside the surface. This provides additional robustness to the algorithm. We demonstrate the capabilities of the approach by detecting surfaces of CT scanned objects.
{"title":"Surface Detection Using Round Cut","authors":"Vedrana Andersen Dahl, A. Dahl, R. Larsen","doi":"10.1109/3DV.2014.60","DOIUrl":"https://doi.org/10.1109/3DV.2014.60","url":null,"abstract":"We propose an iterative method for detecting closed surfaces in a volumetric data, where an optimal search is performed in a graph build upon a triangular mesh. Our approach is based on previous techniques for detecting an optimal terrain-like or tubular surface employing a regular grid. Unlike similar adaptations for triangle meshes, our method is capable of capturing complex geometries by iteratively refining the surface, where we obtain a high level of robustness by applying explicit mesh processing to intermediate results. Our method uses on-surface data support, but it also exploits data information about the region inside and outside the surface. This provides additional robustness to the algorithm. We demonstrate the capabilities of the approach by detecting surfaces of CT scanned objects.","PeriodicalId":275516,"journal":{"name":"2014 2nd International Conference on 3D Vision","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116882171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}