首页 > 最新文献

2016 Fourth International Conference on 3D Vision (3DV)最新文献

英文 中文
Shape Analysis with Anisotropic Windowed Fourier Transform 基于各向异性加窗傅里叶变换的形状分析
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.57
S. Melzi, E. Rodolà, U. Castellani, M. Bronstein
We propose Anisotropic Windowed Fourier Transform (AWFT), a framework for localized space-frequency analysis of deformable 3D shapes. With AWFT, we are able to extract meaningful intrinsic localized orientation-sensitive structures on surfaces, and use them in applications such as shape segmentation, salient point detection, feature point description, and matching. Our method outperforms previous approaches in the considered applications.
我们提出了各向异性加窗傅里叶变换(AWFT),这是一种用于可变形三维形状的局域空频分析的框架。利用AWFT,我们能够在表面上提取有意义的固有局部定向敏感结构,并将其用于形状分割、显著点检测、特征点描述和匹配等应用。在考虑的应用程序中,我们的方法优于以前的方法。
{"title":"Shape Analysis with Anisotropic Windowed Fourier Transform","authors":"S. Melzi, E. Rodolà, U. Castellani, M. Bronstein","doi":"10.1109/3DV.2016.57","DOIUrl":"https://doi.org/10.1109/3DV.2016.57","url":null,"abstract":"We propose Anisotropic Windowed Fourier Transform (AWFT), a framework for localized space-frequency analysis of deformable 3D shapes. With AWFT, we are able to extract meaningful intrinsic localized orientation-sensitive structures on surfaces, and use them in applications such as shape segmentation, salient point detection, feature point description, and matching. Our method outperforms previous approaches in the considered applications.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132236787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences 基于能量的全局三元图像单一深度序列动作识别
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.14
Mengyuan Liu, Hong Liu, Chen Chen, M. Najafian
In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subject's speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.
为了有效地识别深度序列中的动作,我们提出了一种新的特征,称为全局三元图像(Global Ternary Image, GTI),它通过记录深度像素的变化来隐式地编码连续深度帧之间的运动区域和运动方向。在本研究中,GTI中的每个像素都代表三种可能状态中的一种,即正、负和中性,分别代表增加、减少和相同深度值。由于GTI对被测对象的速度非常敏感,我们从运动能量相等的两两深度帧中提取GTI,得到基于能量的GTI (E-GTI)。为了在深度帧之间包含时间信息,我们使用多种运动能量设置提取E-GTI。在这里,使用Radon变换(RT)描述e - gti可以有效地抑制噪声。3D动作表示是将RTs的分层组合提供给视觉词袋模型(BoVW)的结果。在MSRAction3D、DHA、MSRGesture3D和SKIG四个基准数据集上进行的大量实验表明,分层E-GTI在三维动作识别方面优于现有方法。我们在扩展的MSRAction3D数据集上测试了我们提出的方法,以进一步研究和验证其对部分遮挡、噪声和速度的鲁棒性。
{"title":"Energy-Based Global Ternary Image for Action Recognition Using Sole Depth Sequences","authors":"Mengyuan Liu, Hong Liu, Chen Chen, M. Najafian","doi":"10.1109/3DV.2016.14","DOIUrl":"https://doi.org/10.1109/3DV.2016.14","url":null,"abstract":"In order to efficiently recognize actions from depth sequences, we propose a novel feature, called Global Ternary Image (GTI), which implicitly encodes both motion regions and motion directions between consecutive depth frames via recording the changes of depth pixels. In this study, each pixel in GTI indicates one of the three possible states, namely positive, negative and neutral, which represents increased, decreased and same depth values, respectively. Since GTI is sensitive to the subject's speed, we obtain energy-based GTI (E-GTI) by extracting GTI from pairwise depth frames with equal motion energy. To involve temporal information among depth frames, we extract E-GTI using multiple settings of motion energy. Here, the noise can be effectively suppressed by describing E-GTIs using the Radon Transform (RT). The 3D action representation is formed as a result of feeding the hierarchical combination of RTs to the Bag of Visual Words model (BoVW). From the extensive experiments on four benchmark datasets, namely MSRAction3D, DHA, MSRGesture3D and SKIG, it is evident that the hierarchical E-GTI outperforms the existing methods in 3D action recognition. We tested our proposed approach on extended MSRAction3D dataset to further investigate and verify its robustness against partial occlusions, noise and speed.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122456361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Point Cloud Noise and Outlier Removal for Image-Based 3D Reconstruction 基于图像的三维重建中的点云噪声和离群值去除
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.20
Katja Wolff, Changil Kim, H. Zimmer, Christopher Schroers, M. Botsch, O. Sorkine-Hornung, A. Sorkine-Hornung
Point sets generated by image-based 3D reconstruction techniques are often much noisier than those obtained using active techniques like laser scanning. Therefore, they pose greater challenges to the subsequent surface reconstruction (meshing) stage. We present a simple and effective method for removing noise and outliers from such point sets. Our algorithm uses the input images and corresponding depth maps to remove pixels which are geometrically or photometrically inconsistent with the colored surface implied by the input. This allows standard surface reconstruction methods (such as Poisson surface reconstruction) to perform less smoothing and thus achieve higher quality surfaces with more features. Our algorithm is efficient, easy to implement, and robust to varying amounts of noise. We demonstrate the benefits of our algorithm in combination with a variety of state-of-the-art depth and surface reconstruction methods.
基于图像的三维重建技术生成的点集通常比使用激光扫描等主动技术获得的点集噪声大得多。因此,它们对后续的曲面重建(网格划分)阶段提出了更大的挑战。我们提出了一种简单有效的方法来去除这些点集中的噪声和异常值。我们的算法使用输入图像和相应的深度图来去除与输入所暗示的彩色表面在几何或光度上不一致的像素。这允许标准的表面重建方法(如泊松表面重建)执行较少的平滑,从而获得具有更多特征的高质量表面。我们的算法高效,易于实现,并且对不同数量的噪声具有鲁棒性。我们展示了我们的算法与各种最先进的深度和表面重建方法相结合的好处。
{"title":"Point Cloud Noise and Outlier Removal for Image-Based 3D Reconstruction","authors":"Katja Wolff, Changil Kim, H. Zimmer, Christopher Schroers, M. Botsch, O. Sorkine-Hornung, A. Sorkine-Hornung","doi":"10.1109/3DV.2016.20","DOIUrl":"https://doi.org/10.1109/3DV.2016.20","url":null,"abstract":"Point sets generated by image-based 3D reconstruction techniques are often much noisier than those obtained using active techniques like laser scanning. Therefore, they pose greater challenges to the subsequent surface reconstruction (meshing) stage. We present a simple and effective method for removing noise and outliers from such point sets. Our algorithm uses the input images and corresponding depth maps to remove pixels which are geometrically or photometrically inconsistent with the colored surface implied by the input. This allows standard surface reconstruction methods (such as Poisson surface reconstruction) to perform less smoothing and thus achieve higher quality surfaces with more features. Our algorithm is efficient, easy to implement, and robust to varying amounts of noise. We demonstrate the benefits of our algorithm in combination with a variety of state-of-the-art depth and surface reconstruction methods.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"006 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130853838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
HS-Nets: Estimating Human Body Shape from Silhouettes with Convolutional Neural Networks HS-Nets:用卷积神经网络从轮廓估计人体形状
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.19
E. Dibra, H. Jain, Cengiz Oztireli, R. Ziegler, M. Gross
We represent human body shape estimation from binary silhouettes or shaded images as a regression problem, and describe a novel method to tackle it using CNNs. Utilizing a parametric body model, we train CNNs to learn a global mapping from the input to shape parameters used to reconstruct the shapes of people, in neutral poses, with the application of garment fitting in mind. This results in an accurate, robust and automatic system, orders of magnitude faster than methods we compare to, enabling interactive applications. In addition, we show how to combine silhouettes from two views to improve prediction over a single view. The method is extensively evaluated on thousands of synthetic shapes and real data and compared to state of-art approaches, clearly outperforming methods based on global fitting and strongly competing with more expensive local fitting based ones.
我们将二值轮廓或阴影图像的人体形状估计描述为一个回归问题,并描述了一种使用cnn来解决这一问题的新方法。利用参数化身体模型,我们训练cnn学习从输入到形状参数的全局映射,用于重建中性姿势的人的形状,并考虑到服装试穿的应用。这导致了一个准确、健壮和自动化的系统,比我们比较的方法快了几个数量级,从而实现了交互式应用程序。此外,我们还展示了如何结合来自两个视图的轮廓来改进单个视图的预测。该方法在数千种合成形状和真实数据上进行了广泛的评估,并与最先进的方法进行了比较,明显优于基于全局拟合的方法,并与更昂贵的基于局部拟合的方法进行了激烈竞争。
{"title":"HS-Nets: Estimating Human Body Shape from Silhouettes with Convolutional Neural Networks","authors":"E. Dibra, H. Jain, Cengiz Oztireli, R. Ziegler, M. Gross","doi":"10.1109/3DV.2016.19","DOIUrl":"https://doi.org/10.1109/3DV.2016.19","url":null,"abstract":"We represent human body shape estimation from binary silhouettes or shaded images as a regression problem, and describe a novel method to tackle it using CNNs. Utilizing a parametric body model, we train CNNs to learn a global mapping from the input to shape parameters used to reconstruct the shapes of people, in neutral poses, with the application of garment fitting in mind. This results in an accurate, robust and automatic system, orders of magnitude faster than methods we compare to, enabling interactive applications. In addition, we show how to combine silhouettes from two views to improve prediction over a single view. The method is extensively evaluated on thousands of synthetic shapes and real data and compared to state of-art approaches, clearly outperforming methods based on global fitting and strongly competing with more expensive local fitting based ones.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128500651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 81
Robust Real-Time 3D Face Tracking from RGBD Videos under Extreme Pose, Depth, and Expression Variation 基于极端姿态、深度和表情变化的RGBD视频鲁棒实时3D人脸跟踪
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.54
Hai Xuan Pham, V. Pavlovic
We introduce a novel end-to-end real-time pose-robust 3D face tracking framework from RGBD videos, which is capable of tracking head pose and facial actions simultaneously in unconstrained environment without intervention or pre-calibration from a user. In particular, we emphasize tracking the head pose from profile to profile and improving tracking performance in challenging instances, where the tracked subject is at a considerably large distance from the camera and the quality of data deteriorates severely. To achieve these goals, the tracker is guided by an efficient multi-view 3D shape regressor, trained upon generic RGB datasets, which is able to predict model parameters despite large head rotations or tracking range. Specifically, the shape regressor is made aware of the head pose by inferring the possibility of particular facial landmarks being visible through a joint regression-classification local random forest framework, and piecewise linear regression models effectively map visibility features into shape parameters. In addition, the regressor is combined with a joint 2D+3D optimization that sparsely exploits depth information to further refine shape parameters to maintain tracking accuracy over time. The result is a robust on-line RGBD 3D face tracker that can model extreme head poses and facial expressions accurately in challenging scenes, which are demonstrated in our extensive experiments.
我们从RGBD视频中引入了一种新颖的端到端实时姿态鲁棒3D人脸跟踪框架,该框架能够在不受约束的环境中同时跟踪头部姿态和面部动作,而无需用户的干预或预校准。特别是,我们强调从侧面到侧面跟踪头部姿势,并在具有挑战性的情况下提高跟踪性能,其中被跟踪对象距离相机相当远,数据质量严重恶化。为了实现这些目标,跟踪器由一个高效的多视图3D形状回归器指导,在通用RGB数据集上进行训练,尽管头部旋转或跟踪范围很大,但仍然能够预测模型参数。具体来说,形状回归器通过联合回归分类局部随机森林框架推断特定面部标志可见的可能性,从而意识到头部姿势,分段线性回归模型有效地将可见性特征映射到形状参数中。此外,回归器与联合2D+3D优化相结合,稀疏地利用深度信息来进一步细化形状参数,以保持随时间的跟踪精度。结果是一个强大的在线RGBD 3D面部跟踪器,可以在具有挑战性的场景中准确地模拟极端的头部姿势和面部表情,这在我们的广泛实验中得到了证明。
{"title":"Robust Real-Time 3D Face Tracking from RGBD Videos under Extreme Pose, Depth, and Expression Variation","authors":"Hai Xuan Pham, V. Pavlovic","doi":"10.1109/3DV.2016.54","DOIUrl":"https://doi.org/10.1109/3DV.2016.54","url":null,"abstract":"We introduce a novel end-to-end real-time pose-robust 3D face tracking framework from RGBD videos, which is capable of tracking head pose and facial actions simultaneously in unconstrained environment without intervention or pre-calibration from a user. In particular, we emphasize tracking the head pose from profile to profile and improving tracking performance in challenging instances, where the tracked subject is at a considerably large distance from the camera and the quality of data deteriorates severely. To achieve these goals, the tracker is guided by an efficient multi-view 3D shape regressor, trained upon generic RGB datasets, which is able to predict model parameters despite large head rotations or tracking range. Specifically, the shape regressor is made aware of the head pose by inferring the possibility of particular facial landmarks being visible through a joint regression-classification local random forest framework, and piecewise linear regression models effectively map visibility features into shape parameters. In addition, the regressor is combined with a joint 2D+3D optimization that sparsely exploits depth information to further refine shape parameters to maintain tracking accuracy over time. The result is a robust on-line RGBD 3D face tracker that can model extreme head poses and facial expressions accurately in challenging scenes, which are demonstrated in our extensive experiments.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"458 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125831852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Hybrid Structure/Trajectory Constraint for Visual SLAM 视觉SLAM的混合结构/轨迹约束
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.12
Angélique Loesch, S. Bourgeois, V. Gay-Bellile, M. Dhome
This paper presents a hybrid structure/trajectory constraint, that uses output camera poses of a model-based tracker, for object localization with SLAM algorithm. This constraint takes into account the structure information given by a CAD model while relying on the formalism of trajectory constraints. It has the advantages to be compact in memory and to accelerate the SLAM optimization process. The accuracy and robustness of the resulting localization as well as the memory and time gains are evaluated on synthetic and real data. Videos are available as supplementary material.
本文提出了一种基于模型的跟踪器输出相机姿态的混合结构/轨迹约束,用于SLAM算法的目标定位。该约束依赖于轨迹约束的形式化,同时考虑了CAD模型给出的结构信息。它具有内存紧凑和加速SLAM优化过程的优点。在合成数据和实际数据上对定位结果的准确性和鲁棒性以及记忆和时间增益进行了评估。视频可作为补充资料。
{"title":"A Hybrid Structure/Trajectory Constraint for Visual SLAM","authors":"Angélique Loesch, S. Bourgeois, V. Gay-Bellile, M. Dhome","doi":"10.1109/3DV.2016.12","DOIUrl":"https://doi.org/10.1109/3DV.2016.12","url":null,"abstract":"This paper presents a hybrid structure/trajectory constraint, that uses output camera poses of a model-based tracker, for object localization with SLAM algorithm. This constraint takes into account the structure information given by a CAD model while relying on the formalism of trajectory constraints. It has the advantages to be compact in memory and to accelerate the SLAM optimization process. The accuracy and robustness of the resulting localization as well as the memory and time gains are evaluated on synthetic and real data. Videos are available as supplementary material.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126698899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A 3D Reconstruction with High Density and Accuracy Using Laser Profiler and Camera Fusion System on a Rover 利用激光剖面仪和相机融合系统在火星车上进行高密度、高精度的三维重建
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.70
Ryoichi Ishikawa, Menandro Roxas, Yoshihiro Sato, Takeshi Oishi, T. Masuda, K. Ikeuchi
3D Sensing systems mounted on mobile platform are emerging and have been developed for various applications. In this paper, we propose a profiler scanning system mounted on a rover to scan and reconstruct a bas-relief with high density and accuracy. Our hardware system consists of an omnidirectional camera and a 3D laser scanner. Our method selects good projection points for tracking to estimate motion stably and reject mismatches caused by difference between the positions of laser scanner and camera using an error metric based on the distance from omnidirectional camera to scanned point. We demonstrate that our results has better accuracy than comparable approach. In addition to local motion estimation method, we propose global poses refinement method using multi modal 2D-3D registration and our result shows good consistency between reflectance image and 2D RGB image.
安装在移动平台上的三维传感系统正在兴起,并已开发用于各种应用。在本文中,我们提出了一种安装在漫游车上的轮廓扫描系统,用于高密度和高精度地扫描和重建浅地形。我们的硬件系统由一个全向摄像头和一个三维激光扫描仪组成。该方法采用基于全向摄像机到被扫描点距离的误差度量,选择良好的投影点进行跟踪,稳定地估计运动,并抑制激光扫描仪与摄像机位置差异引起的不匹配。我们证明了我们的结果比可比方法有更好的准确性。在局部运动估计方法的基础上,提出了基于多模态2D- 3d配准的全局姿态优化方法,结果表明反射图像与二维RGB图像具有较好的一致性。
{"title":"A 3D Reconstruction with High Density and Accuracy Using Laser Profiler and Camera Fusion System on a Rover","authors":"Ryoichi Ishikawa, Menandro Roxas, Yoshihiro Sato, Takeshi Oishi, T. Masuda, K. Ikeuchi","doi":"10.1109/3DV.2016.70","DOIUrl":"https://doi.org/10.1109/3DV.2016.70","url":null,"abstract":"3D Sensing systems mounted on mobile platform are emerging and have been developed for various applications. In this paper, we propose a profiler scanning system mounted on a rover to scan and reconstruct a bas-relief with high density and accuracy. Our hardware system consists of an omnidirectional camera and a 3D laser scanner. Our method selects good projection points for tracking to estimate motion stably and reject mismatches caused by difference between the positions of laser scanner and camera using an error metric based on the distance from omnidirectional camera to scanned point. We demonstrate that our results has better accuracy than comparable approach. In addition to local motion estimation method, we propose global poses refinement method using multi modal 2D-3D registration and our result shows good consistency between reflectance image and 2D RGB image.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122636308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
CNN-Based Object Segmentation in Urban LIDAR with Missing Points 基于cnn的城市激光雷达缺点目标分割
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.51
Allan Zelener, I. Stamos
We examine the task of point-level object segmentation in outdoor urban LIDAR scans. A key challenge in this area is the problem of missing points in the scans due to technical limitations of the LIDAR sensors. Our core contributions are demonstrating the benefit of reframing the segmentation task over the scan acquisition grid as opposed to considering only the acquired 3D point cloud and developing a pipeline for training and applying a convolutional neural network to accomplish this segmentation on large scale LIDAR scenes. By labeling missing points in the scanning grid we show that we can train our classifier to achieve a more accurate and complete segmentation mask for the vehicle object category which is particularly prone to missing points. Additionally we show that the choice of input features maps to the CNN significantly effect the accuracy of the segmentation and these features should be chosen to fully encapsulate the 3D scene structure. We evaluate our model on a LIDAR dataset collected by Google Street View cars over a large area of New York City.
研究了室外城市激光雷达扫描中点水平目标分割的任务。由于激光雷达传感器的技术限制,该领域的一个关键挑战是扫描中缺失点的问题。我们的核心贡献是展示了在扫描采集网格上重构分割任务的好处,而不是只考虑获得的3D点云,并开发一个用于训练和应用卷积神经网络的管道来完成大规模LIDAR场景的分割。通过标记扫描网格中的缺失点,我们表明我们可以训练我们的分类器来实现对特别容易丢失点的车辆物体类别的更准确和完整的分割掩码。此外,我们表明,输入特征映射到CNN的选择显著影响分割的准确性,应该选择这些特征来充分封装3D场景结构。我们利用谷歌街景汽车在纽约市的大片地区收集的激光雷达数据集来评估我们的模型。
{"title":"CNN-Based Object Segmentation in Urban LIDAR with Missing Points","authors":"Allan Zelener, I. Stamos","doi":"10.1109/3DV.2016.51","DOIUrl":"https://doi.org/10.1109/3DV.2016.51","url":null,"abstract":"We examine the task of point-level object segmentation in outdoor urban LIDAR scans. A key challenge in this area is the problem of missing points in the scans due to technical limitations of the LIDAR sensors. Our core contributions are demonstrating the benefit of reframing the segmentation task over the scan acquisition grid as opposed to considering only the acquired 3D point cloud and developing a pipeline for training and applying a convolutional neural network to accomplish this segmentation on large scale LIDAR scenes. By labeling missing points in the scanning grid we show that we can train our classifier to achieve a more accurate and complete segmentation mask for the vehicle object category which is particularly prone to missing points. Additionally we show that the choice of input features maps to the CNN significantly effect the accuracy of the segmentation and these features should be chosen to fully encapsulate the 3D scene structure. We evaluate our model on a LIDAR dataset collected by Google Street View cars over a large area of New York City.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128542240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Matching Deformable Objects in Clutter 在杂乱中匹配可变形物体
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.10
L. Cosmo, E. Rodolà, Jonathan Masci, A. Torsello, M. Bronstein
We consider the problem of deformable object detection and dense correspondence in cluttered 3D scenes. Key ingredient to our method is the choice of representation: we formulate the problem in the spectral domain using the functional maps framework, where we seek for the most regular nearly-isometric parts in the model and the scene that minimize correspondence error. The problem is initialized by solving a sparse relaxation of a quadratic assignment problem on features obtained via data-driven metric learning. The resulting matching pipeline is solved efficiently, and yields accurate results in challenging settings that were previously left unexplored in the literature.
研究了杂乱三维场景中可变形目标检测和密集对应问题。我们的方法的关键因素是表示的选择:我们使用功能映射框架在谱域中制定问题,在其中我们寻求模型和场景中最规则的近等距部分,以最大限度地减少对应误差。该问题通过对数据驱动度量学习得到的特征进行二次分配问题的稀疏松弛来初始化。由此产生的匹配管道得到有效解决,并在具有挑战性的环境中产生准确的结果,这些环境以前在文献中未被探索过。
{"title":"Matching Deformable Objects in Clutter","authors":"L. Cosmo, E. Rodolà, Jonathan Masci, A. Torsello, M. Bronstein","doi":"10.1109/3DV.2016.10","DOIUrl":"https://doi.org/10.1109/3DV.2016.10","url":null,"abstract":"We consider the problem of deformable object detection and dense correspondence in cluttered 3D scenes. Key ingredient to our method is the choice of representation: we formulate the problem in the spectral domain using the functional maps framework, where we seek for the most regular nearly-isometric parts in the model and the scene that minimize correspondence error. The problem is initialized by solving a sparse relaxation of a quadratic assignment problem on features obtained via data-driven metric learning. The resulting matching pipeline is solved efficiently, and yields accurate results in challenging settings that were previously left unexplored in the literature.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129890098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Comparison of Radial and Tangential Geometries for Cylindrical Panorama 圆柱全景的径向几何和切向几何的比较
Pub Date : 2016-10-01 DOI: 10.1109/3DV.2016.81
F. Amjadi, S. Roy
This paper presents a new approach which builds 360-degree cylindrical panoramic images from multiple cameras. In order to ensure a perceptually correct result, mosaicing typically requires either a planar or near-planar scene, parallax-free camera motion between source frames, or a dense sampling of the scene. When these conditions are not satisfied, various artifacts may appear. There are many algorithms to overcome these problems. We propose a panoramic setup where cameras are placed evenly around a circle. Instead of looking outward, which is the traditional configuration, we propose to make the optical axes tangent to the camera circle, a "tangential" configuration. We will demonstrate that this configuration is very insensitive to depth estimation, which reduces stitching artifacts. This property is only limited by the fact that tangential cameras usually occlude each other along the circle. Beside an analysis and comparison of radial and tangential geometries, we provide an experimental setup with real panoramas obtained in realistic conditions.
本文提出了一种由多台摄像机构建360度圆柱全景图像的新方法。为了确保感知上正确的结果,拼接通常需要平面或近平面场景,源帧之间的无视差相机运动,或场景的密集采样。当这些条件不满足时,可能会出现各种伪影。有许多算法可以克服这些问题。我们提出了一个全景设置,相机均匀地放置在一个圆圈周围。我们建议将光轴与相机圈相切,即“切向”配置,而不是向外看,这是传统的配置。我们将证明这种配置对深度估计非常不敏感,从而减少了拼接伪影。这一特性仅受切向相机通常在圆上相互遮挡这一事实的限制。除了分析和比较径向和切向几何外,我们还提供了一个在现实条件下获得真实全景的实验装置。
{"title":"Comparison of Radial and Tangential Geometries for Cylindrical Panorama","authors":"F. Amjadi, S. Roy","doi":"10.1109/3DV.2016.81","DOIUrl":"https://doi.org/10.1109/3DV.2016.81","url":null,"abstract":"This paper presents a new approach which builds 360-degree cylindrical panoramic images from multiple cameras. In order to ensure a perceptually correct result, mosaicing typically requires either a planar or near-planar scene, parallax-free camera motion between source frames, or a dense sampling of the scene. When these conditions are not satisfied, various artifacts may appear. There are many algorithms to overcome these problems. We propose a panoramic setup where cameras are placed evenly around a circle. Instead of looking outward, which is the traditional configuration, we propose to make the optical axes tangent to the camera circle, a \"tangential\" configuration. We will demonstrate that this configuration is very insensitive to depth estimation, which reduces stitching artifacts. This property is only limited by the fact that tangential cameras usually occlude each other along the circle. Beside an analysis and comparison of radial and tangential geometries, we provide an experimental setup with real panoramas obtained in realistic conditions.","PeriodicalId":425304,"journal":{"name":"2016 Fourth International Conference on 3D Vision (3DV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130816131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2016 Fourth International Conference on 3D Vision (3DV)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1