VIPNet: A Fast and Accurate Single-View Volumetric Reconstruction by Learning Sparse Implicit Point Guidance

2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI:10.1109/3DV50981.2020.00065

Dong Du, Zhiyi Zhang, Xiaoguang Han, Shuguang Cui, Ligang Liu

{"title":"VIPNet: A Fast and Accurate Single-View Volumetric Reconstruction by Learning Sparse Implicit Point Guidance","authors":"Dong Du, Zhiyi Zhang, Xiaoguang Han, Shuguang Cui, Ligang Liu","doi":"10.1109/3DV50981.2020.00065","DOIUrl":null,"url":null,"abstract":"With the advent of deep neural networks, learning-based single-view reconstruction has gained popularity. However, in 3D, there is no absolutely dominant representation that is both computationally efficient and accurate yet allows for reconstructing high-resolution geometry of arbitrary topology. After all, the accurate implicit methods are time-consuming due to dense sampling and inference, while volumetric approaches are fast but limited to heavy memory usage and low accuracy. In this paper, we propose VIPNet, an end-to-end hybrid representation learning for fast and accurate single-view reconstruction under sparse implicit point guidance. Given an image, it first generates a volumetric result. Meanwhile, a corresponding implicit shape representation is learned. To balance the efficiency and accuracy, we adopt PointGenNet to learn some representative points for guiding the voxel refinement with the corresponding sparse implicit inference. A strategy of patch-based synthesis with global-local features under implicit guidance is also applied for reducing memory consumption required to generate high-resolution output. Extensive experiments demonstrate the effectiveness of our method both qualitatively and quantitatively, which indicates that our proposed hybrid learning outperforms separate representation learning. Specifically, our network not only runs 60 times faster than implicit methods but also contributes to accuracy gains. We hope it will inspire a re-thinking of hybrid representation learning.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on 3D Vision (3DV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV50981.2020.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

With the advent of deep neural networks, learning-based single-view reconstruction has gained popularity. However, in 3D, there is no absolutely dominant representation that is both computationally efficient and accurate yet allows for reconstructing high-resolution geometry of arbitrary topology. After all, the accurate implicit methods are time-consuming due to dense sampling and inference, while volumetric approaches are fast but limited to heavy memory usage and low accuracy. In this paper, we propose VIPNet, an end-to-end hybrid representation learning for fast and accurate single-view reconstruction under sparse implicit point guidance. Given an image, it first generates a volumetric result. Meanwhile, a corresponding implicit shape representation is learned. To balance the efficiency and accuracy, we adopt PointGenNet to learn some representative points for guiding the voxel refinement with the corresponding sparse implicit inference. A strategy of patch-based synthesis with global-local features under implicit guidance is also applied for reducing memory consumption required to generate high-resolution output. Extensive experiments demonstrate the effectiveness of our method both qualitatively and quantitatively, which indicates that our proposed hybrid learning outperforms separate representation learning. Specifically, our network not only runs 60 times faster than implicit methods but also contributes to accuracy gains. We hope it will inspire a re-thinking of hybrid representation learning.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

VIPNet:一种基于学习稀疏隐式点引导的快速准确的单视图体重建方法

随着深度神经网络的出现，基于学习的单视图重建得到了广泛的应用。然而，在3D中，没有绝对的主导表示，既计算效率高又准确，又允许重建任意拓扑的高分辨率几何形状。毕竟，精确的隐式方法由于采样和推理密集而耗时，而体积方法虽然速度快，但受内存占用大和精度低的限制。在本文中，我们提出了一种端到端混合表示学习VIPNet，用于在稀疏隐式点引导下快速准确的单视图重建。给定一个图像，它首先生成一个体积结果。同时，学习相应的隐式形状表示。为了平衡效率和准确性，我们采用PointGenNet学习一些有代表性的点来指导体素的细化，并进行相应的稀疏隐式推理。为了减少生成高分辨率输出所需的内存消耗，还采用了隐式指导下全局局部特征的基于补丁的合成策略。大量的实验证明了我们的方法在定性和定量上的有效性，这表明我们提出的混合学习优于单独的表示学习。具体来说，我们的网络不仅运行速度比隐式方法快60倍，而且有助于提高准确性。我们希望它能激发人们对混合表示学习的重新思考。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2020 International Conference on 3D Vision (3DV)

自引率

0.00%

发文量