Dong Du, Zhiyi Zhang, Xiaoguang Han, Shuguang Cui, Ligang Liu
{"title":"VIPNet: A Fast and Accurate Single-View Volumetric Reconstruction by Learning Sparse Implicit Point Guidance","authors":"Dong Du, Zhiyi Zhang, Xiaoguang Han, Shuguang Cui, Ligang Liu","doi":"10.1109/3DV50981.2020.00065","DOIUrl":null,"url":null,"abstract":"With the advent of deep neural networks, learning-based single-view reconstruction has gained popularity. However, in 3D, there is no absolutely dominant representation that is both computationally efficient and accurate yet allows for reconstructing high-resolution geometry of arbitrary topology. After all, the accurate implicit methods are time-consuming due to dense sampling and inference, while volumetric approaches are fast but limited to heavy memory usage and low accuracy. In this paper, we propose VIPNet, an end-to-end hybrid representation learning for fast and accurate single-view reconstruction under sparse implicit point guidance. Given an image, it first generates a volumetric result. Meanwhile, a corresponding implicit shape representation is learned. To balance the efficiency and accuracy, we adopt PointGenNet to learn some representative points for guiding the voxel refinement with the corresponding sparse implicit inference. A strategy of patch-based synthesis with global-local features under implicit guidance is also applied for reducing memory consumption required to generate high-resolution output. Extensive experiments demonstrate the effectiveness of our method both qualitatively and quantitatively, which indicates that our proposed hybrid learning outperforms separate representation learning. Specifically, our network not only runs 60 times faster than implicit methods but also contributes to accuracy gains. We hope it will inspire a re-thinking of hybrid representation learning.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on 3D Vision (3DV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3DV50981.2020.00065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
With the advent of deep neural networks, learning-based single-view reconstruction has gained popularity. However, in 3D, there is no absolutely dominant representation that is both computationally efficient and accurate yet allows for reconstructing high-resolution geometry of arbitrary topology. After all, the accurate implicit methods are time-consuming due to dense sampling and inference, while volumetric approaches are fast but limited to heavy memory usage and low accuracy. In this paper, we propose VIPNet, an end-to-end hybrid representation learning for fast and accurate single-view reconstruction under sparse implicit point guidance. Given an image, it first generates a volumetric result. Meanwhile, a corresponding implicit shape representation is learned. To balance the efficiency and accuracy, we adopt PointGenNet to learn some representative points for guiding the voxel refinement with the corresponding sparse implicit inference. A strategy of patch-based synthesis with global-local features under implicit guidance is also applied for reducing memory consumption required to generate high-resolution output. Extensive experiments demonstrate the effectiveness of our method both qualitatively and quantitatively, which indicates that our proposed hybrid learning outperforms separate representation learning. Specifically, our network not only runs 60 times faster than implicit methods but also contributes to accuracy gains. We hope it will inspire a re-thinking of hybrid representation learning.