{"title":"Two-stage Self-supervised MVS Network using Adaptive Depth Sampling","authors":"Yangyan Deng, Ding Yuan, Hong Zhang","doi":"10.1109/RCAR54675.2022.9872231","DOIUrl":null,"url":null,"abstract":"With the development of deep learning, multi-view stereo has achieved significant progress recently. Due to the expensive three-dimension supervision, self-supervised methods have more potential. In this work, a novel two-stage self-supervised learning framework for multi-view stereo is proposed to overcome photometric dependency and the effect of foreshortening. On considering that accurate depth hypothesis always plays an important role in estimating depth information. Therefore, this work concentrates on designing an adaptive depth sampling module based on neighboring spatial patches propagation, which is determined by the normal maps. From this point of view, a two-stage process is carried out in this work. In detail, the coarse initial depth maps and normal maps are obtained in the first stage, and then the network in the second stage refines the depth sampling module by taking the influence of foreshortening into account. Furthermore, the loss functions are developed including feature-metric consistency to overcome the photometric inconsistency caused by lighting variation. Moreover, the consistency between depth maps and normal maps is also employed in the loss functions. To evaluate the effectiveness of our proposed two-stage framework, the experiments are carried out on the DTU datasets. The experimental results demonstrate that our self-supervised learning framework has excellent performance compared to the baseline methods.","PeriodicalId":304963,"journal":{"name":"2022 IEEE International Conference on Real-time Computing and Robotics (RCAR)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Real-time Computing and Robotics (RCAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RCAR54675.2022.9872231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the development of deep learning, multi-view stereo has achieved significant progress recently. Due to the expensive three-dimension supervision, self-supervised methods have more potential. In this work, a novel two-stage self-supervised learning framework for multi-view stereo is proposed to overcome photometric dependency and the effect of foreshortening. On considering that accurate depth hypothesis always plays an important role in estimating depth information. Therefore, this work concentrates on designing an adaptive depth sampling module based on neighboring spatial patches propagation, which is determined by the normal maps. From this point of view, a two-stage process is carried out in this work. In detail, the coarse initial depth maps and normal maps are obtained in the first stage, and then the network in the second stage refines the depth sampling module by taking the influence of foreshortening into account. Furthermore, the loss functions are developed including feature-metric consistency to overcome the photometric inconsistency caused by lighting variation. Moreover, the consistency between depth maps and normal maps is also employed in the loss functions. To evaluate the effectiveness of our proposed two-stage framework, the experiments are carried out on the DTU datasets. The experimental results demonstrate that our self-supervised learning framework has excellent performance compared to the baseline methods.