{"title":"Multi-view stereo with recurrent neural networks for spatio-temporal consistent depth maps","authors":"Hosung Son, Suk-ju Kang","doi":"10.1109/ICEIC57457.2023.10049937","DOIUrl":null,"url":null,"abstract":"Depth estimation methods based on deep learning have been studied to improve depth estimation accuracy. However, obtaining inter-frame consistency in depth maps in video depth estimation remains a challenge. Therefore, we proposed an application methodology for spatio-temporal consistency enhancement in video depth estimation based on convolutional neural networks (CNNs) and recurrent neural networks (RNNs). In other words, the convolutional long-short term memory (ConvLSTM) module was added to the decoder of depth estimation network to enable the use of the information from the previous frames. Additionally, the one-stage learning process was implemented to ensure ease of training. In conclusion, we experimentally show that the proposed method can achieve not only improved accuracy also consistency between depth map frames.","PeriodicalId":373752,"journal":{"name":"2023 International Conference on Electronics, Information, and Communication (ICEIC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Electronics, Information, and Communication (ICEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIC57457.2023.10049937","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Depth estimation methods based on deep learning have been studied to improve depth estimation accuracy. However, obtaining inter-frame consistency in depth maps in video depth estimation remains a challenge. Therefore, we proposed an application methodology for spatio-temporal consistency enhancement in video depth estimation based on convolutional neural networks (CNNs) and recurrent neural networks (RNNs). In other words, the convolutional long-short term memory (ConvLSTM) module was added to the decoder of depth estimation network to enable the use of the information from the previous frames. Additionally, the one-stage learning process was implemented to ensure ease of training. In conclusion, we experimentally show that the proposed method can achieve not only improved accuracy also consistency between depth map frames.