{"title":"Monocular depth estimation based on Chained Residual Pooling and Gradient Weighted Loss","authors":"Jiajun Han, Zhengang Jiang, Guanyuan Feng","doi":"10.1109/ICCECE58074.2023.10135509","DOIUrl":null,"url":null,"abstract":"In recent years, the self-supervised monocular depth estimation task in the field of autonomous driving has achieved remarkable results. The brightness consistency assumption is adopted to guide network training, The image brightness needs to be kept constant in adjacent frames. However, this assumption does not apply to laparoscopic scenarios, where the intensity of light for the same tissue changes over time during surgery. In addition, the defined receptive fields in laparoscopy lead to low utilization of structured cues, the predicted depth map performs poorly on the tissue contour when the laparoscopic frame is fed into the depth estimation network. In this work, aiming at the problem of luminance consistency, it is proposed to integrate the second-order gradient of the image and the second-order gradient of the parallax map into the photometric reconstruction error to guide the network. In view of the problem of the low utilization rate of laparoscopic image context clues, the following clues are weighted in the decoder part of the network to improve the reuse of low-resolution feature maps for tissue contour clues. Experiments were performed on the SCARED dataset, and new losses and new modules were put into the network separately to train to verify their effectiveness, the results showed good performance on all four commonly used indicators.","PeriodicalId":120030,"journal":{"name":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 3rd International Conference on Consumer Electronics and Computer Engineering (ICCECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCECE58074.2023.10135509","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, the self-supervised monocular depth estimation task in the field of autonomous driving has achieved remarkable results. The brightness consistency assumption is adopted to guide network training, The image brightness needs to be kept constant in adjacent frames. However, this assumption does not apply to laparoscopic scenarios, where the intensity of light for the same tissue changes over time during surgery. In addition, the defined receptive fields in laparoscopy lead to low utilization of structured cues, the predicted depth map performs poorly on the tissue contour when the laparoscopic frame is fed into the depth estimation network. In this work, aiming at the problem of luminance consistency, it is proposed to integrate the second-order gradient of the image and the second-order gradient of the parallax map into the photometric reconstruction error to guide the network. In view of the problem of the low utilization rate of laparoscopic image context clues, the following clues are weighted in the decoder part of the network to improve the reuse of low-resolution feature maps for tissue contour clues. Experiments were performed on the SCARED dataset, and new losses and new modules were put into the network separately to train to verify their effectiveness, the results showed good performance on all four commonly used indicators.