Monocular Depth Estimation Using Encoder-Decoder Architecture and Transfer Learning from Single RGB Image

Hritam Basak, Sagnik Ghosal, Mainak Sarkar, Mayukhmali Das, Soham Chattopadhyay
{"title":"Monocular Depth Estimation Using Encoder-Decoder Architecture and Transfer Learning from Single RGB Image","authors":"Hritam Basak, Sagnik Ghosal, Mainak Sarkar, Mayukhmali Das, Soham Chattopadhyay","doi":"10.1109/UPCON50219.2020.9376365","DOIUrl":null,"url":null,"abstract":"Depth estimation from a single RGB image has been one of the most important research topics in recent days as it has several important applications in self-supervised driving in autonomous cars, image reconstruction, and scene segmentation. Depth estimation from a single monocular image has been challenging as compared to stereo images due to the lack of spatio-temporal features per frame that makes 3D depth perception easier. Existing models and solutions in monocular depth estimation often resulted in low resolution and blurry depth maps and often fail to identify small object boundaries. In this paper, we propose a simple encoder-decoder based network that can predict high-quality depth images from single RGB images using transfer learning. We have utilized important features extracted from pre-trained networks, and after initializing the encoder with fine-tuning and important augmentation strategies, the network decoder part computes the high-end depth maps. The network has fewer trainable parameters and small iterations, though it outperforms the existing state-of-the-art methods and captures accurate boundaries when evaluated on two standard datasets, KITTI, and NYU Depth V2.","PeriodicalId":192190,"journal":{"name":"2020 IEEE 7th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 7th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UPCON50219.2020.9376365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Depth estimation from a single RGB image has been one of the most important research topics in recent days as it has several important applications in self-supervised driving in autonomous cars, image reconstruction, and scene segmentation. Depth estimation from a single monocular image has been challenging as compared to stereo images due to the lack of spatio-temporal features per frame that makes 3D depth perception easier. Existing models and solutions in monocular depth estimation often resulted in low resolution and blurry depth maps and often fail to identify small object boundaries. In this paper, we propose a simple encoder-decoder based network that can predict high-quality depth images from single RGB images using transfer learning. We have utilized important features extracted from pre-trained networks, and after initializing the encoder with fine-tuning and important augmentation strategies, the network decoder part computes the high-end depth maps. The network has fewer trainable parameters and small iterations, though it outperforms the existing state-of-the-art methods and captures accurate boundaries when evaluated on two standard datasets, KITTI, and NYU Depth V2.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于编码器-解码器结构和RGB单幅图像迁移学习的单目深度估计
单幅RGB图像的深度估计是近年来最重要的研究课题之一,因为它在自动驾驶汽车的自监督驾驶、图像重建和场景分割中有几个重要的应用。与立体图像相比,单眼图像的深度估计具有挑战性,因为缺乏每帧的时空特征,这使得3D深度感知更容易。现有的单目深度估计模型和解决方案往往导致深度图分辨率低、模糊,难以识别小目标边界。在本文中,我们提出了一个简单的基于编码器-解码器的网络,该网络可以使用迁移学习从单个RGB图像中预测高质量的深度图像。我们利用了从预训练网络中提取的重要特征,并在使用微调和重要增强策略初始化编码器后,网络解码器部分计算高端深度图。该网络具有更少的可训练参数和较小的迭代,尽管它优于现有的最先进的方法,并在两个标准数据集KITTI和NYU Depth V2上进行评估时捕获准确的边界。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Underwater Image Enhancement Using Neighbourhood Based Two Level Contrast Stretching and Modified Artificial Bee Colony Further LMI conditions to the stability of the delayed discrete-time systems subject to generalized overflow nonlinearities and parameter uncertainties Lossy Medical Image Compression using Residual Learning-based Dual Autoencoder Model Compact Circularly Polarized Patch Antenna for 5G Applications Air Quality Monitoring and Analysis Network
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1