{"title":"Supervised depth estimation for visual perception on low-end platforms","authors":"Sabri Abderrazzak, Souissi Omar, Bouyahyaoui Abdelmalik","doi":"10.1109/ICTMOD49425.2020.9380618","DOIUrl":null,"url":null,"abstract":"Depth estimation is a very useful task in robotics and mostly computervision applications as obstacle avoidance, navigation and localization in GPS denied environments. Stereovision aproaches are commonly used in the state-of-the-art and dominated with methods from the classic Euclidean geometryand computational photography. However, most of these methods are computationally expensive which limits the applicability on high end setups and platforms. In the light of the recent bloom in Deep Learning and especially CNNs, we present in this paper a lightweight CNN-based encoder-decoder model for the task of monocular depth estimation from single RGB images. The model is designed to be runnable on low-cost platforms. In particular, we explore the performance of the most recent version of the MobileNet family as an encoder part, and afterwards we gradually build up the depth map back using the optimized UpConv block for decoding.","PeriodicalId":158303,"journal":{"name":"2020 IEEE International Conference on Technology Management, Operations and Decisions (ICTMOD)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Technology Management, Operations and Decisions (ICTMOD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTMOD49425.2020.9380618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Depth estimation is a very useful task in robotics and mostly computervision applications as obstacle avoidance, navigation and localization in GPS denied environments. Stereovision aproaches are commonly used in the state-of-the-art and dominated with methods from the classic Euclidean geometryand computational photography. However, most of these methods are computationally expensive which limits the applicability on high end setups and platforms. In the light of the recent bloom in Deep Learning and especially CNNs, we present in this paper a lightweight CNN-based encoder-decoder model for the task of monocular depth estimation from single RGB images. The model is designed to be runnable on low-cost platforms. In particular, we explore the performance of the most recent version of the MobileNet family as an encoder part, and afterwards we gradually build up the depth map back using the optimized UpConv block for decoding.