A Learning-Based Framework for Depth Perception using Dense Light Fields

A. P. Ferrugem, B. Zatt, L. Agostini
{"title":"A Learning-Based Framework for Depth Perception using Dense Light Fields","authors":"A. P. Ferrugem, B. Zatt, L. Agostini","doi":"10.1145/3539637.3557062","DOIUrl":null,"url":null,"abstract":"The rapid development of optical sensors technology has accompanied a growing demand for visual measurement systems in emerging areas that need to interpret the real three-dimensional physical world, such as self-driving cars, mobile robotics, Advanced Driver Assistance Systems (ADAS), and medical diagnostic in 3D imaging. In these systems, for modeling the physical world, it is necessary to unify visual information with depth measurements. Light Field cameras have the potential to be used in such systems as a versatile hypersensor. Since Light Fields represent the scene’s visual information from multiple viewpoints, it is possible to calculate the depth information through trigonometric operations. This paper proposes a learning-based framework that allows unifying scene depth with visual information obtained from Light Fields. The structure of the proposed framework is composed of four main modules. The deep learning modules consist of (i) a depth map estimation using a siamese convolutional neural network and (ii) an instance segmentation employing region-based convolutional neural network. The others two modules apply linear transformations: (iii) a module which applies the matrix transformations with camera intrinsic parameters to generated a new depth map of absolute distances and (iv) a module to return the distance of the selected objects. For the depth map estimation module this framework proposal a siamese neural network called EPINET-FAST, which allows for generating depth maps in less than half the time of the original EPINET. A case study is presented using Dense Light Fields captured by a Lytro Illum camera (plenotic 1.0). The case study seeks to exemplify the processing time of each module, allowing researchers to isolate critical points and propose changes in the future, seeking a processing that can be applied in real time.","PeriodicalId":350776,"journal":{"name":"Proceedings of the Brazilian Symposium on Multimedia and the Web","volume":"167 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Brazilian Symposium on Multimedia and the Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3539637.3557062","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid development of optical sensors technology has accompanied a growing demand for visual measurement systems in emerging areas that need to interpret the real three-dimensional physical world, such as self-driving cars, mobile robotics, Advanced Driver Assistance Systems (ADAS), and medical diagnostic in 3D imaging. In these systems, for modeling the physical world, it is necessary to unify visual information with depth measurements. Light Field cameras have the potential to be used in such systems as a versatile hypersensor. Since Light Fields represent the scene’s visual information from multiple viewpoints, it is possible to calculate the depth information through trigonometric operations. This paper proposes a learning-based framework that allows unifying scene depth with visual information obtained from Light Fields. The structure of the proposed framework is composed of four main modules. The deep learning modules consist of (i) a depth map estimation using a siamese convolutional neural network and (ii) an instance segmentation employing region-based convolutional neural network. The others two modules apply linear transformations: (iii) a module which applies the matrix transformations with camera intrinsic parameters to generated a new depth map of absolute distances and (iv) a module to return the distance of the selected objects. For the depth map estimation module this framework proposal a siamese neural network called EPINET-FAST, which allows for generating depth maps in less than half the time of the original EPINET. A case study is presented using Dense Light Fields captured by a Lytro Illum camera (plenotic 1.0). The case study seeks to exemplify the processing time of each module, allowing researchers to isolate critical points and propose changes in the future, seeking a processing that can be applied in real time.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于学习的密集光场深度感知框架
随着光学传感器技术的快速发展,需要解释真实三维物理世界的新兴领域对视觉测量系统的需求不断增长,例如自动驾驶汽车、移动机器人、高级驾驶辅助系统(ADAS)和3D成像中的医疗诊断。在这些系统中,为了对物理世界进行建模,有必要将视觉信息与深度测量相统一。光场相机有潜力在这种系统中被用作多功能超传感器。由于光场代表了多个视点的场景视觉信息,因此可以通过三角运算来计算深度信息。本文提出了一种基于学习的框架,可以将场景深度与从光场获得的视觉信息统一起来。提出的框架结构由四个主要模块组成。深度学习模块包括(i)使用连体卷积神经网络的深度图估计和(ii)使用基于区域的卷积神经网络的实例分割。其他两个模块应用线性变换:(iii)一个模块应用相机内部参数的矩阵变换来生成绝对距离的新深度图,(iv)一个模块返回所选对象的距离。对于深度图估计模块,该框架提出了一种称为EPINET- fast的暹罗神经网络,它可以在不到原始EPINET一半的时间内生成深度图。使用Lytro Illum相机(plenotic 1.0)捕获的密集光场进行了案例研究。案例研究旨在举例说明每个模块的处理时间,使研究人员能够隔离关键点并提出未来的更改,寻求可以实时应用的处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Evaluating Topic Modeling Pre-processing Pipelines for Portuguese Texts A Proposal to Apply SignWriting in IMSC1 Standard for the Next-Generation of Brazilian DTV Broadcasting System Once Learning for Looking and Identifying Based on YOLO-v5 Object Detection I can’t pay! Accessibility analysis of mobile banking apps Should We Translate? Evaluating Toxicity in Online Comments when Translating from Portuguese to English
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1