FaDIV-Syn: Fast Depth-Independent View Synthesis using Soft Masks and Implicit Blending

Andre Rochow, Max Schwarz, Michael Weinmann, Sven Behnke
{"title":"FaDIV-Syn: Fast Depth-Independent View Synthesis using Soft Masks and Implicit Blending","authors":"Andre Rochow, Max Schwarz, Michael Weinmann, Sven Behnke","doi":"10.15607/rss.2022.xviii.054","DOIUrl":null,"url":null,"abstract":"Novel view synthesis is required in many robotic applications, such as VR teleoperation and scene reconstruction. Existing methods are often too slow for these contexts, cannot handle dynamic scenes, and are limited by their explicit depth estimation stage, where incorrect depth predictions can lead to large projection errors. Our proposed method runs in real time on live streaming data and avoids explicit depth estimation by efficiently warping input images into the target frame for a range of assumed depth planes. The resulting plane sweep volume (PSV) is directly fed into our network, which first estimates soft PSV masks in a self-supervised manner, and then directly produces the novel output view. This improves efficiency and performance on transparent, reflective, thin, and feature-less scene parts. FaDIV-Syn can perform both interpolation and extrapolation tasks at 540p in real-time and outperforms state-of-the-art extrapolation methods on the large-scale RealEstate10k dataset. We thoroughly evaluate ablations, such as removing the Soft-Masking network, training from fewer examples as well as generalization to higher resolutions and stronger depth discretization. Our implementation is available.","PeriodicalId":340265,"journal":{"name":"Robotics: Science and Systems XVIII","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics: Science and Systems XVIII","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15607/rss.2022.xviii.054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Novel view synthesis is required in many robotic applications, such as VR teleoperation and scene reconstruction. Existing methods are often too slow for these contexts, cannot handle dynamic scenes, and are limited by their explicit depth estimation stage, where incorrect depth predictions can lead to large projection errors. Our proposed method runs in real time on live streaming data and avoids explicit depth estimation by efficiently warping input images into the target frame for a range of assumed depth planes. The resulting plane sweep volume (PSV) is directly fed into our network, which first estimates soft PSV masks in a self-supervised manner, and then directly produces the novel output view. This improves efficiency and performance on transparent, reflective, thin, and feature-less scene parts. FaDIV-Syn can perform both interpolation and extrapolation tasks at 540p in real-time and outperforms state-of-the-art extrapolation methods on the large-scale RealEstate10k dataset. We thoroughly evaluate ablations, such as removing the Soft-Masking network, training from fewer examples as well as generalization to higher resolutions and stronger depth discretization. Our implementation is available.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
FaDIV-Syn:使用软蒙版和隐式混合的快速深度独立视图合成
在许多机器人应用中,如VR远程操作和场景重建,都需要新颖的视图合成。现有的方法对于这些情况往往太慢,不能处理动态场景,并且受其明确的深度估计阶段的限制,其中不正确的深度预测可能导致较大的投影误差。我们提出的方法在实时流数据上运行,并通过有效地将输入图像扭曲到一系列假设深度平面的目标帧中来避免显式的深度估计。得到的平面扫描体积(PSV)直接输入到我们的网络中,该网络首先以自监督的方式估计软PSV掩模,然后直接产生新的输出视图。这提高了透明、反射、薄和无特征场景部分的效率和性能。FaDIV-Syn可以实时执行540p的插值和外推任务,并且在大规模RealEstate10k数据集上优于最先进的外推方法。我们彻底评估了消融,例如去除软屏蔽网络,从更少的示例中进行训练,以及向更高分辨率和更强深度离散化的泛化。我们的实现是可用的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Underwater Robot-To-Human Communication Via Motion: Implementation and Full-Loop Human Interface Evaluation Meta Value Learning for Fast Policy-Centric Optimal Motion Planning A Learning-based Iterative Control Framework for Controlling a Robot Arm with Pneumatic Artificial Muscles Aerial Layouting: Design and Control of a Compliant and Actuated End-Effector for Precise In-flight Marking on Ceilings Occupancy-SLAM: Simultaneously Optimizing Robot Poses and Continuous Occupancy Map
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1