Recurrent Multiscale Feature Modulation for Geometry Consistent Depth Learning

Zhongkai Zhou;Xinnan Fan;Pengfei Shi;Yuanxue Xin;Dongliang Duan;Liuqing Yang
{"title":"Recurrent Multiscale Feature Modulation for Geometry Consistent Depth Learning","authors":"Zhongkai Zhou;Xinnan Fan;Pengfei Shi;Yuanxue Xin;Dongliang Duan;Liuqing Yang","doi":"10.1109/TPAMI.2024.3420165","DOIUrl":null,"url":null,"abstract":"The U-Net-like coarse-to-fine network design is currently the dominant choice for dense prediction tasks. Although this design can often achieve competitive performance, it suffers from some inherent limitations, such as training error propagation from low to high resolution and the dependency on the deeper and heavier backbones. To design an effective network that performs better, we instead propose Recurrent Multiscale Feature Modulation (R-MSFM), a new lightweight network design for self-supervised monocular depth estimation. R-MSFM extracts per-pixel features, builds a multiscale feature modulation module, and performs recurrent depth refinement through a parameter-shared decoder at a fixed resolution. This network design enables our R-MSFM to maintain a more lightweight architecture and fundamentally avoid error propagation caused by the coarse-to-fine design. Furthermore, we introduce the mask geometry consistency loss to facilitate our R-MSFM for geometry consistent depth learning. This loss penalizes the inconsistency of the estimated depths between adjacent views within the nonoccluded and nonstationary regions. Experimental results demonstrate the superiority of our proposed R-MSFM both at model size and inference speed, and show state-of-the-art results on two datasets: KITTI and Make3D.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"46 12","pages":"9551-9566"},"PeriodicalIF":18.6000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10574331/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The U-Net-like coarse-to-fine network design is currently the dominant choice for dense prediction tasks. Although this design can often achieve competitive performance, it suffers from some inherent limitations, such as training error propagation from low to high resolution and the dependency on the deeper and heavier backbones. To design an effective network that performs better, we instead propose Recurrent Multiscale Feature Modulation (R-MSFM), a new lightweight network design for self-supervised monocular depth estimation. R-MSFM extracts per-pixel features, builds a multiscale feature modulation module, and performs recurrent depth refinement through a parameter-shared decoder at a fixed resolution. This network design enables our R-MSFM to maintain a more lightweight architecture and fundamentally avoid error propagation caused by the coarse-to-fine design. Furthermore, we introduce the mask geometry consistency loss to facilitate our R-MSFM for geometry consistent depth learning. This loss penalizes the inconsistency of the estimated depths between adjacent views within the nonoccluded and nonstationary regions. Experimental results demonstrate the superiority of our proposed R-MSFM both at model size and inference speed, and show state-of-the-art results on two datasets: KITTI and Make3D.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
针对几何一致深度学习的递归多尺度特征调制
类似 U-Net 的从粗到细网络设计是目前高密度预测任务的主流选择。虽然这种设计通常能获得有竞争力的性能,但它也存在一些固有的局限性,如训练误差从低分辨率向高分辨率传播,以及对更深更重的骨干网的依赖性。为了设计出性能更好的有效网络,我们提出了循环多尺度特征调制(Recurrent Multiscale Feature Modulation,R-MSFM),这是一种用于自我监督单目深度估计的新型轻量级网络设计。R-MSFM 可提取每个像素的特征,建立多尺度特征调制模块,并在固定分辨率下通过参数共享解码器执行递归深度细化。这种网络设计使我们的 R-MSFM 保持了更轻量级的架构,并从根本上避免了由粗到细设计造成的错误传播。此外,我们还引入了掩码几何一致性损失,以促进 R-MSFM 的几何一致性深度学习。该损失对非遮挡和非稳态区域内相邻视图之间估计深度的不一致性进行惩罚。实验结果证明了我们提出的 R-MSFM 在模型大小和推理速度方面的优越性,并在两个数据集上展示了最先进的结果:KITTI 和 Make3D。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation. Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True and Class-Wise Distillation. On the Transferability and Discriminability of Representation Learning in Unsupervised Domain Adaptation. Fast Multi-view Discrete Clustering via Spectral Embedding Fusion. GrowSP++: Growing Superpoints and Primitives for Unsupervised 3D Semantic Segmentation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1