NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising

IF 6.4 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS IEEE Transactions on Automation Science and Engineering Pub Date : 2025-02-11 DOI:10.1109/TASE.2025.3541064

Tianchen Deng;Yanbo Wang;Hongle Xie;Hesheng Wang;Rui Guo;Jingchuan Wang;Danwei Wang;Weidong Chen

{"title":"NeSLAM: Neural Implicit Mapping and Self-Supervised Feature Tracking With Depth Completion and Denoising","authors":"Tianchen Deng;Yanbo Wang;Hongle Xie;Hesheng Wang;Rui Guo;Jingchuan Wang;Danwei Wang;Weidong Chen","doi":"10.1109/TASE.2025.3541064","DOIUrl":null,"url":null,"abstract":"In recent years, there have been significant advancements in 3D reconstruction and dense RGB-D SLAM systems. One notable development is the application of Neural Radiance Fields (NeRF) in these systems, which utilizes implicit neural representation to encode 3D scenes. However, the depth images obtained from consumer-grade RGB-D sensors are often sparse and noisy, which poses significant challenges for 3D reconstruction and affects the accuracy of the representation of the scene geometry. Furthermore, existing methods select random pixels for camera tracking, leading to inaccurate localization in real-world indoor environments. To this end, we present NeSLAM, an advanced framework that achieves accurate and dense depth estimation, robust camera tracking, and realistic synthesis of novel views. First, a depth completion and denoising network is designed to provide dense geometry prior and guide the neural implicit representation optimization. Second, we propose a NeRF-based self-supervised feature tracking algorithm for robust real-time tracking. Experiments on various indoor datasets demonstrate the effectiveness and accuracy of the system in reconstruction, tracking quality, and novel view synthesis. Note to Practitioners—Traditional SLAM methods usually use the sparse point cloud to represent the scene, resulting in poor scene representation capability. Our method proposes a neural implicit representation method with depth completion and denoising network and feature tracking method, achieves accurate scene reconstruction and accurate pose estimation in various indoor scenes. The depth completion and denoising network provide accurate depth information associated with depth uncertainty, which is used to improve the geometry consistency. The NeRF-based self-supervised feature tracking method improve the accuracy and robustness for camera tracking. The experimental results demonstrate the accuracy and effectiveness of this method in different scenes.","PeriodicalId":51060,"journal":{"name":"IEEE Transactions on Automation Science and Engineering","volume":"22 ","pages":"12309-12321"},"PeriodicalIF":6.4000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Automation Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10879467/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In recent years, there have been significant advancements in 3D reconstruction and dense RGB-D SLAM systems. One notable development is the application of Neural Radiance Fields (NeRF) in these systems, which utilizes implicit neural representation to encode 3D scenes. However, the depth images obtained from consumer-grade RGB-D sensors are often sparse and noisy, which poses significant challenges for 3D reconstruction and affects the accuracy of the representation of the scene geometry. Furthermore, existing methods select random pixels for camera tracking, leading to inaccurate localization in real-world indoor environments. To this end, we present NeSLAM, an advanced framework that achieves accurate and dense depth estimation, robust camera tracking, and realistic synthesis of novel views. First, a depth completion and denoising network is designed to provide dense geometry prior and guide the neural implicit representation optimization. Second, we propose a NeRF-based self-supervised feature tracking algorithm for robust real-time tracking. Experiments on various indoor datasets demonstrate the effectiveness and accuracy of the system in reconstruction, tracking quality, and novel view synthesis. Note to Practitioners—Traditional SLAM methods usually use the sparse point cloud to represent the scene, resulting in poor scene representation capability. Our method proposes a neural implicit representation method with depth completion and denoising network and feature tracking method, achieves accurate scene reconstruction and accurate pose estimation in various indoor scenes. The depth completion and denoising network provide accurate depth information associated with depth uncertainty, which is used to improve the geometry consistency. The NeRF-based self-supervised feature tracking method improve the accuracy and robustness for camera tracking. The experimental results demonstrate the accuracy and effectiveness of this method in different scenes.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

NeSLAM：深度补全和去噪的神经隐式映射和自监督特征跟踪

近年来，在三维重建和密集RGB-D SLAM系统方面取得了重大进展。一个值得注意的发展是神经辐射场（NeRF）在这些系统中的应用，它利用隐式神经表示来编码3D场景。然而，从消费级RGB-D传感器获得的深度图像通常是稀疏和噪声的，这给3D重建带来了重大挑战，并影响了场景几何形状表示的准确性。此外，现有方法选择随机像素进行相机跟踪，导致在真实室内环境中定位不准确。为此，我们提出了NeSLAM，这是一个先进的框架，可以实现准确和密集的深度估计，鲁棒的相机跟踪和逼真的新视图合成。首先，设计深度补全和去噪网络，提供密集几何先验，指导神经隐式表示优化。其次，提出了一种基于nerf的自监督特征跟踪算法，实现鲁棒实时跟踪。在各种室内数据集上的实验证明了该系统在重建、跟踪质量和新视图合成方面的有效性和准确性。从业者注意：传统的SLAM方法通常使用稀疏的点云来表示场景，导致场景表示能力差。该方法提出了一种结合深度补全去噪网络和特征跟踪方法的神经隐式表示方法，在各种室内场景中实现了准确的场景重建和准确的姿态估计。深度补全和去噪网络提供了与深度不确定性相关的准确深度信息，用于提高几何一致性。基于nerf的自监督特征跟踪方法提高了摄像机跟踪的精度和鲁棒性。实验结果证明了该方法在不同场景下的准确性和有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Automation Science and Engineering 工程技术-自动化与控制系统

CiteScore

12.50

自引率

14.30%

发文量

404

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Automation Science and Engineering (T-ASE) publishes fundamental papers on Automation, emphasizing scientific results that advance efficiency, quality, productivity, and reliability. T-ASE encourages interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, operations research, and other fields. T-ASE welcomes results relevant to industries such as agriculture, biotechnology, healthcare, home automation, maintenance, manufacturing, pharmaceuticals, retail, security, service, supply chains, and transportation. T-ASE addresses a research community willing to integrate knowledge across disciplines and industries. For this purpose, each paper includes a Note to Practitioners that summarizes how its results can be applied or how they might be extended to apply in practice.