CMFormer: Non-line-of-sight imaging with a memory-efficient MetaFormer network

IF 3.7 2区 工程技术 Q2 OPTICS Optics and Lasers in Engineering Pub Date : 2025-04-01 Epub Date: 2025-02-12 DOI:10.1016/j.optlaseng.2025.108875
Shihao Zhang , Shaohui Jin , Hao Liu , Yue Li , Xiaoheng Jiang , Mingliang Xu
{"title":"CMFormer: Non-line-of-sight imaging with a memory-efficient MetaFormer network","authors":"Shihao Zhang ,&nbsp;Shaohui Jin ,&nbsp;Hao Liu ,&nbsp;Yue Li ,&nbsp;Xiaoheng Jiang ,&nbsp;Mingliang Xu","doi":"10.1016/j.optlaseng.2025.108875","DOIUrl":null,"url":null,"abstract":"<div><div>Non-line-of-sight (NLOS) imaging aims to overcome the limitation of traditional sensors that can only detect targets within the line of sight. While existing NLOS imaging algorithms have achieved notable imaging quality, they are constrained by significant memory requirements due to the 3D nature of transient measurements. In this paper, we propose a new memory-efficient MetaFormer-based NLOS imaging method, named CMFormer, which enables NLOS imaging with lower memory usage and faster imaging speed, facilitating deployment on consumer-grade GPUs. Specifically, we design a lightweight module based on MetaFormer, which employs multi-dimensional global convolution and multi-scale dilated convolution as token mixers. This approach leverages the strong temporal-spatial correlation more effectively without separating the transient data into distinct temporal and spatial components for feature extraction. With the unique characteristics of this token mixer, we propose aggregate feature transmission to replace conventional skip connections, achieving better performance without needing to increase network width at the decoder stage. Additionally, to mitigate the loss of important detail features during downsampling, we design a cross-layer integration attention module to enhance the interaction between the adjacent hierarchical features. Leveraging gradient checkpointing technology, the proposed method can be easily trained and inferred on consumer-grade GPUs, significantly less than the current best imaging algorithm NLOST, and achieves an imaging speed of 8 FPS. We employ the UNet hierarchical structure to build our pipeline, ensuring that our network can better denoise and enhance generalization to real-world scenarios even when trained on synthetic datasets. Extensive experimental results demonstrate that our method achieves the best performance on both synthetic and real-world data with low memory cost and higher imaging speed. The code will be released soon.</div></div>","PeriodicalId":49719,"journal":{"name":"Optics and Lasers in Engineering","volume":"187 ","pages":"Article 108875"},"PeriodicalIF":3.7000,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Lasers in Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143816625000624","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/2/12 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Non-line-of-sight (NLOS) imaging aims to overcome the limitation of traditional sensors that can only detect targets within the line of sight. While existing NLOS imaging algorithms have achieved notable imaging quality, they are constrained by significant memory requirements due to the 3D nature of transient measurements. In this paper, we propose a new memory-efficient MetaFormer-based NLOS imaging method, named CMFormer, which enables NLOS imaging with lower memory usage and faster imaging speed, facilitating deployment on consumer-grade GPUs. Specifically, we design a lightweight module based on MetaFormer, which employs multi-dimensional global convolution and multi-scale dilated convolution as token mixers. This approach leverages the strong temporal-spatial correlation more effectively without separating the transient data into distinct temporal and spatial components for feature extraction. With the unique characteristics of this token mixer, we propose aggregate feature transmission to replace conventional skip connections, achieving better performance without needing to increase network width at the decoder stage. Additionally, to mitigate the loss of important detail features during downsampling, we design a cross-layer integration attention module to enhance the interaction between the adjacent hierarchical features. Leveraging gradient checkpointing technology, the proposed method can be easily trained and inferred on consumer-grade GPUs, significantly less than the current best imaging algorithm NLOST, and achieves an imaging speed of 8 FPS. We employ the UNet hierarchical structure to build our pipeline, ensuring that our network can better denoise and enhance generalization to real-world scenarios even when trained on synthetic datasets. Extensive experimental results demonstrate that our method achieves the best performance on both synthetic and real-world data with low memory cost and higher imaging speed. The code will be released soon.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
CMFormer:非视距成像与内存高效的MetaFormer网络
非视距成像旨在克服传统传感器只能探测视距内目标的局限性。虽然现有的NLOS成像算法已经取得了显著的成像质量,但由于瞬态测量的3D性质,它们受到显著内存需求的限制。在本文中,我们提出了一种新的内存高效的基于metaformer的NLOS成像方法,命名为CMFormer,该方法使NLOS成像具有更低的内存使用量和更快的成像速度,便于在消费级gpu上部署。具体来说,我们设计了一个基于MetaFormer的轻量级模块,该模块采用多维全局卷积和多尺度扩展卷积作为令牌混合器。该方法更有效地利用了瞬态数据的强时空相关性,而无需将瞬态数据分离为不同的时空分量进行特征提取。利用该令牌混频器的独特特性,我们提出聚合特征传输来取代传统的跳过连接,在解码器阶段无需增加网络宽度的情况下获得更好的性能。此外,为了减轻下采样过程中重要细节特征的丢失,我们设计了一个跨层集成关注模块来增强相邻分层特征之间的交互。利用梯度检查点技术,该方法可以很容易地在消费级gpu上进行训练和推断,明显低于目前最好的成像算法NLOST,并实现了8 FPS的成像速度。我们采用UNet分层结构来构建我们的管道,确保我们的网络即使在合成数据集上训练也能更好地去噪并增强对现实世界场景的泛化。大量的实验结果表明,我们的方法在合成数据和实际数据上都取得了较好的性能,并且具有较低的存储成本和较高的成像速度。代码将很快发布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Optics and Lasers in Engineering
Optics and Lasers in Engineering 工程技术-光学
CiteScore
8.90
自引率
8.70%
发文量
384
审稿时长
42 days
期刊介绍: Optics and Lasers in Engineering aims at providing an international forum for the interchange of information on the development of optical techniques and laser technology in engineering. Emphasis is placed on contributions targeted at the practical use of methods and devices, the development and enhancement of solutions and new theoretical concepts for experimental methods. Optics and Lasers in Engineering reflects the main areas in which optical methods are being used and developed for an engineering environment. Manuscripts should offer clear evidence of novelty and significance. Papers focusing on parameter optimization or computational issues are not suitable. Similarly, papers focussed on an application rather than the optical method fall outside the journal''s scope. The scope of the journal is defined to include the following: -Optical Metrology- Optical Methods for 3D visualization and virtual engineering- Optical Techniques for Microsystems- Imaging, Microscopy and Adaptive Optics- Computational Imaging- Laser methods in manufacturing- Integrated optical and photonic sensors- Optics and Photonics in Life Science- Hyperspectral and spectroscopic methods- Infrared and Terahertz techniques
期刊最新文献
Generative reconstruction of photoelastic fringe patterns for transparent components using pressure-derived latent features Enhanced gray code pattern for high dynamic range three-dimensional measurement High stability and accuracy 543.5 nm laser referenced to optical frequency comb by PPLN crystal frequency doubling Accurate distributed fiber-optic disturbance sensing in phase-sensitive OTDR system with an improved PGC-based demodulation scheme Precise internal transmittance measurements of highly transparent optical materials at 355 nm with pulsed cavity ring-down technique
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1