按需深度：从低帧率有源传感器流式传输高密度深度数据

arXiv - CS - Computer Vision and Pattern Recognition Pub Date : 2024-09-12 DOI:arxiv-2409.08277

Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia

{"title":"按需深度：从低帧率有源传感器流式传输高密度深度数据","authors":"Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia","doi":"arxiv-2409.08277","DOIUrl":null,"url":null,"abstract":"High frame rate and accurate depth estimation plays an important role in\nseveral tasks crucial to robotics and automotive perception. To date, this can\nbe achieved through ToF and LiDAR devices for indoor and outdoor applications,\nrespectively. However, their applicability is limited by low frame rate, energy\nconsumption, and spatial sparsity. Depth on Demand (DoD) allows for accurate\ntemporal and spatial depth densification achieved by exploiting a high frame\nrate RGB sensor coupled with a potentially lower frame rate and sparse active\ndepth sensor. Our proposal jointly enables lower energy consumption and denser\nshape reconstruction, by significantly reducing the streaming requirements on\nthe depth sensor thanks to its three core stages: i) multi-modal encoding, ii)\niterative multi-modal integration, and iii) depth decoding. We present extended\nevidence assessing the effectiveness of DoD on indoor and outdoor video\ndatasets, covering both environment scanning and automotive perception use\ncases.","PeriodicalId":501130,"journal":{"name":"arXiv - CS - Computer Vision and Pattern Recognition","volume":"174 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Depth on Demand: Streaming Dense Depth from a Low Frame Rate Active Sensor\",\"authors\":\"Andrea Conti, Matteo Poggi, Valerio Cambareri, Stefano Mattoccia\",\"doi\":\"arxiv-2409.08277\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High frame rate and accurate depth estimation plays an important role in\\nseveral tasks crucial to robotics and automotive perception. To date, this can\\nbe achieved through ToF and LiDAR devices for indoor and outdoor applications,\\nrespectively. However, their applicability is limited by low frame rate, energy\\nconsumption, and spatial sparsity. Depth on Demand (DoD) allows for accurate\\ntemporal and spatial depth densification achieved by exploiting a high frame\\nrate RGB sensor coupled with a potentially lower frame rate and sparse active\\ndepth sensor. Our proposal jointly enables lower energy consumption and denser\\nshape reconstruction, by significantly reducing the streaming requirements on\\nthe depth sensor thanks to its three core stages: i) multi-modal encoding, ii)\\niterative multi-modal integration, and iii) depth decoding. We present extended\\nevidence assessing the effectiveness of DoD on indoor and outdoor video\\ndatasets, covering both environment scanning and automotive perception use\\ncases.\",\"PeriodicalId\":501130,\"journal\":{\"name\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"volume\":\"174 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Computer Vision and Pattern Recognition\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.08277\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computer Vision and Pattern Recognition","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.08277","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

高帧率和精确的深度估计在机器人和汽车感知的多项关键任务中发挥着重要作用。迄今为止，可通过分别用于室内和室外应用的 ToF 和激光雷达设备实现这一目标。然而，它们的适用性受到低帧频、能耗和空间稀疏性的限制。按需深度（Depth on Demand，DoD）通过利用高帧率 RGB 传感器和潜在的低帧率稀疏深度传感器，实现了精确的时空深度密集化。我们的方案通过三个核心阶段：i）多模态编码；ii）迭代多模态整合；iii）深度解码，显著降低了对深度传感器的流媒体要求，从而实现了更低的能耗和更密集的形状重建。我们介绍了评估 DoD 在室内和室外视频数据集上有效性的扩展证据，涵盖了环境扫描和汽车感知用例。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Depth on Demand: Streaming Dense Depth from a Low Frame Rate Active Sensor

High frame rate and accurate depth estimation plays an important role in several tasks crucial to robotics and automotive perception. To date, this can be achieved through ToF and LiDAR devices for indoor and outdoor applications, respectively. However, their applicability is limited by low frame rate, energy consumption, and spatial sparsity. Depth on Demand (DoD) allows for accurate temporal and spatial depth densification achieved by exploiting a high frame rate RGB sensor coupled with a potentially lower frame rate and sparse active depth sensor. Our proposal jointly enables lower energy consumption and denser shape reconstruction, by significantly reducing the streaming requirements on the depth sensor thanks to its three core stages: i) multi-modal encoding, ii) iterative multi-modal integration, and iii) depth decoding. We present extended evidence assessing the effectiveness of DoD on indoor and outdoor video datasets, covering both environment scanning and automotive perception use cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Computer Vision and Pattern Recognition

自引率

0.00%

发文量

期刊最新文献

Massively Multi-Person 3D Human Motion Forecasting with Scene Context Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution Precise Forecasting of Sky Images Using Spatial Warping JEAN: Joint Expression and Audio-guided NeRF-based Talking Face Generation Applications of Knowledge Distillation in Remote Sensing: A Survey