通过图像和点云数据的传感器双融合进行鲁棒分类和 6D 姿势估计

IF 3.9 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS ACM Transactions on Sensor Networks Pub Date : 2024-01-05 DOI:10.1145/3639705

Yaming Xu, Yan Wang, Boliang Li

{"title":"通过图像和点云数据的传感器双融合进行鲁棒分类和 6D 姿势估计","authors":"Yaming Xu, Yan Wang, Boliang Li","doi":"10.1145/3639705","DOIUrl":null,"url":null,"abstract":"<p>It is an important aspect to fully leverage complementary sensors of images and point clouds for objects classification and 6D pose estimation tasks. Prior works extract objects category from a single sensor such as RGB camera or LiDAR, limiting their robustness in the event that a key sensor is severely blocked or fails. In this work, we present a robust objects classification and 6D object pose estimation strategy by dual fusion of image and point cloud data. Instead of solely relying on 3D proposals or mature 2D object detectors, our model deeply integrates 2D and 3D information of heterogeneous data sources by a robustness dual fusion network and an attention-based nonlinear fusion function Attn-fun(.), achieving efficiency as well as high accuracy classification for even missed some data sources. Then, our method is also able to precisely estimate the transformation matrix between two input objects by minimizing the feature difference to achieve 6D object pose estimation, even under strong noise or with outliers. We deploy our proposed method not only to ModelNet40 datasets, but also to a real fusion vision rotating platform for tracking objects in outer space based on the estimated pose.</p>","PeriodicalId":50910,"journal":{"name":"ACM Transactions on Sensor Networks","volume":"39 1","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2024-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust Classification and 6D Pose Estimation by Sensor Dual Fusion of Image and Point Cloud Data\",\"authors\":\"Yaming Xu, Yan Wang, Boliang Li\",\"doi\":\"10.1145/3639705\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>It is an important aspect to fully leverage complementary sensors of images and point clouds for objects classification and 6D pose estimation tasks. Prior works extract objects category from a single sensor such as RGB camera or LiDAR, limiting their robustness in the event that a key sensor is severely blocked or fails. In this work, we present a robust objects classification and 6D object pose estimation strategy by dual fusion of image and point cloud data. Instead of solely relying on 3D proposals or mature 2D object detectors, our model deeply integrates 2D and 3D information of heterogeneous data sources by a robustness dual fusion network and an attention-based nonlinear fusion function Attn-fun(.), achieving efficiency as well as high accuracy classification for even missed some data sources. Then, our method is also able to precisely estimate the transformation matrix between two input objects by minimizing the feature difference to achieve 6D object pose estimation, even under strong noise or with outliers. We deploy our proposed method not only to ModelNet40 datasets, but also to a real fusion vision rotating platform for tracking objects in outer space based on the estimated pose.</p>\",\"PeriodicalId\":50910,\"journal\":{\"name\":\"ACM Transactions on Sensor Networks\",\"volume\":\"39 1\",\"pages\":\"\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Sensor Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3639705\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Sensor Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3639705","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

充分利用图像和点云的互补传感器来完成物体分类和 6D 姿态估计任务是一个重要方面。之前的研究仅从 RGB 摄像机或激光雷达等单一传感器中提取物体类别，这限制了其在关键传感器严重受阻或失效时的鲁棒性。在这项工作中，我们通过图像和点云数据的双重融合，提出了一种稳健的物体分类和 6D 物体姿态估计策略。我们的模型不单纯依赖三维建议或成熟的二维物体检测器，而是通过鲁棒性双融合网络和基于注意力的非线性融合函数 Attn-fun(.)，将异构数据源的二维和三维信息进行深度融合，从而实现高效率和高精度的分类，即使遗漏了某些数据源。此外，我们的方法还能通过最小化特征差来精确估计两个输入物体之间的变换矩阵，从而实现 6D 物体姿态估计，即使在强噪声或异常值的情况下也是如此。我们不仅在 ModelNet40 数据集上部署了我们提出的方法，还在一个真实的融合视觉旋转平台上部署了我们提出的方法，以便根据估计的姿态跟踪外太空中的物体。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Robust Classification and 6D Pose Estimation by Sensor Dual Fusion of Image and Point Cloud Data

It is an important aspect to fully leverage complementary sensors of images and point clouds for objects classification and 6D pose estimation tasks. Prior works extract objects category from a single sensor such as RGB camera or LiDAR, limiting their robustness in the event that a key sensor is severely blocked or fails. In this work, we present a robust objects classification and 6D object pose estimation strategy by dual fusion of image and point cloud data. Instead of solely relying on 3D proposals or mature 2D object detectors, our model deeply integrates 2D and 3D information of heterogeneous data sources by a robustness dual fusion network and an attention-based nonlinear fusion function Attn-fun(.), achieving efficiency as well as high accuracy classification for even missed some data sources. Then, our method is also able to precisely estimate the transformation matrix between two input objects by minimizing the feature difference to achieve 6D object pose estimation, even under strong noise or with outliers. We deploy our proposed method not only to ModelNet40 datasets, but also to a real fusion vision rotating platform for tracking objects in outer space based on the estimated pose.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Sensor Networks 工程技术-电信学

CiteScore

5.90

自引率

7.30%

发文量

131

审稿时长

6 months

期刊介绍： ACM Transactions on Sensor Networks (TOSN) is a central publication by the ACM in the interdisciplinary area of sensor networks spanning a broad discipline from signal processing, networking and protocols, embedded systems, information management, to distributed algorithms. It covers research contributions that introduce new concepts, techniques, analyses, or architectures, as well as applied contributions that report on development of new tools and systems or experiences and experiments with high-impact, innovative applications. The Transactions places special attention on contributions to systemic approaches to sensor networks as well as fundamental contributions.