模拟激光雷达数据在训练 3D 深度学习模型中的作用：详尽分析

IF 2.2 4区地球科学 Q3 ENVIRONMENTAL SCIENCES Journal of the Indian Society of Remote Sensing Pub Date : 2024-06-27 DOI:10.1007/s12524-024-01905-2

Bharat Lohani, Parvej Khan, Vaibhav Kumar, Siddhartha Gupta

{"title":"模拟激光雷达数据在训练 3D 深度学习模型中的作用：详尽分析","authors":"Bharat Lohani, Parvej Khan, Vaibhav Kumar, Siddhartha Gupta","doi":"10.1007/s12524-024-01905-2","DOIUrl":null,"url":null,"abstract":"<p>The use of 3D Deep Learning (DL) models for LiDAR data segmentation has attracted much interest in recent years. However, the generation of labeled point cloud data, which is a prerequisite for training DL models, is a highly resource-intensive exercise. Simulated LiDAR data, which are already labeled, provide a cost-effective alternative, but their efficacy and usefulness must be evaluated. This paper examines the role of simulated LiDAR point clouds in training DL models. A high-fidelity 3D terrain model representing the real environment is developed, and the in-house physics-based simulator “Limulator” is used to generate labeled point clouds through various realizations. The paper outlines a few major hypotheses to assess the usefulness of simulated data in training DL models. The hypotheses are designed to assess the role of simulated data alone or in combination with real data or by strategic boosting of minor classes in simulated data. Several experiments are carried out to test these hypotheses. An experiment involves training a DL model, PointCNN in this case, using various combinations of simulated and real LiDAR data and measuring its performance to segment the test data. Results show that training using simulated data alone can produce an overall accuracy (OA) of 89% and the weighted-averaged F1 score of 88.81%. It is further observed that training using a combination of simulated and real data can achieve accuracies comparable to when only a large quantity of real data is employed. Strategic boosting of minor classes in simulated data improves the accuracies of minor classes by up to 23% compared to only real data. Training a DL model using simulated data, due to the ease in its generation and positive impact on segmentation accuracy, can be highly beneficial in the use of DL for LiDAR data. The use of simulated data for training has the potential to minimize the resource-intensive exercise of developing labeled real data.</p>","PeriodicalId":17510,"journal":{"name":"Journal of the Indian Society of Remote Sensing","volume":"4 1","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2024-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Role of Simulated Lidar Data for Training 3D Deep Learning Models: An Exhaustive Analysis\",\"authors\":\"Bharat Lohani, Parvej Khan, Vaibhav Kumar, Siddhartha Gupta\",\"doi\":\"10.1007/s12524-024-01905-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The use of 3D Deep Learning (DL) models for LiDAR data segmentation has attracted much interest in recent years. However, the generation of labeled point cloud data, which is a prerequisite for training DL models, is a highly resource-intensive exercise. Simulated LiDAR data, which are already labeled, provide a cost-effective alternative, but their efficacy and usefulness must be evaluated. This paper examines the role of simulated LiDAR point clouds in training DL models. A high-fidelity 3D terrain model representing the real environment is developed, and the in-house physics-based simulator “Limulator” is used to generate labeled point clouds through various realizations. The paper outlines a few major hypotheses to assess the usefulness of simulated data in training DL models. The hypotheses are designed to assess the role of simulated data alone or in combination with real data or by strategic boosting of minor classes in simulated data. Several experiments are carried out to test these hypotheses. An experiment involves training a DL model, PointCNN in this case, using various combinations of simulated and real LiDAR data and measuring its performance to segment the test data. Results show that training using simulated data alone can produce an overall accuracy (OA) of 89% and the weighted-averaged F1 score of 88.81%. It is further observed that training using a combination of simulated and real data can achieve accuracies comparable to when only a large quantity of real data is employed. Strategic boosting of minor classes in simulated data improves the accuracies of minor classes by up to 23% compared to only real data. Training a DL model using simulated data, due to the ease in its generation and positive impact on segmentation accuracy, can be highly beneficial in the use of DL for LiDAR data. The use of simulated data for training has the potential to minimize the resource-intensive exercise of developing labeled real data.</p>\",\"PeriodicalId\":17510,\"journal\":{\"name\":\"Journal of the Indian Society of Remote Sensing\",\"volume\":\"4 1\",\"pages\":\"\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2024-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the Indian Society of Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s12524-024-01905-2\",\"RegionNum\":4,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Indian Society of Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s12524-024-01905-2","RegionNum":4,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

近年来，将三维深度学习（DL）模型用于激光雷达数据分割引起了广泛关注。然而，生成标注点云数据是训练深度学习模型的先决条件，是一项高度耗费资源的工作。已经标注的模拟激光雷达数据提供了一种具有成本效益的替代方法，但必须对其有效性和实用性进行评估。本文探讨了模拟激光雷达点云在训练 DL 模型中的作用。本文开发了一个代表真实环境的高保真三维地形模型，并使用内部基于物理的模拟器 "Limulator "通过各种实现方式生成标注点云。本文概述了几个主要假设，以评估模拟数据在训练 DL 模型中的有用性。这些假设旨在评估模拟数据单独或与真实数据相结合或通过在模拟数据中战略性地增强次要类别的作用。为了验证这些假设，我们进行了多项实验。实验包括使用模拟数据和真实激光雷达数据的不同组合训练 DL 模型（本例中为 PointCNN），并测量其分割测试数据的性能。结果表明，单独使用模拟数据进行训练的总体准确率（OA）为 89%，加权平均 F1 分数为 88.81%。进一步观察发现，结合使用模拟数据和真实数据进行训练所获得的准确率可与只使用大量真实数据时的准确率相媲美。与仅使用真实数据相比，在模拟数据中对小类进行策略性提升可将小类的准确率提高 23%。使用模拟数据训练 DL 模型，由于其易于生成并对分割准确性有积极影响，因此对使用 DL 处理激光雷达数据非常有益。使用模拟数据进行训练有可能最大限度地减少开发标记真实数据的资源密集型工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Role of Simulated Lidar Data for Training 3D Deep Learning Models: An Exhaustive Analysis

The use of 3D Deep Learning (DL) models for LiDAR data segmentation has attracted much interest in recent years. However, the generation of labeled point cloud data, which is a prerequisite for training DL models, is a highly resource-intensive exercise. Simulated LiDAR data, which are already labeled, provide a cost-effective alternative, but their efficacy and usefulness must be evaluated. This paper examines the role of simulated LiDAR point clouds in training DL models. A high-fidelity 3D terrain model representing the real environment is developed, and the in-house physics-based simulator “Limulator” is used to generate labeled point clouds through various realizations. The paper outlines a few major hypotheses to assess the usefulness of simulated data in training DL models. The hypotheses are designed to assess the role of simulated data alone or in combination with real data or by strategic boosting of minor classes in simulated data. Several experiments are carried out to test these hypotheses. An experiment involves training a DL model, PointCNN in this case, using various combinations of simulated and real LiDAR data and measuring its performance to segment the test data. Results show that training using simulated data alone can produce an overall accuracy (OA) of 89% and the weighted-averaged F1 score of 88.81%. It is further observed that training using a combination of simulated and real data can achieve accuracies comparable to when only a large quantity of real data is employed. Strategic boosting of minor classes in simulated data improves the accuracies of minor classes by up to 23% compared to only real data. Training a DL model using simulated data, due to the ease in its generation and positive impact on segmentation accuracy, can be highly beneficial in the use of DL for LiDAR data. The use of simulated data for training has the potential to minimize the resource-intensive exercise of developing labeled real data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the Indian Society of Remote Sensing ENVIRONMENTAL SCIENCES-REMOTE SENSING

CiteScore

4.80

自引率

8.00%

发文量

163

审稿时长

7 months

期刊介绍： The aims and scope of the Journal of the Indian Society of Remote Sensing are to help towards advancement, dissemination and application of the knowledge of Remote Sensing technology, which is deemed to include photo interpretation, photogrammetry, aerial photography, image processing, and other related technologies in the field of survey, planning and management of natural resources and other areas of application where the technology is considered to be appropriate, to promote interaction among all persons, bodies, institutions (private and/or state-owned) and industries interested in achieving advancement, dissemination and application of the technology, to encourage and undertake research in remote sensing and related technologies and to undertake and execute all acts which shall promote all or any of the aims and objectives of the Indian Society of Remote Sensing.