Diego Teijeiro Paredes, Margarita Amor López, Sandra Buján, Rico Richter, Jürgen Döllner
{"title":"大规模数据集地面点过滤的多级策略","authors":"Diego Teijeiro Paredes, Margarita Amor López, Sandra Buján, Rico Richter, Jürgen Döllner","doi":"10.1007/s11227-024-06406-0","DOIUrl":null,"url":null,"abstract":"<p>Ground point filtering on national-level datasets is a challenge due to the presence of multiple types of landscapes. This limitation does not simply affect to individual users, but it is in particular relevant for those national institutions in charge of providing national-level Light Detection and Ranging (LiDAR) point clouds. Each type of landscape is typically better filtered by different filtering algorithms or parameters; therefore, in order to get the best quality classification, the LiDAR point cloud should be divided by the landscape before running the filtering algorithms. Despite the fact that the manual segmentation and identification of the landscapes can be very time intensive, only few studies have addressed this issue. In this work, we present a multistage approach to automate the identification of the type of landscape using several metrics extracted from the LiDAR point cloud, matching the best filtering algorithms in each type of landscape. An additional contribution is presented, a parallel implementation for distributed memory systems, using Apache Spark, that can achieve up to <span>\\(34\\times\\)</span> of speedup using 12 compute nodes.</p>","PeriodicalId":501596,"journal":{"name":"The Journal of Supercomputing","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multistage strategy for ground point filtering on large-scale datasets\",\"authors\":\"Diego Teijeiro Paredes, Margarita Amor López, Sandra Buján, Rico Richter, Jürgen Döllner\",\"doi\":\"10.1007/s11227-024-06406-0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Ground point filtering on national-level datasets is a challenge due to the presence of multiple types of landscapes. This limitation does not simply affect to individual users, but it is in particular relevant for those national institutions in charge of providing national-level Light Detection and Ranging (LiDAR) point clouds. Each type of landscape is typically better filtered by different filtering algorithms or parameters; therefore, in order to get the best quality classification, the LiDAR point cloud should be divided by the landscape before running the filtering algorithms. Despite the fact that the manual segmentation and identification of the landscapes can be very time intensive, only few studies have addressed this issue. In this work, we present a multistage approach to automate the identification of the type of landscape using several metrics extracted from the LiDAR point cloud, matching the best filtering algorithms in each type of landscape. An additional contribution is presented, a parallel implementation for distributed memory systems, using Apache Spark, that can achieve up to <span>\\\\(34\\\\times\\\\)</span> of speedup using 12 compute nodes.</p>\",\"PeriodicalId\":501596,\"journal\":{\"name\":\"The Journal of Supercomputing\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Journal of Supercomputing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s11227-024-06406-0\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s11227-024-06406-0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multistage strategy for ground point filtering on large-scale datasets
Ground point filtering on national-level datasets is a challenge due to the presence of multiple types of landscapes. This limitation does not simply affect to individual users, but it is in particular relevant for those national institutions in charge of providing national-level Light Detection and Ranging (LiDAR) point clouds. Each type of landscape is typically better filtered by different filtering algorithms or parameters; therefore, in order to get the best quality classification, the LiDAR point cloud should be divided by the landscape before running the filtering algorithms. Despite the fact that the manual segmentation and identification of the landscapes can be very time intensive, only few studies have addressed this issue. In this work, we present a multistage approach to automate the identification of the type of landscape using several metrics extracted from the LiDAR point cloud, matching the best filtering algorithms in each type of landscape. An additional contribution is presented, a parallel implementation for distributed memory systems, using Apache Spark, that can achieve up to \(34\times\) of speedup using 12 compute nodes.