{"title":"ISPRS ICWG IV/III, WG IV/11: Workshop on 3D digital modelling for SDGs","authors":"","doi":"10.1111/phor.1_12501","DOIUrl":"https://doi.org/10.1111/phor.1_12501","url":null,"abstract":"in","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"134 S231","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141413950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"31st international conference on Geoinformatics","authors":"","doi":"10.1111/phor.3_12501","DOIUrl":"https://doi.org/10.1111/phor.3_12501","url":null,"abstract":"","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"124 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141402507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Innsbruck Summer School of alpine research—close range sensing techniques in alpine terrain","authors":"","doi":"10.1111/phor.4_12501","DOIUrl":"https://doi.org/10.1111/phor.4_12501","url":null,"abstract":"","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"84 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141408879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"International workshop on ‘Photogrammetric Data Analysis’","authors":"","doi":"10.1111/phor.5_12501","DOIUrl":"https://doi.org/10.1111/phor.5_12501","url":null,"abstract":"","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"13 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141409220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fine‐grained indoor navigation services require obstacle‐level indoor maps to support, but since indoor environments are affected by human activities, resulting in frequent changes in indoor spatial layouts, and indoor environments are easily affected by light and occlusion, the vast majority of indoor maps are at room level, limiting indoor obstacle‐level navigation path planning. To solve this problem, this paper proposes a hierarchy relation graph (HRG) construction method based on RGB‐D. Firstly, the semantic information extraction of indoor scenes and elements is realized by the output transformed PSPNet and YOLO V8 models, and the bounding box of each element is obtained based on YOLO V8. Then an algorithm for determining the hierarchical relationship of indoor elements is proposed, which calculates the correlation between the two elements from both plane and depth dimensions and constructs a HRG of indoor elements based on directed trees. Finally, comparative experiments are designed to validate the proposed method. Experiments showed that the proposed method can construct HRGs in a variety of scenes; the hierarchy relation detection rate is 88.28%; the accuracy of hierarchy relation determination is 73.44%; and the single‐scene HRG can be generated in 3.81 s.
{"title":"Indoor hierarchy relation graph construction method based on RGB‐D","authors":"Jianwu Jiang, Zhizhong Kang, Jingwen Li","doi":"10.1111/phor.12499","DOIUrl":"https://doi.org/10.1111/phor.12499","url":null,"abstract":"Fine‐grained indoor navigation services require obstacle‐level indoor maps to support, but since indoor environments are affected by human activities, resulting in frequent changes in indoor spatial layouts, and indoor environments are easily affected by light and occlusion, the vast majority of indoor maps are at room level, limiting indoor obstacle‐level navigation path planning. To solve this problem, this paper proposes a hierarchy relation graph (HRG) construction method based on RGB‐D. Firstly, the semantic information extraction of indoor scenes and elements is realized by the output transformed PSPNet and YOLO V8 models, and the bounding box of each element is obtained based on YOLO V8. Then an algorithm for determining the hierarchical relationship of indoor elements is proposed, which calculates the correlation between the two elements from both plane and depth dimensions and constructs a HRG of indoor elements based on directed trees. Finally, comparative experiments are designed to validate the proposed method. Experiments showed that the proposed method can construct HRGs in a variety of scenes; the hierarchy relation detection rate is 88.28%; the accuracy of hierarchy relation determination is 73.44%; and the single‐scene HRG can be generated in 3.81 s.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141152151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Can Li, Zhi Gao, Zhipeng Lin, Tonghui Ye, Ziyao Li
The precise geometric representation and ability to handle long‐tail targets have led to the increasing attention towards vision‐centric 3D occupancy prediction, which models the real world as a voxel‐wise model solely through visual inputs. Despite some notable achievements in this field, many prior or concurrent approaches simply adapt existing spatial cross‐attention (SCA) as their 2D–3D transformation module, which may lead to informative coupling or compromise the global receptive field along the height dimension. To overcome these limitations, we propose a hierarchical occupancy (HierOcc) network featuring our innovative height‐aware cross‐attention (HACA) and hierarchical self‐attention (HSA) as its core modules to achieve enhanced precision and completeness in 3D occupancy prediction. The former module enables 2D–3D transformation, while the latter promotes voxels’ intercommunication. The key insight behind both modules is our multi‐height attention mechanism which ensures each attention head corresponds explicitly to a specific height, thereby decoupling height information while maintaining global attention across the height dimension. Extensive experiments show that our method brings significant improvements compared to baseline and surpasses all concurrent methods, demonstrating its superiority.
{"title":"A hierarchical occupancy network with multi‐height attention for vision‐centric 3D occupancy prediction","authors":"Can Li, Zhi Gao, Zhipeng Lin, Tonghui Ye, Ziyao Li","doi":"10.1111/phor.12500","DOIUrl":"https://doi.org/10.1111/phor.12500","url":null,"abstract":"The precise geometric representation and ability to handle long‐tail targets have led to the increasing attention towards vision‐centric 3D occupancy prediction, which models the real world as a voxel‐wise model solely through visual inputs. Despite some notable achievements in this field, many prior or concurrent approaches simply adapt existing spatial cross‐attention (SCA) as their 2D–3D transformation module, which may lead to informative coupling or compromise the global receptive field along the height dimension. To overcome these limitations, we propose a hierarchical occupancy (HierOcc) network featuring our innovative height‐aware cross‐attention (HACA) and hierarchical self‐attention (HSA) as its core modules to achieve enhanced precision and completeness in 3D occupancy prediction. The former module enables 2D–3D transformation, while the latter promotes voxels’ intercommunication. The key insight behind both modules is our multi‐height attention mechanism which ensures each attention head corresponds explicitly to a specific height, thereby decoupling height information while maintaining global attention across the height dimension. Extensive experiments show that our method brings significant improvements compared to baseline and surpasses all concurrent methods, demonstrating its superiority.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141063106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Land cover change detection (LCCD) with remote sensing images (RSIs) is important for observing the land cover change of the Earth's surface. Considering the insufficient performance of the traditional self‐attention mechanism used in a neural network to smoothen the noise of LCCD with RSIs, in this study a novel cross‐attention neural network (CANN) was proposed for improving the performance of LCCD with RSIs. In the proposed CANN, a cross‐attention mechanism was achieved by employing another temporal image to enhance attention performance and improve detection accuracies. First, a feature difference module was embedded in the backbone of the proposed CANN to generate a change magnitude image and guide the learning progress. A self‐attention module based on the cross‐attention mechanism was then proposed and embedded in the encoder of the proposed network to make the network pay attention to the changed area. Finally, the encoded features were decoded to obtain binary change detection with the ArgMax function. Compared with five methods, the experimental results based on six pairs of real RSIs well demonstrated the feasibility and superiority of the proposed network for achieving LCCD with RSIs. For example, the improvement for overall accuracy for the six pairs of real RSIs improved by our proposed approach is about 0.72–2.56%.
{"title":"Cross‐attention neural network for land cover change detection with remote sensing images","authors":"Zhiyong Lv, Pingdong Zhong, Wei Wang, Weiwei Sun, Tao Lei, Falco Nicola","doi":"10.1111/phor.12492","DOIUrl":"https://doi.org/10.1111/phor.12492","url":null,"abstract":"Land cover change detection (LCCD) with remote sensing images (RSIs) is important for observing the land cover change of the Earth's surface. Considering the insufficient performance of the traditional self‐attention mechanism used in a neural network to smoothen the noise of LCCD with RSIs, in this study a novel cross‐attention neural network (CANN) was proposed for improving the performance of LCCD with RSIs. In the proposed CANN, a cross‐attention mechanism was achieved by employing another temporal image to enhance attention performance and improve detection accuracies. First, a feature difference module was embedded in the backbone of the proposed CANN to generate a change magnitude image and guide the learning progress. A self‐attention module based on the cross‐attention mechanism was then proposed and embedded in the encoder of the proposed network to make the network pay attention to the changed area. Finally, the encoded features were decoded to obtain binary change detection with the ArgMax function. Compared with five methods, the experimental results based on six pairs of real RSIs well demonstrated the feasibility and superiority of the proposed network for achieving LCCD with RSIs. For example, the improvement for overall accuracy for the six pairs of real RSIs improved by our proposed approach is about 0.72–2.56%.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"42 8","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140972905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Simultaneous localization and mapping (SLAM) is a very challenging yet fundamental problem in the field of robotics and photogrammetry, and it is also a prerequisite for intelligent perception of unmanned systems. In recent years, 3D LiDAR SLAM technology has made remarkable progress. However, to the best of our knowledge, almost all existing surveys focus on visual SLAM methods. To bridge the gap, this paper provides a comprehensive review that summarizes the scientific connotation, key difficulties, research status, and future trends of 3D LiDAR SLAM, aiming to give readers a better understanding of LiDAR SLAM technology, thereby inspiring future research. Specifically, it summarizes the contents and characteristics of the main steps of LiDAR SLAM, introduces the key difficulties it faces, and gives the relationship with existing reviews; it provides an overview of current research hotspots, including LiDAR‐only methods and multi‐sensor fusion methods, and gives milestone algorithms and open‐source tools in each category; it summarizes common datasets, evaluation metrics and representative commercial SLAM solutions, and provides the evaluation results of mainstream methods on public datasets; it looks forward to the development trend of LiDAR SLAM, and considers the preliminary ideas of multi‐modal SLAM, event SLAM, and quantum SLAM.
同步定位与绘图(SLAM)是机器人学和摄影测量学领域一个极具挑战性的基本问题,也是无人系统实现智能感知的前提条件。近年来,三维激光雷达 SLAM 技术取得了显著进展。然而,据我们所知,几乎所有现有的研究都集中在视觉 SLAM 方法上。为了弥补这一空白,本文对三维激光雷达 SLAM 的科学内涵、关键难点、研究现状和未来趋势进行了全面综述,旨在让读者更好地了解激光雷达 SLAM 技术,从而对未来的研究有所启发。具体而言,本书总结了LiDAR SLAM主要步骤的内容和特点,介绍了其面临的关键难点,并给出了与现有综述的关系;概述了当前的研究热点,包括纯LiDAR方法和多传感器融合方法,并给出了每类方法中具有里程碑意义的算法和开源工具;总结了常见的数据集、评估指标和代表性的商业 SLAM 解决方案,并提供了主流方法在公共数据集上的评估结果;展望了激光雷达 SLAM 的发展趋势,并考虑了多模态 SLAM、事件 SLAM 和量子 SLAM 的初步设想。
{"title":"3D LiDAR SLAM: A survey","authors":"Yongjun Zhang, Pengcheng Shi, Jiayuan Li","doi":"10.1111/phor.12497","DOIUrl":"https://doi.org/10.1111/phor.12497","url":null,"abstract":"Simultaneous localization and mapping (SLAM) is a very challenging yet fundamental problem in the field of robotics and photogrammetry, and it is also a prerequisite for intelligent perception of unmanned systems. In recent years, 3D LiDAR SLAM technology has made remarkable progress. However, to the best of our knowledge, almost all existing surveys focus on visual SLAM methods. To bridge the gap, this paper provides a comprehensive review that summarizes the scientific connotation, key difficulties, research status, and future trends of 3D LiDAR SLAM, aiming to give readers a better understanding of LiDAR SLAM technology, thereby inspiring future research. Specifically, it summarizes the contents and characteristics of the main steps of LiDAR SLAM, introduces the key difficulties it faces, and gives the relationship with existing reviews; it provides an overview of current research hotspots, including LiDAR‐only methods and multi‐sensor fusion methods, and gives milestone algorithms and open‐source tools in each category; it summarizes common datasets, evaluation metrics and representative commercial SLAM solutions, and provides the evaluation results of mainstream methods on public datasets; it looks forward to the development trend of LiDAR SLAM, and considers the preliminary ideas of multi‐modal SLAM, event SLAM, and quantum SLAM.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"29 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140982814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most spectral–spatial classification methods for hyperspectral images (HSIs) can achieve satisfactory classification results. However, the common problem faced with these approaches is the need for a long training time and sufficient training samples. To address this issue, this study proposes an effective spectral–spatial HSI classification method based on superpixel merging, superpixel smoothing and broad learning system (SMS‐BLS). The newly introduced parameter‐free superpixel merging technique based on local modularity not only enhances the role of local spatial information in classification, but also maintains class boundary information as much as possible. In addition, the spectral and spatial information of HSIs is further fused during the superpixel smoothing process. As a result, with limited training samples, using merged and smoothed superpixels instead of pixels as input to the broad learning system significantly improves its classification performance. Moreover, the merged superpixels weaken the dependence of the classification results on the superpixel segmentation scale. The effectiveness of the proposed method was validated on three HSI benchmarks, namely Indian Pines, Pavia University and Salinas. Experimental and comparative results show the superiority of the method to other state‐of‐the‐art approaches in terms of overall accuracy and running time.
{"title":"Hyperspectral image classification based on superpixel merging and broad learning system","authors":"Fuding Xie, Rui Wang, Cui Jin, Geng Wang","doi":"10.1111/phor.12493","DOIUrl":"https://doi.org/10.1111/phor.12493","url":null,"abstract":"Most spectral–spatial classification methods for hyperspectral images (HSIs) can achieve satisfactory classification results. However, the common problem faced with these approaches is the need for a long training time and sufficient training samples. To address this issue, this study proposes an effective spectral–spatial HSI classification method based on superpixel merging, superpixel smoothing and broad learning system (SMS‐BLS). The newly introduced parameter‐free superpixel merging technique based on local modularity not only enhances the role of local spatial information in classification, but also maintains class boundary information as much as possible. In addition, the spectral and spatial information of HSIs is further fused during the superpixel smoothing process. As a result, with limited training samples, using merged and smoothed superpixels instead of pixels as input to the broad learning system significantly improves its classification performance. Moreover, the merged superpixels weaken the dependence of the classification results on the superpixel segmentation scale. The effectiveness of the proposed method was validated on three HSI benchmarks, namely Indian Pines, Pavia University and Salinas. Experimental and comparative results show the superiority of the method to other state‐of‐the‐art approaches in terms of overall accuracy and running time.","PeriodicalId":22881,"journal":{"name":"The Photogrammetric Record","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140939149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}