Pub Date : 2024-10-03DOI: 10.1109/JSTARS.2024.3472751
Yijian Duan;Liwen Meng;Yanmei Meng;Jihong Zhu;Jiacheng Zhang;Jinlai Zhang;Xin Liu
Given the inherent limitations of camera-only and LiDAR-only methods in performing semantic segmentation tasks in large-scale complex environments, multimodal information fusion for semantic segmentation has become a focal point of contemporary research. However, significant modal disparities often result in existing fusion-based methods struggling with low segmentation accuracy and limited efficiency in large-scale complex environments. To address these challenges,we propose a semantic segmentation network with camera–LiDAR cross-attention fusion based on fast neighbor feature aggregation (MFSA-Net), which is better suited for large-scale semantic segmentation in complex environments. Initially, we propose a dual-distance attention feature aggregation module based on rapid 3-D nearest neighbor search. This module employs a sliding window method in point cloud perspective projections for swift proximity search, and efficiently combines feature distance and Euclidean distance information to learn more distinctive local features. This improves segmentation accuracy while ensuring computational efficiency. Furthermore, we propose a cross-attention fusion two-stream network based on residual, which allows for more effective integration of camera information into the LiDAR data stream, enhancing both accuracy and robustness. Extensive experimental results on the large-scale point cloud datasets SemanticKITTI and Nuscenes demonstrate that our proposed algorithm outperforms similar algorithms in semantic segmentation performance in large-scale complex environments.
{"title":"MFSA-Net: Semantic Segmentation With Camera-LiDAR Cross-Attention Fusion Based on Fast Neighbor Feature Aggregation","authors":"Yijian Duan;Liwen Meng;Yanmei Meng;Jihong Zhu;Jiacheng Zhang;Jinlai Zhang;Xin Liu","doi":"10.1109/JSTARS.2024.3472751","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3472751","url":null,"abstract":"Given the inherent limitations of camera-only and LiDAR-only methods in performing semantic segmentation tasks in large-scale complex environments, multimodal information fusion for semantic segmentation has become a focal point of contemporary research. However, significant modal disparities often result in existing fusion-based methods struggling with low segmentation accuracy and limited efficiency in large-scale complex environments. To address these challenges,we propose a semantic segmentation network with camera–LiDAR cross-attention fusion based on fast neighbor feature aggregation (MFSA-Net), which is better suited for large-scale semantic segmentation in complex environments. Initially, we propose a dual-distance attention feature aggregation module based on rapid 3-D nearest neighbor search. This module employs a sliding window method in point cloud perspective projections for swift proximity search, and efficiently combines feature distance and Euclidean distance information to learn more distinctive local features. This improves segmentation accuracy while ensuring computational efficiency. Furthermore, we propose a cross-attention fusion two-stream network based on residual, which allows for more effective integration of camera information into the LiDAR data stream, enhancing both accuracy and robustness. Extensive experimental results on the large-scale point cloud datasets SemanticKITTI and Nuscenes demonstrate that our proposed algorithm outperforms similar algorithms in semantic segmentation performance in large-scale complex environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"19627-19639"},"PeriodicalIF":4.7,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10704067","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142595926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-02DOI: 10.1109/JSTARS.2024.3472296
Qinfeng Zhu;Yuan Fang;Yuanzhi Cai;Cheng Chen;Lei Fan
Deep learning methods, especially convolutional neural networks (CNNs) and vision transformers (ViTs), are frequently employed to perform semantic segmentation of high-resolution remotely sensed images. However, CNNs are constrained by their restricted receptive fields, while ViTs face challenges due to their quadratic complexity. Recently, the Mamba model, featuring linear complexity and a global receptive field, has gained extensive attention for vision tasks. In such tasks, images need to be serialized to form sequences compatible with the Mamba model. Numerous research efforts have explored scanning strategies to serialize images, aiming to enhance the Mamba model's understanding of images. However, the effectiveness of these scanning strategies remains uncertain. In this research, we conduct a comprehensive experimental investigation on the impact of mainstream scanning directions and their combinations on semantic segmentation of remotely sensed images. Through extensive experiments on the LoveDA, ISPRS Potsdam, ISPRS Vaihingen, and UAVid datasets, we demonstrate that no single scanning strategy outperforms others, regardless of their complexity or the number of scanning directions involved. A simple, single scanning direction is deemed sufficient for semantic segmentation of high-resolution remotely sensed images. Relevant directions for future research are also recommended.
{"title":"Rethinking Scanning Strategies With Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study","authors":"Qinfeng Zhu;Yuan Fang;Yuanzhi Cai;Cheng Chen;Lei Fan","doi":"10.1109/JSTARS.2024.3472296","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3472296","url":null,"abstract":"Deep learning methods, especially convolutional neural networks (CNNs) and vision transformers (ViTs), are frequently employed to perform semantic segmentation of high-resolution remotely sensed images. However, CNNs are constrained by their restricted receptive fields, while ViTs face challenges due to their quadratic complexity. Recently, the Mamba model, featuring linear complexity and a global receptive field, has gained extensive attention for vision tasks. In such tasks, images need to be serialized to form sequences compatible with the Mamba model. Numerous research efforts have explored scanning strategies to serialize images, aiming to enhance the Mamba model's understanding of images. However, the effectiveness of these scanning strategies remains uncertain. In this research, we conduct a comprehensive experimental investigation on the impact of mainstream scanning directions and their combinations on semantic segmentation of remotely sensed images. Through extensive experiments on the LoveDA, ISPRS Potsdam, ISPRS Vaihingen, and UAVid datasets, we demonstrate that no single scanning strategy outperforms others, regardless of their complexity or the number of scanning directions involved. A simple, single scanning direction is deemed sufficient for semantic segmentation of high-resolution remotely sensed images. Relevant directions for future research are also recommended.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"18223-18234"},"PeriodicalIF":4.7,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10703181","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142452721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This study creatively invented a new type of turnbuckle adjustable corner reflector (CR), which greatly enhances the flexible adjustment ability of CR in both vertical and horizontal directions through a unique positive and negative screw structure design, significantly improving the convenience of on-site deployment. Based on the performance of dihedral CR and trihedral CR installed in the South-to-North Water Diversion Channel using back-to-back design on TerraSAR-X and Sentinel-1A images, the performance of different structures of CR in complex environments, especially under heavy precipitation conditions, was deeply analyzed. The experimental results show that the trihedral CR can still maintain stable monitoring efficiency when encountering extreme weather conditions with precipitation exceeding 10 mm. The monitoring effect of traditional dihedral CR drops sharply and is almost ineffective in such environments. At the same time, the combination of theoretical radar cross section (RCS) and measured RCS values confirms the decisive impact of CR geometry and deployment strategy on improving monitoring stability and accuracy. Further precise comparison between CR-InSAR monitoring results and the second-order leveling measurement results shows that the system's average error is controlled within the range of 2–3 mm using trihedral CR. Compared with the results of dihedral CR and InSAR without CR, a significant improvement in accuracy has been achieved. This study provides strong theoretical support and practical guidance for the optimization design and practical application of CR systems, and has important scientific value and application prospects.
{"title":"Refinement Analysis of Real Dihedral and Trihedral CR-InSAR Based on TerraSAR-X and Sentinel-1A Images","authors":"Hui Liu;Bochen Zhou;Changwei Miao;Shihuan Li;Lei Xu;Ke Zheng;Geshuang Li;Shiji Yang;Mengyuan Zhu","doi":"10.1109/JSTARS.2024.3472220","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3472220","url":null,"abstract":"This study creatively invented a new type of turnbuckle adjustable corner reflector (CR), which greatly enhances the flexible adjustment ability of CR in both vertical and horizontal directions through a unique positive and negative screw structure design, significantly improving the convenience of on-site deployment. Based on the performance of dihedral CR and trihedral CR installed in the South-to-North Water Diversion Channel using back-to-back design on TerraSAR-X and Sentinel-1A images, the performance of different structures of CR in complex environments, especially under heavy precipitation conditions, was deeply analyzed. The experimental results show that the trihedral CR can still maintain stable monitoring efficiency when encountering extreme weather conditions with precipitation exceeding 10 mm. The monitoring effect of traditional dihedral CR drops sharply and is almost ineffective in such environments. At the same time, the combination of theoretical radar cross section (RCS) and measured RCS values confirms the decisive impact of CR geometry and deployment strategy on improving monitoring stability and accuracy. Further precise comparison between CR-InSAR monitoring results and the second-order leveling measurement results shows that the system's average error is controlled within the range of 2–3 mm using trihedral CR. Compared with the results of dihedral CR and InSAR without CR, a significant improvement in accuracy has been achieved. This study provides strong theoretical support and practical guidance for the optimization design and practical application of CR systems, and has important scientific value and application prospects.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"18739-18750"},"PeriodicalIF":4.7,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10702605","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142517954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The global navigation satellite system reflectometry (GNSS-R) technology has shown significant potential in retrieving snow depth using signal-to-noise ratio (SNR) data. However, compared to traditional in situ snow depth measurement techniques, we have observed that the accuracy and performance of GNSS-R can be significantly impacted under certain conditions, particularly when the elevation angle increases. This is due to the attenuation of the multipath effect, which is particularly evident during snow-free periods and under low-snow conditions where snow depths are below 50 cm. To address these limitations, we propose a snow depth inversion method that integrates SNR signals with the support vector regression algorithm, utilizing SNR sequences as feature inputs. We conducted studies at stations P351 and P030, covering elevation angles ranging from 5° to 20°, 5° to 25°, and 5° to 30°. The experimental results show that the root-mean-square error at both the stations decreased by 50% or more compared to traditional methods, demonstrating an improvement in inversion accuracy across different elevation angles. More importantly, the inversion accuracy of our method does not significantly lag behind that at lower elevation angles, indicating its excellent performance under challenging conditions. These findings highlight the contribution of our method in enhancing the accuracy of snow depth retrieval and its potential to drive further advancements in the field of GNSS-R snow depth inversion.
{"title":"GNSS-R Snow Depth Inversion Study Based on SNR-SVR","authors":"Yuan Hu;Jingxin Wang;Wei Liu;Xintai Yuan;Jens Wickert","doi":"10.1109/JSTARS.2024.3470508","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3470508","url":null,"abstract":"The global navigation satellite system reflectometry (GNSS-R) technology has shown significant potential in retrieving snow depth using signal-to-noise ratio (SNR) data. However, compared to traditional in situ snow depth measurement techniques, we have observed that the accuracy and performance of GNSS-R can be significantly impacted under certain conditions, particularly when the elevation angle increases. This is due to the attenuation of the multipath effect, which is particularly evident during snow-free periods and under low-snow conditions where snow depths are below 50 cm. To address these limitations, we propose a snow depth inversion method that integrates SNR signals with the support vector regression algorithm, utilizing SNR sequences as feature inputs. We conducted studies at stations P351 and P030, covering elevation angles ranging from 5° to 20°, 5° to 25°, and 5° to 30°. The experimental results show that the root-mean-square error at both the stations decreased by 50% or more compared to traditional methods, demonstrating an improvement in inversion accuracy across different elevation angles. More importantly, the inversion accuracy of our method does not significantly lag behind that at lower elevation angles, indicating its excellent performance under challenging conditions. These findings highlight the contribution of our method in enhancing the accuracy of snow depth retrieval and its potential to drive further advancements in the field of GNSS-R snow depth inversion.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"18025-18037"},"PeriodicalIF":4.7,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10703113","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142524244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-10-01DOI: 10.1109/JSTARS.2024.3472021
Nguyen Thi Thu Ha;Pham Quang Vinh;Nguyen Thien Phuong Thao;Pham Ha Linh;Michael Parsons;Nguyen Van Manh
The effective monitoring of eutrophication in inland water bodies is crucial for environmental management and pollution prevention. This study conducts a comprehensive analysis of in situ hyperspectral reflectance data (400–900 nm) and the trophic state index (TSI) obtained from 365 points across ten lakes and reservoirs in Northern Vietnam to propose a trophic classification based on water reflectance spectra features and a TSI estimation model for diagnosis and assessment of lake trophic status. By analyzing the quantity of reflectance peaks and their heights, our study identifies three distinct water reflectance spectra classes corresponding to three trophic levels: mesotrophic to lightly eutrophic, highly eutrophic, and hypertrophic. This classification enables the quick identification of trophic levels directly at the in situ radiometric measurement sites. Our study demonstrates that a logarithmic function of the band ratio, ${{mathbf{R}}_{mathbf{rs}}}( {715} )/{{mathbf{R}}_{mathbf{rs}}}( {560} )$