Shi Yi , Mengting Chen , Xuesong Yuan , Si Guo , Jiashuai Wang
{"title":"An interactive fusion attention-guided network for ground surface hot spring fluids segmentation in dual-spectrum UAV images","authors":"Shi Yi , Mengting Chen , Xuesong Yuan , Si Guo , Jiashuai Wang","doi":"10.1016/j.isprsjprs.2025.01.022","DOIUrl":null,"url":null,"abstract":"<div><div>Investigating the distribution of ground surface hot spring fluids is crucial for the exploitation and utilization of geothermal resources. The detailed information provided by dual-spectrum images captured by unmanned aerial vehicles (UAVs) flew at low altitudes is beneficial to accurately segment ground surface hot spring fluids. However, existing image segmentation methods face significant challenges of hot spring fluids segmentation due to the frequent and irregular variations in fluid boundaries, meanwhile the presence of substances within such fluids lead to segmentation uncertainties. In addition, there is currently no benchmark dataset dedicated to ground surface hot spring fluid segmentation in dual-spectrum UAV images. To this end, in this study, a benchmark dataset called the dual-spectrum hot spring fluid segmentation (DHFS) dataset was constructed for segmenting ground surface hot spring fluids in dual-spectrum UAV images. Additionally, a novel interactive fusion attention-guided RGB-Thermal (RGB-T) semantic segmentation network named IFAGNet was proposed in this study for accurately segmenting ground surface hot spring fluids in dual-spectrum UAV images. The proposed IFAGNet consists of two sub-networks that leverage two feature fusion architectures and the two-stage feature fusion module is designed to achieve optimal intermediate feature fusion. Furthermore, IFAGNet utilizes an interactive fusion attention-guided architecture to guide the two sub-networks further process the extracted features through complementary information exchange, resulting in a significant boost in hot spring fluid segmentation accuracy. Additionally, two down-up full scale feature pyramid network (FPN) decoders are developed for each sub-network to fully utilize multi-stage fused features and improve the preservation of detailed information during hot spring fluid segmentation. Moreover, a hybrid consistency learning strategy is implemented to train the IFAGNet, which combines fully supervised learning with consistency learning between each sub-network and their fusion results to further optimize the segmentation accuracy of hot spring fluid in RGB-T UAV images. The optimal model of the IFAGNet was tested on the proposed DHFS dataset, and the experimental results demonstrated that the IFAGNet outperforms existing image segmentation frameworks in terms of segmentation accuracy for hot spring fluids segmentation in dual-spectrum UAV images which achieved Pixel Accuracy (PA) of 96.1%, Precision of 93.2%, Recall of 85.9%, Intersection over Union (IoU) of 78.3%, and F1-score (F1) of 89.4%, respectively. And overcomes segmentation uncertainties to a great extent, while maintaining competitive computational efficiency. The ablation studies have confirmed the effectiveness of each main innovation in IFAGNet for improving the accuracy of hot spring fluid segmentation. Therefore, the proposed DHFS dataset and IFAGNet lay the foundation for segmentation of ground surface hot spring fluids in dual-spectrum UAV images, which has significant potential value for the geothermal hot spring resources exploitation. The DHFS dataset and the code of IFAGNet will be available at <span><span>https://github.com/Ys-Master-CDUT/IFAGNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50269,"journal":{"name":"ISPRS Journal of Photogrammetry and Remote Sensing","volume":"220 ","pages":"Pages 661-691"},"PeriodicalIF":12.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S092427162500022X","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOGRAPHY, PHYSICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Investigating the distribution of ground surface hot spring fluids is crucial for the exploitation and utilization of geothermal resources. The detailed information provided by dual-spectrum images captured by unmanned aerial vehicles (UAVs) flew at low altitudes is beneficial to accurately segment ground surface hot spring fluids. However, existing image segmentation methods face significant challenges of hot spring fluids segmentation due to the frequent and irregular variations in fluid boundaries, meanwhile the presence of substances within such fluids lead to segmentation uncertainties. In addition, there is currently no benchmark dataset dedicated to ground surface hot spring fluid segmentation in dual-spectrum UAV images. To this end, in this study, a benchmark dataset called the dual-spectrum hot spring fluid segmentation (DHFS) dataset was constructed for segmenting ground surface hot spring fluids in dual-spectrum UAV images. Additionally, a novel interactive fusion attention-guided RGB-Thermal (RGB-T) semantic segmentation network named IFAGNet was proposed in this study for accurately segmenting ground surface hot spring fluids in dual-spectrum UAV images. The proposed IFAGNet consists of two sub-networks that leverage two feature fusion architectures and the two-stage feature fusion module is designed to achieve optimal intermediate feature fusion. Furthermore, IFAGNet utilizes an interactive fusion attention-guided architecture to guide the two sub-networks further process the extracted features through complementary information exchange, resulting in a significant boost in hot spring fluid segmentation accuracy. Additionally, two down-up full scale feature pyramid network (FPN) decoders are developed for each sub-network to fully utilize multi-stage fused features and improve the preservation of detailed information during hot spring fluid segmentation. Moreover, a hybrid consistency learning strategy is implemented to train the IFAGNet, which combines fully supervised learning with consistency learning between each sub-network and their fusion results to further optimize the segmentation accuracy of hot spring fluid in RGB-T UAV images. The optimal model of the IFAGNet was tested on the proposed DHFS dataset, and the experimental results demonstrated that the IFAGNet outperforms existing image segmentation frameworks in terms of segmentation accuracy for hot spring fluids segmentation in dual-spectrum UAV images which achieved Pixel Accuracy (PA) of 96.1%, Precision of 93.2%, Recall of 85.9%, Intersection over Union (IoU) of 78.3%, and F1-score (F1) of 89.4%, respectively. And overcomes segmentation uncertainties to a great extent, while maintaining competitive computational efficiency. The ablation studies have confirmed the effectiveness of each main innovation in IFAGNet for improving the accuracy of hot spring fluid segmentation. Therefore, the proposed DHFS dataset and IFAGNet lay the foundation for segmentation of ground surface hot spring fluids in dual-spectrum UAV images, which has significant potential value for the geothermal hot spring resources exploitation. The DHFS dataset and the code of IFAGNet will be available at https://github.com/Ys-Master-CDUT/IFAGNet.
研究地表温泉流体的分布规律对地热资源的开发利用具有重要意义。低空飞行的无人机捕获的双光谱图像提供的详细信息有利于准确分割地表温泉流体。然而,现有的图像分割方法在温泉流体分割中面临着很大的挑战,因为流体边界的频繁和不规则变化,同时这些流体中存在物质导致分割的不确定性。此外,目前还没有专门针对双光谱无人机图像中地表温泉流体分割的基准数据集。为此,本研究构建了双光谱温泉流体分割(DHFS)数据集作为基准数据集,对双光谱无人机图像中的地表温泉流体进行分割。此外,本文还提出了一种新的交互式融合注意力引导RGB-T (RGB-T)语义分割网络IFAGNet,用于双光谱无人机图像中地表温泉流体的精确分割。提出的IFAGNet由两个子网组成,利用两种特征融合架构,设计了两阶段特征融合模块,以实现最优的中间特征融合。此外,IFAGNet利用交互式融合注意力引导架构,引导两个子网络通过互补信息交换对提取的特征进行进一步处理,从而显著提高了温泉流体分割的精度。此外,为每个子网络开发了两个上下全尺度特征金字塔网络(FPN)解码器,以充分利用多阶段融合特征,提高温泉流体分割过程中详细信息的保存。采用混合一致性学习策略对IFAGNet进行训练,将完全监督学习与各子网络之间及其融合结果的一致性学习相结合,进一步优化RGB-T无人机图像中温泉流体的分割精度。在提出的DHFS数据集上对IFAGNet优化模型进行了测试,实验结果表明,IFAGNet在双光谱无人机图像中对温泉流体的分割精度优于现有的图像分割框架,分别达到了96.1%的像素精度(Pixel accuracy)、93.2%的精度、85.9%的查全率(Recall)、78.3%的交汇率(Intersection over Union)和89.4%的F1分数(F1)。在很大程度上克服了分割的不确定性,同时保持了有竞争力的计算效率。消融研究证实了IFAGNet在提高温泉流体分割精度方面的每项主要创新的有效性。因此,本文提出的DHFS数据集和IFAGNet为双光谱无人机图像中地表温泉流体的分割奠定了基础,对地热温泉资源开发具有重要的潜在价值。DHFS数据集和IFAGNet代码可在https://github.com/Ys-Master-CDUT/IFAGNet上获得。
期刊介绍:
The ISPRS Journal of Photogrammetry and Remote Sensing (P&RS) serves as the official journal of the International Society for Photogrammetry and Remote Sensing (ISPRS). It acts as a platform for scientists and professionals worldwide who are involved in various disciplines that utilize photogrammetry, remote sensing, spatial information systems, computer vision, and related fields. The journal aims to facilitate communication and dissemination of advancements in these disciplines, while also acting as a comprehensive source of reference and archive.
P&RS endeavors to publish high-quality, peer-reviewed research papers that are preferably original and have not been published before. These papers can cover scientific/research, technological development, or application/practical aspects. Additionally, the journal welcomes papers that are based on presentations from ISPRS meetings, as long as they are considered significant contributions to the aforementioned fields.
In particular, P&RS encourages the submission of papers that are of broad scientific interest, showcase innovative applications (especially in emerging fields), have an interdisciplinary focus, discuss topics that have received limited attention in P&RS or related journals, or explore new directions in scientific or professional realms. It is preferred that theoretical papers include practical applications, while papers focusing on systems and applications should include a theoretical background.