首页 > 最新文献

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing最新文献

英文 中文
DDRNet: Dual-Domain Refinement Network for Remote Sensing Image Semantic Segmentation DDRNet:用于遥感图像语义分割的双域细化网络
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-04 DOI: 10.1109/JSTARS.2024.3490584
Zhenhao Yang;Fukun Bi;Xinghai Hou;Dehao Zhou;Yanping Wang
Semantic segmentation is crucial for interpreting remote sensing images. The segmentation performance has been significantly improved recently with the development of deep learning. However, complex background samples and small objects greatly increase the challenge of the semantic segmentation task for remote sensing images. To address these challenges, we propose a dual-domain refinement network (DDRNet) for accurate segmentation. Specifically, we first propose a spatial and frequency feature reconstruction module, which separately utilizes the characteristics of the frequency and spatial domains to refine the global salient features and the fine-grained spatial features of objects. This process enhances the foreground saliency and adaptively suppresses background noise. Subsequently, we propose a feature alignment module that selectively couples the features refined from both domains via cross-attention, achieving semantic alignment between frequency and spatial domains. In addition, a meticulously designed detail-aware attention module is introduced to compensate for the loss of small objects during feature propagation. This module leverages cross-correlation matrices between high-level features and the original image to quantify the similarities among objects belonging to the same category, thereby transmitting rich semantic information from high-level features to small objects. The results on multiple datasets validate that our method outperforms the existing methods and achieves a good compromise between computational overhead and accuracy.
语义分割对于解读遥感图像至关重要。近年来,随着深度学习的发展,分割性能得到了显著提高。然而,复杂的背景样本和小物体大大增加了遥感图像语义分割任务的难度。为了应对这些挑战,我们提出了一种用于精确分割的双域细化网络(DDRNet)。具体来说,我们首先提出了空间和频率特性重构模块,分别利用频率域和空间域的特性来细化物体的全局突出特征和细粒度空间特征。这一过程增强了前景显著性,并自适应地抑制了背景噪声。随后,我们提出了一个特征对齐模块,通过交叉注意将两个域中提炼出的特征有选择性地结合起来,实现频率域和空间域之间的语义对齐。此外,我们还引入了一个精心设计的细节感知注意模块,以补偿特征传播过程中的小物体损失。该模块利用高级特征与原始图像之间的交叉相关矩阵来量化属于同一类别的物体之间的相似性,从而将丰富的语义信息从高级特征传递到小物体。在多个数据集上的结果验证了我们的方法优于现有方法,并在计算开销和准确性之间实现了良好的折中。
{"title":"DDRNet: Dual-Domain Refinement Network for Remote Sensing Image Semantic Segmentation","authors":"Zhenhao Yang;Fukun Bi;Xinghai Hou;Dehao Zhou;Yanping Wang","doi":"10.1109/JSTARS.2024.3490584","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3490584","url":null,"abstract":"Semantic segmentation is crucial for interpreting remote sensing images. The segmentation performance has been significantly improved recently with the development of deep learning. However, complex background samples and small objects greatly increase the challenge of the semantic segmentation task for remote sensing images. To address these challenges, we propose a dual-domain refinement network (DDRNet) for accurate segmentation. Specifically, we first propose a spatial and frequency feature reconstruction module, which separately utilizes the characteristics of the frequency and spatial domains to refine the global salient features and the fine-grained spatial features of objects. This process enhances the foreground saliency and adaptively suppresses background noise. Subsequently, we propose a feature alignment module that selectively couples the features refined from both domains via cross-attention, achieving semantic alignment between frequency and spatial domains. In addition, a meticulously designed detail-aware attention module is introduced to compensate for the loss of small objects during feature propagation. This module leverages cross-correlation matrices between high-level features and the original image to quantify the similarities among objects belonging to the same category, thereby transmitting rich semantic information from high-level features to small objects. The results on multiple datasets validate that our method outperforms the existing methods and achieves a good compromise between computational overhead and accuracy.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"20177-20189"},"PeriodicalIF":4.7,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10741324","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint Grid-Based Attention and Multilevel Feature Fusion for Landslide Recognition 基于网格的联合关注和多层次特征融合用于滑坡识别
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-04 DOI: 10.1109/JSTARS.2024.3491216
Xinran Li;Tao Chen;Gang Liu;Jie Dou;Ruiqing Niu;Antonio Plaza
Landslide recognition (LR) is a fundamental task for disaster prevention and control. Convolutional neural networks (CNNs) and transformer architectures have been widely used for extracting landslide information. However, CNNs cannot accurately characterize long-distance dependencies and global information, while the transformer may not be as effective as CNNs in capturing local features and spatial information. To address these limitations, we construct a new LR network based on grid-based attention and multilevel feature fusion (GAMTNet). We complement CNNs by adding a transformer-based structure in a layer-by-layer fashion and improving methods for sequence generation and attention weight calculation. As a result, GAMTNet effectively learns global and local information about landslides across various spatial scales. We evaluated our model using landslide data collected from the southwest region of Jiuzhaigou County, Aba Tibetan, and Qiang Autonomous Prefecture, Sichuan Province, China. The results demonstrate that the proposed GAMTNet model achieves an F1-score of 0.8951, a Kappa coefficient of 0.8807, and an MIoU of 0.8908, indicating its capability for the accurate landslide identification and its potential application in LR tasks.
滑坡识别(LR)是灾害预防和控制的一项基本任务。卷积神经网络(CNN)和变换器架构已被广泛用于提取滑坡信息。然而,卷积神经网络无法准确描述长距离依赖关系和全局信息,而变换器在捕捉局部特征和空间信息方面可能不如卷积神经网络有效。为了解决这些局限性,我们构建了一种基于网格关注和多级特征融合的新型 LR 网络(GAMTNet)。我们通过逐层添加基于变换器的结构,并改进序列生成和注意力权重计算方法,对 CNN 进行了补充。因此,GAMTNet 可有效学习不同空间尺度上滑坡的全局和局部信息。我们使用从中国四川省阿坝藏族羌族自治州九寨沟县西南地区收集到的滑坡数据对我们的模型进行了评估。结果表明,所提出的 GAMTNet 模型的 F1 分数为 0.8951,Kappa 系数为 0.8807,MIoU 为 0.8908,这表明该模型具有准确识别滑坡的能力,并有望应用于 LR 任务中。
{"title":"Joint Grid-Based Attention and Multilevel Feature Fusion for Landslide Recognition","authors":"Xinran Li;Tao Chen;Gang Liu;Jie Dou;Ruiqing Niu;Antonio Plaza","doi":"10.1109/JSTARS.2024.3491216","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3491216","url":null,"abstract":"Landslide recognition (LR) is a fundamental task for disaster prevention and control. Convolutional neural networks (CNNs) and transformer architectures have been widely used for extracting landslide information. However, CNNs cannot accurately characterize long-distance dependencies and global information, while the transformer may not be as effective as CNNs in capturing local features and spatial information. To address these limitations, we construct a new LR network based on grid-based attention and multilevel feature fusion (GAMTNet). We complement CNNs by adding a transformer-based structure in a layer-by-layer fashion and improving methods for sequence generation and attention weight calculation. As a result, GAMTNet effectively learns global and local information about landslides across various spatial scales. We evaluated our model using landslide data collected from the southwest region of Jiuzhaigou County, Aba Tibetan, and Qiang Autonomous Prefecture, Sichuan Province, China. The results demonstrate that the proposed GAMTNet model achieves an \u0000<italic>F</i>\u00001-score of 0.8951, a Kappa coefficient of 0.8807, and an MIoU of 0.8908, indicating its capability for the accurate landslide identification and its potential application in LR tasks.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"19911-19922"},"PeriodicalIF":4.7,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10742385","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Infrared Moving Small Target Detection Based on Spatial–Temporal Feature Fusion Tensor Model 基于时空特征融合张量模型的红外移动小目标检测
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-04 DOI: 10.1109/JSTARS.2024.3491221
Deyong Lu;Wei An;Haibo Wang;Qiang Ling;Dong Cao;Miao Li;Zaiping Lin
Infrared moving small target detection is an important and challenging task in infrared search and track system, especially in the case of low signal-to-clutter ratio (SCR) and complex scenes. The spatial–temporal information has not been fully utilized, and there is a serious imbalance in their exploitation, especially the lack of long-term temporal characteristics. In this article, a novel method based on the spatial–temporal feature fusion tensor model is proposed to solve these problems. By directly stacking raw infrared images, the sequence can be transformed into a third-order tensor, where the spatial–temporal features are not reduced or destroyed. Its horizontal and lateral slices can be viewed as 2-D images, showing the change of gray values of horizontal/vertical fixed spatial pixels over time. Then, a new tensor composed of several serial slices are decomposed into low-rank background components and sparse target components, which can make full use of the temporal similarity and spatial correlation of background. The partial tubal nuclear norm is introduced to constrain the low-rank background, and the tensor robust principal component analysis problem is solved quickly by the alternating direction method of multipliers. By superimposing all the decomposed sparse components into the target tensor, small target can be segmented from the reconstructed target image. Experimental results of synthetic and real data demonstrate that the proposed method is superior to other state-of-the-art methods in visual and numerical results for targets with different sizes, velocities, and SCR values under different complex backgrounds.
红外移动小目标探测是红外搜索与跟踪系统中一项重要而具有挑战性的任务,尤其是在低信噪比(SCR)和复杂场景下。空间-时间信息尚未得到充分利用,在利用上存在严重的不平衡,尤其是缺乏长期的时间特征。本文提出了一种基于时空特征融合张量模型的新方法来解决这些问题。通过直接堆叠原始红外图像,可将序列转换为三阶张量,其中的时空特征不会被削弱或破坏。其水平和横向切片可视为二维图像,显示水平/垂直固定空间像素灰度值随时间的变化。然后,由多个连续切片组成的新张量被分解为低秩的背景分量和稀疏的目标分量,这可以充分利用背景的时间相似性和空间相关性。引入部分管核规范对低阶背景进行约束,并通过交替方向乘法快速解决张量鲁棒主成分分析问题。通过将所有分解的稀疏分量叠加到目标张量中,可以从重建的目标图像中分割出小目标。合成数据和真实数据的实验结果表明,对于不同大小、速度和 SCR 值的目标,在不同的复杂背景下,所提出的方法在视觉和数值结果上都优于其他最先进的方法。
{"title":"Infrared Moving Small Target Detection Based on Spatial–Temporal Feature Fusion Tensor Model","authors":"Deyong Lu;Wei An;Haibo Wang;Qiang Ling;Dong Cao;Miao Li;Zaiping Lin","doi":"10.1109/JSTARS.2024.3491221","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3491221","url":null,"abstract":"Infrared moving small target detection is an important and challenging task in infrared search and track system, especially in the case of low signal-to-clutter ratio (SCR) and complex scenes. The spatial–temporal information has not been fully utilized, and there is a serious imbalance in their exploitation, especially the lack of long-term temporal characteristics. In this article, a novel method based on the spatial–temporal feature fusion tensor model is proposed to solve these problems. By directly stacking raw infrared images, the sequence can be transformed into a third-order tensor, where the spatial–temporal features are not reduced or destroyed. Its horizontal and lateral slices can be viewed as 2-D images, showing the change of gray values of horizontal/vertical fixed spatial pixels over time. Then, a new tensor composed of several serial slices are decomposed into low-rank background components and sparse target components, which can make full use of the temporal similarity and spatial correlation of background. The partial tubal nuclear norm is introduced to constrain the low-rank background, and the tensor robust principal component analysis problem is solved quickly by the alternating direction method of multipliers. By superimposing all the decomposed sparse components into the target tensor, small target can be segmented from the reconstructed target image. Experimental results of synthetic and real data demonstrate that the proposed method is superior to other state-of-the-art methods in visual and numerical results for targets with different sizes, velocities, and SCR values under different complex backgrounds.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"78-99"},"PeriodicalIF":4.7,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10742415","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation of Total Precipitable Water Trends From Reprocessed MiRS SNPP ATMS Observations, 2012–2021 对 2012-2021 年经重新处理的全球降水监测系统(MiRS)SNPP ATMS 观测数据得出的可降水总量趋势进行评估
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-01 DOI: 10.1109/JSTARS.2024.3481444
Yan Zhou;Christopher Grassotti;Quanhua Liu;Shuyan Liu;Yong-Keun Lee
Total precipitable water (TPW) is defined as the vertically integrated column water vapor from the earth's surface to the top of the atmosphere. TPW is a key element of the hydrological cycle and is responsive to changes in global climate related to greenhouse-gas-induced warming. In this research, we focus on trend analysis using the TPW retrieval product from the recently reprocessed Microwave Integrated Retrieval System (MiRS) Suomi National Polar-Orbiting Partnership (SNPP) Advanced Technology Microwave Sounder (ATMS) data and compare it with ERA5 reanalysis. The primary results show that the global TPW trend during 2012–2021 from reprocessed SNPP ATMS is 0.46 mm/decade, in relatively good agreement with the trend from ERA5 of 0.39 mm/decade. Trends for tropical and mid-latitude subregions are also in good agreement, with essentially the same trend of 0.43 mm/decade seen in both datasets in the mid-latitudes. Both the datasets show a large positive anomaly associated with the strong El Nino event in 2015–2016, which increased TPW amounts in the tropics. We also found that the TPW trend is not uniformly distributed spatially, with significant regional variations in both sign and amplitude. Nevertheless, the spatial patterns from MiRS SNPP ATMS retrievals and ERA5 analyses are in very good agreement. Both the datasets show that positive TPW trends in terms of relative percentage in the polar regions were on par with those seen in lower latitudes. The results suggest that water vapor observations from a single polar-orbiting microwave instrument with only two local observation times daily may be sufficient to characterize trends in TPW.
可降水总量(TPW)是指从地球表面到大气顶部的垂直整合水蒸气柱。总降水量是水文循环的一个关键要素,对温室气体引起的气候变暖所导致的全球气候变化具有反应性。在这项研究中,我们重点利用最近重新处理的微波综合检索系统(MiRS)Suomi 国家极轨伙伴关系(SNPP)先进技术微波探测仪(ATMS)数据中的 TPW 检索产品进行趋势分析,并与 ERA5 再分析进行比较。主要结果表明,SNPP ATMS数据再处理后得出的2012-2021年全球TPW趋势为0.46毫米/十年,与ERA5数据得出的0.39毫米/十年相对吻合。热带和中纬度次区域的趋势也很一致,两个数据集在中纬度的趋势基本相同,都是 0.43 毫米/十年。两个数据集都显示出与 2015-2016 年强厄尔尼诺事件相关的巨大正异常,这增加了热带地区的冠层厚度。我们还发现,TPW 趋势在空间分布上并不均匀,在符号和振幅上都存在显著的区域差异。尽管如此,MiRS SNPP ATMS检索和ERA5分析得出的空间模式非常一致。这两个数据集都显示,极地地区在相对百分比方面的正 TPW 趋势与低纬度地区相同。结果表明,每天仅用两个局部观测时间,通过单个极轨微波仪器进行的水汽观测,可能就足以描述 TPW 的变化趋势。
{"title":"Evaluation of Total Precipitable Water Trends From Reprocessed MiRS SNPP ATMS Observations, 2012–2021","authors":"Yan Zhou;Christopher Grassotti;Quanhua Liu;Shuyan Liu;Yong-Keun Lee","doi":"10.1109/JSTARS.2024.3481444","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3481444","url":null,"abstract":"Total precipitable water (TPW) is defined as the vertically integrated column water vapor from the earth's surface to the top of the atmosphere. TPW is a key element of the hydrological cycle and is responsive to changes in global climate related to greenhouse-gas-induced warming. In this research, we focus on trend analysis using the TPW retrieval product from the recently reprocessed Microwave Integrated Retrieval System (MiRS) Suomi National Polar-Orbiting Partnership (SNPP) Advanced Technology Microwave Sounder (ATMS) data and compare it with ERA5 reanalysis. The primary results show that the global TPW trend during 2012–2021 from reprocessed SNPP ATMS is 0.46 mm/decade, in relatively good agreement with the trend from ERA5 of 0.39 mm/decade. Trends for tropical and mid-latitude subregions are also in good agreement, with essentially the same trend of 0.43 mm/decade seen in both datasets in the mid-latitudes. Both the datasets show a large positive anomaly associated with the strong El Nino event in 2015–2016, which increased TPW amounts in the tropics. We also found that the TPW trend is not uniformly distributed spatially, with significant regional variations in both sign and amplitude. Nevertheless, the spatial patterns from MiRS SNPP ATMS retrievals and ERA5 analyses are in very good agreement. Both the datasets show that positive TPW trends in terms of relative percentage in the polar regions were on par with those seen in lower latitudes. The results suggest that water vapor observations from a single polar-orbiting microwave instrument with only two local observation times daily may be sufficient to characterize trends in TPW.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"19798-19804"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10740803","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142636568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multiscale Attention-UNet-Based Near-Real-Time Precipitation Estimation From FY-4A/AGRI and Doppler Radar Observations 基于 FY-4A/AGRI 和多普勒雷达观测数据的多尺度注意力网络近实时降水估算
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-01 DOI: 10.1109/JSTARS.2024.3488854
Dongling Wang;Shanmin Yang;Xiaojie Li;Jing Peng;Hongjiang Ma;Xi Wu
Extreme precipitation events greatly threaten people's daily lives and safety, making accurate and timely precipitation estimation especially critical. However, common methods like radar and satellite remote sensing have limitations due to coverage and environmental factors. Existing deep learning models struggle with complex scenarios and multisource data correlations. These make the precipitation estimation tasks challenging. This article proposes a Multiscale Dual Cross-Attention UNet (MS-DCA-UNet) model for near-real-time precipitation estimation. It integrates Doppler weather radar and FY-4A satellite data to overcome single-source data limitations. To narrow the semantic gap among the encoder feature maps, the MS-DCA-UNet model introduces a dual-cross attention (DCA) module at the skip connections of the backbone network U-Net. The DCA module mainly employs a channel cross-attention and a spatial cross-attention to capture remote dependencies and enable multiscale feature fusion. A multiscale convolution module is designed to reduce the risk of the model falling into local optima. It is a multibranch upsampling strategy that runs parallel to the decoder. Experimental results show that the Critical Success Index (CSI), Root Mean Square Error (RMSE), and Pearson's Correlation Coefficient (CC) of MS-DCA-UNet are 0.6033, 0.5949 mm/h, and 0.8460, respectively, with the hourly CMPAS precipitation as the benchmark. These outperform the other comparisons, such as FY-4A QPE, GPM IMERG, U-Net, Attention-UNet, and DCA-UNet on the CSI, RMSE, and CC metrics. MS-DCA-UNet reduces the RMSE of Attention-UNet, UNet, and DCA-UNet by a margin of 34.68% (0.5949 mm/h versus 0.9107 mm/h), 10.24% (0.5949 mm/h versus 0.6628 mm/h), 6.96% (0.5949 mm/h versus 0.6394 mm/h), respectively.
极端降水事件极大地威胁着人们的日常生活和安全,因此准确及时的降水估测尤为重要。然而,雷达和卫星遥感等常用方法因覆盖范围和环境因素而存在局限性。现有的深度学习模型难以应对复杂的场景和多源数据关联。这些都使降水估测任务充满挑战。本文提出了一种用于近实时降水估算的多尺度双交叉观测网(MS-DCA-UNet)模型。它整合了多普勒天气雷达和 FY-4A 卫星数据,克服了单一数据源的局限性。为了缩小编码器特征图之间的语义差距,MS-DCA-UNet 模型在骨干网 U-Net 的跳接处引入了双交叉注意(DCA)模块。DCA 模块主要采用通道交叉注意和空间交叉注意来捕捉远程依赖关系,实现多尺度特征融合。多尺度卷积模块旨在降低模型陷入局部最优的风险。这是一种与解码器并行运行的多分支上采样策略。实验结果表明,以每小时 CMPAS 降水量为基准,MS-DCA-UNet 的临界成功指数(CSI)、均方根误差(RMSE)和皮尔逊相关系数(CC)分别为 0.6033、0.5949 mm/h 和 0.8460。这些指标在 CSI、RMSE 和 CC 方面均优于其他比较指标,如 FY-4A QPE、GPM IMERG、U-Net、Attention-UNet 和 DCA-UNet。MS-DCA-UNet 将 Attention-UNet、UNet 和 DCA-UNet 的 RMSE 分别降低了 34.68%(0.5949 mm/h 对 0.9107 mm/h)、10.24%(0.5949 mm/h 对 0.6628 mm/h)和 6.96%(0.5949 mm/h 对 0.6394 mm/h)。
{"title":"Multiscale Attention-UNet-Based Near-Real-Time Precipitation Estimation From FY-4A/AGRI and Doppler Radar Observations","authors":"Dongling Wang;Shanmin Yang;Xiaojie Li;Jing Peng;Hongjiang Ma;Xi Wu","doi":"10.1109/JSTARS.2024.3488854","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3488854","url":null,"abstract":"Extreme precipitation events greatly threaten people's daily lives and safety, making accurate and timely precipitation estimation especially critical. However, common methods like radar and satellite remote sensing have limitations due to coverage and environmental factors. Existing deep learning models struggle with complex scenarios and multisource data correlations. These make the precipitation estimation tasks challenging. This article proposes a Multiscale Dual Cross-Attention UNet (MS-DCA-UNet) model for near-real-time precipitation estimation. It integrates Doppler weather radar and FY-4A satellite data to overcome single-source data limitations. To narrow the semantic gap among the encoder feature maps, the MS-DCA-UNet model introduces a dual-cross attention (DCA) module at the skip connections of the backbone network U-Net. The DCA module mainly employs a channel cross-attention and a spatial cross-attention to capture remote dependencies and enable multiscale feature fusion. A multiscale convolution module is designed to reduce the risk of the model falling into local optima. It is a multibranch upsampling strategy that runs parallel to the decoder. Experimental results show that the Critical Success Index (CSI), Root Mean Square Error (RMSE), and Pearson's Correlation Coefficient (CC) of MS-DCA-UNet are 0.6033, 0.5949 mm/h, and 0.8460, respectively, with the hourly CMPAS precipitation as the benchmark. These outperform the other comparisons, such as FY-4A QPE, GPM IMERG, U-Net, Attention-UNet, and DCA-UNet on the CSI, RMSE, and CC metrics. MS-DCA-UNet reduces the RMSE of Attention-UNet, UNet, and DCA-UNet by a margin of 34.68% (0.5949 mm/h versus 0.9107 mm/h), 10.24% (0.5949 mm/h versus 0.6628 mm/h), 6.96% (0.5949 mm/h versus 0.6394 mm/h), respectively.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"19998-20011"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10740264","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142672121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluation and Modeling of Image Sharpness of Chinese Gaofen-1/2/6/7 Optical Remote-Sensing Satellites Over Time 中国高分一号/二号/六号/七号光学遥感卫星图像清晰度随时间变化的评估与建模
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-01 DOI: 10.1109/JSTARS.2024.3490738
Jiayang Cao;Litao Li;Yonghua Jiang;Xin Shen;Deren Li;Meilin Tan
Image sharpness assesses detail visibility in remote-sensing images and measures sensors' details resolution capability. Sensor aging and environmental changes can degrade image sharpness and quality. The Gaofen (GF) satellites provide diverse remote-sensing imagery, but evaluations of their sharpness are limited. In this study, for the GF1/2/6/7 optical remote-sensing satellites in the space-based system of the China High-Resolution Earth Observation System (CHEOS) major special project, we evaluated the relative edge response (RER), full width at half maximum (FWHM), and modulation transfer function (MTF) of the images, using nearly ten years of ground target image data. This measures image sharpness and models how it changes over time with different sensors. Within ten years of on-orbit operation, the RER and MTF (@Nyquist frequency) of GF1/2 are 0.51 and 0.50, and 0.15 and 0.11, respectively. This indicated good image edge and high-frequency detail responsiveness, with FWHM of 1.16 and 1.17, respectively, showing a slight image sharpening. For GF6, the RER, MTF (@Nyquist frequency), and FWHM were 0.42, 0.09, and 1.39, indicating improved sharpening compared with GF1/2 but decreased edge and high-frequency detail response. The RER, MTF (@Nyquist frequency), and FWHM of the panchromatic images of GF7 were 0.32, 0.04, and 1.91, which indicate image blur. Meanwhile, the corresponding indicators for the multispectral images were 0.45, 0.14, and 1.40, better than the panchromatic images. Long-term data showed periodic sharpness variations in satellite images, with GF6s stability and minimal track differences being superior. The dynamic change pattern corresponds to a fourth-order polynomial model.
图像清晰度评估遥感图像的细节可见度,衡量传感器的细节分辨率能力。传感器老化和环境变化会降低图像清晰度和质量。高分(GF)卫星提供了多种遥感图像,但对其清晰度的评估却很有限。在这项研究中,我们利用近十年的地面目标图像数据,对中国高分辨率对地观测系统(CHEOS)重大专项天基系统中的 GF1/2/6/7 光学遥感卫星进行了图像相对边缘响应(RER)、半最大全宽(FWHM)和调制传递函数(MTF)的评估。这可以测量图像清晰度,并模拟不同传感器的图像清晰度随时间的变化情况。在十年的在轨运行期间,GF1/2 的 RER 和 MTF(@奈奎斯特频率)分别为 0.51 和 0.50,以及 0.15 和 0.11。这表明图像边缘和高频细节响应良好,FWHM 分别为 1.16 和 1.17,显示图像略有锐化。对于 GF6,RER、MTF(@奈奎斯特频率)和 FWHM 分别为 0.42、0.09 和 1.39,表明与 GF1/2 相比,图像锐度有所提高,但边缘和高频细节响应有所下降。GF7 全色图像的 RER、MTF(@奈奎斯特频率)和 FWHM 分别为 0.32、0.04 和 1.91,表明图像模糊。而多光谱图像的相应指标分别为 0.45、0.14 和 1.40,优于全色图像。长期数据显示,卫星图像的清晰度呈周期性变化,GF6s 的稳定性和最小轨迹差异更胜一筹。动态变化模式与四阶多项式模型相对应。
{"title":"Evaluation and Modeling of Image Sharpness of Chinese Gaofen-1/2/6/7 Optical Remote-Sensing Satellites Over Time","authors":"Jiayang Cao;Litao Li;Yonghua Jiang;Xin Shen;Deren Li;Meilin Tan","doi":"10.1109/JSTARS.2024.3490738","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3490738","url":null,"abstract":"Image sharpness assesses detail visibility in remote-sensing images and measures sensors' details resolution capability. Sensor aging and environmental changes can degrade image sharpness and quality. The Gaofen (GF) satellites provide diverse remote-sensing imagery, but evaluations of their sharpness are limited. In this study, for the GF1/2/6/7 optical remote-sensing satellites in the space-based system of the China High-Resolution Earth Observation System (CHEOS) major special project, we evaluated the relative edge response (RER), full width at half maximum (FWHM), and modulation transfer function (MTF) of the images, using nearly ten years of ground target image data. This measures image sharpness and models how it changes over time with different sensors. Within ten years of on-orbit operation, the RER and MTF (@Nyquist frequency) of GF1/2 are 0.51 and 0.50, and 0.15 and 0.11, respectively. This indicated good image edge and high-frequency detail responsiveness, with FWHM of 1.16 and 1.17, respectively, showing a slight image sharpening. For GF6, the RER, MTF (@Nyquist frequency), and FWHM were 0.42, 0.09, and 1.39, indicating improved sharpening compared with GF1/2 but decreased edge and high-frequency detail response. The RER, MTF (@Nyquist frequency), and FWHM of the panchromatic images of GF7 were 0.32, 0.04, and 1.91, which indicate image blur. Meanwhile, the corresponding indicators for the multispectral images were 0.45, 0.14, and 1.40, better than the panchromatic images. Long-term data showed periodic sharpness variations in satellite images, with GF6s stability and minimal track differences being superior. The dynamic change pattern corresponds to a fourth-order polynomial model.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"20150-20163"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10741340","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Impact Assessment of Flood Events Based on Multisource Satellite Remote Sensing: The Case of Kahovka Dam 基于多源卫星遥感的洪水事件影响评估:卡霍夫卡大坝案例
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-01 DOI: 10.1109/JSTARS.2024.3490756
Chen Zuo;Haowei Zhang;Xin Ma;Wei Gong
On June 6, 2023, the Kakhovka Dam in Ukraine was damaged, causing a flood that could have had a significant impact on society. A single satellite may not provide sufficient data for timely disaster response. To overcome this, a combination of optical images, a new generation of hydrological monitoring satellite data, and nighttime light data were employed to analyze the disaster. Sentinel-3 provides useful hydrological information for quickly identifying and locating disaster areas, while surface water and ocean topography are able to detect changes in water surface elevation, providing a direct view of the flood's impact on the water area and surface elevation. The datasets both provide information about the extent of the flood area, which is more detailed than that provided by a single source. Furthermore, the NPP-VIIRS data not only reflects the indirect impact of the disaster on the lives and production of the local population, but also provides an intuitive assessment of the damage to the Zaporizhzhya nuclear power plant's power supply. The data showed that the disaster affected an area of about 8 km on either side of the river downstream of the dam. This enables the prediction of the damage and the postdisaster reconstruction strategy from a humanistic perspective. The combination of these three data types enables the specific impact of the disaster to be gauged in terms of its scope, extent, impact on human life, and the postdisaster recovery situation. This provides a scientific reference for the timely formulation and implementation of disaster relief and postdisaster reconstruction measures.
2023 年 6 月 6 日,乌克兰的 Kakhovka 大坝遭到破坏,引发了可能对社会造成重大影响的洪水。单靠一颗卫星可能无法为及时救灾提供足够的数据。为了克服这一问题,我们采用了光学图像、新一代水文监测卫星数据和夜间光线数据相结合的方法来分析这场灾难。哨兵-3 号卫星提供了有用的水文信息,可用于快速识别和定位灾区,而地表水和海洋地形图则能够探测水面高程的变化,直接了解洪水对水域面积和水面高程的影响。这两个数据集都能提供洪灾区域范围的信息,比单一来源提供的信息更加详细。此外,NPP-VIIRS 数据不仅反映了灾害对当地居民生活和生产的间接影响,还对扎波罗热核电站的供电受损情况进行了直观评估。数据显示,灾难影响到大坝下游河流两侧约 8 公里的区域。这样就可以从人文角度预测损失情况和灾后重建策略。通过这三种数据的结合,可以从灾害的范围、程度、对人类生活的影响以及灾后恢复情况等方面来衡量灾害的具体影响。这为及时制定和实施救灾和灾后重建措施提供了科学参考。
{"title":"Impact Assessment of Flood Events Based on Multisource Satellite Remote Sensing: The Case of Kahovka Dam","authors":"Chen Zuo;Haowei Zhang;Xin Ma;Wei Gong","doi":"10.1109/JSTARS.2024.3490756","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3490756","url":null,"abstract":"On June 6, 2023, the Kakhovka Dam in Ukraine was damaged, causing a flood that could have had a significant impact on society. A single satellite may not provide sufficient data for timely disaster response. To overcome this, a combination of optical images, a new generation of hydrological monitoring satellite data, and nighttime light data were employed to analyze the disaster. Sentinel-3 provides useful hydrological information for quickly identifying and locating disaster areas, while surface water and ocean topography are able to detect changes in water surface elevation, providing a direct view of the flood's impact on the water area and surface elevation. The datasets both provide information about the extent of the flood area, which is more detailed than that provided by a single source. Furthermore, the NPP-VIIRS data not only reflects the indirect impact of the disaster on the lives and production of the local population, but also provides an intuitive assessment of the damage to the Zaporizhzhya nuclear power plant's power supply. The data showed that the disaster affected an area of about 8 km on either side of the river downstream of the dam. This enables the prediction of the damage and the postdisaster reconstruction strategy from a humanistic perspective. The combination of these three data types enables the specific impact of the disaster to be gauged in terms of its scope, extent, impact on human life, and the postdisaster recovery situation. This provides a scientific reference for the timely formulation and implementation of disaster relief and postdisaster reconstruction measures.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"20164-20176"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10741335","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MMAPP: Multibranch and Multiscale Adaptive Progressive Pyramid Network for Multispectral Image Pansharpening MMAPP:用于多光谱图像平锐化的多分支多尺度自适应渐进金字塔网络
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-11-01 DOI: 10.1109/JSTARS.2024.3490755
Zhiqi Zhang;Chuang Liu;Lu Wei;Shao Xiang
Pansharpening is the process of integrating two heterogeneous remote sensing images to obtain high-resolution multispectral images, which is crucial for downstream tasks. Existing methods utilizing advanced deep-learning techniques are able to achieve good sharpening results. However, the heterogeneity between diverse source images is not sufficiently considered, which in turn results in distortions in the sharpening results. Addressing this gap, we have developed a multibranch pyramid structure, which can build bridges between diverse source images at various scales. It contains three distinct branches, including the PAN branch, the MS branch, and the fusion branch, which efficiently and seamlessly integrates the data flow in distinct branches by means of the pyramid structure. Furthermore, in order to retain more advantageous information, we have developed a specialized adaptive extraction and integration module (AEIM) for each branch, namely, the texture shrinkage adaptive module for the PAN branch, the spectral information consistency module for the MS branch, and the adaptive fusion module for the fusion branch. These AEIMs are specifically designed to cater to diverse sources and distinct stages of the pansharpening process. The adaptive weights they generate can be used to extract and fuse more advantageous information. Ultimately, high-fidelity sharpening outcomes are obtained by minimizing the reconstruction errors at various scales in distinct branches. Extensive experiments show that our methodology surpasses that of representative advanced methods, while maintaining a high level of efficiency. All implementations will be published at MMAPP.
全景锐化是将两幅异质遥感图像整合以获得高分辨率多光谱图像的过程,这对下游任务至关重要。利用先进的深度学习技术的现有方法能够实现良好的锐化效果。然而,不同源图像之间的异质性没有得到充分考虑,这反过来又导致锐化结果失真。针对这一缺陷,我们开发了一种多分支金字塔结构,它可以在不同尺度的不同源图像之间建立桥梁。它包含三个不同的分支,包括 PAN 分支、MS 分支和融合分支,通过金字塔结构有效、无缝地整合了不同分支中的数据流。此外,为了保留更多有利信息,我们还为每个分支开发了专门的自适应提取和整合模块(AEIM),即 PAN 分支的纹理收缩自适应模块、MS 分支的光谱信息一致性模块和融合分支的自适应融合模块。这些 AEIM 专为满足不同来源和不同阶段的平差处理而设计。它们生成的自适应权重可用于提取和融合更有利的信息。最终,通过最小化不同分支中不同尺度的重建误差,获得高保真的锐化结果。广泛的实验表明,我们的方法超越了具有代表性的先进方法,同时保持了较高的效率。所有实施方案都将在 MMAPP 上发布。
{"title":"MMAPP: Multibranch and Multiscale Adaptive Progressive Pyramid Network for Multispectral Image Pansharpening","authors":"Zhiqi Zhang;Chuang Liu;Lu Wei;Shao Xiang","doi":"10.1109/JSTARS.2024.3490755","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3490755","url":null,"abstract":"Pansharpening is the process of integrating two heterogeneous remote sensing images to obtain high-resolution multispectral images, which is crucial for downstream tasks. Existing methods utilizing advanced deep-learning techniques are able to achieve good sharpening results. However, the heterogeneity between diverse source images is not sufficiently considered, which in turn results in distortions in the sharpening results. Addressing this gap, we have developed a multibranch pyramid structure, which can build bridges between diverse source images at various scales. It contains three distinct branches, including the PAN branch, the MS branch, and the fusion branch, which efficiently and seamlessly integrates the data flow in distinct branches by means of the pyramid structure. Furthermore, in order to retain more advantageous information, we have developed a specialized adaptive extraction and integration module (AEIM) for each branch, namely, the texture shrinkage adaptive module for the PAN branch, the spectral information consistency module for the MS branch, and the adaptive fusion module for the fusion branch. These AEIMs are specifically designed to cater to diverse sources and distinct stages of the pansharpening process. The adaptive weights they generate can be used to extract and fuse more advantageous information. Ultimately, high-fidelity sharpening outcomes are obtained by minimizing the reconstruction errors at various scales in distinct branches. Extensive experiments show that our methodology surpasses that of representative advanced methods, while maintaining a high level of efficiency. All implementations will be published at MMAPP.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"20129-20149"},"PeriodicalIF":4.7,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10741347","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142713774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing Land Degradation and Restoration in Eastern China Grasslands from 1985 to 2018 Using Multitemporal Landsat Data 利用多时相大地遥感数据评估 1985 年至 2018 年中国东部草原的土地退化和恢复情况
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-31 DOI: 10.1109/JSTARS.2024.3483992
Caixia Liu;Huabing Huang;John M. Melack;Ye Tian;Jinxiong Jiang;Xiao Fu;Zhiguo Cao;Shaohua Wang
The grassland ecosystems of Xilingol, China, characteristically part of the vast Eurasian steppe, are currently facing two challenges: natural variations and anthropogenic stress, which are leading to significant degradation. This article harnesses a sequence of high-resolution (30 m) land cover and greenness trend maps derived from multiyear Landsat imagery to describe these ecologically critical shifts over a landscape spanning more than 200 000 km2. By leveraging random forest models complemented with phenological patterns, we streamlined the generation of land cover maps, securing overall accuracies upwards of 94% across eight categorical classifications, as substantiated by rigorous validation. Between 1985 and 2000, there were significant changes in the landscape, such as an increase in farmland of about 4.0 × 103 km2, mostly at the expense of natural grasslands and wetlands. Throughout the study period, an ongoing trend is the noticeable shrinkage of water bodies with the biggest reduction of wetlands reported between 1995 and 2015. Open-pit mining regions began to increase with the start of the 21st century, and from 1985 to the present, urbanization drove the growth of impervious surfaces. These maps offer powerful visual representations of major land use changes, capturing the expansion of surface mining, the retreat of wetland areas, and the growth of urban areas. Therefore, our findings compose an essential part in the documentation and comprehension of the details of wetland reduction, cropland intensification, surface water decline, and rapid urban growth, providing crucial information to conservationists and policymakers working toward sustainable ecosystem management.
中国锡林郭勒的草原生态系统是广袤的欧亚大草原的典型组成部分,目前正面临着两大挑战:自然变化和人为压力,这导致了草原的严重退化。本文利用从多年陆地卫星图像中提取的一系列高分辨率(30 米)土地覆被和绿化趋势图,描述了在面积超过 20 万平方公里的地形上发生的这些生态关键变化。通过利用随机森林模型并辅以物候模式,我们简化了土地覆被图的生成过程,在八种分类中确保了高达 94% 的总体准确率,并通过了严格的验证。1985 年至 2000 年期间,地貌发生了显著变化,例如农田面积增加了约 4.0 × 103 平方公里,这主要是以牺牲天然草地和湿地为代价的。在整个研究期间,一个持续的趋势是水体明显缩小,据报告,1995 年至 2015 年期间湿地减少最多。进入 21 世纪后,露天开采地区开始增加,从 1985 年至今,城市化推动了不透水表面的增加。这些地图有力地直观反映了土地利用的主要变化,捕捉到了露天采矿的扩张、湿地的退缩以及城市地区的增长。因此,我们的研究结果是记录和理解湿地减少、耕地集约化、地表水减少和城市快速增长等细节的重要组成部分,为致力于可持续生态系统管理的保护主义者和决策者提供了重要信息。
{"title":"Assessing Land Degradation and Restoration in Eastern China Grasslands from 1985 to 2018 Using Multitemporal Landsat Data","authors":"Caixia Liu;Huabing Huang;John M. Melack;Ye Tian;Jinxiong Jiang;Xiao Fu;Zhiguo Cao;Shaohua Wang","doi":"10.1109/JSTARS.2024.3483992","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3483992","url":null,"abstract":"The grassland ecosystems of Xilingol, China, characteristically part of the vast Eurasian steppe, are currently facing two challenges: natural variations and anthropogenic stress, which are leading to significant degradation. This article harnesses a sequence of high-resolution (30 m) land cover and greenness trend maps derived from multiyear Landsat imagery to describe these ecologically critical shifts over a landscape spanning more than 200 000 km\u0000<sup>2</sup>\u0000. By leveraging random forest models complemented with phenological patterns, we streamlined the generation of land cover maps, securing overall accuracies upwards of 94% across eight categorical classifications, as substantiated by rigorous validation. Between 1985 and 2000, there were significant changes in the landscape, such as an increase in farmland of about 4.0 × 10\u0000<sup>3</sup>\u0000 km\u0000<sup>2</sup>\u0000, mostly at the expense of natural grasslands and wetlands. Throughout the study period, an ongoing trend is the noticeable shrinkage of water bodies with the biggest reduction of wetlands reported between 1995 and 2015. Open-pit mining regions began to increase with the start of the 21st century, and from 1985 to the present, urbanization drove the growth of impervious surfaces. These maps offer powerful visual representations of major land use changes, capturing the expansion of surface mining, the retreat of wetland areas, and the growth of urban areas. Therefore, our findings compose an essential part in the documentation and comprehension of the details of wetland reduction, cropland intensification, surface water decline, and rapid urban growth, providing crucial information to conservationists and policymakers working toward sustainable ecosystem management.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"19328-19342"},"PeriodicalIF":4.7,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10740496","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142579179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Popeye: A Unified Visual-Language Model for Multisource Ship Detection From Remote Sensing Imagery 大力水手从遥感图像中进行多源船舶探测的统一视觉语言模型
IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2024-10-30 DOI: 10.1109/JSTARS.2024.3488034
Wei Zhang;Miaoxin Cai;Tong Zhang;Guoqiang Lei;Yin Zhuang;Xuerui Mao
Ship detection needs to identify ship locations from remote sensing scenes. Due to different imaging payloads, various appearances of ships, and complicated background interference from the bird's eye view, it is difficult to setup a unified paradigm for achieving multisource ship detection. To address this challenge, in this article, leveraging the large language models powerful generalization ability, a unified visual-language model called Popeye is proposed for multisource ship detection from RS imagery. Specifically, to bridge the interpretation gap across the multisource images for ship detection, a novel unified labeling paradigm is designed to integrate different visual modalities and the various ship detection ways, i.e., horizontal bounding box and oriented bounding box. Subsequently, the hybrid experts encoder is designed to refine multiscale visual features, thereby enhancing visual perception. Then, a visual-language alignment method is developed for Popeye to enhance interactive comprehension ability between visual and language content. Furthermore, an instruction adaption mechanism is proposed for transferring the pretrained visual-language knowledge from the nature scene into the RS domain for multisource ship detection. In addition, the segment anything model is also seamlessly integrated into the proposed Popeye to achieve pixel-level ship segmentation without additional training costs. Finally, extensive experiments are conducted on the newly constructed ship instruction dataset named MMShip, and the results indicate that the proposed Popeye outperforms current specialist, open-vocabulary, and other visual-language models in zero-shot multisource various ship detection tasks.
船舶探测需要从遥感场景中识别船舶位置。由于不同的成像有效载荷、船舶的不同外观以及复杂的鸟瞰背景干扰,很难建立一个统一的范式来实现多源船舶检测。为解决这一难题,本文利用大型语言模型强大的泛化能力,提出了一种名为 "大力水手 "的统一视觉语言模型,用于从 RS 图像中进行多源船舶检测。具体地说,为了弥补多源图像在船舶检测方面的解释差距,本文设计了一种新颖的统一标注范式,以整合不同的视觉模态和各种船舶检测方式,即水平边界框和定向边界框。随后,设计了混合专家编码器来细化多尺度视觉特征,从而增强视觉感知。然后,为 "大力水手 "开发了一种视觉语言对齐方法,以增强视觉内容与语言内容之间的交互理解能力。此外,还提出了一种指令适应机制,用于将自然场景中预先训练好的视觉语言知识转移到 RS 领域,以进行多源船舶检测。此外,Popeye 还无缝集成了任何分割模型,以实现像素级的船舶分割,而无需额外的训练成本。最后,在新构建的名为 MMShip 的船舶指令数据集上进行了大量实验,结果表明,在零镜头多源各种船舶检测任务中,所提出的 Popeye 优于当前的专家、开放词汇和其他视觉语言模型。
{"title":"Popeye: A Unified Visual-Language Model for Multisource Ship Detection From Remote Sensing Imagery","authors":"Wei Zhang;Miaoxin Cai;Tong Zhang;Guoqiang Lei;Yin Zhuang;Xuerui Mao","doi":"10.1109/JSTARS.2024.3488034","DOIUrl":"https://doi.org/10.1109/JSTARS.2024.3488034","url":null,"abstract":"Ship detection needs to identify ship locations from remote sensing scenes. Due to different imaging payloads, various appearances of ships, and complicated background interference from the bird's eye view, it is difficult to setup a unified paradigm for achieving multisource ship detection. To address this challenge, in this article, leveraging the large language models powerful generalization ability, a unified visual-language model called Popeye is proposed for multisource ship detection from RS imagery. Specifically, to bridge the interpretation gap across the multisource images for ship detection, a novel unified labeling paradigm is designed to integrate different visual modalities and the various ship detection ways, i.e., horizontal bounding box and oriented bounding box. Subsequently, the hybrid experts encoder is designed to refine multiscale visual features, thereby enhancing visual perception. Then, a visual-language alignment method is developed for Popeye to enhance interactive comprehension ability between visual and language content. Furthermore, an instruction adaption mechanism is proposed for transferring the pretrained visual-language knowledge from the nature scene into the RS domain for multisource ship detection. In addition, the segment anything model is also seamlessly integrated into the proposed Popeye to achieve pixel-level ship segmentation without additional training costs. Finally, extensive experiments are conducted on the newly constructed ship instruction dataset named MMShip, and the results indicate that the proposed Popeye outperforms current specialist, open-vocabulary, and other visual-language models in zero-shot multisource various ship detection tasks.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"17 ","pages":"20050-20063"},"PeriodicalIF":4.7,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10738390","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142645478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1