首页 > 最新文献

IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing最新文献

英文 中文
DualFocus-CapNet: A Dual-Stream Network for Real Change and Interscale Relationship-Aware Change Captioning in Remote Sensing Images DualFocus-CapNet:用于遥感图像真实变化和尺度间关系感知变化字幕的双流网络
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-11 DOI: 10.1109/JSTARS.2025.3642993
Xianqi Meng;Yuefeng Zhao;Kaifa Cao;Qifei Wang;Junjie Wang;Nannan Hu
Remote sensing image change captioning (RSICC) aims to generate textual descriptions of changes between bitemporal images. However, accurately describing fine-grained changes while capturing interscale relationships as well as distinguishing real changes from spurious changes (e.g., illumination, seasonal variations) are still major challenges for current methods. To address these issues, we propose DualFocus-CapNet, a novel model tailored for RSICC. DualFocus-CapNet employs a dual-stream architecture, where each stream is dedicated to processing a distinct pair of bitemporal features. Crucially, we introduce a scale-wise progressive convolution (ScalePro Conv) that employs a progressive scale-specific approach to decompose remote sensing features into pixel-level variations, regional continuities, and linear structures. Unlike conventional parallel multiscale processing methods, ScalePro Conv adopts a serial progressive structure to establish interscale relationships, thereby avoiding the fragmentation of feature information. Then, the bi-directional difference guided transformer (BDiGTrans) is proposed to eliminate interference from spurious changes by dynamically masking invariant regions and extracting bidirectional differential features. Furthermore, the cross-temporal adaptive fusion module (CTAF) is introduced to dynamically balance bitemporal features using learnable gating to enhance change discrimination and robust caption generation. Comprehensive experiments on the benchmark datasets LEVIR-CC and WHU-CDC show that our DualFocus-CapNet surpasses state-of-the-art change captioning methods in various evaluation metrics.
遥感图像变化字幕(RSICC)旨在生成双时图像之间变化的文本描述。然而,准确地描述细粒度的变化,同时捕捉尺度间的关系,以及区分真实的变化和虚假的变化(例如,照明,季节变化)仍然是当前方法的主要挑战。为了解决这些问题,我们提出了DualFocus-CapNet,这是一个为RSICC量身定制的新模型。DualFocus-CapNet采用双流架构,其中每个流专门处理一对不同的双时间特征。至关重要的是,我们引入了一种尺度渐进卷积(ScalePro Conv),它采用特定于尺度的渐进方法将遥感特征分解为像素级变化、区域连续性和线性结构。与传统的并行多尺度处理方法不同,ScalePro Conv采用串行递进结构建立尺度间关系,避免了特征信息的碎片化。然后,提出了双向差分引导变压器(BDiGTrans),通过动态屏蔽不变区域和提取双向差分特征来消除杂散变化的干扰。此外,引入了跨时间自适应融合模块(CTAF),利用可学习门控来动态平衡双时间特征,以增强变化判别和鲁棒标题生成。在基准数据集LEVIR-CC和WHU-CDC上的综合实验表明,我们的DualFocus-CapNet在各种评估指标上都超过了最先进的变化字幕方法。
{"title":"DualFocus-CapNet: A Dual-Stream Network for Real Change and Interscale Relationship-Aware Change Captioning in Remote Sensing Images","authors":"Xianqi Meng;Yuefeng Zhao;Kaifa Cao;Qifei Wang;Junjie Wang;Nannan Hu","doi":"10.1109/JSTARS.2025.3642993","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3642993","url":null,"abstract":"Remote sensing image change captioning (RSICC) aims to generate textual descriptions of changes between bitemporal images. However, accurately describing fine-grained changes while capturing interscale relationships as well as distinguishing real changes from spurious changes (e.g., illumination, seasonal variations) are still major challenges for current methods. To address these issues, we propose DualFocus-CapNet, a novel model tailored for RSICC. DualFocus-CapNet employs a dual-stream architecture, where each stream is dedicated to processing a distinct pair of bitemporal features. Crucially, we introduce a scale-wise progressive convolution (ScalePro Conv) that employs a progressive scale-specific approach to decompose remote sensing features into pixel-level variations, regional continuities, and linear structures. Unlike conventional parallel multiscale processing methods, ScalePro Conv adopts a serial progressive structure to establish interscale relationships, thereby avoiding the fragmentation of feature information. Then, the bi-directional difference guided transformer (BDiGTrans) is proposed to eliminate interference from spurious changes by dynamically masking invariant regions and extracting bidirectional differential features. Furthermore, the cross-temporal adaptive fusion module (CTAF) is introduced to dynamically balance bitemporal features using learnable gating to enhance change discrimination and robust caption generation. Comprehensive experiments on the benchmark datasets LEVIR-CC and WHU-CDC show that our DualFocus-CapNet surpasses state-of-the-art change captioning methods in various evaluation metrics.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2045-2059"},"PeriodicalIF":5.3,"publicationDate":"2025-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11297771","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial–Frequency Collaborative Network for Remote Sensing Image Change Detection 遥感图像变化检测的空间-频率协同网络
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-10 DOI: 10.1109/JSTARS.2025.3642571
Rong Deng;Yunwei Pu;Mingda Hao
The accuracy of change detection (CD) in high-resolution remote sensing imagery can be compromised due to complex surface textures, variations in illumination, and atmospheric disturbances. Existing methods struggle to simultaneously achieve noise suppression, long-range dependence modeling, and computational efficiency. In response to the aforementioned challenges, this study proposes a spatial–frequency collaborative detection network (SFCNet) that combines VMamba’s linear-complexity global modeling with frequency-domain techniques. The designed network adopts a VMamba-based encoder to efficiently extract multiscale features and further integrates a frequency–spatial interaction module to perform content-aware noise suppression and detail enhancement through spatial-domain-guided adaptive frequency filtering. A progressive decoder, equipped with a hybrid pooling channel attention module, integrates multilevel features through hybrid pooling and progressive fusion strategies. To validate the effectiveness of the proposed SFCNet, extensive experiments were conducted on three widely used public datasets: LEVIR-CD, WHU-CD, and CLCD. The network achieved F1-scores of 91.17%, 94.29%, and 74.98%, along with intersection over union values of 83.78%, 89.21%, and 59.98%, respectively. These results clearly demonstrate that the proposed method enhances detection accuracy and robustness under complex scenarios.
在高分辨率遥感图像中,由于复杂的表面纹理、光照变化和大气干扰,变化检测(CD)的精度会受到影响。现有的方法很难同时实现噪声抑制、远程依赖建模和计算效率。为了应对上述挑战,本研究提出了一种空间-频率协同检测网络(SFCNet),该网络将vamba的线性复杂性全局建模与频域技术相结合。设计的网络采用基于vamba的编码器高效提取多尺度特征,并进一步集成频率-空间交互模块,通过空域引导自适应频率滤波实现内容感知的噪声抑制和细节增强。一种配备混合池化信道注意模块的累进解码器,通过混合池化和累进融合策略集成了多层特征。为了验证所提出的SFCNet的有效性,我们在三个广泛使用的公共数据集:LEVIR-CD、WHU-CD和CLCD上进行了大量的实验。网络的f1得分分别为91.17%、94.29%和74.98%,交集/联合值分别为83.78%、89.21%和59.98%。这些结果清楚地表明,该方法提高了复杂场景下的检测精度和鲁棒性。
{"title":"Spatial–Frequency Collaborative Network for Remote Sensing Image Change Detection","authors":"Rong Deng;Yunwei Pu;Mingda Hao","doi":"10.1109/JSTARS.2025.3642571","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3642571","url":null,"abstract":"The accuracy of change detection (CD) in high-resolution remote sensing imagery can be compromised due to complex surface textures, variations in illumination, and atmospheric disturbances. Existing methods struggle to simultaneously achieve noise suppression, long-range dependence modeling, and computational efficiency. In response to the aforementioned challenges, this study proposes a spatial–frequency collaborative detection network (SFCNet) that combines VMamba’s linear-complexity global modeling with frequency-domain techniques. The designed network adopts a VMamba-based encoder to efficiently extract multiscale features and further integrates a frequency–spatial interaction module to perform content-aware noise suppression and detail enhancement through spatial-domain-guided adaptive frequency filtering. A progressive decoder, equipped with a hybrid pooling channel attention module, integrates multilevel features through hybrid pooling and progressive fusion strategies. To validate the effectiveness of the proposed SFCNet, extensive experiments were conducted on three widely used public datasets: LEVIR-CD, WHU-CD, and CLCD. The network achieved F1-scores of 91.17%, 94.29%, and 74.98%, along with intersection over union values of 83.78%, 89.21%, and 59.98%, respectively. These results clearly demonstrate that the proposed method enhances detection accuracy and robustness under complex scenarios.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2485-2496"},"PeriodicalIF":5.3,"publicationDate":"2025-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11293753","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Synergistic Fusion of Sentinel-1 and Sentinel-2 for Global LULC Mapping: The Multimodal Network LULC-Former and Dynamic World+ Dataset 基于Sentinel-1和Sentinel-2的全球LULC映射协同融合:多模态网络LULC- former和动态世界+数据集
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-09 DOI: 10.1109/JSTARS.2025.3641788
Hao Yu;Gen Li;Haoyu Liu;Songyan Zhu;Jian Xu;Wenquan Dong;Changjian Li;Jiancheng Shi
Accurate, high-resolution global land use and land cover (LULC) mapping is crucial for environmental monitoring, but remains challenging when relying solely on multispectral data. Most existing global LULC mapping studies rely exclusively on multispectral observations, and even those that incorporate synthetic aperture radar (SAR) data often fail to fully exploit the information it provides. SAR provides an all-weather sensing capability and is uniquely sensitive to surface structure, texture, and moisture—critical information for LULC classes that are often spectrally ambiguous. To address this data gap, we introduce the Dynamic World+ dataset, a new global benchmark that expands the authoritative Dynamic World by aligning it with Sentinel-1 SAR data. In addition, to facilitate the combination of multispectral and SAR data, we propose a lightweight transformer architecture termed LULC-Former. It incorporates two innovative modules, the dual-modal enhancement module and mutual modal aggregation module, designed to exploit cross-information between the two modalities in a split-fusion manner. These modules enable spectral features to guide the interpretation of SAR texture, and vice versa, thereby improving the overall performance of global LULC semantic segmentation. Furthermore, we adopt an imbalanced parameter allocation strategy, which assigns parameters to different modalities based on the distinct physical information each provides for LULC characterization. Experiments demonstrate that our network outperforms existing transformer and CNN-based models, achieving a mean intersection over union of 59.58%, an overall accuracy of 79.48%, and an F1 score of 71.68% with only 26.70 M parameters. Furthermore, the generated national-scale LULC maps across diverse regions demonstrate the effectiveness of the proposed dataset and network.
准确、高分辨率的全球土地利用和土地覆盖(LULC)制图对于环境监测至关重要,但当仅依赖多光谱数据时仍然具有挑战性。大多数现有的全球LULC制图研究完全依赖于多光谱观测,甚至那些纳入合成孔径雷达(SAR)数据的研究也往往不能充分利用其提供的信息。SAR提供全天候传感能力,并且对表面结构、纹理和湿度关键信息非常敏感,适用于通常光谱模糊的LULC类。为了解决这一数据差距,我们引入了动态世界+数据集,这是一个新的全球基准,通过将其与Sentinel-1 SAR数据相匹配,扩展了权威的动态世界。此外,为了方便多光谱和SAR数据的结合,我们提出了一种称为LULC-Former的轻型变压器架构。它包含两个创新模块,即双模态增强模块和互模态聚合模块,旨在以分裂融合的方式利用两种模态之间的交叉信息。这些模块使光谱特征能够指导SAR纹理的解释,反之亦然,从而提高全局LULC语义分割的整体性能。此外,我们采用了一种不平衡的参数分配策略,该策略根据每个模式提供的不同物理信息将参数分配给不同的模式,以用于LULC表征。实验表明,我们的网络优于现有的变压器和基于cnn的模型,在仅26.70 M个参数的情况下,实现了59.58%的平均交集over union, 79.48%的总体准确率和71.68%的F1分数。此外,生成的跨不同地区的国家尺度LULC地图证明了所提出的数据集和网络的有效性。
{"title":"Synergistic Fusion of Sentinel-1 and Sentinel-2 for Global LULC Mapping: The Multimodal Network LULC-Former and Dynamic World+ Dataset","authors":"Hao Yu;Gen Li;Haoyu Liu;Songyan Zhu;Jian Xu;Wenquan Dong;Changjian Li;Jiancheng Shi","doi":"10.1109/JSTARS.2025.3641788","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3641788","url":null,"abstract":"Accurate, high-resolution global land use and land cover (LULC) mapping is crucial for environmental monitoring, but remains challenging when relying solely on multispectral data. Most existing global LULC mapping studies rely exclusively on multispectral observations, and even those that incorporate synthetic aperture radar (SAR) data often fail to fully exploit the information it provides. SAR provides an all-weather sensing capability and is uniquely sensitive to surface structure, texture, and moisture—critical information for LULC classes that are often spectrally ambiguous. To address this data gap, we introduce the <italic>Dynamic World+</i> dataset, a new global benchmark that expands the authoritative Dynamic World by aligning it with Sentinel-1 SAR data. In addition, to facilitate the combination of multispectral and SAR data, we propose a lightweight transformer architecture termed <italic>LULC-Former</i>. It incorporates two innovative modules, the dual-modal enhancement module and mutual modal aggregation module, designed to exploit cross-information between the two modalities in a split-fusion manner. These modules enable spectral features to guide the interpretation of SAR texture, and vice versa, thereby improving the overall performance of global LULC semantic segmentation. Furthermore, we adopt an imbalanced parameter allocation strategy, which assigns parameters to different modalities based on the distinct physical information each provides for LULC characterization. Experiments demonstrate that our network outperforms existing transformer and CNN-based models, achieving a mean intersection over union of 59.58%, an overall accuracy of 79.48%, and an F1 score of 71.68% with only 26.70 M parameters. Furthermore, the generated national-scale LULC maps across diverse regions demonstrate the effectiveness of the proposed dataset and network.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2511-2524"},"PeriodicalIF":5.3,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11286222","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SAM4CH4: Zero-Shot Methane Plume Mapping With Segment Anything and Vision-Language Models SAM4CH4:零射击甲烷羽映射与段任何和视觉语言模型
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-09 DOI: 10.1109/JSTARS.2025.3642040
Masoud Mahdianpari;Ali Radman;Daniel J. Varon;Fariba Mohammadimanesh
Recent advances in foundation models, including large language models and advanced computer vision techniques, have opened new possibilities in remote sensing applications. One such model is the Segment Anything Model (SAM), which can perform image segmentation without task-specific training data. This is especially useful for detecting methane plumes in satellite imagery, where it is important to accurately separate methane column enhancements from complex background conditions. SAM’s prompt-based segmentation approach helps address these challenges and reduces the need for large annotated datasets. In this study, we introduce SAM4CH4, a zero-shot segmentation framework that applies SAM for methane plume detection using Sentinel-2 imagery, with segmentation prompts automatically generated by text encoder models including contrastive language-image pretraining (CLIP), CLIP Surgery, and Grounding DINO. We evaluate the approach on both a synthetically generated benchmark dataset and real Sentinel-2 images. Results show that bounding-box prompts from the Swin-L variant of Grounding DINO, combined with the latest version of SAM, consistently achieve high accuracy, exceeding 72% in F1-score and 95% in overall accuracy, and outperform a widely used statistical thresholding method by approximately 15% in F1-score. These results are also competitive with supervised deep learning methods, which typically require large labeled datasets and significant computational resources. By leveraging pretrained models and removing the need for manual annotation, the proposed SAM4CH4 framework offers a zero-shot scalable and efficient solution for operational methane plume detection and monitoring.
基础模型的最新进展,包括大型语言模型和先进的计算机视觉技术,为遥感应用开辟了新的可能性。其中一个这样的模型是分割任何模型(SAM),它可以在没有特定任务训练数据的情况下执行图像分割。这对于探测卫星图像中的甲烷羽流尤其有用,因为在卫星图像中,准确地将甲烷柱增强与复杂的背景条件分离是很重要的。SAM的基于提示的分割方法有助于解决这些挑战,并减少了对大型注释数据集的需求。在本研究中,我们引入了SAM4CH4,这是一个零射击分割框架,将SAM应用于使用Sentinel-2图像进行甲烷羽流检测,并通过文本编码器模型自动生成分割提示,包括对比语言图像预训练(CLIP), CLIP手术和接地DINO。我们在合成的基准数据集和真实的Sentinel-2图像上对该方法进行了评估。结果表明,结合最新版本的SAM,来自swing - l变体的接地DINO的边界框提示始终保持较高的准确率,f1得分超过72%,总体准确率超过95%,并且在f1得分上优于广泛使用的统计阈值方法约15%。这些结果也与监督深度学习方法相竞争,后者通常需要大量的标记数据集和大量的计算资源。通过利用预训练模型和消除手动注释的需要,所提出的SAM4CH4框架为操作甲烷羽流检测和监测提供了零射击可扩展和高效的解决方案。
{"title":"SAM4CH4: Zero-Shot Methane Plume Mapping With Segment Anything and Vision-Language Models","authors":"Masoud Mahdianpari;Ali Radman;Daniel J. Varon;Fariba Mohammadimanesh","doi":"10.1109/JSTARS.2025.3642040","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3642040","url":null,"abstract":"Recent advances in foundation models, including large language models and advanced computer vision techniques, have opened new possibilities in remote sensing applications. One such model is the Segment Anything Model (SAM), which can perform image segmentation without task-specific training data. This is especially useful for detecting methane plumes in satellite imagery, where it is important to accurately separate methane column enhancements from complex background conditions. SAM’s prompt-based segmentation approach helps address these challenges and reduces the need for large annotated datasets. In this study, we introduce SAM4CH4, a zero-shot segmentation framework that applies SAM for methane plume detection using Sentinel-2 imagery, with segmentation prompts automatically generated by text encoder models including contrastive language-image pretraining (CLIP), CLIP Surgery, and Grounding DINO. We evaluate the approach on both a synthetically generated benchmark dataset and real Sentinel-2 images. Results show that bounding-box prompts from the Swin-L variant of Grounding DINO, combined with the latest version of SAM, consistently achieve high accuracy, exceeding 72% in F1-score and 95% in overall accuracy, and outperform a widely used statistical thresholding method by approximately 15% in F1-score. These results are also competitive with supervised deep learning methods, which typically require large labeled datasets and significant computational resources. By leveraging pretrained models and removing the need for manual annotation, the proposed SAM4CH4 framework offers a zero-shot scalable and efficient solution for operational methane plume detection and monitoring.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2273-2284"},"PeriodicalIF":5.3,"publicationDate":"2025-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11286214","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Chlorophyll Concentration Inversion Method Based on OWTs and 1D CNN-Transformer Feature Extraction 基于OWTs和一维CNN-Transformer特征提取的叶绿素浓度反演方法
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-08 DOI: 10.1109/JSTARS.2025.3641630
Yusheng Cui;Tao Xie;Jian Li;Xuehong Zhang;Shuying Bai;Chao Wang;Hui Liu
The inversion of chlorophyll-a (Chla) across water bodies is challenged by the diversity of optical water types (OWTs) and by the nonlinear superposition of optical signals from optically active constituents. To address these challenges, this study introduces an inversion method based on water classification that integrates dual-stream feature extraction (via a one-dimensional convolutional neural network (1-D CNN)-Transformer) and multimodel evaluation, termed OWT-CCINET. We first applied k-means clustering to the reflectance spectra, dividing the samples into four OWTs: eutrophic, turbid, clear, and moderately turbid. For each group, feature extraction was adjusted using either sequential or parallel branches. Several feature combinations were then tested with different regression models, and the most suitable model for each OWT was determined based on a comparative evaluation. Compared with conventional models such as the OCx and CI hybrid algorithms, the model with targeted feature extraction and model selection based on water classification significantly improved the accuracy and applicability for Chla inversion across various OWTs. By applying OWT-CCINET, the natural log (ln)-transformed Chla concentrations were estimated, and the model achieved outstanding results, with a mean absolute error of 0.410, a root mean square error of 0.549, and a coefficient of determination (R2) of 0.899. After minimally harmonizing MODIS and MERIS to SeaWiFS-equivalent bands, OWT-CCINET maintained stable cross-sensor performance, supporting long-term, multimission applications. This study provides novel methodologies and theoretical support for the development of cross-water type models, offering significant value for practical applications such as marine environmental monitoring.
水体叶绿素a (Chla)的反演受到光水类型(OWTs)多样性和光活性成分光信号非线性叠加的挑战。为了解决这些挑战,本研究引入了一种基于水分类的反演方法,该方法集成了双流特征提取(通过一维卷积神经网络(1-D CNN)-Transformer)和多模型评估,称为OWT-CCINET。我们首先对反射光谱应用k-means聚类,将样品分为四个owt:富营养化、浑浊、透明和中度浑浊。对于每一组,特征提取使用顺序或并行分支进行调整。然后使用不同的回归模型对几种特征组合进行测试,并通过比较评价确定每种OWT的最合适模型。与OCx和CI混合算法等传统模型相比,基于水分类的有针对性特征提取和模型选择的模型显著提高了对不同水体Chla反演的准确性和适用性。利用OWT-CCINET对自然对数(ln)变换后的Chla浓度进行估计,模型取得了较好的结果,平均绝对误差为0.410,均方根误差为0.549,决定系数(R2)为0.899。在最低限度地协调MODIS和MERIS与seawifs等效波段后,OWT-CCINET保持了稳定的跨传感器性能,支持长期的多任务应用。该研究为跨水模型的发展提供了新的方法和理论支持,对海洋环境监测等实际应用具有重要价值。
{"title":"A Chlorophyll Concentration Inversion Method Based on OWTs and 1D CNN-Transformer Feature Extraction","authors":"Yusheng Cui;Tao Xie;Jian Li;Xuehong Zhang;Shuying Bai;Chao Wang;Hui Liu","doi":"10.1109/JSTARS.2025.3641630","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3641630","url":null,"abstract":"The inversion of chlorophyll-a (Chla) across water bodies is challenged by the diversity of optical water types (OWTs) and by the nonlinear superposition of optical signals from optically active constituents. To address these challenges, this study introduces an inversion method based on water classification that integrates dual-stream feature extraction (via a one-dimensional convolutional neural network (1-D CNN)-Transformer) and multimodel evaluation, termed OWT-CCINET. We first applied <italic>k</i>-means clustering to the reflectance spectra, dividing the samples into four OWTs: eutrophic, turbid, clear, and moderately turbid. For each group, feature extraction was adjusted using either sequential or parallel branches. Several feature combinations were then tested with different regression models, and the most suitable model for each OWT was determined based on a comparative evaluation. Compared with conventional models such as the OCx and CI hybrid algorithms, the model with targeted feature extraction and model selection based on water classification significantly improved the accuracy and applicability for Chla inversion across various OWTs. By applying OWT-CCINET, the natural log (ln)-transformed Chla concentrations were estimated, and the model achieved outstanding results, with a mean absolute error of 0.410, a root mean square error of 0.549, and a coefficient of determination (<italic>R</i><sup>2</sup>) of 0.899. After minimally harmonizing MODIS and MERIS to SeaWiFS-equivalent bands, OWT-CCINET maintained stable cross-sensor performance, supporting long-term, multimission applications. This study provides novel methodologies and theoretical support for the <italic>development</i> of cross-water type models, offering significant value for practical applications such as marine environmental monitoring.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"2006-2032"},"PeriodicalIF":5.3,"publicationDate":"2025-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11284877","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Elevation and Vegetation Cover Dominate Inter-Basin Water Use Efficiency Patterns in China 高程和植被覆盖主导中国流域间水资源利用效率格局
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-05 DOI: 10.1109/JSTARS.2025.3640403
Jun Hu;Hongjun Su;Yiping Chen;Yuanwei Qin;Zhaohui Xue;Qian Du
Water use efficiency (WUE) is a fundamental indicator of the balance between ecosystem carbon assimilation and water consumption. However, its spatial variability and dominant environmental drivers across China's river basins remain unclear, posing challenges for basin-scale management. In this study, a comprehensive WUE analysis framework was established through the integration of multisource remote sensing and auxiliary datasets. In this framework, multisource vegetation, climate, topography, and land-use data were integrated to estimate WUE from the GPP-to-ET ratio, and a novel basin-scale dataset covering 25 major river basins in China from 2002 to 2021 was generated (CBS-WUE, https://doi.org/10.5281/zenodo.17402779), which was validated against FLUXNET2015 observations. With this new dataset, inter-basin comparisons were conducted to characterize spatial heterogeneity and temporal dynamics, while multivariate statistical and machine learning analyses were employed to identify the relative contributions of climatic, biotic, and land-use drivers. Results indicated that elevation and vegetation structure were the primary factors influencing basin-scale WUE differences. The national average WUE was 1.13 g C kg−1 H2O, with basin-level values ranging from 0.11 to 1.80 g C kg−1 H2O. Among them, higher WUE was in basins of moderate elevation and dense vegetation, and lower WUE was in high-elevation or arid basins. This integrative analysis highlights the dominant role of topography and vegetation in shaping WUE patterns and provides a scientific basis for enhancing water resource efficiency and ecological sustainability under changing environmental conditions.
水分利用效率(WUE)是衡量生态系统碳同化与水分消耗平衡的基本指标。然而,中国河流流域的空间变异性和主导环境驱动因素尚不清楚,这给流域尺度管理带来了挑战。本研究通过多源遥感和辅助数据集的整合,建立了一个综合的WUE分析框架。在该框架下,综合多源植被、气候、地形和土地利用数据,从gpp - et比估算WUE,并生成了覆盖中国25个主要流域的2002 - 2021年流域尺度数据集(CBS-WUE, https://doi.org/10.5281/zenodo.17402779),并与FLUXNET2015观测数据进行了验证。利用这一新的数据集,进行了流域间的比较,以表征空间异质性和时间动态,同时采用多元统计和机器学习分析来确定气候、生物和土地利用驱动因素的相对贡献。结果表明,高程和植被结构是影响流域尺度水分利用效率差异的主要因素。全国平均水分利用效率为1.13 g C kg−1 H2O,流域平均水分利用效率为0.11 ~ 1.80 g C kg−1 H2O。其中,中等海拔和植被密集的流域WUE较高,而高海拔和干旱的流域WUE较低。这种综合分析强调了地形和植被在水利用效率模式形成中的主导作用,为在变化的环境条件下提高水资源效率和生态可持续性提供了科学依据。
{"title":"Elevation and Vegetation Cover Dominate Inter-Basin Water Use Efficiency Patterns in China","authors":"Jun Hu;Hongjun Su;Yiping Chen;Yuanwei Qin;Zhaohui Xue;Qian Du","doi":"10.1109/JSTARS.2025.3640403","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3640403","url":null,"abstract":"Water use efficiency (WUE) is a fundamental indicator of the balance between ecosystem carbon assimilation and water consumption. However, its spatial variability and dominant environmental drivers across China's river basins remain unclear, posing challenges for basin-scale management. In this study, a comprehensive WUE analysis framework was established through the integration of multisource remote sensing and auxiliary datasets. In this framework, multisource vegetation, climate, topography, and land-use data were integrated to estimate WUE from the GPP-to-ET ratio, and a novel basin-scale dataset covering 25 major river basins in China from 2002 to 2021 was generated (CBS-WUE, <uri>https://doi.org/10.5281/zenodo.17402779</uri>), which was validated against FLUXNET2015 observations. With this new dataset, inter-basin comparisons were conducted to characterize spatial heterogeneity and temporal dynamics, while multivariate statistical and machine learning analyses were employed to identify the relative contributions of climatic, biotic, and land-use drivers. Results indicated that elevation and vegetation structure were the primary factors influencing basin-scale WUE differences. The national average WUE was 1.13 g C kg<sup>−1</sup> H<sub>2</sub>O, with basin-level values ranging from 0.11 to 1.80 g C kg<sup>−1</sup> H<sub>2</sub>O. Among them, higher WUE was in basins of moderate elevation and dense vegetation, and lower WUE was in high-elevation or arid basins. This integrative analysis highlights the dominant role of topography and vegetation in shaping WUE patterns and provides a scientific basis for enhancing water resource efficiency and ecological sustainability under changing environmental conditions.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1533-1548"},"PeriodicalIF":5.3,"publicationDate":"2025-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11278658","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Rotation and Scale Invariant Map Matching Method for UAV Visual Geolocalization 一种用于无人机视觉地理定位的旋转和尺度不变地图匹配方法
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-04 DOI: 10.1109/JSTARS.2025.3639900
Yu Liu;Jing Bai;Zhu Xiao;Yuheng Lian;Licheng Jiao
Absolute visual localization based on onboard cameras and georeferenced data is the dominant solution for unmanned aerial vehicles (UAVs) when the global navigation satellite system is not available. However, current image-based matching methods are easily affected by significant feature differences between UAV images and georeferenced images, and map-based matching methods are affected by image rotation and resolution differences. To address the above-mentioned challenges, this article proposes a rotation and scale invariance map matching method (RSIM) to realize UAV visual localization in urban scenarios. Specifically, RSIM first extracts building contour information from UAV images. Then, based on the shape and spatial relationship features of the buildings, a scene matching method is designed to match the UAV image and the vector e-map. Finally, the centers of the matched building individuals are used as control points for UAV position solving. Extensive experiments conducted on the LoFTR and DKM datasets demonstrate that the proposed RSIM algorithm achieves superior localization performance compared to existing deep learning-based image matching methods. Moreover, our RSIM effectively enables UAV localization in urban scenarios even when the orientation and resolution of UAV images are unknown, achieving performance comparable to the SSRM map matching method, which relies on prior knowledge of these parameters.
在全球卫星导航系统不可用的情况下,基于机载相机和地理参考数据的绝对视觉定位是无人机的主要解决方案。然而,目前基于图像的匹配方法容易受到无人机图像与地理参考图像显著特征差异的影响,而基于地图的匹配方法容易受到图像旋转和分辨率差异的影响。针对上述挑战,本文提出了一种旋转尺度不变性地图匹配方法(RSIM)来实现城市场景下无人机的视觉定位。具体而言,RSIM首先从无人机图像中提取建筑物轮廓信息。然后,根据建筑物的形状和空间关系特征,设计了一种将无人机图像与矢量电子地图进行匹配的场景匹配方法。最后,以匹配的建筑个体中心为控制点进行无人机定位求解。在LoFTR和DKM数据集上进行的大量实验表明,与现有的基于深度学习的图像匹配方法相比,提出的RSIM算法具有更好的定位性能。此外,即使在无人机图像的方向和分辨率未知的情况下,我们的RSIM也能有效地实现无人机在城市场景中的定位,其性能可与SSRM地图匹配方法相媲美,后者依赖于这些参数的先验知识。
{"title":"A Rotation and Scale Invariant Map Matching Method for UAV Visual Geolocalization","authors":"Yu Liu;Jing Bai;Zhu Xiao;Yuheng Lian;Licheng Jiao","doi":"10.1109/JSTARS.2025.3639900","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639900","url":null,"abstract":"Absolute visual localization based on onboard cameras and georeferenced data is the dominant solution for unmanned aerial vehicles (UAVs) when the global navigation satellite system is not available. However, current image-based matching methods are easily affected by significant feature differences between UAV images and georeferenced images, and map-based matching methods are affected by image rotation and resolution differences. To address the above-mentioned challenges, this article proposes a rotation and scale invariance map matching method (RSIM) to realize UAV visual localization in urban scenarios. Specifically, RSIM first extracts building contour information from UAV images. Then, based on the shape and spatial relationship features of the buildings, a scene matching method is designed to match the UAV image and the vector e-map. Finally, the centers of the matched building individuals are used as control points for UAV position solving. Extensive experiments conducted on the LoFTR and DKM datasets demonstrate that the proposed RSIM algorithm achieves superior localization performance compared to existing deep learning-based image matching methods. Moreover, our RSIM effectively enables UAV localization in urban scenarios even when the orientation and resolution of UAV images are unknown, achieving performance comparable to the SSRM map matching method, which relies on prior knowledge of these parameters.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1616-1627"},"PeriodicalIF":5.3,"publicationDate":"2025-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11278038","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TranSTD: A Wavelet-Driven Transformer-Based SAR Target Detection Framework With Adaptive Feature Enhancement and Fusion 基于自适应特征增强和融合的小波驱动变换SAR目标检测框架
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-03 DOI: 10.1109/JSTARS.2025.3639785
Bobo Xi;Jiaqi Chen;Yan Huang;Jiaojiao Li;Yunsong Li;Zan Li;Xiang-Gen Xia
Target detection in Synthetic Aperture Radar (SAR) images is of great importance in civilian monitoring and military reconnaissance. However, the unique speckle noise inherent in SAR images leads to semantic information loss, while traditional convolutional neural network downsampling methods exacerbate this issue, impacting detection accuracy and robustness. Moreover, some dense target scenarios and weak scattering features of targets make it challenging to achieve sufficient feature discriminability, adding complexity to the detection task. In addition, the multiscale characteristic of SAR targets presents difficulties in balancing detection performance with computational efficiency in complex scenes. To tackle these difficulties, this article introduces a wavelet-driven transformer-based SAR target detection framework called TranSTD. Specifically, it incorporates the Haar wavelet dynamic downsampling and semantic preserving dynamic downsampling modules, which effectively suppress noise and preserve semantic information using techniques such as Haar wavelet denoise and input-driven dynamic pooling downsampling. Furthermore, the SAR adaptive convolution (SAC) bottleneck is proposed for enhancing the discrimination of features. To optimize performance and efficiency across varying scene complexities, a multiscale SAR attention fusion encoder is developed. Extensive experiments are carried out on three datasets, showing that our proposed algorithm outperforms the current state-of-the-art benchmarks in SAR target detection, offering a robust solution for the detection of targets in complex SAR scenes.
合成孔径雷达(SAR)图像中的目标检测在民用监控和军事侦察中具有重要意义。然而,SAR图像固有的斑点噪声导致语义信息丢失,而传统的卷积神经网络降采样方法加剧了这一问题,影响了检测的准确性和鲁棒性。此外,一些密集的目标场景和目标较弱的散射特性,使得目标特征难以达到足够的可分辨性,增加了检测任务的复杂性。此外,SAR目标的多尺度特性使得在复杂场景下难以平衡检测性能和计算效率。为了解决这些困难,本文介绍了一种基于小波驱动变压器的SAR目标检测框架TranSTD。具体来说,它结合了Haar小波动态下采样和语义保持动态下采样模块,使用Haar小波去噪和输入驱动的动态池下采样等技术有效地抑制噪声并保留语义信息。在此基础上,提出了SAR自适应卷积(SAC)瓶颈来增强特征识别。为了优化不同场景下的性能和效率,研制了一种多尺度SAR注意力融合编码器。在三个数据集上进行了大量的实验,表明我们提出的算法优于当前最先进的SAR目标检测基准,为复杂SAR场景中的目标检测提供了强大的解决方案。
{"title":"TranSTD: A Wavelet-Driven Transformer-Based SAR Target Detection Framework With Adaptive Feature Enhancement and Fusion","authors":"Bobo Xi;Jiaqi Chen;Yan Huang;Jiaojiao Li;Yunsong Li;Zan Li;Xiang-Gen Xia","doi":"10.1109/JSTARS.2025.3639785","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639785","url":null,"abstract":"Target detection in Synthetic Aperture Radar (SAR) images is of great importance in civilian monitoring and military reconnaissance. However, the unique speckle noise inherent in SAR images leads to semantic information loss, while traditional convolutional neural network downsampling methods exacerbate this issue, impacting detection accuracy and robustness. Moreover, some dense target scenarios and weak scattering features of targets make it challenging to achieve sufficient feature discriminability, adding complexity to the detection task. In addition, the multiscale characteristic of SAR targets presents difficulties in balancing detection performance with computational efficiency in complex scenes. To tackle these difficulties, this article introduces a wavelet-driven transformer-based SAR target detection framework called TranSTD. Specifically, it incorporates the Haar wavelet dynamic downsampling and semantic preserving dynamic downsampling modules, which effectively suppress noise and preserve semantic information using techniques such as Haar wavelet denoise and input-driven dynamic pooling downsampling. Furthermore, the SAR adaptive convolution (SAC) bottleneck is proposed for enhancing the discrimination of features. To optimize performance and efficiency across varying scene complexities, a multiscale SAR attention fusion encoder is developed. Extensive experiments are carried out on three datasets, showing that our proposed algorithm outperforms the current state-of-the-art benchmarks in SAR target detection, offering a robust solution for the detection of targets in complex SAR scenes.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1197-1211"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275702","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dual-Branch EfficientNetV2-S-Based Method for Marine Oil Spill Detection Using Multisource Satellite Data Fusion 基于多源卫星数据融合的双分支高效netv2海洋溢油检测方法
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-03 DOI: 10.1109/JSTARS.2025.3639503
Yong Wan;Liyan Peng;Rui Zhang;Ruyue Zhang;Haowen Wang
As one of the most severe forms of pollution, oil spills pose significant threats to the marine environment. Synthetic aperture radar (SAR), an active microwave remote sensing technology, enables sea surface monitoring under all weather and lighting conditions and provides high spatial resolution. It has been widely used in the field of marine oil spill detection. However, other natural phenomena, such as low wind regions and biogenic oil films, can also produce dark spot features in SAR imagery that resemble oil spills, leading to false alarms. Global navigation satellite system-reflectometry (GNSS-R), as an emerging remote sensing technique for ocean observation, offers distinct advantages, including high temporal resolution and multisource observation capabilities. By combining SAR backscattering coefficients with GNSS-R delay doppler map, it becomes possible to characterize the impact of oil spills on sea surface roughness from both backscattering and forward-scattering perspectives. This joint approach enables more accurate oil spill detection and has the potential to reduce the false alarms. Nevertheless, limited measured data for multisource remote sensing oil spill detection hinders robust multisensor fusion model development. To address this, this study proposes a synchronized data generation method, creating a joint SAR and GNSS-R oil spill dataset, and on this basis, a dual-branch EfficientNetV2-S architecture is adopted to build a multisource satellite oil spill data fusion model, which is applied to offshore oil spill detection. According to experimental results, the suggested model detects oil spills with an accuracy of 94.97%. Compared with SAR-only detection models, the false alarm rate is reduced by 3.6%, demonstrating that the dual-payload approach effectively lowers the rate of false detections in marine oil spill monitoring.
石油泄漏作为最严重的污染形式之一,对海洋环境构成了重大威胁。合成孔径雷达(SAR)是一种主动微波遥感技术,能够在所有天气和光照条件下进行海面监测,并提供高空间分辨率。它在海洋溢油检测领域得到了广泛的应用。然而,其他自然现象,如低风区和生物油膜,也会在SAR图像中产生类似石油泄漏的黑点特征,从而导致误报。全球导航卫星系统反射测量(GNSS-R)作为一种新兴的海洋遥感观测技术,具有高时间分辨率和多源观测能力等明显优势。通过将SAR后向散射系数与GNSS-R延迟多普勒图相结合,可以从后向散射和前向散射两个角度描述石油泄漏对海面粗糙度的影响。这种联合方法可以更准确地检测溢油,并有可能减少误报。然而,多源遥感溢油探测的实测数据有限,阻碍了多传感器融合模型的发展。为此,本研究提出了同步数据生成方法,创建SAR和GNSS-R联合溢油数据集,并在此基础上采用双分支的EfficientNetV2-S架构构建多源卫星溢油数据融合模型,应用于海上溢油检测。实验结果表明,该模型的溢油检测准确率为94.97%。与单纯的sar检测模型相比,虚警率降低了3.6%,表明双载荷方法有效降低了海洋溢油监测中的虚警率。
{"title":"A Dual-Branch EfficientNetV2-S-Based Method for Marine Oil Spill Detection Using Multisource Satellite Data Fusion","authors":"Yong Wan;Liyan Peng;Rui Zhang;Ruyue Zhang;Haowen Wang","doi":"10.1109/JSTARS.2025.3639503","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639503","url":null,"abstract":"As one of the most severe forms of pollution, oil spills pose significant threats to the marine environment. Synthetic aperture radar (SAR), an active microwave remote sensing technology, enables sea surface monitoring under all weather and lighting conditions and provides high spatial resolution. It has been widely used in the field of marine oil spill detection. However, other natural phenomena, such as low wind regions and biogenic oil films, can also produce dark spot features in SAR imagery that resemble oil spills, leading to false alarms. Global navigation satellite system-reflectometry (GNSS-R), as an emerging remote sensing technique for ocean observation, offers distinct advantages, including high temporal resolution and multisource observation capabilities. By combining SAR backscattering coefficients with GNSS-R delay doppler map, it becomes possible to characterize the impact of oil spills on sea surface roughness from both backscattering and forward-scattering perspectives. This joint approach enables more accurate oil spill detection and has the potential to reduce the false alarms. Nevertheless, limited measured data for multisource remote sensing oil spill detection hinders robust multisensor fusion model development. To address this, this study proposes a synchronized data generation method, creating a joint SAR and GNSS-R oil spill dataset, and on this basis, a dual-branch EfficientNetV2-S architecture is adopted to build a multisource satellite oil spill data fusion model, which is applied to offshore oil spill detection. According to experimental results, the suggested model detects oil spills with an accuracy of 94.97%. Compared with SAR-only detection models, the false alarm rate is reduced by 3.6%, demonstrating that the dual-payload approach effectively lowers the rate of false detections in marine oil spill monitoring.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1549-1566"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275680","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AF2-MSA Net: Attention-Fusion Focused Multiscale Architecture Network for Remote Sensing Scene Classification AF2-MSA网:面向遥感场景分类的关注融合多尺度体系结构网络
IF 5.3 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC Pub Date : 2025-12-03 DOI: 10.1109/JSTARS.2025.3639670
Cuiping Shi;Yimin Wang;Liguo Wang
With the rapid development of deep learning technology, significant progress has been made in the field of remote sensing (RS) scene image classification. However, the large intraclass distance and high interclass similarity still pose significant challenges for RS scene classification. In addition, there are multiscale targets in RS images, which make significant differences in target characteristics. To overcome the above limitations, this article proposes a novel attention-fusion focused multiscale architecture network (AF2-MSA Net). First, a multilevel feature extraction module (MFEM) was designed to extract semantic and detail information at different scales from RS images. Subsequently, an intricately designed global context recalibration module (GCRM) was embedded into MFEM, and the features at each level were enhanced through a global context recalibration mechanism, enabling the model to dynamically focus on key semantic regions and important contextual information. Next, an axis-aligned feature harmonization module (AAFHM) was constructed to fuse multiscale features from adjacent stages layer by layer. This module combines attention mechanisms from both channel and spatial branches to adaptively coordinate and fuse multiscale contextual information, achieving deep collaborative optimization of different scale features. Finally, the GCRM and AAFHM are integrated into a unified framework called AF2-MSA Net to achieve collaborative optimization of global semantics and multiscale discriminative features. Extensive experiments on three commonly used datasets have shown that the proposed AF2-MSA Net outperforms some state-of-the-art methods in RS image scene classification tasks.
随着深度学习技术的快速发展,遥感场景图像分类领域取得了重大进展。然而,大的类内距离和高的类间相似度仍然给遥感场景分类带来了很大的挑战。此外,RS图像中存在多尺度目标,这使得目标特征存在显著差异。为了克服上述局限性,本文提出了一种新的以注意力融合为中心的多尺度架构网络(AF2-MSA网)。首先,设计多级特征提取模块,从遥感图像中提取不同尺度的语义信息和细节信息;随后,将设计复杂的全局上下文再校准模块(GCRM)嵌入到MFEM中,并通过全局上下文再校准机制增强各层特征,使模型能够动态关注关键语义区域和重要上下文信息。然后,构建一个轴向特征协调模块(AAFHM),逐层融合相邻阶段的多尺度特征;该模块结合渠道分支和空间分支的注意机制,自适应协调融合多尺度上下文信息,实现不同尺度特征的深度协同优化。最后,将GCRM和AAFHM集成到AF2-MSA Net的统一框架中,实现全局语义和多尺度判别特征的协同优化。在三个常用数据集上的大量实验表明,所提出的AF2-MSA网在RS图像场景分类任务中优于一些最先进的方法。
{"title":"AF2-MSA Net: Attention-Fusion Focused Multiscale Architecture Network for Remote Sensing Scene Classification","authors":"Cuiping Shi;Yimin Wang;Liguo Wang","doi":"10.1109/JSTARS.2025.3639670","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3639670","url":null,"abstract":"With the rapid development of deep learning technology, significant progress has been made in the field of remote sensing (RS) scene image classification. However, the large intraclass distance and high interclass similarity still pose significant challenges for RS scene classification. In addition, there are multiscale targets in RS images, which make significant differences in target characteristics. To overcome the above limitations, this article proposes a novel attention-fusion focused multiscale architecture network (AF<sup>2</sup>-MSA Net). First, a multilevel feature extraction module (MFEM) was designed to extract semantic and detail information at different scales from RS images. Subsequently, an intricately designed global context recalibration module (GCRM) was embedded into MFEM, and the features at each level were enhanced through a global context recalibration mechanism, enabling the model to dynamically focus on key semantic regions and important contextual information. Next, an axis-aligned feature harmonization module (AAFHM) was constructed to fuse multiscale features from adjacent stages layer by layer. This module combines attention mechanisms from both channel and spatial branches to adaptively coordinate and fuse multiscale contextual information, achieving deep collaborative optimization of different scale features. Finally, the GCRM and AAFHM are integrated into a unified framework called AF<sup>2</sup>-MSA Net to achieve collaborative optimization of global semantics and multiscale discriminative features. Extensive experiments on three commonly used datasets have shown that the proposed AF<sup>2</sup>-MSA Net outperforms some state-of-the-art methods in RS image scene classification tasks.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1150-1164"},"PeriodicalIF":5.3,"publicationDate":"2025-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11275695","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729382","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1