首页 > 最新文献

IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society最新文献

英文 中文
Multiscale Low-Rank and Sparse Attention-Based Transformer for Hyperspectral Image Classification 基于多尺度低秩稀疏关注的高光谱图像分类变压器
Jinliang An;Longlong Dai;Muzi Wang;Weidong Zhang
Recently, transformer-based approaches have emerged as powerful tools for hyperspectral image (HSI) classification. HSI inherently exhibits low-rank and sparse properties due to spatial continuity and spectral redundancy. However, most existing methods directly adopt standard transformer architectures, overlooking the distinctive priors inherent in HSI, which limits the classification performance and modeling efficiency. To address these challenges, this letter proposes a multiscale low-rank and sparse transformer (MLSFormer) that effectively integrates both low-rank and sparse priors. Specifically, we leverage tensor low-rank decomposition (TLRD) to factorize the query, key, and value matrices into low-rank tensor products, capturing dominant low-rank structures. In parallel, we introduce a sparse attention mechanism to retain only the most important connections. Furthermore, a multiscale attention mechanism is designed to hierarchically partition attention heads into global, medium, and local groups, each assigned tailored decomposition ranks and sparsity ratios, enabling comprehensive multiscale feature extraction. Extensive experiments on three benchmark datasets demonstrate that MLSFormer achieves superior classification performance compared to state-of-the-art methods.
最近,基于变压器的方法已经成为高光谱图像(HSI)分类的强大工具。由于空间连续性和频谱冗余,HSI固有地表现出低秩和稀疏特性。然而,大多数现有方法直接采用标准变压器架构,忽略了HSI固有的独特先验,这限制了分类性能和建模效率。为了解决这些挑战,这封信提出了一种多尺度低秩稀疏变压器(MLSFormer),它有效地集成了低秩和稀疏先验。具体来说,我们利用张量低秩分解(TLRD)将查询、键和值矩阵分解为低秩张量积,捕获主要的低秩结构。同时,我们引入了一种稀疏注意机制,只保留最重要的连接。此外,设计了一种多尺度注意机制,将注意头分层划分为全局、中等和局部组,每个组分配定制的分解等级和稀疏度比,从而实现全面的多尺度特征提取。在三个基准数据集上进行的大量实验表明,与最先进的方法相比,MLSFormer具有更好的分类性能。
{"title":"Multiscale Low-Rank and Sparse Attention-Based Transformer for Hyperspectral Image Classification","authors":"Jinliang An;Longlong Dai;Muzi Wang;Weidong Zhang","doi":"10.1109/LGRS.2025.3601670","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601670","url":null,"abstract":"Recently, transformer-based approaches have emerged as powerful tools for hyperspectral image (HSI) classification. HSI inherently exhibits low-rank and sparse properties due to spatial continuity and spectral redundancy. However, most existing methods directly adopt standard transformer architectures, overlooking the distinctive priors inherent in HSI, which limits the classification performance and modeling efficiency. To address these challenges, this letter proposes a multiscale low-rank and sparse transformer (MLSFormer) that effectively integrates both low-rank and sparse priors. Specifically, we leverage tensor low-rank decomposition (TLRD) to factorize the query, key, and value matrices into low-rank tensor products, capturing dominant low-rank structures. In parallel, we introduce a sparse attention mechanism to retain only the most important connections. Furthermore, a multiscale attention mechanism is designed to hierarchically partition attention heads into global, medium, and local groups, each assigned tailored decomposition ranks and sparsity ratios, enabling comprehensive multiscale feature extraction. Extensive experiments on three benchmark datasets demonstrate that MLSFormer achieves superior classification performance compared to state-of-the-art methods.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MACNet: A Multiscale Attention-Guided Contextual Network for Hyperspectral Anomaly Detection 高光谱异常检测的多尺度注意引导上下文网络
Yuquan Gan;Xingyu Li;Siyu Wu;Mengjiao Wang
Hyperspectral anomaly detection (HAD) aims to identify anomalous targets that differ from the background in high-dimensional spectral images, and is widely applied in fields such as military reconnaissance and environmental monitoring. However, the diversity of anomaly scales, interference from complex backgrounds, and redundancy of spectral information pose significant challenges to achieving high detection accuracy. To address these issues, this letter proposes a multiscale attention-guided context network (MACNet) to enhance the perception of anomalous regions. MACNet consists of three components: a multiscale local feature extractor (MSLFE) that effectively captures edge structures and subtle anomalies at different scales, a global context awareness module (GCAM) that fuses local and global contextual information to improve discrimination under complex backgrounds, and a refined reconstruction and contrast enhancement module (RRCE) that employs channel attention and spatial reconstruction mechanisms to enhance the response differences between anomalies and background. Experiments on four publicly available hyperspectral datasets demonstrate that MACNet achieves superior detection accuracy compared to existing mainstream methods, validating the effectiveness of the proposed approach.
高光谱异常检测(HAD)旨在识别高维光谱图像中与背景不同的异常目标,广泛应用于军事侦察和环境监测等领域。然而,异常尺度的多样性、复杂背景的干扰以及光谱信息的冗余,对实现高检测精度提出了重大挑战。为了解决这些问题,这封信提出了一个多尺度注意力引导上下文网络(MACNet)来增强对异常区域的感知。MACNet由三个部分组成:有效捕获不同尺度边缘结构和细微异常的多尺度局部特征提取器(MSLFE),融合局部和全局上下文信息以提高复杂背景下识别能力的全局上下文感知模块(GCAM),以及利用通道注意和空间重建机制增强异常和背景响应差异的精细重建和对比度增强模块(RRCE)。在4个公开可用的高光谱数据集上进行的实验表明,与现有主流方法相比,MACNet的检测精度更高,验证了该方法的有效性。
{"title":"MACNet: A Multiscale Attention-Guided Contextual Network for Hyperspectral Anomaly Detection","authors":"Yuquan Gan;Xingyu Li;Siyu Wu;Mengjiao Wang","doi":"10.1109/LGRS.2025.3601600","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601600","url":null,"abstract":"Hyperspectral anomaly detection (HAD) aims to identify anomalous targets that differ from the background in high-dimensional spectral images, and is widely applied in fields such as military reconnaissance and environmental monitoring. However, the diversity of anomaly scales, interference from complex backgrounds, and redundancy of spectral information pose significant challenges to achieving high detection accuracy. To address these issues, this letter proposes a multiscale attention-guided context network (MACNet) to enhance the perception of anomalous regions. MACNet consists of three components: a multiscale local feature extractor (MSLFE) that effectively captures edge structures and subtle anomalies at different scales, a global context awareness module (GCAM) that fuses local and global contextual information to improve discrimination under complex backgrounds, and a refined reconstruction and contrast enhancement module (RRCE) that employs channel attention and spatial reconstruction mechanisms to enhance the response differences between anomalies and background. Experiments on four publicly available hyperspectral datasets demonstrate that MACNet achieves superior detection accuracy compared to existing mainstream methods, validating the effectiveness of the proposed approach.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Partial Attention Feature Aggregation Network for Lightweight Remote Sensing Image Super-Resolution 面向轻量遥感图像超分辨率的部分关注特征聚合网络
Wei Xue;Tiancheng Shao;Mingyang Du;Xiao Zheng;Ping Zhong
Most lightweight super-resolution networks are designed to improve performance by introducing an attention mechanism and to reduce model parameters by designing lightweight convolutional layers. However, the introduction of the attention mechanism often leads to an increase in the number of parameters. In addition, the lightweight convolutional layer has a limited receptive field and cannot effectively capture long-range dependencies. In this letter, we design a novel lightweight base module called partial attention convolution (PAConv) and develop three variants of PAConv with different receptive fields to collaboratively exploit nonlocal information. Based on PAConv, we further propose a lightweight super-resolution network called partial attention feature aggregation network (PAFAN). Specifically, we arrange the PAConv variants in a progressive iterative manner to form the attention progressive feature distillation block (APFDB), which aims to gradually optimize the distilled features. Furthermore, we construct a multilevel aggregation spatial attention (MASA) via a stacking of the PAConv variants to systematically coordinate multiscale structural information. Extensive experiments conducted on benchmark datasets show that PAFAN achieves an optimal balance between reconstruction quality and computational efficiency. In particular, with only 123 K parameters and 0.49G FLOPs, PAFAN can maintain a performance comparable to that of SOTA methods.
大多数轻量级超分辨率网络都是通过引入注意力机制来提高性能,并通过设计轻量级卷积层来减少模型参数。然而,注意机制的引入往往会导致参数数量的增加。此外,轻量级卷积层具有有限的接受域,不能有效地捕获远程依赖关系。在这篇文章中,我们设计了一个新的轻量级基础模块,称为部分注意卷积(PAConv),并开发了三个具有不同接受域的PAConv变体,以协同利用非局部信息。在PAConv的基础上,我们进一步提出了一种轻量级的超分辨率网络——部分注意力特征聚合网络(PAFAN)。具体而言,我们以渐进迭代的方式排列PAConv变量,形成关注渐进特征蒸馏块(APFDB),目的是逐步优化蒸馏出来的特征。在此基础上,通过对PAConv变量的叠加,构建了一个多层次聚集空间注意(MASA),对多尺度结构信息进行系统协调。在基准数据集上进行的大量实验表明,PAFAN在重建质量和计算效率之间取得了最佳平衡。特别是,在只有123 K参数和0.49G FLOPs的情况下,paan可以保持与SOTA方法相当的性能。
{"title":"Partial Attention Feature Aggregation Network for Lightweight Remote Sensing Image Super-Resolution","authors":"Wei Xue;Tiancheng Shao;Mingyang Du;Xiao Zheng;Ping Zhong","doi":"10.1109/LGRS.2025.3601595","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601595","url":null,"abstract":"Most lightweight super-resolution networks are designed to improve performance by introducing an attention mechanism and to reduce model parameters by designing lightweight convolutional layers. However, the introduction of the attention mechanism often leads to an increase in the number of parameters. In addition, the lightweight convolutional layer has a limited receptive field and cannot effectively capture long-range dependencies. In this letter, we design a novel lightweight base module called partial attention convolution (PAConv) and develop three variants of PAConv with different receptive fields to collaboratively exploit nonlocal information. Based on PAConv, we further propose a lightweight super-resolution network called partial attention feature aggregation network (PAFAN). Specifically, we arrange the PAConv variants in a progressive iterative manner to form the attention progressive feature distillation block (APFDB), which aims to gradually optimize the distilled features. Furthermore, we construct a multilevel aggregation spatial attention (MASA) via a stacking of the PAConv variants to systematically coordinate multiscale structural information. Extensive experiments conducted on benchmark datasets show that PAFAN achieves an optimal balance between reconstruction quality and computational efficiency. In particular, with only 123 K parameters and 0.49G FLOPs, PAFAN can maintain a performance comparable to that of SOTA methods.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
G2L2Net: A Road Extraction Method for Remote Sensing Images via Gated Global–Local Linear Attention 基于门控全局-局部线性关注的遥感图像道路提取方法
Zhilin Qu;Mingzhe Li;Chenggong Wang;Zehua Chen
Road extraction from remote sensing imagery plays a pivotal role in a wide range of geospatial and urban applications. Nevertheless, this task remains inherently challenging due to the intricate morphological variations of roads and frequent occlusions or interference caused by complex background environments. To address these challenges, we propose a road extraction network based on gated global–local linear attention (G $^2$ L $^2$ Attention). First, we introduce a linear deformable convolution and design a linear input-dependent deformable convolution (LID2Conv), which adaptively modulates convolution offsets and weights in a content-aware manner. In addition, we design a top-K-based sparse gated weight (TGW). We use this gated mechanism as a shared weight to multiply with local and global information to achieve G2L2Attention. Local information is obtained by LID2Conv, and we gain global information by introducing 2-D selective scan (SS2D). These two pathways are integrated through the proposed G2L2Attention, enabling an efficient and consistent fusion of hierarchical spatial features. The extracted features are passed to the decoder. This approach improves road detail representation and provides accurate contextual information. Experiments conducted on three public road datasets demonstrate that G2L2Net outperforms the existing methods in various evaluation metrics. Our source code is available at https://github.com/ZehuaChenLab
从遥感影像中提取道路在广泛的地理空间和城市应用中起着关键作用。然而,由于道路复杂的形态变化和复杂背景环境引起的频繁遮挡或干扰,这项任务仍然具有固有的挑战性。为了解决这些挑战,我们提出了一个基于门控全局-局部线性关注(G $^2$ L $^2$ attention)的道路提取网络。首先,我们引入了一个线性可变形卷积,并设计了一个线性输入相关的可变形卷积(LID2Conv),该卷积以内容感知的方式自适应调节卷积偏移量和权重。此外,我们设计了一个基于顶部的稀疏门控权(TGW)。我们使用这种门控机制作为共享权值与局部和全局信息相乘来实现G2L2Attention。局部信息由LID2Conv获取,全局信息由二维选择性扫描(SS2D)获取。通过提出的G2L2Attention将这两条路径整合在一起,实现了分层空间特征的高效一致融合。提取的特征被传递给解码器。这种方法改进了道路细节表示,并提供了准确的上下文信息。在三个公共道路数据集上进行的实验表明,G2L2Net在各种评估指标上优于现有方法。我们的源代码可从https://github.com/ZehuaChenLab获得
{"title":"G2L2Net: A Road Extraction Method for Remote Sensing Images via Gated Global–Local Linear Attention","authors":"Zhilin Qu;Mingzhe Li;Chenggong Wang;Zehua Chen","doi":"10.1109/LGRS.2025.3601585","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601585","url":null,"abstract":"Road extraction from remote sensing imagery plays a pivotal role in a wide range of geospatial and urban applications. Nevertheless, this task remains inherently challenging due to the intricate morphological variations of roads and frequent occlusions or interference caused by complex background environments. To address these challenges, we propose a road extraction network based on gated global–local linear attention (G<inline-formula> <tex-math>$^2$ </tex-math></inline-formula>L<inline-formula> <tex-math>$^2$ </tex-math></inline-formula>Attention). First, we introduce a linear deformable convolution and design a linear input-dependent deformable convolution (LID2Conv), which adaptively modulates convolution offsets and weights in a content-aware manner. In addition, we design a top-K-based sparse gated weight (TGW). We use this gated mechanism as a shared weight to multiply with local and global information to achieve G2L2Attention. Local information is obtained by LID2Conv, and we gain global information by introducing 2-D selective scan (SS2D). These two pathways are integrated through the proposed G2L2Attention, enabling an efficient and consistent fusion of hierarchical spatial features. The extracted features are passed to the decoder. This approach improves road detail representation and provides accurate contextual information. Experiments conducted on three public road datasets demonstrate that G2L2Net outperforms the existing methods in various evaluation metrics. Our source code is available at <uri>https://github.com/ZehuaChenLab</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145061937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Crossformer-Based Method for Sea Surface Height Prediction Using Delay–Doppler Map Feature Points 基于交叉变形的延迟多普勒地图特征点海面高度预测方法
Jin Xing;Feng Wang;Dongkai Yang;Chuanrui Tan;Xiangchao Ma;Wenqian Chen;Guangmiao Ji
Global navigation satellite system-reflectometry (GNSS-R) provides an effective remote sensing technique for accurate retrieval of sea surface height (SSH) measurements. However, accuracy is severely affected by environmental disturbances such as wind-induced sea clutter and wave interference, degrading delay–Doppler map (DDM)-derived measurements. In this study, we propose an advanced trajectory-based deep learning model, Crossformer, explicitly designed to capture temporal dependencies inherent in GNSS-R sequential data. The method leverages five distinct DDM features: peak power point (PPP), maximum slope point (MSP), center pixel intensity (CPI), average power point (APP), and kurtosis (KUR). A dimension-segmentwise (DSW) embedding technique combined with a two-stage attention (TSA) mechanism effectively models both temporal and cross-dimensional correlations. Evaluation using CYGNSS data validated against Jason-3 Level 2 measurements demonstrates the superior performance of our approach, yielding a root mean square error (RMSE) of 0.93 m, mean absolute error (MAE) of 0.65 m, and a coefficient of determination ( $R^{2}$ ) of 0.9901. Comparative analyses with baseline methods confirm significant improvements in robustness and predictive accuracy, particularly across varying sea states. This research underscores the potential of advanced temporal modeling techniques in GNSS-R altimetry applications.
全球导航卫星系统反射测量(GNSS-R)为精确检索海面高度(SSH)测量值提供了一种有效的遥感技术。然而,由于环境干扰(如风致海杂波和波浪干扰),延迟多普勒图(DDM)衍生的测量结果会受到精度的严重影响。在这项研究中,我们提出了一种先进的基于轨迹的深度学习模型,Crossformer,明确设计用于捕获GNSS-R序列数据中固有的时间依赖性。该方法利用了五个不同的DDM特征:峰值功率点(PPP)、最大斜率点(MSP)、中心像素强度(CPI)、平均功率点(APP)和峰度(KUR)。结合两阶段注意(TSA)机制的维度分段嵌入技术有效地模拟了时间和跨维度相关性。使用CYGNSS数据对Jason-3 Level 2测量结果进行验证的评估表明,我们的方法具有优越的性能,产生的均方根误差(RMSE)为0.93 m,平均绝对误差(MAE)为0.65 m,决定系数($R^{2}$)为0.9901。与基线方法的对比分析证实了鲁棒性和预测准确性的显著提高,特别是在不同的海况下。这项研究强调了先进的时间建模技术在GNSS-R测高应用中的潜力。
{"title":"A Crossformer-Based Method for Sea Surface Height Prediction Using Delay–Doppler Map Feature Points","authors":"Jin Xing;Feng Wang;Dongkai Yang;Chuanrui Tan;Xiangchao Ma;Wenqian Chen;Guangmiao Ji","doi":"10.1109/LGRS.2025.3601112","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601112","url":null,"abstract":"Global navigation satellite system-reflectometry (GNSS-R) provides an effective remote sensing technique for accurate retrieval of sea surface height (SSH) measurements. However, accuracy is severely affected by environmental disturbances such as wind-induced sea clutter and wave interference, degrading delay–Doppler map (DDM)-derived measurements. In this study, we propose an advanced trajectory-based deep learning model, Crossformer, explicitly designed to capture temporal dependencies inherent in GNSS-R sequential data. The method leverages five distinct DDM features: peak power point (PPP), maximum slope point (MSP), center pixel intensity (CPI), average power point (APP), and kurtosis (KUR). A dimension-segmentwise (DSW) embedding technique combined with a two-stage attention (TSA) mechanism effectively models both temporal and cross-dimensional correlations. Evaluation using CYGNSS data validated against Jason-3 Level 2 measurements demonstrates the superior performance of our approach, yielding a root mean square error (RMSE) of 0.93 m, mean absolute error (MAE) of 0.65 m, and a coefficient of determination (<inline-formula> <tex-math>$R^{2}$ </tex-math></inline-formula>) of 0.9901. Comparative analyses with baseline methods confirm significant improvements in robustness and predictive accuracy, particularly across varying sea states. This research underscores the potential of advanced temporal modeling techniques in GNSS-R altimetry applications.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ProFus: Progressive Radar–Vision Heterogeneous Modality Fusion for Maritime Target Detection 用于海上目标检测的渐进式雷达-视觉异构模态融合
Jingang Wang;Shikai Wu;Peng Liu
Maritime monitoring is crucial in both civilian and military applications, with shore-based radar and visual systems widely used due to their cost effectiveness. However, single-sensor methods have notable limitations: radar systems, while offering wide detection coverage, suffer from high false alarm rates and lack detailed target information, whereas visual systems provide rich details but perform poorly in adverse weather conditions such as rain and fog. To address these issues, this letter proposes a progressive radar–vision fusion method for surface target detection. Due to the significant differences in data characteristics between radar and visual sensors, direct fusion is nearly infeasible. Instead, the proposed method adopts a stepwise fusion strategy, consisting of coordinate calibration, shallow feature fusion, and deep feature integration. Experimental results show that this approach achieves an $text {mAP}_{50}$ of 86.7% and an $text {mAP}_{75}$ of 54.5%, outperforming YOLOv10 by 1.0% and 1.5%, respectively. Moreover, the proposed method significantly surpasses existing state-of-the-art radar–vision fusion approaches, demonstrating its superior effectiveness in complex environments.
海上监测在民用和军事应用中都至关重要,由于其成本效益,岸基雷达和视觉系统被广泛使用。然而,单传感器方法有明显的局限性:雷达系统虽然提供广泛的探测覆盖,但存在高误报率和缺乏详细的目标信息,而视觉系统提供丰富的细节,但在恶劣的天气条件下(如雨和雾)表现不佳。为了解决这些问题,本文提出了一种用于表面目标检测的渐进雷达-视觉融合方法。由于雷达和视觉传感器之间数据特征的显著差异,直接融合几乎是不可行的。该方法采用坐标标定、浅特征融合和深特征融合的分步融合策略。实验结果表明,该方法的$text {mAP}_{50}$的准确率为86.7%,$text {mAP}_{75}$的准确率为54.5%,分别优于YOLOv10算法1.0%和1.5%。此外,该方法明显优于现有的最先进的雷达-视觉融合方法,在复杂环境中显示出优越的有效性。
{"title":"ProFus: Progressive Radar–Vision Heterogeneous Modality Fusion for Maritime Target Detection","authors":"Jingang Wang;Shikai Wu;Peng Liu","doi":"10.1109/LGRS.2025.3601131","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601131","url":null,"abstract":"Maritime monitoring is crucial in both civilian and military applications, with shore-based radar and visual systems widely used due to their cost effectiveness. However, single-sensor methods have notable limitations: radar systems, while offering wide detection coverage, suffer from high false alarm rates and lack detailed target information, whereas visual systems provide rich details but perform poorly in adverse weather conditions such as rain and fog. To address these issues, this letter proposes a progressive radar–vision fusion method for surface target detection. Due to the significant differences in data characteristics between radar and visual sensors, direct fusion is nearly infeasible. Instead, the proposed method adopts a stepwise fusion strategy, consisting of coordinate calibration, shallow feature fusion, and deep feature integration. Experimental results show that this approach achieves an <inline-formula> <tex-math>$text {mAP}_{50}$ </tex-math></inline-formula> of 86.7% and an <inline-formula> <tex-math>$text {mAP}_{75}$ </tex-math></inline-formula> of 54.5%, outperforming YOLOv10 by 1.0% and 1.5%, respectively. Moreover, the proposed method significantly surpasses existing state-of-the-art radar–vision fusion approaches, demonstrating its superior effectiveness in complex environments.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Combining Contrastive Learning and Diffusion Model for Hyperspectral Image Classification 结合对比学习和扩散模型的高光谱图像分类
Xiaorun Li;Jinhui Li;Shuhan Chen;Zeyu Cao
In recent years, self-supervised learning has made significant strides in hyperspectral image classification (HSIC). However, different approaches come with distinct strengths and limitations. Contrastive learning excels at extracting key information from large volumes of redundant data, but its training objective can inadvertently increase intraclass feature distance. To address this limitation, we leverage diffusion models (DMs) for their proven ability to refine and aggregate features by modeling complex data distributions. Specifically, DMs’ inherent denoising and generative processes are theoretically well-suited to enhance intraclass compactness by learning to reconstruct clean, representative features from perturbed inputs. We propose the new method—ContrastDM. This approach generates synthetic features, improving and enriching feature representation, and partially addressing the issue of sample sparsity. Classification experiments on three publicly available datasets demonstrate that ContrastDM significantly outperforms state-of-the-art methods.
近年来,自监督学习在高光谱图像分类(HSIC)领域取得了重大进展。然而,不同的方法有不同的优点和局限性。对比学习擅长于从大量冗余数据中提取关键信息,但其训练目标可能会不经意地增加类内特征距离。为了解决这一限制,我们利用扩散模型(dm),因为它具有通过对复杂数据分布建模来细化和聚合特征的成熟能力。具体来说,dm的固有去噪和生成过程在理论上非常适合通过学习从扰动输入中重建干净的、具有代表性的特征来增强类内紧密性。我们提出了一种新的方法- contrastdm。该方法生成了合成特征,改进和丰富了特征表示,部分解决了样本稀疏性问题。在三个公开可用的数据集上进行的分类实验表明,ContrastDM显著优于最先进的方法。
{"title":"Combining Contrastive Learning and Diffusion Model for Hyperspectral Image Classification","authors":"Xiaorun Li;Jinhui Li;Shuhan Chen;Zeyu Cao","doi":"10.1109/LGRS.2025.3601152","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601152","url":null,"abstract":"In recent years, self-supervised learning has made significant strides in hyperspectral image classification (HSIC). However, different approaches come with distinct strengths and limitations. Contrastive learning excels at extracting key information from large volumes of redundant data, but its training objective can inadvertently increase intraclass feature distance. To address this limitation, we leverage diffusion models (DMs) for their proven ability to refine and aggregate features by modeling complex data distributions. Specifically, DMs’ inherent denoising and generative processes are theoretically well-suited to enhance intraclass compactness by learning to reconstruct clean, representative features from perturbed inputs. We propose the new method—ContrastDM. This approach generates synthetic features, improving and enriching feature representation, and partially addressing the issue of sample sparsity. Classification experiments on three publicly available datasets demonstrate that ContrastDM significantly outperforms state-of-the-art methods.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144934396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S3LBI: Spectral–Spatial Segmentation-Based Local Bicubic Interpolation for Single Hyperspectral Image Super-Resolution 基于光谱-空间分割的单幅高光谱图像超分辨率局部双三次插值
Yubo Ma;Wei He;Siyu Cai;Qingke Zou
Single hyperspectral image (HSI) super-resolution (SR), which is limited by the lack of exterior information, has always been a challenging task. A lot of effort has gone into fully mining spectral information or adopting pretrained models to enhance spatial resolution. However, few SR approaches take into account structural features from the perspective of multidimensional segmentation of the image. Therefore, a novel spectral–spatial segmentation-based local bicubic interpolation (S3LBI) is proposed to implement segmented and blocked interpolation according to the characteristics of HSI. Specifically, the bands of an HSI are clustered into several spectral segments. Then, super-pixel segmentation is carried out in each spectral segment. After that, the bicubic interpolations are separately conducted on different spectral–spatial segments. Experiments demonstrate the superiority of our S3LBI over the compared HSI SR approaches.
单幅高光谱图像(HSI)的超分辨率一直是一项具有挑战性的任务,但受外部信息缺乏的限制。在充分挖掘光谱信息或采用预训练模型来提高空间分辨率方面已经付出了大量的努力。然而,很少有SR方法从图像的多维分割角度考虑结构特征。因此,根据HSI的特点,提出了一种基于频谱空间分割的局部双三次插值方法(S3LBI)来实现分割和块插值。具体地说,HSI的波段被聚集成几个光谱段。然后,对每个光谱段进行超像素分割。然后分别对不同的光谱空间段进行双三次插值。实验证明了我们的S3LBI优于比较的HSI SR方法。
{"title":"S3LBI: Spectral–Spatial Segmentation-Based Local Bicubic Interpolation for Single Hyperspectral Image Super-Resolution","authors":"Yubo Ma;Wei He;Siyu Cai;Qingke Zou","doi":"10.1109/LGRS.2025.3601230","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601230","url":null,"abstract":"Single hyperspectral image (HSI) super-resolution (SR), which is limited by the lack of exterior information, has always been a challenging task. A lot of effort has gone into fully mining spectral information or adopting pretrained models to enhance spatial resolution. However, few SR approaches take into account structural features from the perspective of multidimensional segmentation of the image. Therefore, a novel spectral–spatial segmentation-based local bicubic interpolation (S3LBI) is proposed to implement segmented and blocked interpolation according to the characteristics of HSI. Specifically, the bands of an HSI are clustered into several spectral segments. Then, super-pixel segmentation is carried out in each spectral segment. After that, the bicubic interpolations are separately conducted on different spectral–spatial segments. Experiments demonstrate the superiority of our S3LBI over the compared HSI SR approaches.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal-Guided Transformer Architecture for Remote Sensing Salient Object Detection 遥感显著目标检测的多模态引导变压器结构
Bei Cheng;Zao Liu;Huxiao Tang;Qingwang Wang;Wenhao Chen;Tao Chen;Tao Shen
The latest remote sensing image saliency detectors primarily rely on RGB information alone. However, spatial and geometric information embedded in depth images is robust to variations in lighting and color. Integrating depth information with RGB images can enhance the spatial structure of objects. In light of this, we innovatively propose a remote sensing image saliency detection model that fuses RGB and depth information, named the multimodal-guided transformer architecture (MGTA). Specifically, we first introduce the strongly correlated complementary fusion (SCCF) module to explore cross-modal consistency and similarity, maintaining consistency across different modalities while uncovering multidimensional common information. In addition, the global–local context information interaction (GLCII) module is designed to extract global semantic information and local detail information, effectively utilizing contextual information while reducing the number of parameters. Finally, a cascaded feature-guided decoder (CFGD) is employed to gradually fuse hierarchical decoding features, effectively integrating multilevel data and accurately locating target positions. Extensive experiments demonstrate that our proposed model outperforms 14 state-of-the-art methods. The code and results of our method are available at https://github.com/Zackisliuzao/MGTANet
最新的遥感图像显著性检测器主要依赖于RGB信息。然而,嵌入在深度图像中的空间和几何信息对光照和颜色的变化具有鲁棒性。将深度信息与RGB图像相结合可以增强物体的空间结构。鉴于此,我们创新性地提出了一种融合RGB和深度信息的遥感图像显著性检测模型,命名为多模态引导变压器架构(multimodal-guided transformer architecture, MGTA)。具体来说,我们首先引入了强相关互补融合(SCCF)模块来探索跨模态一致性和相似性,在发现多维公共信息的同时保持不同模态之间的一致性。此外,设计了全局-局部上下文信息交互(GLCII)模块,提取全局语义信息和局部细节信息,在减少参数数量的同时有效利用上下文信息。最后,采用级联特征引导解码器(CFGD)逐步融合分层解码特征,有效整合多层数据,准确定位目标位置。大量的实验表明,我们提出的模型优于14种最先进的方法。我们的方法的代码和结果可在https://github.com/Zackisliuzao/MGTANet上获得
{"title":"Multimodal-Guided Transformer Architecture for Remote Sensing Salient Object Detection","authors":"Bei Cheng;Zao Liu;Huxiao Tang;Qingwang Wang;Wenhao Chen;Tao Chen;Tao Shen","doi":"10.1109/LGRS.2025.3601083","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601083","url":null,"abstract":"The latest remote sensing image saliency detectors primarily rely on RGB information alone. However, spatial and geometric information embedded in depth images is robust to variations in lighting and color. Integrating depth information with RGB images can enhance the spatial structure of objects. In light of this, we innovatively propose a remote sensing image saliency detection model that fuses RGB and depth information, named the multimodal-guided transformer architecture (MGTA). Specifically, we first introduce the strongly correlated complementary fusion (SCCF) module to explore cross-modal consistency and similarity, maintaining consistency across different modalities while uncovering multidimensional common information. In addition, the global–local context information interaction (GLCII) module is designed to extract global semantic information and local detail information, effectively utilizing contextual information while reducing the number of parameters. Finally, a cascaded feature-guided decoder (CFGD) is employed to gradually fuse hierarchical decoding features, effectively integrating multilevel data and accurately locating target positions. Extensive experiments demonstrate that our proposed model outperforms 14 state-of-the-art methods. The code and results of our method are available at <uri>https://github.com/Zackisliuzao/MGTANet</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144998172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Channel Characterization Based on 3-D TransUnet-CBAM With Multiloss Function 基于多损耗函数的三维TransUnet-CBAM通道表征
Binpeng Yan;Jiaqi Zhao;Mutian Li;Rui Pan
The channel system is intimately linked to the formation of oil and gas reservoirs. In petroliferous basins, channel deposits frequently serve as both storage spaces and fluid conduits. Consequently, the accurate identification of channels in 3-D seismic data is, therefore, critical for reservoir prediction. Traditional seismic attribute-based methods can outline channel boundaries, but noise and stratigraphic complexity introduce discontinuities that reduce accuracy and require extensive manual correction. Deep learning-based methods outperform conventional methods in terms of efficiency and precision. However, the similar seismic signatures of channels and continuous karst caves in seismic profiles can still mislead the existing models. To address this challenge, we proposed an improved variant of the 3-D TransUnet model for 3-D seismic data recognition. The model incorporates channel and spatial attention mechanisms into the skip connections of the TransUnet architecture, effectively enhancing its feature representation capability and recognition accuracy. In addition, a multiloss function is introduced to improve the delineation and continuity of the channel while increasing the model’s robustness against nonchannel interference features. Experiments on synthetic and field seismic data confirm superior boundary delineation, continuity, and noise resistance compared with baseline methods.
河道系统与油气藏的形成有着密切的联系。在含油气盆地中,河道沉积往往既是储集空间又是流体通道。因此,在三维地震资料中准确识别通道对储层预测至关重要。传统的基于地震属性的方法可以勾勒出通道边界,但噪声和地层复杂性会引入不连续面,从而降低精度,需要大量的人工校正。基于深度学习的方法在效率和精度方面优于传统方法。然而,地震剖面中通道和连续溶洞的相似地震特征仍然会对现有模型产生误导。为了解决这一挑战,我们提出了一种改进的3-D TransUnet模型,用于3-D地震数据识别。该模型将通道和空间注意机制融入到TransUnet架构的跳跃连接中,有效提高了TransUnet架构的特征表示能力和识别精度。此外,引入多损失函数来改善信道的描绘和连续性,同时提高模型对非信道干扰特征的鲁棒性。合成和现场地震数据实验证实,与基线方法相比,该方法具有更好的边界圈定、连续性和抗噪性。
{"title":"Channel Characterization Based on 3-D TransUnet-CBAM With Multiloss Function","authors":"Binpeng Yan;Jiaqi Zhao;Mutian Li;Rui Pan","doi":"10.1109/LGRS.2025.3601200","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3601200","url":null,"abstract":"The channel system is intimately linked to the formation of oil and gas reservoirs. In petroliferous basins, channel deposits frequently serve as both storage spaces and fluid conduits. Consequently, the accurate identification of channels in 3-D seismic data is, therefore, critical for reservoir prediction. Traditional seismic attribute-based methods can outline channel boundaries, but noise and stratigraphic complexity introduce discontinuities that reduce accuracy and require extensive manual correction. Deep learning-based methods outperform conventional methods in terms of efficiency and precision. However, the similar seismic signatures of channels and continuous karst caves in seismic profiles can still mislead the existing models. To address this challenge, we proposed an improved variant of the 3-D TransUnet model for 3-D seismic data recognition. The model incorporates channel and spatial attention mechanisms into the skip connections of the TransUnet architecture, effectively enhancing its feature representation capability and recognition accuracy. In addition, a multiloss function is introduced to improve the delineation and continuity of the channel while increasing the model’s robustness against nonchannel interference features. Experiments on synthetic and field seismic data confirm superior boundary delineation, continuity, and noise resistance compared with baseline methods.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144914218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1