Pub Date : 2025-10-20DOI: 10.1109/LGRS.2025.3623244
Aybora Köksal;A. Aydın Alatan
Remote sensing (RS) applications often rely on edge hardware that cannot host the models in the 7B parametric vision language of today. This letter presents TinyRS, the first 2B-parameter vision language models (VLMs) optimized for RS, and TinyRS-R1, its reasoning-augmented variant. Based on Qwen2-VL-2B, TinyRS is trained via a four-stage pipeline: pretraining on million-scale satellite images, instruction tuning, fine-tuning with chain-of-thought (CoT) annotations from a new reasoning dataset, and group relative policy optimization (GRPO)-based alignment. TinyRS-R1 matches or surpasses recent 7B RS models in classification, visual question answering (VQA), grounding, and open-ended QA—while using one third of the memory and latency. CoT reasoning improves grounding and scene understanding, while TinyRS excels at concise, low-latency VQA. TinyRS-R1 is the first domain-specialized small VLM with GRPO-aligned CoT reasoning for general-purpose RS. The code, models, and caption datasets are available at https://github.com/aybora/TinyRS
{"title":"TinyRS-R1: Compact Vision Language Model for Remote Sensing","authors":"Aybora Köksal;A. Aydın Alatan","doi":"10.1109/LGRS.2025.3623244","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3623244","url":null,"abstract":"Remote sensing (RS) applications often rely on edge hardware that cannot host the models in the 7B parametric vision language of today. This letter presents TinyRS, the first 2B-parameter vision language models (VLMs) optimized for RS, and TinyRS-R1, its reasoning-augmented variant. Based on Qwen2-VL-2B, TinyRS is trained via a four-stage pipeline: pretraining on million-scale satellite images, instruction tuning, fine-tuning with chain-of-thought (CoT) annotations from a new reasoning dataset, and group relative policy optimization (GRPO)-based alignment. TinyRS-R1 matches or surpasses recent 7B RS models in classification, visual question answering (VQA), grounding, and open-ended QA—while using one third of the memory and latency. CoT reasoning improves grounding and scene understanding, while TinyRS excels at concise, low-latency VQA. TinyRS-R1 is the first domain-specialized small VLM with GRPO-aligned CoT reasoning for general-purpose RS. The code, models, and caption datasets are available at <uri>https://github.com/aybora/TinyRS</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145405232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-13DOI: 10.1109/LGRS.2025.3620872
Jingfan Wang;Wen Lu;Zeming Zhang;Zhaoyang Wang;Zhe Li
Transformer-based methods for remote sensing image super-resolution (SR) face challenges in reconstructing high-frequency textures due to the interference from large flat regions, such as farmlands and water bodies. To address these limitations, we propose a channel-enhanced multiscale window attention mechanism, which is designed to minimize the impact of flat regions on high-frequency area reconstruction while effectively utilizing the intrinsic multiscale features of remote sensing images. To better capture the multiscale features of remote sensing images, we introduce a series of depthwise separable convolution kernels of varying sizes during the shallow feature extraction stage. Experimental results demonstrate that the proposed method achieves superior peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) scores across multiple remote sensing benchmark datasets and scaling factors, validating its effectiveness.
{"title":"Multiscale Window Attention Channel Enhanced for Remote Sensing Image Super-Resolution","authors":"Jingfan Wang;Wen Lu;Zeming Zhang;Zhaoyang Wang;Zhe Li","doi":"10.1109/LGRS.2025.3620872","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3620872","url":null,"abstract":"Transformer-based methods for remote sensing image super-resolution (SR) face challenges in reconstructing high-frequency textures due to the interference from large flat regions, such as farmlands and water bodies. To address these limitations, we propose a channel-enhanced multiscale window attention mechanism, which is designed to minimize the impact of flat regions on high-frequency area reconstruction while effectively utilizing the intrinsic multiscale features of remote sensing images. To better capture the multiscale features of remote sensing images, we introduce a series of depthwise separable convolution kernels of varying sizes during the shallow feature extraction stage. Experimental results demonstrate that the proposed method achieves superior peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) scores across multiple remote sensing benchmark datasets and scaling factors, validating its effectiveness.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pretrained vision–language models (VLMs) have demonstrated promising performance in remote sensing (RS) image–text retrieval tasks. However, the scarcity of high-quality image–text datasets remains a challenge in fine-tuning VLMs for RS. The captions in existing datasets tend to be uniform and lack details. To fully use rich detailed information from RS images, we propose a method to fine-tune VLMs. We first construct a new visual–language dataset that balances both global and local information for RS (GLRS) image–text retrieval. Specifically, a multimodal large language model (MLLM) is used to generate captions for local patches and global captions for the entire image. To effectively use local information, we propose a global and local image captioning method (GLCap). With a large language model (LLM), we further obtain higher quality captions by merging both global and local captions. Finally, we fine-tune the weights of RS-M-contrastive language image pretraining (CLIP) with a progressive global–local fine-tuning strategy on GLRS. Experimental results demonstrate that our method outperforms state-of-the-art (SoTA) approaches on two common RS image–text retrieval downstream tasks. Our code and dataset are available at https://github.com/hhu-czy/GLRS
预训练的视觉语言模型(VLMs)在遥感图像文本检索任务中表现出了良好的性能。然而,高质量的图像-文本数据集的缺乏仍然是对遥感vlm进行微调的一个挑战,现有数据集的标题往往是统一的,缺乏细节。为了充分利用RS图像中丰富的细节信息,我们提出了一种微调VLMs的方法。我们首先构建了一个新的视觉语言数据集,该数据集平衡了RS (GLRS)图像文本检索的全局和局部信息。具体而言,使用多模态大语言模型(multimodal large language model, MLLM)生成局部补丁的标题和整个图像的全局标题。为了有效地利用局部信息,我们提出了一种全局和局部图像字幕方法(GLCap)。使用大型语言模型(LLM),我们通过合并全局和局部字幕进一步获得更高质量的字幕。最后,我们在GLRS上采用渐进的全局-局部微调策略对rs - m对比语言图像预训练(CLIP)的权重进行微调。实验结果表明,我们的方法在两个常见的RS图像文本检索下游任务上优于最先进的(SoTA)方法。我们的代码和数据集可在https://github.com/hhu-czy/GLRS上获得
{"title":"Integrating Global and Local Information for Remote Sensing Image–Text Retrieval","authors":"Ziyun Chen;Fan Liu;Zhangqingyun Guan;Qian Zhou;Xiaocong Zhou;Chuanyi Zhang","doi":"10.1109/LGRS.2025.3616154","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3616154","url":null,"abstract":"Pretrained vision–language models (VLMs) have demonstrated promising performance in remote sensing (RS) image–text retrieval tasks. However, the scarcity of high-quality image–text datasets remains a challenge in fine-tuning VLMs for RS. The captions in existing datasets tend to be uniform and lack details. To fully use rich detailed information from RS images, we propose a method to fine-tune VLMs. We first construct a new visual–language dataset that balances both global and local information for RS (GLRS) image–text retrieval. Specifically, a multimodal large language model (MLLM) is used to generate captions for local patches and global captions for the entire image. To effectively use local information, we propose a global and local image captioning method (GLCap). With a large language model (LLM), we further obtain higher quality captions by merging both global and local captions. Finally, we fine-tune the weights of RS-M-contrastive language image pretraining (CLIP) with a progressive global–local fine-tuning strategy on GLRS. Experimental results demonstrate that our method outperforms state-of-the-art (SoTA) approaches on two common RS image–text retrieval downstream tasks. Our code and dataset are available at <uri>https://github.com/hhu-czy/GLRS</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145455801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-12DOI: 10.1109/LGRS.2025.3609444
Li Liu;Yongcheng Zhou;Hang Xu;Jingxia Li;Jianguo Zhang;Lijun Zhou;Bingjie Wang
Automatic underground object classification based on deep learning (DL) has been widely used in ground penetrating radar (GPR) fields. However, its excellent performance heavily depends on sufficient labeled training data. In GPR fields, large amounts of labeled data are difficult to obtain due to time-consuming and experience-dependent manual annotation work. To address the issue of limited labeled data, we propose a novel semi-supervised learning (SSL) method for urban-road underground multiclass object classification. It fully utilizes abundant unlabeled data and limited labeled data to enhance classification performance. We applied a variant of the triple-GAN (TGAN) model and modified it by introducing a similarity constraint, which is associated with GPR data geometric features and can help to produce high-quality generated images. Experimental results of laboratory and field data show that it has higher accuracy than representative baseline methods under limited labeled data.
{"title":"Semi-Supervised Triple-GAN With Similarity Constraint for Automatic Underground Object Classification Using Ground Penetrating Radar Data","authors":"Li Liu;Yongcheng Zhou;Hang Xu;Jingxia Li;Jianguo Zhang;Lijun Zhou;Bingjie Wang","doi":"10.1109/LGRS.2025.3609444","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3609444","url":null,"abstract":"Automatic underground object classification based on deep learning (DL) has been widely used in ground penetrating radar (GPR) fields. However, its excellent performance heavily depends on sufficient labeled training data. In GPR fields, large amounts of labeled data are difficult to obtain due to time-consuming and experience-dependent manual annotation work. To address the issue of limited labeled data, we propose a novel semi-supervised learning (SSL) method for urban-road underground multiclass object classification. It fully utilizes abundant unlabeled data and limited labeled data to enhance classification performance. We applied a variant of the triple-GAN (TGAN) model and modified it by introducing a similarity constraint, which is associated with GPR data geometric features and can help to produce high-quality generated images. Experimental results of laboratory and field data show that it has higher accuracy than representative baseline methods under limited labeled data.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145078645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-11DOI: 10.1109/LGRS.2025.3608704
Long Tang;Hong Zhang;Yumei Li;Fan Xu;Fang Zou
This study investigates the large-scale ionospheric traveling disturbances (LSTIDs) over North America and Europe associated with the intense geomagnetic storm in May 2024, utilizing total electron content (TEC) data derived from ground-based Global Navigation Satellite System (GNSS) stations. The findings reveal that the observed LSTIDs in both regions exhibited an unusually prolonged duration, lasting for over 10 h from 17:00 UT on May 10 to 03:30 UT on May 11, 2024. This extended duration may be attributed to the continuous triggering of LSTIDs by auroral energy input during the geomagnetic storm. Additionally, significant differences in propagation characteristics, including velocities, azimuths, wavelengths, and traveling distances of LSTIDs, were observed between the two regions. These disparities in LSTID parameters are likely due to variations in the magnitude of energy input in the polar regions and local time differences in North America (14:00 LT) and Europe (19:00 LT), which cause diurnal electron-density contrast to influence LSTID propagation.
{"title":"Large-Scale Traveling Ionospheric Disturbances Over North America and Europe During the May 2024 Extreme Geomagnetic Storm","authors":"Long Tang;Hong Zhang;Yumei Li;Fan Xu;Fang Zou","doi":"10.1109/LGRS.2025.3608704","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3608704","url":null,"abstract":"This study investigates the large-scale ionospheric traveling disturbances (LSTIDs) over North America and Europe associated with the intense geomagnetic storm in May 2024, utilizing total electron content (TEC) data derived from ground-based Global Navigation Satellite System (GNSS) stations. The findings reveal that the observed LSTIDs in both regions exhibited an unusually prolonged duration, lasting for over 10 h from 17:00 UT on May 10 to 03:30 UT on May 11, 2024. This extended duration may be attributed to the continuous triggering of LSTIDs by auroral energy input during the geomagnetic storm. Additionally, significant differences in propagation characteristics, including velocities, azimuths, wavelengths, and traveling distances of LSTIDs, were observed between the two regions. These disparities in LSTID parameters are likely due to variations in the magnitude of energy input in the polar regions and local time differences in North America (14:00 LT) and Europe (19:00 LT), which cause diurnal electron-density contrast to influence LSTID propagation.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-10DOI: 10.1109/LGRS.2025.3608489
Xiaofei Yu;Jie Ma;Liqiang Qiao
Remote sensing image change captioning (RSICC) is a challenging task that involves describing surface changes between bitemporal or multitemporal satellite images using natural language. This task requires both fine-grained visual understanding and expressive language generation. Transformer-based and long short-term memory (LSTM)-based models have shown promising results in this domain. However, they may encounter difficulties in generating flexible and diverse captions, particularly when training data are limited or imbalanced. While diffusion models provide richer textual outputs, they are often constrained by long inference times. To address these issues, we propose a novel diffusion-based framework, KD-RSCC, for efficient and expressive remote sensing change captioning. This framework utilizes the Karras sampling method to significantly reduce the number of steps required during inference, while preserving the quality and diversity of the generated captions. In addition, we introduce a large language model (LLM)-based evaluation strategy $text {G-Eval}_{text {RSCC}}$ to conduct a more comprehensive assessment of the semantic accuracy, fluency, and linguistic diversity of the generated descriptions. Experimental results demonstrate that KD-RSCC achieves an optimal balance between generation quality and inference speed, enhancing the flexibility and readability of its outputs. The code and supplementary materials are available at https://github.com/Fay-Y/KD_RSCC
{"title":"KD-RSCC: A Karras Diffusion Framework for Efficient Remote Sensing Change Captioning","authors":"Xiaofei Yu;Jie Ma;Liqiang Qiao","doi":"10.1109/LGRS.2025.3608489","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3608489","url":null,"abstract":"Remote sensing image change captioning (RSICC) is a challenging task that involves describing surface changes between bitemporal or multitemporal satellite images using natural language. This task requires both fine-grained visual understanding and expressive language generation. Transformer-based and long short-term memory (LSTM)-based models have shown promising results in this domain. However, they may encounter difficulties in generating flexible and diverse captions, particularly when training data are limited or imbalanced. While diffusion models provide richer textual outputs, they are often constrained by long inference times. To address these issues, we propose a novel diffusion-based framework, KD-RSCC, for efficient and expressive remote sensing change captioning. This framework utilizes the Karras sampling method to significantly reduce the number of steps required during inference, while preserving the quality and diversity of the generated captions. In addition, we introduce a large language model (LLM)-based evaluation strategy <inline-formula> <tex-math>$text {G-Eval}_{text {RSCC}}$ </tex-math></inline-formula> to conduct a more comprehensive assessment of the semantic accuracy, fluency, and linguistic diversity of the generated descriptions. Experimental results demonstrate that KD-RSCC achieves an optimal balance between generation quality and inference speed, enhancing the flexibility and readability of its outputs. The code and supplementary materials are available at <uri>https://github.com/Fay-Y/KD_RSCC</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145090174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-09DOI: 10.1109/LGRS.2025.3607840
Xiaosheng Yu;Weiqi Bai;Jubo Chen;Jiawei Huang;Zhuoqun Fang;Zhaokui Li
Accurate segmentation of very high-resolution remote sensing images is vital for downstream tasks. Most semantic segmentation methods fail to fully consider the inherent characteristics of the images, such as intricate backgrounds, significant intraclass variance, and spatial interdependence of geographic object distribution. To address these challenges, we propose an efficient global–local scene awareness network with rotary position embedding (RoGLSNet). Specifically, we introduce the dynamic global filter (DGF) module to adaptively select frequency components, thereby mitigating interference from background noise. For high intraclass variance, the class center aware block (CCAB) performs class-level contextual modeling with spatial information integration. Additionally, the rotary position embedding (RoPE) is incorporated into vanilla attention to indirectly model the positional and distance relationships of geographic target objects. Extensive experimental results on two widely used datasets demonstrate that RoGLSNet outperforms the state-of-the-art (SOTA) segmentation methods. The code is available at https://github.com/bai101315/RoGLSNet
{"title":"RoGLSNet: An Efficient Global–Local Scene Awareness Network With Rotary Position Embedding for Remote Image Segmentation","authors":"Xiaosheng Yu;Weiqi Bai;Jubo Chen;Jiawei Huang;Zhuoqun Fang;Zhaokui Li","doi":"10.1109/LGRS.2025.3607840","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3607840","url":null,"abstract":"Accurate segmentation of very high-resolution remote sensing images is vital for downstream tasks. Most semantic segmentation methods fail to fully consider the inherent characteristics of the images, such as intricate backgrounds, significant intraclass variance, and spatial interdependence of geographic object distribution. To address these challenges, we propose an efficient global–local scene awareness network with rotary position embedding (RoGLSNet). Specifically, we introduce the dynamic global filter (DGF) module to adaptively select frequency components, thereby mitigating interference from background noise. For high intraclass variance, the class center aware block (CCAB) performs class-level contextual modeling with spatial information integration. Additionally, the rotary position embedding (RoPE) is incorporated into vanilla attention to indirectly model the positional and distance relationships of geographic target objects. Extensive experimental results on two widely used datasets demonstrate that RoGLSNet outperforms the state-of-the-art (SOTA) segmentation methods. The code is available at <uri>https://github.com/bai101315/RoGLSNet</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145073326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-08DOI: 10.1109/LGRS.2025.3606934
Jintong Xu;Xiao Xiao;Jingtian Tang
The controlled-source electromagnetic (CSEM) method is an important geophysical tool for sensing and studying subsurface conductivity structures. Advanced forward modeling techniques are crucial for the inversion and imaging of CSEM data. In this letter, we develop an accurate and efficient 3-D forward modeling algorithm for CSEM problems, combining spectral element method (SEM) and octree meshes. The SEM based on high-order basis functions can provide accurate CSEM responses, and the octree meshes enable local refinement, allowing for the discretization of models with fewer elements compared to the structured hexahedral meshes used in conventional SEM, while also providing the capability to handle complex models. Two synthetic examples are presented to verify the accuracy and efficiency of the algorithm. The utility of the algorithm is verified by a realistic model with complex geometry.
{"title":"Three-Dimensional Controlled-Source Electromagnetic Modeling Using Octree-Based Spectral Element Method","authors":"Jintong Xu;Xiao Xiao;Jingtian Tang","doi":"10.1109/LGRS.2025.3606934","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3606934","url":null,"abstract":"The controlled-source electromagnetic (CSEM) method is an important geophysical tool for sensing and studying subsurface conductivity structures. Advanced forward modeling techniques are crucial for the inversion and imaging of CSEM data. In this letter, we develop an accurate and efficient 3-D forward modeling algorithm for CSEM problems, combining spectral element method (SEM) and octree meshes. The SEM based on high-order basis functions can provide accurate CSEM responses, and the octree meshes enable local refinement, allowing for the discretization of models with fewer elements compared to the structured hexahedral meshes used in conventional SEM, while also providing the capability to handle complex models. Two synthetic examples are presented to verify the accuracy and efficiency of the algorithm. The utility of the algorithm is verified by a realistic model with complex geometry.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145078642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Resolution of time–frequency-based seismic attributes mainly relies on the time–frequency analysis tool. This study proposes an improved second-order synchroextracting wavelet transform (SSEWT) by optimizing the scale parameters and extraction scheme. Time–frequency computation on synthetic data shows a 5% improvement in efficiency. Then, we apply the proposed transform to fluid mobility calculation on field data, yielding a 5.6% increase in computational efficiency and an 11.26% improvement in resolution, demonstrating its superior performance. Field data tests demonstrate that the proposed transform and the related fluid mobility result outperform conventional methods. Despite remaining computational challenges, the method offers significant advancements in reservoir characterization and fluid detection.
{"title":"Fluid Mobility Attribute Extraction Based on Optimized Second-Order Synchroextracting Wavelet Transform","authors":"Yu Wang;Xiao Pan;Kang Shao;Ning Wang;Yuqiang Zhang;Xinyu Zhang;Chaoyang Lei;Xiaotao Wen","doi":"10.1109/LGRS.2025.3607097","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3607097","url":null,"abstract":"Resolution of time–frequency-based seismic attributes mainly relies on the time–frequency analysis tool. This study proposes an improved second-order synchroextracting wavelet transform (SSEWT) by optimizing the scale parameters and extraction scheme. Time–frequency computation on synthetic data shows a 5% improvement in efficiency. Then, we apply the proposed transform to fluid mobility calculation on field data, yielding a 5.6% increase in computational efficiency and an 11.26% improvement in resolution, demonstrating its superior performance. Field data tests demonstrate that the proposed transform and the related fluid mobility result outperform conventional methods. Despite remaining computational challenges, the method offers significant advancements in reservoir characterization and fluid detection.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145073140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-08DOI: 10.1109/LGRS.2025.3607205
Xiao Wang;Yisha Sun;Pan He
Convolutional neural network (CNN)-based methods have been widely applied in remote sensing scene classification (RSSC) and have achieved remarkable classification results. However, traditional CNN methods have certain limitations in extracting global features and capturing image semantics, especially in complex remote sensing (RS) image scenes. The Transformer can directly capture global features through the self-attention mechanism, but its performance is weaker when handling local details. Currently, methods that directly combine CNN and transformer features lead to feature imbalance and introduce redundant information. To address these issues, we propose AFIMNet, an adaptive feature interaction network for RSSC. First, we use a dual-branch network structure (based on ResNet34 and Swin-S) to extract local and global features from RS scene images. Second, we design an adaptive feature interaction module (AFIM) that effectively enhances the interaction and correlation between local and global features. Third, we use a spatial-channel fusion module (SCFM) to aggregate the interacted features, further strengthening feature representation capabilities. Our proposed method is validated on three public RS datasets, and experimental results show that AFIMNet has a stronger feature representation ability compared to current popular RS image classification methods, significantly improving classification accuracy. The source code will be publicly accessible at https://github.com/xavi276310/AFIMNet
{"title":"AFIMNet: An Adaptive Feature Interaction Network for Remote Sensing Scene Classification","authors":"Xiao Wang;Yisha Sun;Pan He","doi":"10.1109/LGRS.2025.3607205","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3607205","url":null,"abstract":"Convolutional neural network (CNN)-based methods have been widely applied in remote sensing scene classification (RSSC) and have achieved remarkable classification results. However, traditional CNN methods have certain limitations in extracting global features and capturing image semantics, especially in complex remote sensing (RS) image scenes. The Transformer can directly capture global features through the self-attention mechanism, but its performance is weaker when handling local details. Currently, methods that directly combine CNN and transformer features lead to feature imbalance and introduce redundant information. To address these issues, we propose AFIMNet, an adaptive feature interaction network for RSSC. First, we use a dual-branch network structure (based on ResNet34 and Swin-S) to extract local and global features from RS scene images. Second, we design an adaptive feature interaction module (AFIM) that effectively enhances the interaction and correlation between local and global features. Third, we use a spatial-channel fusion module (SCFM) to aggregate the interacted features, further strengthening feature representation capabilities. Our proposed method is validated on three public RS datasets, and experimental results show that AFIMNet has a stronger feature representation ability compared to current popular RS image classification methods, significantly improving classification accuracy. The source code will be publicly accessible at <uri>https://github.com/xavi276310/AFIMNet</uri>","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"22 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145061905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}