Pub Date : 2025-11-25DOI: 10.1109/LGRS.2025.3637108
Fang Ouyang;Jianguo Zhao;Xinze Liu;Bin Wang;Yu Zhang;Bohong Yan
Rock-physics theories and experiments have demonstrated that seismic wave velocity dispersion and attenuation are closely related to hydrocarbon deposits. To obtain the velocity at different seismic frequencies, the frequency-dependent amplitude variation with angle (AVA) inversion method has been developed to invert the dispersive velocity from frequency-domain P-wave reflection coefficients. Such a method can overcome the disability of the conventional AVA inversion in terms of seismic dispersion. However, the limitation is that only velocity dispersion is considered while the effects of seismic attenuation are neglected. In this letter, we proposed a new frequency-dependent AVA method, in which the thickness of reservoir and the complex dispersive P-wave velocity that includes the information of both dispersion and attenuation are simultaneously inverted. To better catch the characteristics of the reflections and transmissions between layers, the reflectivity method is adopted as the forward modeling engine. Furthermore, a modified simulated annealing method that takes advantages of the parameter-by-parameter optimization idea in heat-bath algorithm as well as the acceptance criteria used in Metropolis algorithm is developed, so as to achieve efficient and better global optimization for the complex inversion problem of high-degree nonlinearity and ill-posedness. Compared with previous frequency-dependent AVA methods, our improved approach can not only predict the P-wave velocity dispersion but also the frequency-dependent inverse quality factor of the reservoir layer. Using synthetic records and field data through a drilling well, the effectiveness and applicability of the proposed method in hydrocarbon indication are verified.
{"title":"Stochastic Frequency-Dependent Velocity and Attenuation Inversion for Hydrocarbon Detection","authors":"Fang Ouyang;Jianguo Zhao;Xinze Liu;Bin Wang;Yu Zhang;Bohong Yan","doi":"10.1109/LGRS.2025.3637108","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3637108","url":null,"abstract":"Rock-physics theories and experiments have demonstrated that seismic wave velocity dispersion and attenuation are closely related to hydrocarbon deposits. To obtain the velocity at different seismic frequencies, the frequency-dependent amplitude variation with angle (AVA) inversion method has been developed to invert the dispersive velocity from frequency-domain P-wave reflection coefficients. Such a method can overcome the disability of the conventional AVA inversion in terms of seismic dispersion. However, the limitation is that only velocity dispersion is considered while the effects of seismic attenuation are neglected. In this letter, we proposed a new frequency-dependent AVA method, in which the thickness of reservoir and the complex dispersive P-wave velocity that includes the information of both dispersion and attenuation are simultaneously inverted. To better catch the characteristics of the reflections and transmissions between layers, the reflectivity method is adopted as the forward modeling engine. Furthermore, a modified simulated annealing method that takes advantages of the parameter-by-parameter optimization idea in heat-bath algorithm as well as the acceptance criteria used in Metropolis algorithm is developed, so as to achieve efficient and better global optimization for the complex inversion problem of high-degree nonlinearity and ill-posedness. Compared with previous frequency-dependent AVA methods, our improved approach can not only predict the P-wave velocity dispersion but also the frequency-dependent inverse quality factor of the reservoir layer. Using synthetic records and field data through a drilling well, the effectiveness and applicability of the proposed method in hydrocarbon indication are verified.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674823","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1109/LGRS.2025.3636279
Yier Yan;Zhibin Liang;Changhong Liu;Tao Zou
With the rapid development of uncrewed aerial vehicle (UAV) technology, UAVs have provided an innovative solution for floating debris monitoring. However, object detection in UAV images remains challenging due to high miss rates for small objects, insufficient low-level feature extraction, and computational redundancy. This letter proposes an efficient floating debris detection model based on YOLOv8n, named EFD-you only look once (YOLO), to address these issues. First, the edge fusion stem (EFStem) module is proposed to enhance low-level feature extraction through an integrated gate-attention mechanism. Second, the multibranch efficient reparameterization block (MBERB) is designed to achieve efficient cross-layer feature fusion. Experimental results demonstrate that compared to YOLOv8n, our model achieves a 6.3% improvement in mean average precision (mAP) on the UAV floating debris dataset, while simultaneously reducing parameters by 26.7% and improving small object recall by 21.9%. The inference time of EFD-YOLO on the RK3588 edge device is as low as 30.5 ms, demonstrating real-time capability.
随着无人飞行器(UAV)技术的快速发展,无人机为浮物监测提供了创新的解决方案。然而,无人机图像中的目标检测仍然具有挑战性,因为小目标的高缺失率、低层次特征提取不足和计算冗余。为了解决这些问题,本信函提出了一种基于YOLOv8n的高效漂浮碎片检测模型,称为EFD-you only look once (YOLO)。首先,提出边缘融合干(EFStem)模块,通过集成门-注意机制增强底层特征提取;其次,设计多分支高效重参数化块(MBERB),实现高效的跨层特征融合;实验结果表明,与YOLOv8n相比,我们的模型在无人机漂浮碎片数据集上的平均精度(mAP)提高了6.3%,同时参数减少了26.7%,小目标召回率提高了21.9%。EFD-YOLO在RK3588边缘器件上的推理时间低至30.5 ms,显示出实时性。
{"title":"EFD-YOLO: An Improved YOLOv8 Network for River Floating Debris Object Detection","authors":"Yier Yan;Zhibin Liang;Changhong Liu;Tao Zou","doi":"10.1109/LGRS.2025.3636279","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3636279","url":null,"abstract":"With the rapid development of uncrewed aerial vehicle (UAV) technology, UAVs have provided an innovative solution for floating debris monitoring. However, object detection in UAV images remains challenging due to high miss rates for small objects, insufficient low-level feature extraction, and computational redundancy. This letter proposes an efficient floating debris detection model based on YOLOv8n, named EFD-you only look once (YOLO), to address these issues. First, the edge fusion stem (EFStem) module is proposed to enhance low-level feature extraction through an integrated gate-attention mechanism. Second, the multibranch efficient reparameterization block (MBERB) is designed to achieve efficient cross-layer feature fusion. Experimental results demonstrate that compared to YOLOv8n, our model achieves a 6.3% improvement in mean average precision (mAP) on the UAV floating debris dataset, while simultaneously reducing parameters by 26.7% and improving small object recall by 21.9%. The inference time of EFD-YOLO on the RK3588 edge device is as low as 30.5 ms, demonstrating real-time capability.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Remote sensing image segmentation is particularly difficult due to the coexistence of large-scale variations and fine-grained structures in very high-resolution imagery. Conventional CNN-based or transformer-based networks often struggle to capture global context while preserving boundary details, leading to degraded performance on small or thin objects. To address these challenges, we propose a self-prompt calibration network based on segment anything model 2 (SC-SAM). The SC-SAM achieves self-prompt by feeding mask prompts from a lightweight decoder into frozen prompt encoder. Output calibration is achieved through the proposed cross-probability guided calibration (CPGC) module, which employs cross-probability uncertainty as complementary guidance to refine final predictions via self-prompted outputs. Furthermore, to better preserve contextual and structural information across multiple scales, a scale-decoupled kernel mixture (SDKM) module is designed. Experimental results on the ISPRS Vaihingen and Potsdam dataset demonstrate that the proposed approach surpasses the state-of-the-art methods by 1.02% and 1.34% in mIoU, highlighting its effectiveness. This study provides new insights into adapting SAM for domain-specific remote sensing segmentation tasks.
{"title":"A Self-Prompt Calibration Network Based on Segment Anything Model 2 for High-Resolution Remote Sensing Image Segmentation","authors":"Yizhou Lan;Daoyuan Zheng;Xinge Zhao;Ke Shang;Feizhou Zhang","doi":"10.1109/LGRS.2025.3636177","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3636177","url":null,"abstract":"Remote sensing image segmentation is particularly difficult due to the coexistence of large-scale variations and fine-grained structures in very high-resolution imagery. Conventional CNN-based or transformer-based networks often struggle to capture global context while preserving boundary details, leading to degraded performance on small or thin objects. To address these challenges, we propose a self-prompt calibration network based on segment anything model 2 (SC-SAM). The SC-SAM achieves self-prompt by feeding mask prompts from a lightweight decoder into frozen prompt encoder. Output calibration is achieved through the proposed cross-probability guided calibration (CPGC) module, which employs cross-probability uncertainty as complementary guidance to refine final predictions via self-prompted outputs. Furthermore, to better preserve contextual and structural information across multiple scales, a scale-decoupled kernel mixture (SDKM) module is designed. Experimental results on the ISPRS Vaihingen and Potsdam dataset demonstrate that the proposed approach surpasses the state-of-the-art methods by 1.02% and 1.34% in mIoU, highlighting its effectiveness. This study provides new insights into adapting SAM for domain-specific remote sensing segmentation tasks.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
To be effective, ecosystem and habitat conservation must not only look at past losses but also understand the effects of current and future decisions on landscapes. Here, we present a transformative, user-driven land cover change prediction tool designed to aid land planners in strategic decision-making for conservation and habitat protection. Within an integrated map-based prediction pipeline, the tool uses machine learning (ML) and deep learning (DL) models to classify satellite images and make predictions of near-term land cover changes. The tool facilitates user interaction with a cloud-hosted ML model, making it accessible to nontechnical users for generating map-based predictions using big data. The tool’s key strength lies in its dynamic variable adjustment feature, empowering users to tailor scenarios related to potential future development planning. Through the integration of cloud-hosted ML and DL models with a user-centric interface, the tool has the potential to allow stakeholders and land planners to make informed decisions, actively minimizing habitat destruction and aligning with broader conservation objectives. We tested our approach in the context of central Texas, USA to evaluate its effectiveness in diverse conservation scenarios, with an average overall accuracy of 88% for the land cover class maps over four years and over 72% for the five-year land cover change prediction. While our approach has the potential to improve land management and planning for conservation, we also acknowledge the importance of rigorous model validation and ongoing refinement and highlight the need for technological advancement to be developed with strong stakeholder engagement.
{"title":"User-Driven Land Cover Change Prediction Map Tool for Land Conservation Planning","authors":"Pui-Yu Ling;Laura Nunes;Jonathan Srinivasan;Nasir Popalzay;Palmer Wilson;Jameson Quisenberry;Alex Borowicz","doi":"10.1109/LGRS.2025.3636286","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3636286","url":null,"abstract":"To be effective, ecosystem and habitat conservation must not only look at past losses but also understand the effects of current and future decisions on landscapes. Here, we present a transformative, user-driven land cover change prediction tool designed to aid land planners in strategic decision-making for conservation and habitat protection. Within an integrated map-based prediction pipeline, the tool uses machine learning (ML) and deep learning (DL) models to classify satellite images and make predictions of near-term land cover changes. The tool facilitates user interaction with a cloud-hosted ML model, making it accessible to nontechnical users for generating map-based predictions using big data. The tool’s key strength lies in its dynamic variable adjustment feature, empowering users to tailor scenarios related to potential future development planning. Through the integration of cloud-hosted ML and DL models with a user-centric interface, the tool has the potential to allow stakeholders and land planners to make informed decisions, actively minimizing habitat destruction and aligning with broader conservation objectives. We tested our approach in the context of central Texas, USA to evaluate its effectiveness in diverse conservation scenarios, with an average overall accuracy of 88% for the land cover class maps over four years and over 72% for the five-year land cover change prediction. While our approach has the potential to improve land management and planning for conservation, we also acknowledge the importance of rigorous model validation and ongoing refinement and highlight the need for technological advancement to be developed with strong stakeholder engagement.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11265796","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1109/LGRS.2025.3636165
Nguyen Anh Tu;Nursultan Makhanov;Kenzhebek Taniyev;Ton Duc Do
Aerial video captioning (VC) facilitates the automatic interpretation of dynamic scenes in remote sensing (RS), supporting critical applications, such as disaster response, traffic monitoring, and environmental surveillance. However, challenges, such as extreme angles and continuous camera motion, require adaptive modeling of complex temporal relationships. To tackle these challenges, we leverage an image-language model as the vision encoder and introduce a temporal adaptation module that combines convolution with self-attention layers to both capture local semantics across neighboring frames and model global temporal dependencies. This design allows our model to exploit the multimodal knowledge of the vision encoder while effectively reasoning over the spatiotemporal dynamics. In addition, privacy concerns often restrict access to annotated aerial datasets, posing further challenges for model training. To address this, we develop a federated learning (FL) framework that enables collaborative model training across decentralized clients. Within this framework, we establish a unified benchmark for systematic comparison of temporal adapters, text decoders, and FL strategies, hence filling a gap in the existing literature. Extensive experiments validate the robustness of our approach and its potential for advancing aerial VC.
{"title":"Federated Aerial Video Captioning With Effective Temporal Adaptation","authors":"Nguyen Anh Tu;Nursultan Makhanov;Kenzhebek Taniyev;Ton Duc Do","doi":"10.1109/LGRS.2025.3636165","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3636165","url":null,"abstract":"Aerial video captioning (VC) facilitates the automatic interpretation of dynamic scenes in remote sensing (RS), supporting critical applications, such as disaster response, traffic monitoring, and environmental surveillance. However, challenges, such as extreme angles and continuous camera motion, require adaptive modeling of complex temporal relationships. To tackle these challenges, we leverage an image-language model as the vision encoder and introduce a temporal adaptation module that combines convolution with self-attention layers to both capture local semantics across neighboring frames and model global temporal dependencies. This design allows our model to exploit the multimodal knowledge of the vision encoder while effectively reasoning over the spatiotemporal dynamics. In addition, privacy concerns often restrict access to annotated aerial datasets, posing further challenges for model training. To address this, we develop a federated learning (FL) framework that enables collaborative model training across decentralized clients. Within this framework, we establish a unified benchmark for systematic comparison of temporal adapters, text decoders, and FL strategies, hence filling a gap in the existing literature. Extensive experiments validate the robustness of our approach and its potential for advancing aerial VC.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-24DOI: 10.1109/LGRS.2025.3636236
Qi Zeng;Wanchun Zhang;Jie Cheng
This study develops an integrated framework for all-sky surface longwave downward radiation (SLDR) estimate for the medium resolution spectral imager-II (MERSI-II) onboard the Fengyun-3D (FY-3D) satellite. The framework comprises a hybrid method for the clear-sky SLDR estimate and a cloud base temperature (CBT)-based single-layer cloud model (SLCM) for the cloudy-sky SLDR estimate. In situ validation indicates that the hybrid method yields a bias/RMSE of −0.78/21.70 W/m2, whereas the SLCM achieves a bias/RMSE of 5.79/23.61 W/m2. The bias/RMSE of the all-sky SLDR is 3.37/22.93 W/m2. The estimated all-sky instantaneous SLDR was combined with ERA5 temporal information to derive daily SLDR using a bias-corrected sinusoidal integration method, yielding a bias of 0.04 W/m2 and an RMSE of 16.77 W/m2. These results demonstrate the robustness of the proposed framework and its substantial potential in generating both instantaneous and daily SLDR products at 1 km spatial resolution.
{"title":"An Integrated Framework for Estimating the All-Sky Surface Downward Longwave Radiation From FY-3D/MERSI-II Imagery","authors":"Qi Zeng;Wanchun Zhang;Jie Cheng","doi":"10.1109/LGRS.2025.3636236","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3636236","url":null,"abstract":"This study develops an integrated framework for all-sky surface longwave downward radiation (SLDR) estimate for the medium resolution spectral imager-II (MERSI-II) onboard the Fengyun-3D (FY-3D) satellite. The framework comprises a hybrid method for the clear-sky SLDR estimate and a cloud base temperature (CBT)-based single-layer cloud model (SLCM) for the cloudy-sky SLDR estimate. In situ validation indicates that the hybrid method yields a bias/RMSE of −0.78/21.70 W/m2, whereas the SLCM achieves a bias/RMSE of 5.79/23.61 W/m2. The bias/RMSE of the all-sky SLDR is 3.37/22.93 W/m2. The estimated all-sky instantaneous SLDR was combined with ERA5 temporal information to derive daily SLDR using a bias-corrected sinusoidal integration method, yielding a bias of 0.04 W/m2 and an RMSE of 16.77 W/m2. These results demonstrate the robustness of the proposed framework and its substantial potential in generating both instantaneous and daily SLDR products at 1 km spatial resolution.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-20DOI: 10.1109/LGRS.2025.3635413
Brent Martin;Norman W. H. Mason;James D. Shepherd;Jan Schindler
The New Zealand Land Use Carbon Analysis System Land Use Map (LUCAS LUM) is a series of land use layers that map land use classes, including both exotic and native forest, dating back to 1990 and updated every four years since 2008. This map is a rich resource, but the significant effort required to update it means errors may creep in without detection. We trialed whether a deep learning model could be trained on this imperfect data. We found the model predicts exotic forestry nationally to a higher level of accuracy than previously achieved. The resulting layer was used to detect and correct missed exotic forest plantations in the current LUCAS LUM. We also demonstrate that the exotic forestry prediction is sufficiently sensitive to detect wilding conifer infestations and estimate infestation density. Our results highlight the effectiveness of weakly supervised learning, enabling accurate and scalable national land use and land cover mapping while drastically reducing manual labeling efforts.
{"title":"Improving New Zealand’s Vegetation Mapping Using Weakly Supervised Learning","authors":"Brent Martin;Norman W. H. Mason;James D. Shepherd;Jan Schindler","doi":"10.1109/LGRS.2025.3635413","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3635413","url":null,"abstract":"The New Zealand Land Use Carbon Analysis System Land Use Map (LUCAS LUM) is a series of land use layers that map land use classes, including both exotic and native forest, dating back to 1990 and updated every four years since 2008. This map is a rich resource, but the significant effort required to update it means errors may creep in without detection. We trialed whether a deep learning model could be trained on this imperfect data. We found the model predicts exotic forestry nationally to a higher level of accuracy than previously achieved. The resulting layer was used to detect and correct missed exotic forest plantations in the current LUCAS LUM. We also demonstrate that the exotic forestry prediction is sufficiently sensitive to detect wilding conifer infestations and estimate infestation density. Our results highlight the effectiveness of weakly supervised learning, enabling accurate and scalable national land use and land cover mapping while drastically reducing manual labeling efforts.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145674859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Forest tree species classification has great significance for sustainable development of forest resource. Multisource remote sensing data provide abundant temporal, spatial, and spectral information for tree species classification. However, there lacks tree species classification methods, which comprehensively capture and fuse spatio–temporal–spectral information. Therefore, a tree species classification method based on deep ensemble learning of multisource spatio–temporal–spectral remote sensing data is proposed. First, multitemporal, high-resolution, and hyperspectral data are utilized for training temporal, spatial, and spectral deep networks. Furtherly, deep ensemble learning is developed for the fusion of spatio–temporal–spectral network outputs, where weighted fusion is implemented via dynamic weight optimization based on the spatio–temporal–spatial features. Experimental results indicate that the importance of temporal features is higher than that of spatial information, and spectral networks perform best among all network structures. After the spatio–temporal–spectral ensemble learning, the performance of tree species classification is further improved, and the overall accuracy (OA) of the proposed method reaches above 90%. The proposed algorithm realizes precise and fine-scale tree species classification and provides technique support for the monitoring and conservation of forest resource.
{"title":"Forest Tree Species Classification Based on Deep Ensemble Learning by Fusing High-Resolution, Multitemporal, and Hyperspectral Multisource Remote Sensing Data","authors":"Dengli Yu;Lilin Tu;Ziqing Wei;Fuyao Zhu;Chengjun Yu;Denghong Wang;Jiayi Li;Xin Huang","doi":"10.1109/LGRS.2025.3634553","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3634553","url":null,"abstract":"Forest tree species classification has great significance for sustainable development of forest resource. Multisource remote sensing data provide abundant temporal, spatial, and spectral information for tree species classification. However, there lacks tree species classification methods, which comprehensively capture and fuse spatio–temporal–spectral information. Therefore, a tree species classification method based on deep ensemble learning of multisource spatio–temporal–spectral remote sensing data is proposed. First, multitemporal, high-resolution, and hyperspectral data are utilized for training temporal, spatial, and spectral deep networks. Furtherly, deep ensemble learning is developed for the fusion of spatio–temporal–spectral network outputs, where weighted fusion is implemented via dynamic weight optimization based on the spatio–temporal–spatial features. Experimental results indicate that the importance of temporal features is higher than that of spatial information, and spectral networks perform best among all network structures. After the spatio–temporal–spectral ensemble learning, the performance of tree species classification is further improved, and the overall accuracy (OA) of the proposed method reaches above 90%. The proposed algorithm realizes precise and fine-scale tree species classification and provides technique support for the monitoring and conservation of forest resource.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1109/LGRS.2025.3634593
HengYu Li;Bo Huang;JianYong Lv
Driven by the increasing demand for intelligent Earth observation and large-scale scene understanding, remote sensing object detection has gained significant academic and practical importance. Despite notable progress in feature extraction and computational efficiency, many recent approaches still struggle to effectively handle issues such as detecting objects at multiple scales and preserving small targets. In this letter, an efficient remote sensing object detector called multiscale and feature-preserving YOLO with gated attention (YOLO-MFG) is proposed to address these challenges. First, a multiscale group shuffle attention (MGSA) module is introduced to adaptively aggregate multiscale spatial features, improving the model’s sensitivity to objects of diverse sizes. Second, the use of feature-preserving downsampling (FPD) enhances the downsampling process by introducing a triple-branch fusion mechanism that mitigates aliasing while jointly preserving semantics, saliency, and geometry. Finally, gated enhanced attention (GEA) is integrated to capture long-range dependencies and contextual cues crucial for remote sensing scenarios. The experimental results demonstrate that the proposed YOLO-MFG achieves a 2.9% improvement in mean average precision at an intersection over union (IoU) threshold of 0.5 (mAP50) on the optical remote sensing dataset SIMD compared with YOLO11. In addition, the mAP50 of detection results is improved by 1.4% and 4.2% on the DIOR and NWPU VHR-10 datasets, respectively.
{"title":"YOLO-MFG: Multiscale and Feature-Preserving YOLO With Gated Attention for Remote Sensing Object Detection","authors":"HengYu Li;Bo Huang;JianYong Lv","doi":"10.1109/LGRS.2025.3634593","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3634593","url":null,"abstract":"Driven by the increasing demand for intelligent Earth observation and large-scale scene understanding, remote sensing object detection has gained significant academic and practical importance. Despite notable progress in feature extraction and computational efficiency, many recent approaches still struggle to effectively handle issues such as detecting objects at multiple scales and preserving small targets. In this letter, an efficient remote sensing object detector called multiscale and feature-preserving YOLO with gated attention (YOLO-MFG) is proposed to address these challenges. First, a multiscale group shuffle attention (MGSA) module is introduced to adaptively aggregate multiscale spatial features, improving the model’s sensitivity to objects of diverse sizes. Second, the use of feature-preserving downsampling (FPD) enhances the downsampling process by introducing a triple-branch fusion mechanism that mitigates aliasing while jointly preserving semantics, saliency, and geometry. Finally, gated enhanced attention (GEA) is integrated to capture long-range dependencies and contextual cues crucial for remote sensing scenarios. The experimental results demonstrate that the proposed YOLO-MFG achieves a 2.9% improvement in mean average precision at an intersection over union (IoU) threshold of 0.5 (mAP50) on the optical remote sensing dataset SIMD compared with YOLO11. In addition, the mAP50 of detection results is improved by 1.4% and 4.2% on the DIOR and NWPU VHR-10 datasets, respectively.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-19DOI: 10.1109/LGRS.2025.3634759
Shengyi Wang;Xuehua Chen;Cong Wang;Junjie Liu;Xin Luo
High-resolution time–frequency analysis is crucial for seismic interpretation. Conventional sparse time–frequency transforms, such as the sparse generalized S transform (SGST), are not adaptive to the intrinsic characteristics of the signal. To address this limitation, we propose a sparse adaptive generalized S transform (SAGST). This method incorporates the signal amplitude spectrum into the Gaussian window function, allowing the window to adapt dynamically to the signal characteristics. This adaptive mechanism enables the construction of wavelet bases that are better matched to the signal. We apply the SAGST to the time–frequency analysis of both synthetic signal and field seismic data. The synthetic signal test shows that the SAGST achieves higher energy concentration, superior computational efficiency, and enhanced weak signal extraction compared with the sparse adaptive S transform (SAST) and SGST. A field example demonstrates that the SAGST can be used to indicate low-frequency shadow associated with hydrocarbon reservoirs.
{"title":"The Sparse Adaptive Generalized S Transform","authors":"Shengyi Wang;Xuehua Chen;Cong Wang;Junjie Liu;Xin Luo","doi":"10.1109/LGRS.2025.3634759","DOIUrl":"https://doi.org/10.1109/LGRS.2025.3634759","url":null,"abstract":"High-resolution time–frequency analysis is crucial for seismic interpretation. Conventional sparse time–frequency transforms, such as the sparse generalized S transform (SGST), are not adaptive to the intrinsic characteristics of the signal. To address this limitation, we propose a sparse adaptive generalized S transform (SAGST). This method incorporates the signal amplitude spectrum into the Gaussian window function, allowing the window to adapt dynamically to the signal characteristics. This adaptive mechanism enables the construction of wavelet bases that are better matched to the signal. We apply the SAGST to the time–frequency analysis of both synthetic signal and field seismic data. The synthetic signal test shows that the SAGST achieves higher energy concentration, superior computational efficiency, and enhanced weak signal extraction compared with the sparse adaptive S transform (SAST) and SGST. A field example demonstrates that the SAGST can be used to indicate low-frequency shadow associated with hydrocarbon reservoirs.","PeriodicalId":91017,"journal":{"name":"IEEE geoscience and remote sensing letters : a publication of the IEEE Geoscience and Remote Sensing Society","volume":"23 ","pages":"1-5"},"PeriodicalIF":4.4,"publicationDate":"2025-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145612170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}