Ecological restoration increasingly requires decision-support frameworks that can integrate diverse remote sensing indicators while maintaining interpretability for practical planning. Existing approaches often emphasize ecosystem assessment but provide limited guidance for selecting specific restoration projects, due to indicator redundancy and opacity of closed-box deep learning models. To bridge this gap, this study proposes a weakly interpretable machine learning framework that combines principal component analysis (PCA) with k-nearest neighbors (KNN), termed PCA-KNN, for ecological restoration engineering planning. PCA fuses multisource ecological indicators into principal components while retaining transparent loadings linked to underlying ecological processes, and KNN provides sample-based classification with traceable decision logic. Together, this structure allows ecological indicators and engineering decisions to be explicitly connected, balancing predictive performance with practical interpretability. The framework is applied to the northern slope of the Qinling Mountains, China, an ecologically significant region facing soil erosion, vegetation degradation, land-use conflict, and wetland decline. Using 13 kinds of remote sensing-derived indicators and field-verified engineering labels, the method achieves 87.6% block-level accuracy and 66.8% pixel-level accuracy, effectively mapping four restoration engineering types and reducing the planning cycle by more than half. Results demonstrate that the proposed PCA-KNN framework translates remote sensing ecological indicators into actionable engineering decisions, offering an operational and scalable pathway for data-driven restoration planning. This work advances remote sensing-based ecological restoration from post-hoc evaluation toward transparent, engineering-oriented planning support.
{"title":"Application of Weakly Interpretable PCA-KNN Framework in Ecological Restoration Engineering Planning","authors":"Yuwan Xue;Xiaoping Li;Hui Li;Qiangjun Yang;Yan Zhang;Taoli Yang;Hanwen Yu","doi":"10.1109/JSTARS.2025.3638651","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3638651","url":null,"abstract":"Ecological restoration increasingly requires decision-support frameworks that can integrate diverse remote sensing indicators while maintaining interpretability for practical planning. Existing approaches often emphasize ecosystem assessment but provide limited guidance for selecting specific restoration projects, due to indicator redundancy and opacity of closed-box deep learning models. To bridge this gap, this study proposes a weakly interpretable machine learning framework that combines principal component analysis (PCA) with k-nearest neighbors (KNN), termed PCA-KNN, for ecological restoration engineering planning. PCA fuses multisource ecological indicators into principal components while retaining transparent loadings linked to underlying ecological processes, and KNN provides sample-based classification with traceable decision logic. Together, this structure allows ecological indicators and engineering decisions to be explicitly connected, balancing predictive performance with practical interpretability. The framework is applied to the northern slope of the Qinling Mountains, China, an ecologically significant region facing soil erosion, vegetation degradation, land-use conflict, and wetland decline. Using 13 kinds of remote sensing-derived indicators and field-verified engineering labels, the method achieves 87.6% block-level accuracy and 66.8% pixel-level accuracy, effectively mapping four restoration engineering types and reducing the planning cycle by more than half. Results demonstrate that the proposed PCA-KNN framework translates remote sensing ecological indicators into actionable engineering decisions, offering an operational and scalable pathway for data-driven restoration planning. This work advances remote sensing-based ecological restoration from post-hoc evaluation toward transparent, engineering-oriented planning support.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1277-1287"},"PeriodicalIF":5.3,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271138","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/JSTARS.2025.3638412
Kuo-Liang Chung;Jui-Che Chang
Feature matching is a fundamental task in remote sensing and 3-D vision. In this article, a new feature matching algorithm is proposed under the random sample consensus (RANSAC) interaction model in which the global RANSAC works on the initial correspondence set $mathbf {C}$ and the local RANSAC works on the reliable local correspondence set, which is initially constructed by removing outliers from $mathbf {C}$. To increase the matching accuracy, after each RANSAC interaction round, the proposed self-adjusting strategy updates the local correspondence set adaptively by adding some potential correspondences from $mathbf {C}$, but removing some unreliable local correspondences. Combining the global and local confidence level conditions with our two early termination conditions, namely, the local early termination condition and the global maximal RANSAC interaction round constraint, it can achieve the best compromise between matching accuracy and time for different inlier rate cases. Finally, we apply the weighted SVD-based method to estimate the global model solution. Based on 873 testing image pairs, comprehensive experimental results have justified the matching accuracy and execution time merits of our algorithm relative to the state-of-the-art methods.
{"title":"Feature Matching via Self-Adjusting Reliable Correspondence Set and Early Termination","authors":"Kuo-Liang Chung;Jui-Che Chang","doi":"10.1109/JSTARS.2025.3638412","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3638412","url":null,"abstract":"Feature matching is a fundamental task in remote sensing and 3-D vision. In this article, a new feature matching algorithm is proposed under the random sample consensus (RANSAC) interaction model in which the global RANSAC works on the initial correspondence set <inline-formula><tex-math>$mathbf {C}$</tex-math></inline-formula> and the local RANSAC works on the reliable local correspondence set, which is initially constructed by removing outliers from <inline-formula><tex-math>$mathbf {C}$</tex-math></inline-formula>. To increase the matching accuracy, after each RANSAC interaction round, the proposed self-adjusting strategy updates the local correspondence set adaptively by adding some potential correspondences from <inline-formula><tex-math>$mathbf {C}$</tex-math></inline-formula>, but removing some unreliable local correspondences. Combining the global and local confidence level conditions with our two early termination conditions, namely, the local early termination condition and the global maximal RANSAC interaction round constraint, it can achieve the best compromise between matching accuracy and time for different inlier rate cases. Finally, we apply the weighted SVD-based method to estimate the global model solution. Based on 873 testing image pairs, comprehensive experimental results have justified the matching accuracy and execution time merits of our algorithm relative to the state-of-the-art methods.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1165-1182"},"PeriodicalIF":5.3,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271143","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the monitoring of discontinuous ground-based synthetic aperture radar (GB-SAR), challenges such as repositioning error and atmospheric phase screen (APS) can significantly impact the accuracy of deformation inversion. Existing compensation methods are limited to specific scanning modes (linear-scanning or arc-scanning) and lack a unified framework, leading to suboptimal performance in complex scenarios. We propose a novel joint compensation model applicable to both linear-scanning and arc-scanning GB-SAR. By formulating repositioning error as ternary functions of positional shifts and linearizing them through first-order approximation, the method establishes a unified phase error model. A high-order range error component is integrated to characterize APS effects. The combined model parameters are optimized by gradient descent. Experimental validation using near-field and far-field datasets demonstrates significant improvements: in linear-scanning mode, the residual phase RMSE is reduced by 40.4%, while in arc-scanning mode, it decreased by 6.8%. The proposed framework effectively compensates for two errors, outperforming conventional approaches by unifying compensation across scanning geometries. This study enables high-precision deformation monitoring in diverse GB-SAR applications, advancing the reliability of geological hazard early warning and infrastructure assessment.
{"title":"A Joint Compensation Method for Repositioning Error and Atmospheric Phase Screen of Ground-Based SAR","authors":"Zechao Bai;Peng Cui;Yanping Wang;Hui Liu;Yun Lin;Yang Li;Wenjie Shen","doi":"10.1109/JSTARS.2025.3638786","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3638786","url":null,"abstract":"In the monitoring of discontinuous ground-based synthetic aperture radar (GB-SAR), challenges such as repositioning error and atmospheric phase screen (APS) can significantly impact the accuracy of deformation inversion. Existing compensation methods are limited to specific scanning modes (linear-scanning or arc-scanning) and lack a unified framework, leading to suboptimal performance in complex scenarios. We propose a novel joint compensation model applicable to both linear-scanning and arc-scanning GB-SAR. By formulating repositioning error as ternary functions of positional shifts and linearizing them through first-order approximation, the method establishes a unified phase error model. A high-order range error component is integrated to characterize APS effects. The combined model parameters are optimized by gradient descent. Experimental validation using near-field and far-field datasets demonstrates significant improvements: in linear-scanning mode, the residual phase RMSE is reduced by 40.4%, while in arc-scanning mode, it decreased by 6.8%. The proposed framework effectively compensates for two errors, outperforming conventional approaches by unifying compensation across scanning geometries. This study enables high-precision deformation monitoring in diverse GB-SAR applications, advancing the reliability of geological hazard early warning and infrastructure assessment.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1995-2005"},"PeriodicalIF":5.3,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271375","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145886539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-28DOI: 10.1109/JSTARS.2025.3638382
Tianyu Wei;He Chen;Wenchao Liu;Liang Chen;Panzhe Gu;Jue Wang
Building footprint extraction using optical and synthetic aperture radar (SAR) images enables all-weather capability and significantly boosts performance. In practical scenarios, optical data may not be available, leading to the missing-modality challenge. To overcome this challenge, advanced methods employ mainstream knowledge distillation approaches with hallucination network schemes to improve performance. However, under complex SAR backgrounds, current hallucination-network-based methods suffer from cross-modal information transfer failure between optical and hallucination models. To solve this problem, this study introduces a cross-modal hallucination collaborative learning (CMH-CL) method, consisting of two components: modality-share information alignment learning (MSAL) and multimodal fusion information alignment learning (MFAL). The MSAL method facilitates cross-modal knowledge transfer between optical and hallucination encoders, thereby enabling the hallucination model to effectively mimic the missing optical modality. The MFAL method aligns semantic information between OPT-SAR and HAL-SAR fusion heads to strengthen their semantic consistency, thereby improving HAL-SAR fusion performance. By combining MSAL and MFAL, the CMH-CL method collaboratively alleviates cross-modal transfer failure problem between the optical and hallucination models, thereby improving performance in missing-modality building footprint extraction. Extensive experimental results obtained on a public dataset demonstrate the effectiveness of the proposed CMH-CL. The source code is available at https://github.com/TINYWAI/CMH-CL.
{"title":"Optical and SAR Cross-Modal Hallucination Collaborative Learning for Remote Sensing Missing-Modality Building Footprint Extraction","authors":"Tianyu Wei;He Chen;Wenchao Liu;Liang Chen;Panzhe Gu;Jue Wang","doi":"10.1109/JSTARS.2025.3638382","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3638382","url":null,"abstract":"Building footprint extraction using optical and synthetic aperture radar (SAR) images enables all-weather capability and significantly boosts performance. In practical scenarios, optical data may not be available, leading to the missing-modality challenge. To overcome this challenge, advanced methods employ mainstream knowledge distillation approaches with hallucination network schemes to improve performance. However, under complex SAR backgrounds, current hallucination-network-based methods suffer from cross-modal information transfer failure between optical and hallucination models. To solve this problem, this study introduces a cross-modal hallucination collaborative learning (CMH-CL) method, consisting of two components: modality-share information alignment learning (MSAL) and multimodal fusion information alignment learning (MFAL). The MSAL method facilitates cross-modal knowledge transfer between optical and hallucination encoders, thereby enabling the hallucination model to effectively mimic the missing optical modality. The MFAL method aligns semantic information between OPT-SAR and HAL-SAR fusion heads to strengthen their semantic consistency, thereby improving HAL-SAR fusion performance. By combining MSAL and MFAL, the CMH-CL method collaboratively alleviates cross-modal transfer failure problem between the optical and hallucination models, thereby improving performance in missing-modality building footprint extraction. Extensive experimental results obtained on a public dataset demonstrate the effectiveness of the proposed CMH-CL. The source code is available at <uri>https://github.com/TINYWAI/CMH-CL</uri>.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1183-1196"},"PeriodicalIF":5.3,"publicationDate":"2025-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271136","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-27DOI: 10.1109/JSTARS.2025.3637818
Yilu Gong;Yifan Wang;Jun Yang
Research on the thermal environmental effects of urban green spaces has traditionally been constrained to the block scale due to the lack of high-resolution remote sensing data capable of accurately capturing small and fragmented green patches, thereby limiting the spatial precision of existing findings. This study employs multitemporal high-resolution remote sensing imagery and multiple fractal models—including grid dimension (GD), boundary dimension A (BD-A), boundary dimension B, and radius dimension—to investigate both linear and nonlinear relationships between green space fractal characteristics and surface temperature across temperature-zone scales. The results show that: 1) the linear and nonlinear effects of fractal features differ substantially, with GD identified as the dominant indicator and BD-A as the most variable; 2) fractal effects exhibit significant scale dependence, with the medium-temperature zone showing the strongest sensitivity to fractal structure; and 3) strong interactions exist among key indicators, particularly between BD-A and GD, where their combined influence in the medium-temperature zone displays complex nonlinear patterns, potentially driven by the mixture of natural and artificial green spaces. A major contribution of this study lies in the integration of extreme gradient boosting for predictive modeling and Shapley additive explanations for interpretative analysis, enabling the decomposition of nonlinear model outputs into explicit contributions of individual fractal indicators. This combination enhances the interpretability of machine learning predictions and clarifies the mechanisms through which fractal geometry influences surface thermal dynamics. Furthermore, the introduction of the radius-based fractal dimension at the temperature-zone scale provides a new geometric perspective on spatial diffusion and morphological continuity of green patches. By leveraging high-resolution remote sensing and explainable modeling, this research establishes a robust framework for quantifying the multiscale thermal effects of urban green spaces and offers scientifically grounded guidance for optimizing urban thermal environment regulation.
{"title":"Impacts of Urban Green Space Fractal on Surface Thermal Environment at Temperature Zone Scale Based on High-Resolution Remote Sensing Images","authors":"Yilu Gong;Yifan Wang;Jun Yang","doi":"10.1109/JSTARS.2025.3637818","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3637818","url":null,"abstract":"Research on the thermal environmental effects of urban green spaces has traditionally been constrained to the block scale due to the lack of high-resolution remote sensing data capable of accurately capturing small and fragmented green patches, thereby limiting the spatial precision of existing findings. This study employs multitemporal high-resolution remote sensing imagery and multiple fractal models—including grid dimension (GD), boundary dimension A (BD-A), boundary dimension B, and radius dimension—to investigate both linear and nonlinear relationships between green space fractal characteristics and surface temperature across temperature-zone scales. The results show that: 1) the linear and nonlinear effects of fractal features differ substantially, with GD identified as the dominant indicator and BD-A as the most variable; 2) fractal effects exhibit significant scale dependence, with the medium-temperature zone showing the strongest sensitivity to fractal structure; and 3) strong interactions exist among key indicators, particularly between BD-A and GD, where their combined influence in the medium-temperature zone displays complex nonlinear patterns, potentially driven by the mixture of natural and artificial green spaces. A major contribution of this study lies in the integration of extreme gradient boosting for predictive modeling and Shapley additive explanations for interpretative analysis, enabling the decomposition of nonlinear model outputs into explicit contributions of individual fractal indicators. This combination enhances the interpretability of machine learning predictions and clarifies the mechanisms through which fractal geometry influences surface thermal dynamics. Furthermore, the introduction of the radius-based fractal dimension at the temperature-zone scale provides a new geometric perspective on spatial diffusion and morphological continuity of green patches. By leveraging high-resolution remote sensing and explainable modeling, this research establishes a robust framework for quantifying the multiscale thermal effects of urban green spaces and offers scientifically grounded guidance for optimizing urban thermal environment regulation.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1038-1053"},"PeriodicalIF":5.3,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11271131","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Multitemporal differential interferometric synthetic aperture radar (MT-DInSAR) techniques have become essential tools for monitoring ground displacement with millimeter-scale precision. While widely applied to spaceborne SAR systems for global, long-term monitoring, and to ground-based SAR (GBSAR) for continuous tracking in localized areas, these platforms remain constrained by fixed acquisition geometries and line-of-sight (LOS) sensitivity. In addition, spaceborne systems face inherent limitations in capturing high-magnitude displacements due to their relatively long revisit times, often leading to temporal decorrelation in such scenarios. In this context, airborne SAR systems offer a compromise by enabling flexible acquisition geometries and on-demand revisit times, with drone platforms emerging as a cost-effective alternative to the high operational and logistical demands of conventional airborne systems. This article presents the SAR-Drone system, a Ku-band drone-based SAR, and a coregistration and interferometric processor together with two multitemporal DInSAR (MT-DInSAR) methodologies adapted to the SAR-Drone data. The first is displacement-based and designed for scenarios with moderate motion, where interferometric phase remains coherent between consecutive acquisitions; the second is velocity-based, targeting scenarios with high-magnitude displacements where decorrelation occurs even when only a few hours separate acquisitions. The displacement-based methodology is validated in a controlled experiment using corner reflectors, while the velocity-based methodology is demonstrated in a real open-pit mine, where metric-scale slope movements occur within intervals of approximately one day. The results presented in the article demonstrate the potential of drone-based SAR to complement existing spaceborne, ground-based, and airborne systems by enabling high-resolution 3-D displacement tracking in complex and rapidly evolving environments.
{"title":"Drone-Based MT-DInSAR for High-Magnitude 3-D Displacement Retrieval With Daily Revisits","authors":"Gerard Ruiz-Carregal;Gerard Masalias;Luis Yam;Eduard Makhoul;Rubén Iglesias;Marc Lort;Dani Monells;Azadeh Faridi;Giuseppe Centolanza;Antonio Heredia;Álex González;Nieves Pasqualotto;Marc Palmada;Diego Santamaría;Carlos López-Martínez;Javier Duro","doi":"10.1109/JSTARS.2025.3637980","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3637980","url":null,"abstract":"Multitemporal differential interferometric synthetic aperture radar (MT-DInSAR) techniques have become essential tools for monitoring ground displacement with millimeter-scale precision. While widely applied to spaceborne SAR systems for global, long-term monitoring, and to ground-based SAR (GBSAR) for continuous tracking in localized areas, these platforms remain constrained by fixed acquisition geometries and line-of-sight (LOS) sensitivity. In addition, spaceborne systems face inherent limitations in capturing high-magnitude displacements due to their relatively long revisit times, often leading to temporal decorrelation in such scenarios. In this context, airborne SAR systems offer a compromise by enabling flexible acquisition geometries and on-demand revisit times, with drone platforms emerging as a cost-effective alternative to the high operational and logistical demands of conventional airborne systems. This article presents the SAR-Drone system, a Ku-band drone-based SAR, and a coregistration and interferometric processor together with two multitemporal DInSAR (MT-DInSAR) methodologies adapted to the SAR-Drone data. The first is displacement-based and designed for scenarios with moderate motion, where interferometric phase remains coherent between consecutive acquisitions; the second is velocity-based, targeting scenarios with high-magnitude displacements where decorrelation occurs even when only a few hours separate acquisitions. The displacement-based methodology is validated in a controlled experiment using corner reflectors, while the velocity-based methodology is demonstrated in a real open-pit mine, where metric-scale slope movements occur within intervals of approximately one day. The results presented in the article demonstrate the potential of drone-based SAR to complement existing spaceborne, ground-based, and airborne systems by enabling high-resolution 3-D displacement tracking in complex and rapidly evolving environments.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"850-874"},"PeriodicalIF":5.3,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11270849","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Object detection in remote-sensing imagery plays a crucial role in providing precise geospatial information for urban planning and environmental monitoring. However, real-world remote-sensing scenarios often involve complex conditions, such as varying illumination, weather interference, and low signal-to-noise ratios, which significantly degrade the performance of traditional single-modal detection methods. To overcome these limitations, multimodal object detection has developed, demonstrating great potential by integrating complementary information from multiple modalities. Nevertheless, existing multimodal frameworks still face challenges, such as insufficient cross-modal interaction, limited learning of complementary features, and high computational costs, due to redundant fusion in complex environments. To overcome these challenges, we propose an enhanced multimodal fusion strategy aimed at maximizing cross-modal feature learning capabilities. Our method employs a dual-backbone architecture to extract mode-specific representations independently, integrating a direction attention module at an early stage of each backbone to enhance discriminative feature extraction. We then introduce a dual-stream feature fusion network to effectively fuse cross-modal features, generating rich representations for the detection head. In addition, we embed a local–global channel attention mechanism in the head stage to strengthen feature learning in the channel dimension before generating the final prediction. Extensive experiments on the widely used vehicle detection in aerial imagery multimodal remote-sensing dataset demonstrate that our method achieves state-of-the-art performance, while evaluations on single-modal datasets confirm its exceptional generalization capability.
{"title":"Dual-Stream Multimodal Fusion With Local–Global Attention for Remote-Sensing Object Detection","authors":"Youxiang Huang;Zhuo Wang;Tiantian Tang;Tomoaki Ohtsuki;Guan Gui","doi":"10.1109/JSTARS.2025.3637891","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3637891","url":null,"abstract":"Object detection in remote-sensing imagery plays a crucial role in providing precise geospatial information for urban planning and environmental monitoring. However, real-world remote-sensing scenarios often involve complex conditions, such as varying illumination, weather interference, and low signal-to-noise ratios, which significantly degrade the performance of traditional single-modal detection methods. To overcome these limitations, multimodal object detection has developed, demonstrating great potential by integrating complementary information from multiple modalities. Nevertheless, existing multimodal frameworks still face challenges, such as insufficient cross-modal interaction, limited learning of complementary features, and high computational costs, due to redundant fusion in complex environments. To overcome these challenges, we propose an enhanced multimodal fusion strategy aimed at maximizing cross-modal feature learning capabilities. Our method employs a dual-backbone architecture to extract mode-specific representations independently, integrating a direction attention module at an early stage of each backbone to enhance discriminative feature extraction. We then introduce a dual-stream feature fusion network to effectively fuse cross-modal features, generating rich representations for the detection head. In addition, we embed a local–global channel attention mechanism in the head stage to strengthen feature learning in the channel dimension before generating the final prediction. Extensive experiments on the widely used vehicle detection in aerial imagery multimodal remote-sensing dataset demonstrate that our method achieves state-of-the-art performance, while evaluations on single-modal datasets confirm its exceptional generalization capability.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1691-1702"},"PeriodicalIF":5.3,"publicationDate":"2025-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11270224","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145830902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-11-26DOI: 10.1109/JSTARS.2025.3637264
Jun Xie;Hua Huang;Lingfei Song
Due to the thermal imaging mechanism, infrared images are inevitably contaminated by thermal radiation bias fields. Correcting this low-frequency nonuniformity is challenging because the bias field is difficult to separate from the low-frequency scene content. Existing methods fit parametric surface models to estimate bias fields. However, variations in camera operating states and environment often lead to irregular bias field that such surface models cannot accurately capture. This article proposes a novel model-free method for irregular bias field correction. The proposed method utilizes the normalized deconvolution module to restore the clean image in the probability density function (PDF) domain, in which the PDF of the bias field can be approximated by a series of specific Gaussian kernel functions. Accordingly, the complete bias field could be obtained gradually in the PDF domain by incremental updating. To transform the image PDF back to the intensity domain, the corresponding conditional expectation is computed in this article. Note that the proposed method does not require any explicit parametric modeling of the bias field. Therefore, the proposed method is able to correct irregular bias fields effectively. Extensive experiments demonstrate that the proposed model-free method outperforms existing methods in synthesized and real infrared images.
{"title":"A Model-Free Method for Irregular Bias Field Correction in Infrared Images","authors":"Jun Xie;Hua Huang;Lingfei Song","doi":"10.1109/JSTARS.2025.3637264","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3637264","url":null,"abstract":"Due to the thermal imaging mechanism, infrared images are inevitably contaminated by thermal radiation bias fields. Correcting this low-frequency nonuniformity is challenging because the bias field is difficult to separate from the low-frequency scene content. Existing methods fit parametric surface models to estimate bias fields. However, variations in camera operating states and environment often lead to irregular bias field that such surface models cannot accurately capture. This article proposes a novel model-free method for irregular bias field correction. The proposed method utilizes the normalized deconvolution module to restore the clean image in the probability density function (PDF) domain, in which the PDF of the bias field can be approximated by a series of specific Gaussian kernel functions. Accordingly, the complete bias field could be obtained gradually in the PDF domain by incremental updating. To transform the image PDF back to the intensity domain, the corresponding conditional expectation is computed in this article. Note that the proposed method does not require any explicit parametric modeling of the bias field. Therefore, the proposed method is able to correct irregular bias fields effectively. Extensive experiments demonstrate that the proposed model-free method outperforms existing methods in synthesized and real infrared images.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1583-1597"},"PeriodicalIF":5.3,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11269800","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Combining optical remote sensing images for ship monitoring is a practical approach for maritime surveillance. However, existing research lacks sufficient detection accuracy and fails to consider computational resource constraints in ship detection processing. This article proposes a novel lightweight rotating ship target detection model. First, we enhance the detection accuracy by expanding the YOLOv8n-obb model with Large Selective Kernel (LSK) attention mechanism, Weight-Fusion Multi-Branch Auxiliary FPN (WFMAFPN), and Dynamic Task-Aligned Detection Head (DTAH). Specifically, the LSK attention mechanism dynamically adjusts the receptive field, effectively capturing multiscale features. The WFMAFPN improves the capacity of feature fusion by the multidirectional paths and adaptive weight assignment to individual feature maps. The DTAH further enhances detection performance by improving task interaction between classification and localization. Second, we reduce the computational resource consumption of our model. This technique is developed by pruning based on layer adaptive magnitude on the enhanced architecture and designing the DTAH module with shared parameters. Considering the above improvement, we name our model RYOLO-LWMD-Lite. Finally, we constructed a large-scale dataset for rotating ships, named AShipClass9, with diverse ship categories to evaluate our model. Experimental results indicate that the RYOLO-LWMD-Lite model achieves higher detection accuracy while maintaining a lower parameter count. Specifically, the model’s parameter count is approximately 2/3 that of YOLOv8n-obb, and the test accuracy on AShipClass9 reaches 48.2% (in terms of AP$_{50}$), a 6% improvement over the baseline. In addition, experiments conducted on the DOTA1.5 dataset validate the generalization capability of the proposed model.
{"title":"RYOLO-LWMD-Lite: A Lightweight Rotating Ship Target Detection Model for Optical Remote Sensing Images","authors":"Zhaohui Li;Sheng Qi;Haohao Yang;Haolin Li;Hongyu Jia","doi":"10.1109/JSTARS.2025.3637224","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3637224","url":null,"abstract":"Combining optical remote sensing images for ship monitoring is a practical approach for maritime surveillance. However, existing research lacks sufficient detection accuracy and fails to consider computational resource constraints in ship detection processing. This article proposes a novel lightweight rotating ship target detection model. First, we enhance the detection accuracy by expanding the YOLOv8n-obb model with Large Selective Kernel (LSK) attention mechanism, Weight-Fusion Multi-Branch Auxiliary FPN (WFMAFPN), and Dynamic Task-Aligned Detection Head (DTAH). Specifically, the LSK attention mechanism dynamically adjusts the receptive field, effectively capturing multiscale features. The WFMAFPN improves the capacity of feature fusion by the multidirectional paths and adaptive weight assignment to individual feature maps. The DTAH further enhances detection performance by improving task interaction between classification and localization. Second, we reduce the computational resource consumption of our model. This technique is developed by pruning based on layer adaptive magnitude on the enhanced architecture and designing the DTAH module with shared parameters. Considering the above improvement, we name our model RYOLO-LWMD-Lite. Finally, we constructed a large-scale dataset for rotating ships, named AShipClass9, with diverse ship categories to evaluate our model. Experimental results indicate that the RYOLO-LWMD-Lite model achieves higher detection accuracy while maintaining a lower parameter count. Specifically, the model’s parameter count is approximately 2/3 that of YOLOv8n-obb, and the test accuracy on AShipClass9 reaches 48.2% (in terms of AP<inline-formula><tex-math>$_{50}$</tex-math></inline-formula>), a 6% improvement over the baseline. In addition, experiments conducted on the DOTA1.5 dataset validate the generalization capability of the proposed model.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"1415-1424"},"PeriodicalIF":5.3,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11269318","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145778389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Internal solitary waves (ISWs) induce sea surface height (SSH) perturbations that encode subsurface stratification, yet their detection and quantification from satellite altimetry remain limited. Here, we leverage the wide-swath interferometric capability of the surface water and ocean topography (SWOT) mission and develop a two-part framework for ISW characterization. First, a dedicated SSH anomaly (SSHA) processing workflow and two-dimensional feature-extraction algorithm jointly enhance ISW detectability and delineate perturbation stripes with high fidelity. Second, an amplitude inversion scheme, combining corrected vertical mode functions, enables quantitative retrieval of strongly nonlinear ISW amplitudes from SWOT SSHA. Application to the Maluku Sea reveals basin-scale ISW activity, with perturbations of 10–20 cm and enhanced SSHA amplitudes exceeding 30 cm. Comparative case studies over the Amazon Shelf and Sulu Sea highlight SWOT’s superior spatial resolution relative to Sentinel-3, while coordinated SWOT–mooring experiments in the South China Sea yield ISW amplitude inversion relative errors of 11% and 7.7% The results demonstrate that SWOT observations, combined with the proposed framework, provide a powerful means for quantitative detection and characterization of ISWs from SSH measurements.
{"title":"SWOT-Based Detection of Sea Surface Height Variations Induced by Internal Solitary Waves: Feature Extraction and Amplitude Inversion","authors":"Hao Zhang;Chenqing Fan;Longyu Huang;Lina Sun;Junmin Meng","doi":"10.1109/JSTARS.2025.3637183","DOIUrl":"https://doi.org/10.1109/JSTARS.2025.3637183","url":null,"abstract":"Internal solitary waves (ISWs) induce sea surface height (SSH) perturbations that encode subsurface stratification, yet their detection and quantification from satellite altimetry remain limited. Here, we leverage the wide-swath interferometric capability of the surface water and ocean topography (SWOT) mission and develop a two-part framework for ISW characterization. First, a dedicated SSH anomaly (SSHA) processing workflow and two-dimensional feature-extraction algorithm jointly enhance ISW detectability and delineate perturbation stripes with high fidelity. Second, an amplitude inversion scheme, combining corrected vertical mode functions, enables quantitative retrieval of strongly nonlinear ISW amplitudes from SWOT SSHA. Application to the Maluku Sea reveals basin-scale ISW activity, with perturbations of 10–20 cm and enhanced SSHA amplitudes exceeding 30 cm. Comparative case studies over the Amazon Shelf and Sulu Sea highlight SWOT’s superior spatial resolution relative to Sentinel-3, while coordinated SWOT–mooring experiments in the South China Sea yield ISW amplitude inversion relative errors of 11% and 7.7% The results demonstrate that SWOT observations, combined with the proposed framework, provide a powerful means for quantitative detection and characterization of ISWs from SSH measurements.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"19 ","pages":"805-814"},"PeriodicalIF":5.3,"publicationDate":"2025-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=11269836","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145729330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}