{"title":"Progressive Symmetric Registration for Multimodal Remote Sensing Imagery","authors":"Heng Yan;Ailong Ma;Yanfei Zhong","doi":"10.1109/TGRS.2024.3514305","DOIUrl":null,"url":null,"abstract":"Image registration forms the foundation of collaborative processing in multimodal remote sensing imagery (MRSI). However, high-resolution MRSIs frequently display complex distortions due to imaging characteristics and terrain variations, with both global and local distortions present. Effectively addressing these complex distortions necessitates the identification of uniformly and densely distributed corresponding points across the entire image. Existing methods primarily focus on global affine distortions and often extract only sparse and unevenly distributed corresponding points, which makes the effective handling of these coexisting distortions a significant challenge. To address this problem, we propose a progressive symmetric registration learning network (PSRNet) for MRSIs. In PSRNet, multimodal remote sensing image registration (MRSIR) is redefined as a symmetric dense regression task, differing from the traditional pipeline that concentrates on unidirectional sparse transformation parameter prediction. Specifically, PSRNet consists of three primary components: 1) a multiscale feature projector (MFP), which employs a dual-branch structure with nonshared weights to achieve modality-specific representation of different modal images across multiple scales, 2) a progressive cross-modal transformer (PCMT) to further mine modality-invariant features and progressively predict symmetric deformation fields, and 3) a symmetric consistency loss (SCL) function capable of elegantly achieving high-precision reversible alignment of image pairs, encompassing endpoint error loss, bidirectional alignment loss, and smoothness loss. Experimental results demonstrate that PSRNet achieves more comprehensive and advanced registration performance on our self-constructed large-scale high-resolution MRSIR dataset, which includes complex global-local geometric distortions and significant nonlinear radiometric differences (NRD).","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-17"},"PeriodicalIF":7.5000,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10787065/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Image registration forms the foundation of collaborative processing in multimodal remote sensing imagery (MRSI). However, high-resolution MRSIs frequently display complex distortions due to imaging characteristics and terrain variations, with both global and local distortions present. Effectively addressing these complex distortions necessitates the identification of uniformly and densely distributed corresponding points across the entire image. Existing methods primarily focus on global affine distortions and often extract only sparse and unevenly distributed corresponding points, which makes the effective handling of these coexisting distortions a significant challenge. To address this problem, we propose a progressive symmetric registration learning network (PSRNet) for MRSIs. In PSRNet, multimodal remote sensing image registration (MRSIR) is redefined as a symmetric dense regression task, differing from the traditional pipeline that concentrates on unidirectional sparse transformation parameter prediction. Specifically, PSRNet consists of three primary components: 1) a multiscale feature projector (MFP), which employs a dual-branch structure with nonshared weights to achieve modality-specific representation of different modal images across multiple scales, 2) a progressive cross-modal transformer (PCMT) to further mine modality-invariant features and progressively predict symmetric deformation fields, and 3) a symmetric consistency loss (SCL) function capable of elegantly achieving high-precision reversible alignment of image pairs, encompassing endpoint error loss, bidirectional alignment loss, and smoothness loss. Experimental results demonstrate that PSRNet achieves more comprehensive and advanced registration performance on our self-constructed large-scale high-resolution MRSIR dataset, which includes complex global-local geometric distortions and significant nonlinear radiometric differences (NRD).
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.