Chentong Huang;Junming Hou;Chenxu Wu;Xiaofeng Cong;Man Zhou;Junling Li;Danfeng Hong
{"title":"一种通用协同优化驱动的多光谱图像融合高频增强框架","authors":"Chentong Huang;Junming Hou;Chenxu Wu;Xiaofeng Cong;Man Zhou;Junling Li;Danfeng Hong","doi":"10.1109/TGRS.2025.3541561","DOIUrl":null,"url":null,"abstract":"Pan-sharpening essentially to boost the spatial resolution of a multispectral (MS) image guided by its paired panchromatic (PAN) image. In other words, this process intricately integrates the high-frequency components extracted from texture-rich PAN images into the low-resolution (LR) MS images, resulting in texture-rich MS images. Though existing deep learning (DL)-based techniques have made impressive performance compared with traditional algorithms, they still face challenges in accurately restoring high-frequency details in MS images, thus limiting overall pan-sharpening performance. In addition, reference high-resolution (HR) MS images are often underutilized, typically serving only as training labels. In this work, we present a general high-frequency enhancement framework for pan-sharpening, which is implemented through a cooperative optimization strategy using mutual information (MI) maximization and contrastive learning. Specifically, our model comprises two fundamental modules: the high-frequency feature alignment (HFFA) module and the high-frequency detail calibration (HFDC) module. The first employs MI maximization to align the high-frequency semantic statistical distribution between PAN images and reference HRMS images. The latter is designed to calibrate the high-frequency components of MS modality under the guidance of the PAN counterparts through the contrastive learning constraint, thereby producing more accurate high-frequency information on MS modality. By integrating the calibrated high-frequency features of MS modality and those of PAN modality, we can obtain a more comprehensive and precise high-frequency feature representation of these two modalities, facilitating the reconstruction of LRMS images. Our model, incorporating the aforementioned key elements, significantly surpasses other state-of-the-art (SOTA) techniques across multiple satellite datasets in both quantitative and qualitative experiments. Moreover, the real-world full-resolution and cross-sensor assessments testify to its exceptional generalization capabilities. The code is available at <uri>https://github.com/Vcocoi/CONet</uri>.","PeriodicalId":13213,"journal":{"name":"IEEE Transactions on Geoscience and Remote Sensing","volume":"63 ","pages":"1-14"},"PeriodicalIF":8.6000,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A General Cooperative Optimization Driven High-Frequency Enhancement Framework for Multispectral Image Fusion\",\"authors\":\"Chentong Huang;Junming Hou;Chenxu Wu;Xiaofeng Cong;Man Zhou;Junling Li;Danfeng Hong\",\"doi\":\"10.1109/TGRS.2025.3541561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Pan-sharpening essentially to boost the spatial resolution of a multispectral (MS) image guided by its paired panchromatic (PAN) image. In other words, this process intricately integrates the high-frequency components extracted from texture-rich PAN images into the low-resolution (LR) MS images, resulting in texture-rich MS images. Though existing deep learning (DL)-based techniques have made impressive performance compared with traditional algorithms, they still face challenges in accurately restoring high-frequency details in MS images, thus limiting overall pan-sharpening performance. In addition, reference high-resolution (HR) MS images are often underutilized, typically serving only as training labels. In this work, we present a general high-frequency enhancement framework for pan-sharpening, which is implemented through a cooperative optimization strategy using mutual information (MI) maximization and contrastive learning. Specifically, our model comprises two fundamental modules: the high-frequency feature alignment (HFFA) module and the high-frequency detail calibration (HFDC) module. The first employs MI maximization to align the high-frequency semantic statistical distribution between PAN images and reference HRMS images. The latter is designed to calibrate the high-frequency components of MS modality under the guidance of the PAN counterparts through the contrastive learning constraint, thereby producing more accurate high-frequency information on MS modality. By integrating the calibrated high-frequency features of MS modality and those of PAN modality, we can obtain a more comprehensive and precise high-frequency feature representation of these two modalities, facilitating the reconstruction of LRMS images. Our model, incorporating the aforementioned key elements, significantly surpasses other state-of-the-art (SOTA) techniques across multiple satellite datasets in both quantitative and qualitative experiments. Moreover, the real-world full-resolution and cross-sensor assessments testify to its exceptional generalization capabilities. The code is available at <uri>https://github.com/Vcocoi/CONet</uri>.\",\"PeriodicalId\":13213,\"journal\":{\"name\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"volume\":\"63 \",\"pages\":\"1-14\"},\"PeriodicalIF\":8.6000,\"publicationDate\":\"2025-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Geoscience and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10897307/\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Geoscience and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10897307/","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
A General Cooperative Optimization Driven High-Frequency Enhancement Framework for Multispectral Image Fusion
Pan-sharpening essentially to boost the spatial resolution of a multispectral (MS) image guided by its paired panchromatic (PAN) image. In other words, this process intricately integrates the high-frequency components extracted from texture-rich PAN images into the low-resolution (LR) MS images, resulting in texture-rich MS images. Though existing deep learning (DL)-based techniques have made impressive performance compared with traditional algorithms, they still face challenges in accurately restoring high-frequency details in MS images, thus limiting overall pan-sharpening performance. In addition, reference high-resolution (HR) MS images are often underutilized, typically serving only as training labels. In this work, we present a general high-frequency enhancement framework for pan-sharpening, which is implemented through a cooperative optimization strategy using mutual information (MI) maximization and contrastive learning. Specifically, our model comprises two fundamental modules: the high-frequency feature alignment (HFFA) module and the high-frequency detail calibration (HFDC) module. The first employs MI maximization to align the high-frequency semantic statistical distribution between PAN images and reference HRMS images. The latter is designed to calibrate the high-frequency components of MS modality under the guidance of the PAN counterparts through the contrastive learning constraint, thereby producing more accurate high-frequency information on MS modality. By integrating the calibrated high-frequency features of MS modality and those of PAN modality, we can obtain a more comprehensive and precise high-frequency feature representation of these two modalities, facilitating the reconstruction of LRMS images. Our model, incorporating the aforementioned key elements, significantly surpasses other state-of-the-art (SOTA) techniques across multiple satellite datasets in both quantitative and qualitative experiments. Moreover, the real-world full-resolution and cross-sensor assessments testify to its exceptional generalization capabilities. The code is available at https://github.com/Vcocoi/CONet.
期刊介绍:
IEEE Transactions on Geoscience and Remote Sensing (TGRS) is a monthly publication that focuses on the theory, concepts, and techniques of science and engineering as applied to sensing the land, oceans, atmosphere, and space; and the processing, interpretation, and dissemination of this information.