Ruo-Cheng Wu;Shangqi Deng;Ran Ran;Hong-Xia Dou;Liang-Jian Deng
{"title":"INF3: Implicit Neural Feature Fusion Function for Multispectral and Hyperspectral Image Fusion","authors":"Ruo-Cheng Wu;Shangqi Deng;Ran Ran;Hong-Xia Dou;Liang-Jian Deng","doi":"10.1109/TCI.2024.3488569","DOIUrl":null,"url":null,"abstract":"Multispectral and Hyperspectral Image Fusion (MHIF) is a task that aims to fuse a high-resolution multispectral image (HR-MSI) and a low-resolution hyperspectral image (LR-HSI) acquired on the same scene to obtain a high-resolution hyperspectral image (HR-HSI). Benefiting from the powerful inductive bias capability, convolutional neural network (CNN) based methods have achieved great success for the MHIF task. However, they lack flexibility when processing multi-scale images and require convolution structures be stacked to enhance performance. Implicit neural representation (INR) has recently achieved good performance and interpretability in 2D processing tasks thanks to its ability to locally interpolate samples and utilize multimodal content, such as pixels and coordinates. Although INR-based approaches show promising results, they put additional demands on high-frequency information (e.g., positional encoding). In this paper, we propose the use of the HR-MSI as high-frequency detail auxiliary input, thus introducing a new INR-based hyperspectral fusion function called implicit neural feature fusion function (INF\n<sup>3</sup>\n). The method overcomes the inherent shortcomings of vanilla INR thereby solving the MHIF problem. Specifically, our INF\n<sup>3</sup>\n designs a dual high-frequency fusion (DHFF) structure that obtains high-frequency information from HR-MSI and LR-HSI fusing them with coordinate information. Moreover, the proposed INF\n<sup>3</sup>\n incorporates a parameter-free method called INR with cosine similarity (INR-CS) that uses cosine similarity to generate local weights through feature vectors. Relied upon INF\n<sup>3</sup>\n, we build an implicit neural fusion network (INFN) that achieves state-of-the-art performance for the MHIF task on two public datasets, i.e., CAVE and Harvard. It also reaches the advanced level on the Pansharpening task, proving the flexibility of the proposed approach.","PeriodicalId":56022,"journal":{"name":"IEEE Transactions on Computational Imaging","volume":"10 ","pages":"1547-1558"},"PeriodicalIF":4.2000,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computational Imaging","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10750035/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Multispectral and Hyperspectral Image Fusion (MHIF) is a task that aims to fuse a high-resolution multispectral image (HR-MSI) and a low-resolution hyperspectral image (LR-HSI) acquired on the same scene to obtain a high-resolution hyperspectral image (HR-HSI). Benefiting from the powerful inductive bias capability, convolutional neural network (CNN) based methods have achieved great success for the MHIF task. However, they lack flexibility when processing multi-scale images and require convolution structures be stacked to enhance performance. Implicit neural representation (INR) has recently achieved good performance and interpretability in 2D processing tasks thanks to its ability to locally interpolate samples and utilize multimodal content, such as pixels and coordinates. Although INR-based approaches show promising results, they put additional demands on high-frequency information (e.g., positional encoding). In this paper, we propose the use of the HR-MSI as high-frequency detail auxiliary input, thus introducing a new INR-based hyperspectral fusion function called implicit neural feature fusion function (INF
3
). The method overcomes the inherent shortcomings of vanilla INR thereby solving the MHIF problem. Specifically, our INF
3
designs a dual high-frequency fusion (DHFF) structure that obtains high-frequency information from HR-MSI and LR-HSI fusing them with coordinate information. Moreover, the proposed INF
3
incorporates a parameter-free method called INR with cosine similarity (INR-CS) that uses cosine similarity to generate local weights through feature vectors. Relied upon INF
3
, we build an implicit neural fusion network (INFN) that achieves state-of-the-art performance for the MHIF task on two public datasets, i.e., CAVE and Harvard. It also reaches the advanced level on the Pansharpening task, proving the flexibility of the proposed approach.
期刊介绍:
The IEEE Transactions on Computational Imaging will publish articles where computation plays an integral role in the image formation process. Papers will cover all areas of computational imaging ranging from fundamental theoretical methods to the latest innovative computational imaging system designs. Topics of interest will include advanced algorithms and mathematical techniques, model-based data inversion, methods for image and signal recovery from sparse and incomplete data, techniques for non-traditional sensing of image data, methods for dynamic information acquisition and extraction from imaging sensors, software and hardware for efficient computation in imaging systems, and highly novel imaging system design.