Commonality Feature Representation Learning for Unsupervised Multimodal Change Detection

IF 13.7 IEEE transactions on image processing : a publication of the IEEE Signal Processing Society Pub Date : 2025-02-17 DOI:10.1109/TIP.2025.3539461

Tongfei Liu;Mingyang Zhang;Maoguo Gong;Qingfu Zhang;Fenlong Jiang;Hanhong Zheng;Di Lu

{"title":"Commonality Feature Representation Learning for Unsupervised Multimodal Change Detection","authors":"Tongfei Liu;Mingyang Zhang;Maoguo Gong;Qingfu Zhang;Fenlong Jiang;Hanhong Zheng;Di Lu","doi":"10.1109/TIP.2025.3539461","DOIUrl":null,"url":null,"abstract":"The main challenge of multimodal change detection (MCD) is that multimodal bitemporal images (MBIs) cannot be compared directly to identify changes. To overcome this problem, this paper proposes a novel commonality feature representation learning (CFRL) and constructs a CFRL-based unsupervised MCD framework. The CFRL is composed of a Siamese-based encoder and two decoders. First, the Siamese-based encoder can map original MBIs in the same feature space for extracting the representative features of each modality. Then, the two decoders are used to reconstruct the original MBIs by regressing themselves, respectively. Meanwhile, we swap the decoders to reconstruct the pseudo-MBIs to conduct modality alignment. Subsequently, all reconstructed images are input to the Siamese-based encoder again to map them in a same feature space, by which representative features are obtained. On this basis, latent commonality features between MBIs can be extracted by minimizing the distance between these representative features. These latent commonality features are comparable and can be used to identify changes. Notably, the proposed CFRL can be performed simultaneously in two modalities corresponding to MBIs. Therefore, two change magnitude images (CMIs) can be generated simultaneously by measuring the difference between the commonality features of MBIs. Finally, a simple threshold algorithm or a clustering algorithm can be employed to divide CMIs into binary change maps. Extensive experiments on six publicly available MCD datasets show that the proposed CFRL-based framework can achieve superior performance compared with other state-of-the-art approaches.","PeriodicalId":94032,"journal":{"name":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","volume":"34 ","pages":"1219-1233"},"PeriodicalIF":13.7000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on image processing : a publication of the IEEE Signal Processing Society","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10891329/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The main challenge of multimodal change detection (MCD) is that multimodal bitemporal images (MBIs) cannot be compared directly to identify changes. To overcome this problem, this paper proposes a novel commonality feature representation learning (CFRL) and constructs a CFRL-based unsupervised MCD framework. The CFRL is composed of a Siamese-based encoder and two decoders. First, the Siamese-based encoder can map original MBIs in the same feature space for extracting the representative features of each modality. Then, the two decoders are used to reconstruct the original MBIs by regressing themselves, respectively. Meanwhile, we swap the decoders to reconstruct the pseudo-MBIs to conduct modality alignment. Subsequently, all reconstructed images are input to the Siamese-based encoder again to map them in a same feature space, by which representative features are obtained. On this basis, latent commonality features between MBIs can be extracted by minimizing the distance between these representative features. These latent commonality features are comparable and can be used to identify changes. Notably, the proposed CFRL can be performed simultaneously in two modalities corresponding to MBIs. Therefore, two change magnitude images (CMIs) can be generated simultaneously by measuring the difference between the commonality features of MBIs. Finally, a simple threshold algorithm or a clustering algorithm can be employed to divide CMIs into binary change maps. Extensive experiments on six publicly available MCD datasets show that the proposed CFRL-based framework can achieve superior performance compared with other state-of-the-art approaches.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

无监督多模态变化检测的共性特征表示学习

多模态变化检测（MCD）的主要挑战是不能直接比较多模态双时图像来识别变化。为了克服这一问题，本文提出了一种新的共性特征表示学习（CFRL），并构建了一个基于CFRL的无监督MCD框架。CFRL由一个基于暹罗的编码器和两个解码器组成。首先，基于siames的编码器可以将原始mbi映射到相同的特征空间中，以提取每种模态的代表性特征。然后，使用两个解码器分别通过自身回归来重建原始mbi。同时，我们交换解码器来重建伪mbi以进行模态对齐。随后，将所有重构图像再次输入到基于siames的编码器中，在同一特征空间中进行映射，得到具有代表性的特征。在此基础上，通过最小化这些代表性特征之间的距离，可以提取mbi之间的潜在共性特征。这些潜在的共性特征具有可比性，可用于识别变化。值得注意的是，拟议的CFRL可以同时以与mbi相对应的两种模式进行。因此，通过测量mbi共性特征之间的差异，可以同时生成两幅变化幅度图像（cmi）。最后，采用简单的阈值算法或聚类算法将cmi划分为二元变化映射。在六个公开可用的MCD数据集上进行的大量实验表明，与其他最先进的方法相比，所提出的基于cfrl的框架可以获得更好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE transactions on image processing : a publication of the IEEE Signal Processing Society

自引率

0.00%

发文量