Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS IEEE Transactions on Information Forensics and Security Pub Date : 2025-02-14 DOI:10.1109/TIFS.2025.3541969

Mahdi Alehdaghi;Arthur Josi;Rafael M. O. Cruz;Pourya Shamsolmoali;Eric Granger

{"title":"Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification","authors":"Mahdi Alehdaghi;Arthur Josi;Rafael M. O. Cruz;Pourya Shamsolmoali;Eric Granger","doi":"10.1109/TIFS.2025.3541969","DOIUrl":null,"url":null,"abstract":"Visible-infrared person re-identification (V-I ReID) seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate the V and I modalities to represent images within a shared space. However, given the significant gap in the data distributions between V and I modalities, cross-modal V-I ReID remains challenging. A solution is to involve a privileged intermediate space to bridge between modalities, but in practice, such data is not available and requires selecting or creating effective mechanisms for informative intermediate domains. This paper introduces the Adaptive Generation of Privileged Intermediate Information (AGPI2) training approach to adapt and generate a virtual domain that bridges discriminative information between the V and I modalities. AGPI2 enhances the training of a deep V-I ReID backbone by generating and then leveraging bridging privileged information without modifying the model in the inference phase. This information captures shared discriminative attributes that are not easily ascertainable for the model within individual V or I modalities. Towards this goal, a non-linear generative module is trained with adversarial objectives, transforming V attributes into intermediate spaces that also contain I features. This domain exhibits less domain shift relative to the I domain compared to the V domain. Meanwhile, the embedding module within AGPI2 aims to extract discriminative modality-invariant features for both modalities by leveraging modality-free descriptors from generated images, making them a bridge between the main modalities. Experiments conducted on challenging V-I ReID datasets indicate that AGPI2 consistently increases matching accuracy without additional computational resources during inference.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"3400-3413"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10887926/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Visible-infrared person re-identification (V-I ReID) seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate the V and I modalities to represent images within a shared space. However, given the significant gap in the data distributions between V and I modalities, cross-modal V-I ReID remains challenging. A solution is to involve a privileged intermediate space to bridge between modalities, but in practice, such data is not available and requires selecting or creating effective mechanisms for informative intermediate domains. This paper introduces the Adaptive Generation of Privileged Intermediate Information (AGPI²) training approach to adapt and generate a virtual domain that bridges discriminative information between the V and I modalities. AGPI² enhances the training of a deep V-I ReID backbone by generating and then leveraging bridging privileged information without modifying the model in the inference phase. This information captures shared discriminative attributes that are not easily ascertainable for the model within individual V or I modalities. Towards this goal, a non-linear generative module is trained with adversarial objectives, transforming V attributes into intermediate spaces that also contain I features. This domain exhibits less domain shift relative to the I domain compared to the V domain. Meanwhile, the embedding module within AGPI² aims to extract discriminative modality-invariant features for both modalities by leveraging modality-free descriptors from generated images, making them a bridge between the main modalities. Experiments conducted on challenging V-I ReID datasets indicate that AGPI² consistently increases matching accuracy without additional computational resources during inference.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

可见-红外人物再识别特权中间信息的自适应生成

可见红外人员再识别（V-I ReID）旨在检索通过RGB和IR传感器分布式网络捕获的同一个人的图像。几种V-I ReID方法直接整合V和I模态来表示共享空间内的图像。然而，考虑到V和I模态之间数据分布的显著差异，跨模态V-I ReID仍然具有挑战性。一种解决方案是在模式之间建立一个特殊的中间空间，但在实践中，这样的数据是不可用的，需要为信息中间域选择或创建有效的机制。本文介绍了特权中间信息的自适应生成（AGPI2）训练方法，以适应和生成一个虚拟域，在V和I模式之间架起判别信息的桥梁。AGPI2通过在推理阶段不修改模型的情况下生成并利用桥接特权信息，增强了深V-I ReID骨干的训练。该信息捕获了共享的判别属性，这些属性在单个V或I模态中不容易确定。为了实现这一目标，使用对抗目标训练非线性生成模块，将V个属性转换为包含I个特征的中间空间。与V域相比，这个域相对于I域表现出更小的域移位。同时，AGPI2中的嵌入模块旨在通过从生成的图像中利用无模态描述符提取两种模态的判别模态不变特征，使其成为主要模态之间的桥梁。在具有挑战性的V-I ReID数据集上进行的实验表明，AGPI2在推理过程中不需要额外的计算资源即可持续提高匹配精度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features