Mahdi Alehdaghi;Arthur Josi;Rafael M. O. Cruz;Pourya Shamsolmoali;Eric Granger
{"title":"Adaptive Generation of Privileged Intermediate Information for Visible-Infrared Person Re-Identification","authors":"Mahdi Alehdaghi;Arthur Josi;Rafael M. O. Cruz;Pourya Shamsolmoali;Eric Granger","doi":"10.1109/TIFS.2025.3541969","DOIUrl":null,"url":null,"abstract":"Visible-infrared person re-identification (V-I ReID) seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate the V and I modalities to represent images within a shared space. However, given the significant gap in the data distributions between V and I modalities, cross-modal V-I ReID remains challenging. A solution is to involve a privileged intermediate space to bridge between modalities, but in practice, such data is not available and requires selecting or creating effective mechanisms for informative intermediate domains. This paper introduces the Adaptive Generation of Privileged Intermediate Information (AGPI<sup>2</sup>) training approach to adapt and generate a virtual domain that bridges discriminative information between the V and I modalities. AGPI<sup>2</sup> enhances the training of a deep V-I ReID backbone by generating and then leveraging bridging privileged information without modifying the model in the inference phase. This information captures shared discriminative attributes that are not easily ascertainable for the model within individual V or I modalities. Towards this goal, a non-linear generative module is trained with adversarial objectives, transforming V attributes into intermediate spaces that also contain I features. This domain exhibits less domain shift relative to the I domain compared to the V domain. Meanwhile, the embedding module within AGPI<sup>2</sup> aims to extract discriminative modality-invariant features for both modalities by leveraging modality-free descriptors from generated images, making them a bridge between the main modalities. Experiments conducted on challenging V-I ReID datasets indicate that AGPI<sup>2</sup> consistently increases matching accuracy without additional computational resources during inference.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"3400-3413"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10887926/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Visible-infrared person re-identification (V-I ReID) seeks to retrieve images of the same individual captured over a distributed network of RGB and IR sensors. Several V-I ReID approaches directly integrate the V and I modalities to represent images within a shared space. However, given the significant gap in the data distributions between V and I modalities, cross-modal V-I ReID remains challenging. A solution is to involve a privileged intermediate space to bridge between modalities, but in practice, such data is not available and requires selecting or creating effective mechanisms for informative intermediate domains. This paper introduces the Adaptive Generation of Privileged Intermediate Information (AGPI2) training approach to adapt and generate a virtual domain that bridges discriminative information between the V and I modalities. AGPI2 enhances the training of a deep V-I ReID backbone by generating and then leveraging bridging privileged information without modifying the model in the inference phase. This information captures shared discriminative attributes that are not easily ascertainable for the model within individual V or I modalities. Towards this goal, a non-linear generative module is trained with adversarial objectives, transforming V attributes into intermediate spaces that also contain I features. This domain exhibits less domain shift relative to the I domain compared to the V domain. Meanwhile, the embedding module within AGPI2 aims to extract discriminative modality-invariant features for both modalities by leveraging modality-free descriptors from generated images, making them a bridge between the main modalities. Experiments conducted on challenging V-I ReID datasets indicate that AGPI2 consistently increases matching accuracy without additional computational resources during inference.
期刊介绍:
The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features