利用中间域缩小源到目标的差距，实现跨域人员再识别

IF 11.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE International Journal of Computer Vision Pub Date : 2024-07-31 DOI:10.1007/s11263-024-02169-6

Yongxing Dai, Yifan Sun, Jun Liu, Zekun Tong, Ling-Yu Duan

{"title":"利用中间域缩小源到目标的差距，实现跨域人员再识别","authors":"Yongxing Dai, Yifan Sun, Jun Liu, Zekun Tong, Ling-Yu Duan","doi":"10.1007/s11263-024-02169-6","DOIUrl":null,"url":null,"abstract":"<p>Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"98 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bridging the Source-to-Target Gap for Cross-Domain Person Re-identification with Intermediate Domains\",\"authors\":\"Yongxing Dai, Yifan Sun, Jun Liu, Zekun Tong, Ling-Yu Duan\",\"doi\":\"10.1007/s11263-024-02169-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM.</p>\",\"PeriodicalId\":13752,\"journal\":{\"name\":\"International Journal of Computer Vision\",\"volume\":\"98 1\",\"pages\":\"\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11263-024-02169-6\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02169-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

跨域人员再识别（re-ID），如无监督域自适应再识别（UDA re-ID），旨在将身份识别知识从源域转移到目标域。现有方法通常认为源域和目标域是相互隔离的，即源域和目标域之间没有中间状态模型。在两个孤立的域之间直接转移知识可能非常困难，尤其是当域差距较大时。本文从一个新颖的角度出发，假设这两个域并非完全孤立，而是可以通过一系列中间域连接起来。我们建议不直接将源域和目标域对齐，而是将源域和目标域对齐其中间域，以促进知识的顺利转移。为了发现和利用这些中间域，本文提出了一个中间域模块（IDM）和一个镜像生成模块（MGM）。IDM 有两个功能：(1) 通过混合源域和目标域的隐藏层特征生成多个中间域；(2) 动态减少源域/目标域特征与中间域特征之间的域差距。虽然 IDM 实现了良好的域对齐效果，但它也带来了副作用，即混合操作可能会将标识混合成一个新标识，从而丢失原始标识。因此，我们引入了 MGM，通过将特征映射到 IDM 生成的中间域而不改变其原始标识，以补偿原始标识的丢失。它可以将重点放在最小化域变化上，进一步促进源域/目标域和中间域之间的一致性，从而将 IDM 强化为 IDM++。我们在 UDA 和域泛化 (DG) 场景下对我们的方法进行了广泛评估，观察到 IDM++ 在跨域重新识别（cross-domain re-ID）方面产生了一致（通常是显著）的性能改进，达到了新的技术水平。例如，在具有挑战性的 MSMT17 基准上，IDM++ 在 UDA 和 DG 场景下的排名-1 准确率分别高达 9.9% 和 7.8%，远远超过了之前的技术水平。代码见 https://github.com/SikaStar/IDM。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Bridging the Source-to-Target Gap for Cross-Domain Person Re-identification with Intermediate Domains

Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Computer Vision 工程技术-计算机：人工智能

CiteScore

29.80

自引率

2.10%

发文量

163

审稿时长

6 months

期刊介绍： The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs. Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision. Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community. Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas. In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives. The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research. Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.

期刊最新文献

CS-CoLBP: Cross-Scale Co-occurrence Local Binary Pattern for Image Classification Warping the Residuals for Image Editing with StyleGAN Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation Feature Matching via Graph Clustering with Local Affine Consensus Learning to Detect Novel Species with SAM in the Wild