Yongxing Dai, Yifan Sun, Jun Liu, Zekun Tong, Ling-Yu Duan
{"title":"利用中间域缩小源到目标的差距,实现跨域人员再识别","authors":"Yongxing Dai, Yifan Sun, Jun Liu, Zekun Tong, Ling-Yu Duan","doi":"10.1007/s11263-024-02169-6","DOIUrl":null,"url":null,"abstract":"<p>Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM.</p>","PeriodicalId":13752,"journal":{"name":"International Journal of Computer Vision","volume":"98 1","pages":""},"PeriodicalIF":11.6000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Bridging the Source-to-Target Gap for Cross-Domain Person Re-identification with Intermediate Domains\",\"authors\":\"Yongxing Dai, Yifan Sun, Jun Liu, Zekun Tong, Ling-Yu Duan\",\"doi\":\"10.1007/s11263-024-02169-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM.</p>\",\"PeriodicalId\":13752,\"journal\":{\"name\":\"International Journal of Computer Vision\",\"volume\":\"98 1\",\"pages\":\"\"},\"PeriodicalIF\":11.6000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computer Vision\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s11263-024-02169-6\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computer Vision","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11263-024-02169-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Bridging the Source-to-Target Gap for Cross-Domain Person Re-identification with Intermediate Domains
Cross-domain person re-identification (re-ID), such as unsupervised domain adaptive re-ID (UDA re-ID), aims to transfer the identity-discriminative knowledge from the source to the target domain. Existing methods commonly consider the source and target domains are isolated from each other, i.e., no intermediate status is modeled between the source and target domains. Directly transferring the knowledge between two isolated domains can be very difficult, especially when the domain gap is large. This paper, from a novel perspective, assumes these two domains are not completely isolated, but can be connected through a series of intermediate domains. Instead of directly aligning the source and target domains against each other, we propose to align the source and target domains against their intermediate domains so as to facilitate a smooth knowledge transfer. To discover and utilize these intermediate domains, this paper proposes an Intermediate Domain Module (IDM) and a Mirrors Generation Module (MGM). IDM has two functions: (1) it generates multiple intermediate domains by mixing the hidden-layer features from source and target domains and (2) it dynamically reduces the domain gap between the source/target domain features and the intermediate domain features. While IDM achieves good domain alignment effect, it introduces a side effect, i.e., the mix-up operation may mix the identities into a new identity and lose the original identities. Accordingly, MGM is introduced to compensate the loss of the original identity by mapping the features into the IDM-generated intermediate domains without changing their original identity. It allows to focus on minimizing domain variations to further promote the alignment between the source/target domain and intermediate domains, which reinforces IDM into IDM++. We extensively evaluate our method under both the UDA and domain generalization (DG) scenarios and observe that IDM++ yields consistent (and usually significant) performance improvement for cross-domain re-ID, achieving new state of the art. For example, on the challenging MSMT17 benchmark, IDM++ surpasses the prior state of the art by a large margin (e.g., up to 9.9% and 7.8% rank-1 accuracy) for UDA and DG scenarios, respectively. Code is available at https://github.com/SikaStar/IDM.
期刊介绍:
The International Journal of Computer Vision (IJCV) serves as a platform for sharing new research findings in the rapidly growing field of computer vision. It publishes 12 issues annually and presents high-quality, original contributions to the science and engineering of computer vision. The journal encompasses various types of articles to cater to different research outputs.
Regular articles, which span up to 25 journal pages, focus on significant technical advancements that are of broad interest to the field. These articles showcase substantial progress in computer vision.
Short articles, limited to 10 pages, offer a swift publication path for novel research outcomes. They provide a quicker means for sharing new findings with the computer vision community.
Survey articles, comprising up to 30 pages, offer critical evaluations of the current state of the art in computer vision or offer tutorial presentations of relevant topics. These articles provide comprehensive and insightful overviews of specific subject areas.
In addition to technical articles, the journal also includes book reviews, position papers, and editorials by prominent scientific figures. These contributions serve to complement the technical content and provide valuable perspectives.
The journal encourages authors to include supplementary material online, such as images, video sequences, data sets, and software. This additional material enhances the understanding and reproducibility of the published research.
Overall, the International Journal of Computer Vision is a comprehensive publication that caters to researchers in this rapidly growing field. It covers a range of article types, offers additional online resources, and facilitates the dissemination of impactful research.